Design
The project's desired functionality is to organize a task. That is, our system should be able to recognize a number unique objects on the workspace and move them to their desired positions with relatively low rates of failure. Ideally, we would want to maximize the number of unique objects on which our system works, but from a more grounded perspective we would like to achieve the following design criteria:
To meet our design criteria, our project was modularized into two parts: object recognition and localization & robot manipulation and grasping.
For the object recognition and localization, we chose to use deep learning for the object detection and localization. Convolutional neural network performs very well at computer vision tasks such as image classification and object detection. Additionally, there are many state-of-the-art developed algorithms with open source code for us to implement. Therefore, we chose Faster R-CNN, one of the state-of-art detection algorithm in this field which depends on region proposal algorithms to hypothesize object locations.
To implement the robotic manipulation and grasping, we used the Baxter robot from Rethink Robotics. Baxter has two arms, each with 7 degrees of freedom. The left arm was used to visualize our workspace with a built-in camera, and the right arm was equipped with a gripper to grasp and move our objects. Our workspace consisted of colored wooden blocks on a table with a white backdrop.
In terms of object recognition and localization, we first decided to use easily graspable wooden blocks with AR tags to help us implement preliminary control of Baxter's kinematics. The blocks would sit on table, essentially constraining our workspace to only 2 dimensions. There are several packages and libraries freely available that helped us localize, identify, grasp, and move our objects. While having easily graspable objects is nice, we cannot ensure that our grasping method will work for many real world objects. Secondly, while using AR tags was handy to get started, they are not easily distinguishable to humans and having to attach AR tags to real objects is impractical.
For reorganizing the objects without AR tags, we mainly combine the object recognition and robotic manipulation. When baxter the robot receives the layout of the table (image), it use camera calibration parameters to rectify the taken images. After rectification, the detection network will predict the coordinate of different objects. Knowing the object location, the robot could manipulate the objects and reorganize the table to the original layout.
In a real world application, using Baxter would not be the best idea. Baxter is mainly intended for educational purposes and safety. Subsequently, it has limitations in its joint velocity and precision that would be make it less efficient and precise in an actual engineering application. However, the idea of system that can recognize distinct objects and move them to a desired location in 2-D space would be valuable for assembly line production or even home use. Our project sets the foundation on which more advance and generalized methods can implemented later.
- Using AR tags, consistently move only one specific object from its current position to the desired position.
- Using AR tags, consistently move two or more objects from their current positions to the desired new positions.
- Without AR tags, consistently move the objects from current layout to the desired layout via computer vision techniques.
To meet our design criteria, our project was modularized into two parts: object recognition and localization & robot manipulation and grasping.
For the object recognition and localization, we chose to use deep learning for the object detection and localization. Convolutional neural network performs very well at computer vision tasks such as image classification and object detection. Additionally, there are many state-of-the-art developed algorithms with open source code for us to implement. Therefore, we chose Faster R-CNN, one of the state-of-art detection algorithm in this field which depends on region proposal algorithms to hypothesize object locations.
To implement the robotic manipulation and grasping, we used the Baxter robot from Rethink Robotics. Baxter has two arms, each with 7 degrees of freedom. The left arm was used to visualize our workspace with a built-in camera, and the right arm was equipped with a gripper to grasp and move our objects. Our workspace consisted of colored wooden blocks on a table with a white backdrop.
In terms of object recognition and localization, we first decided to use easily graspable wooden blocks with AR tags to help us implement preliminary control of Baxter's kinematics. The blocks would sit on table, essentially constraining our workspace to only 2 dimensions. There are several packages and libraries freely available that helped us localize, identify, grasp, and move our objects. While having easily graspable objects is nice, we cannot ensure that our grasping method will work for many real world objects. Secondly, while using AR tags was handy to get started, they are not easily distinguishable to humans and having to attach AR tags to real objects is impractical.
For reorganizing the objects without AR tags, we mainly combine the object recognition and robotic manipulation. When baxter the robot receives the layout of the table (image), it use camera calibration parameters to rectify the taken images. After rectification, the detection network will predict the coordinate of different objects. Knowing the object location, the robot could manipulate the objects and reorganize the table to the original layout.
In a real world application, using Baxter would not be the best idea. Baxter is mainly intended for educational purposes and safety. Subsequently, it has limitations in its joint velocity and precision that would be make it less efficient and precise in an actual engineering application. However, the idea of system that can recognize distinct objects and move them to a desired location in 2-D space would be valuable for assembly line production or even home use. Our project sets the foundation on which more advance and generalized methods can implemented later.