Research enables robots to manipulate objects and recognize new places

By Damir Beciri

3 September 2011

Programming a computer to observe events and find commonalities is called machine learning. There are many examples of this process being perfected by many researchers around the world, but a team of researchers from the Cornell University Personal Robotics Laboratory is teaching robots to find their way around in new environments and manipulate objects with greater ease.

“We just show the robot some examples and it learns to generalize the placing strategies and applies them to objects that were not seen before”, said lead researcher Ashutosh Saxena, assistant professor of computer science at Cornell University. “It learns about stability and other criteria for good placing for plates and cups, and when it sees a new object – a bowl – it applies them.”

Saxena and colleague Thorsten Joachims, associate professor of computer science, have developed a system that enables a robot to scan its surroundings in order to identify the objects around it. Pictures from the robot’s 3D camera are combined and merged together in order to form a 3D image of an entire room. In the next step, that image is divided into new segments which are based on discontinuities and distances between objects.

The researchers trained a robot by giving it 24 office scenes and 28 home scenes in which they have labeled most of the objects. After examination of features such as color, texture and what is nearby, the robot’s software decides what characteristics all objects with the same label have in common. In a new environment, it compares each segment of its scan with the objects in its memory and chooses the ones with the best fit.

For example, after observing its surroundings with a 3D camera, the robot randomly tests small volumes of space as suitable locations for placement. In tests, the robot correctly identified objects about 83 percent of the time in home scenes and 88 percent in offices. In a final test, it successfully located a keyboard in an unfamiliar room. Although the keyboard shows up as a few pixels in the image, the monitor is easily found, and the robot uses that contextual information to locate the keyboard.

After training, the robot placed most objects correctly 98 percent of the time when it had seen the objects and environments previously, and 95 percent of the time when working with new objects in a new environment. In some cases will test if the object can be held upright by vertical supports. It can also learn “preferred locations” where demonstrations could help it decide to place a plate flat on a table, but upright in a dishwasher.

“The novelty of this work is to learn the contextual relations in 3D”, said Saxena. “For identifying a keyboard it may be easier to locate the monitors first, because the keyboards are found below the monitors.”

Although the performance could be improved by longer training, the researchers admit that robots still have a long way to go to learn like humans.

For more information, visit the Cornell’s Personal Robotics Laboratory publications page where you can read various publications related to this research.