Donut algorithm allows robots to learn from our mistakes

By Damir Beciri

26 May 2011

Instead being treated as useless mistakes, failed demonstrations can provide great insights into better learning, claim scientists from EPFL’s Learning Algorithms and Systems Laboratory (LASA). Their unusual point of view has led to the development of novel algorithms which enable machines to learn more rapidly and outperform humans by starting from failed or inaccurate demonstrations.

“We inversed the principle, generally accepted in robotics, of acquisition by imitation, and considered cases in which humans are inaccurate in certain tasks”, said professor Aude Billard, head of LASA.

Robot’s instructor needs to demonstrate false attempts of desired action and robot uses iterations of possible solutions until it finds the solution. The research does use ideas from Reinforcement Learning (RL) to deal with noisy demonstrations and to improve the robot’s performance beyond that of the human. However, unlike traditional RL where human has successfully completed the desired task, this method assumes that the humans have failed, and use their demonstrations as a negative constraint on exploration.

Postdoctoral researcher Dan Grollman based his work on what he calls the “Donut as I do” theory. He developed an algorithm that tells the robot not to reproduce a demonstrator’s inaccurate gesture, but rather search for alternative solutions. Thus the play of words with “do not”, where donut’s hole in the middle is the incorrect gesture, which must be excluded, and the surrounding dough represents the field of potential solutions to explore.

“This approach allows the robot to go further, to learn more quickly and above all, outperform the human”, said Grollman, who was recently awarded a “Best Paper Award” for an article on the subject presented at the International Conference on Robotics and Automation (ICRA), in Shanghai.

“We were inspired by the way in which humans learn”, said Billard. “Children often progress by making mistakes or by observing others’ mistakes and assimilating the fact that they must not reproduce them.”

The researchers are looking for ways to improve Donut’s performance. One approach is to use sampling in an attempt to find the global maxima instead of a local one. However, each additional sample would require its own gradient ascent, which is computationally costly and it could lead to potentially unsafe velocities and torques.

For more information, you can read the article named: “Donut as I do: Learning from failed demonstrations” (PDF).

This entry was posted on Thursday, May 26th, 2011 at 4:40PM and filed under Robotics.

Tags: algorithm, donut, epfl, programming through demonstration, Reinforcement Learning, robotics research

Menu

Donut algorithm allows robots to learn from our mistakes

Leave your response!

Antibacterial power of black silicon inspired by cicada wings

Green architecture – Junction House, Melbourne

Using light to dramatically improve conductivity at room temperature

Improving hydrogen production with copper nanowires

Butterfly biomimicry

NanoTech Security KolourOptiks – butterly inspired anti-counterfiet technology

Analyzing butterfly flight for better MAV maneuverability

Biomimicry of butterfly wings for more powerful solar cells

Qualcomm Mirasol display for color e-readers inspired by butterflies

Subscribe

3 latest robotics articles

Recent comments

Article updates

Subscribe

Menu

Donut algorithm allows robots to learn from our mistakes

Related posts:

Leave your response!

Butterfly biomimicry

Subscribe

3 latest robotics articles

Recent comments

Article updates

Subscribe