In order for robots to interact within household environments, robots should be able to manipulate a large variety of objects and appliances in human environments, such as stoves, coffee dispensers, juice extractors, and so on. Consider the espresso machine above — even without having seen the machine before, a person can prepare a cup of latte by visually observing the machine and by reading the instruction manual. This is possible because humans have vast prior experience of manipulating differently-shaped objects. In this project, our goal is to enable robots to generalize to different objects and tasks.
In this project, we use crowd-sourcing to build a large collection of demonstrations for robots. Help improve our model by teaching the robot! You can help our Robobarista learn about different objects!
There is a large variety of objects and appliances in human environments, such as stoves, coffee dispensers, juice extractors, and so on. It is challenging to program a robot for each of these object types and for each of their instantiations.
In this work, we present a novel approach to manipulation planning based on the idea that many household objects share similarly-operated object parts.
We formulate the manipulation planning as a structured prediction problem and design a deep learning model that can handle large noise in the manipulation demonstrations and learns features from three different modalities: point-clouds, language and trajectory.
As the PR2 robot stands in front of the object it has never seen before, the robot is given a natural language instruction (manual) and segmented point-cloud. Using our algorithm, the robot was even able to make a cup of latte.