MIT Teaches Soft Robots Body Awareness Through AI And Vision

MIT CSAIL researchers have developed a new system that teaches robots to understand their own … More
Researchers from the Massachussets Institute of Technology’s (MIT) CSAIL lab have developed a new system that teaches robots to understand their bodies, using only vision. Using consumer-grade cameras, the robot watched itself move and then built an internal model of its geometry and controllability.
According the researchers this could dramatically expand what’s possible in soft and bio-inspired robotics, enabling affordable, sensor-free machines that adapt to their environments in real time.
The team at MIT said that this system and research is a major step toward more adaptable, accessible robots that can operate in the wild with no GPS, simulations or sensors. The research was published in June in Nature.
Daniela Rus, MIT CSAIL Director said with Neural Jacobian Fields, CSAIL’s soft robotic hands were able to learn to grasp objects entirely through visual observation with no sensors, no prior model and no manual programming.
“By watching its own movements through a camera and performing random actions, the robot built an internal model of how its body responds to motor commands. Neural Jacobian Fields mapped these visual inputs to a dense visuomotor Jacobian field, enabling the robot to control its motion in real time based solely on what it sees,” added Rus.
Rus adds that the reframing of control has major implications.
“Traditional methods require detailed models or embedded sensors but Neural Jacobian Fields lifts those constraints, enabling control of unconventional, deformable, or sensor-less robots in real time, using only a single monocular camera.”
Vincent Sitzmann, Assistant Professor at MIT’s Department of Electrical Engineering and Computer Science and CSAIL Principal Investigator said the researchers relied on techniques from computer vision and machine learning. The neural network observes a single image and learns to reconstruct a 3D model of the robot which relies on a technique called differentiable rendering which allows machine learning algorithms to learn to reconstruct 3D scenes from only 2D images.
“We use motion tracking algorithms – point tracking and optical flow – to track the motion of the robot during training,” said Sitzmann. “By relating the motion of the robot to the commands that we instructed it with, we reconstruct our proposed Neural Jacobian Field, which endows the 3D model of the robot with an understanding of how each 3D point would move under a particular robot action.”
Sitzmann says this represents a shift towards robots possessing a form of bodily self-awareness and away from pre-programmed 3D models and precision-engineered hardware. “This moves us towards more generalist sensors, such as vision, combined with artificial intelligence that allows the robot to learn a model of itself instead of a human expert,” said Sitzmann. “This also signals a new class of adaptable, machine-learning driven robots that can perceive and understand themselves.”
The researchers said that three different types of robots acquired awareness of their bodies and the actions they could take as a result of that understandi
A 3D-printed DIY toy robot arm with loose joints and no sensors learned to draw letters in the air with centimeter-level precision. It discovered which visual region corresponds to each actuation channel, mapping ‘which joint moves when I command actuator X’ just from seeing motion.
A soft pneumatic hand learned which air channel controls each finger, not by being told, but just by watching itself wiggle. They inferred depth and geometry from color video alone, reconstructing 3D shape before and after actions.
A soft, wrist-like robot platform, physically disturbed with added weight, learned to balance and follow complex trajectories. They quantified motion sensitivity, for example, measuring how a command that slightly changes an actuator produces millimeter‑level translations in the gripper.
Changing soft robotics
The CSAIL researchers aid that soft robots are hard to model because they deform in complex ways. One reasercher said in an email interview that the method they used in the research doesn’t require any manual modeling. The robot watches itself move and figures out how its body behaves similar to a human learning to move their arm by watching themselves in a mirror.
Sitzmann says conventional robots are rigid, discrete joints connected by rigid linksbuilt to have low manufacturing tolerance. “Compare that to your own body, which is soft: first, of course, your skin and muscles are not perfectly solid but give in when you grasp something.”
“However, your joints also aren’t perfectly rigid like those of a robot, they can similarly bend and give in, and while you can sense the approximate position of your joints, your highest-precision sensors are vision and touch, which is how you solve most manipulation tasks,” said Sitzmann. “Soft robots are inspired by these properties of living creatures to be similarly compliant, and must therefore necessarily also rely on different sensors than their rigid cousins.”
Sitzmann says that this kind of understanding could revolutionize industries like soft robotics, low‑cost manufacturing, home automation and agricultural robotics.
“Any sector that can profit from automation but does not require sub-millimeter accuracy can benefit from vision‑based calibration and control, dramatically lowering cost and complexity,” said Sitzmann. “In the future, with inclusion of tactile sensing (=touch), this paradigm may even extend to applications that require high accuracy.”
A new approach to soft robotics
Researchers say their approach removes the need for experts to build an accurate model of the robot, a process that can take months. It also eliminates reliance on expensive sensor systems or manual calibration. The simplified process entails recording the robot moving randomly and the model learns everything it needs to know from that video.
“Instead of painstakingly measuring every joint parameter or embedding sensors in every motor, our system heavily relies on a camera to control the robot,” said Sitzmann. “In the future, for applications where sub-millimeter accuracy is not critical, we will see that conventional robots with all their embedded sensors will increasingly be replaced by mass-producible, affordable robots that rely on sensors more similar to our own: vision and touch.”