What imitation learning means for the enterprise

Join today’s leading executives online at the Data Summit on March 9th. Register here.

Imitation learning is a powerful concept in AI. A type of learning where behaviors are acquired by mimicking a person’s actions, it enables a system to perform a task by creating a “mapping” between observations and actions. As Carnegie Mellon computer scientist Dean Pomerleau demonstrated in 1989, imitation learning can be used in driverless vehicle systems, which can learn to map data from sensors into steering angles and drive autonomously. It can also be used to train off-the-shelf robots, for example to manipulate objects. And it’s been applied to robotic process automation (RPA) to automate digital tasks.

But imitation learning has limitations. For example, it can be difficult to train systems for complex applications or when an expert demonstrator isn’t available. In many cases, it’s not possible to obtain high-quality demonstration data, and systems trained via imitation learning — whether self-driving or object-manipulating — will at best be as good as the best demonstrator.

Still, imitation learning can be a valuable tool in the enterprise as interest in AI more broadly grows — particularly robotics. According to one study (albeit from a robotics vendor), nearly half of retailers will take part in some kind of in-store robotics project in 2022. A separate, 2021 survey by Automation World found that between 43.5% and 56.5% of manufacturers and warehouse operators plan to purchase robots within the next year. (Statista projects that the global market for industrial robots was worth around $43.8 billion in 2021.)

Imitation learning explained

The simplest form of imitation learning is behavioral cloning, where a system learns from an expert (e.g., a human driver) through supervised learning. Supervised learning entails training the system on input data annotated for a particular “output” (e.g., an image of a black bear captioned with the text “black bear”) until it can pick up on the underlying relationships. During the training phase, the system is fed with labeled datasets that tell it which outputs (e.g., the caption “black bear”) are related to each specific input value (e.g., an image of a black bear).

Typically, input data for imitation learning comes in the form of a log that captures the demonstrator’s actions. The system outputs a set of rules that reproduce the behavior, associating actions with states in the same way a computer vision system maps captions to images. For example, in the context of an autonomous car, data from sensors like cameras, accelerometers, radars, and more are mapped to human driver’s actions including the steering angle, gear shifts, accelerator pushes, and brake pedal pushes.

An example of a system that uses imitation learning to train a system to drive autonomously, in different types of conditions.

Several companies in the autonomous driving space have used imitation learning to develop their products, like Latent Logic, which was acquired by Alphabet-backed Waymo in 2019. Latent Logic was analyzing data collected from traffic cam videos to develop “policies” that simulate realistic models of human behavior on the road, from which autonomous driving systems can learn.

Tesla, too, is reportedly leveraging imitation learning to improve its Autopilot advanced driver assistance system. As The Information reports, the Autopilot team “can examine what traditional human driving looks like in various driving scenarios and mimic it … Tesla’s engineers believe that by putting enough data from good human driving through a neural network, that network can learn how to directly predict the correct steering, braking and acceleration in most situations.”

FortressIQ, an RPA vendor recently acquired by Automation Anywhere, is also employing imitation learning. Like other RPA platforms, FortressIQ’s can mimic the way humans interact with software to perform tasks like logging into applications, entering data, and copying data between applications. But FortressIQ claims to take this a step further by “replicating human behavior through observation,” capturing low-bandwidth movies of software processes and then transcribing the footage into a series of “software interactions.”.

In the robotics domain, Google researchers last November detailed a behavioral cloning technique that they say achieves “state-of-the-art” results on human-expert tasks. By having a robotic arm observe and map actions from demonstrations, they “taught” the system to accomplish tasks like sliding a block across a table and inserting it into a slot. Google researchers have also published work showing a robot learning how to walk by mimicking a dog’s movements using imitation learning. And research lab OpenAI has used imitation learning to help robots grasp objects by imitation.

Imitation learning limitations

While imitation learning has clear applications, it isn’t without shortcomings. Imitation learning-trained systems don’t always generalize well to scenarios that weren’t included in the training data, and generalization issues can arise due to dataset biases even after many demonstrations.

DA V 2103 FortressIQ figure1
The backend of FortressIQ’s RPA system.

“Control policies from imitation learning can often fail to generalize to novel environments due to imperfect demonstrations or the inability of imitation learning algorithms to accurately infer the expert’s policies,” Princeton coauthors wrote in a 2020 paper. “This may be due to the expert’s demonstrations not being safe or generalizable, or due to the imitation learning algorithm not accurately inferring the expert’s policy.”

Other unsolved problems in imitation learning include enabling systems to learn effectively from demonstrations performed by multiple experts and removing unwanted “noise” from demonstrations (like accidental bumps of a steering wheel). Imitation learning also assumes that humans can demonstrate the desired task, which isn’t always possible — especially where a robot has a physical advantage.

Emerging methods, tools, and applications promise to address some of the challenges in imitation learning, however. For example, last month, researchers at New York University released VINN, an imitation learning framework that doesn’t require large training datasets. By focusing on representation learning — the process by which systems learn to identify “task-relevant” features in a scene — the researchers were able to efficiently teach a robot how to open a door by looking at similar demonstration images.

As the coauthors of a 2018 paper published in the Association for Computing Machinery Computing Surveys write, imitation learning has a long runway ahead of it. For instance, it could be a boon for companies developing self-navigating forklifts and other heavy equipment on factory floors. 

“The paradigm of learning by imitation is gaining popularity because it facilitates teaching complex tasks with minimal expert knowledge of the tasks,” the coauthors said. “Generic imitation learning methods could potentially reduce the problem of teaching a task to that of providing demonstrations, without the need for explicit programming or designing reward functions specific to the task. Modern sensors are able to collect and transmit high volumes of data rapidly, and processors with high computational power allow fast processing that maps the sensory data to actions in a timely manner. This opens the door for many potential AI applications that require real-time perception and reaction such as humanoid [and home] robots, self-driving vehicles, human computer interaction, and computer games, to name a few.”

Originally appeared on: TheSpuzz