Learning to grasp is one of the most significant open problems in robotics, requiring complex interaction with previously unseen objects. Solving this problem would open many applications in industry, such as in logistics or agriculture, but it's also the first step to perform a wide range of more sophisticated object manipulation behaviors.
In order to operate in nondeterministic environments, the system should be able to adapt to unexpected dynamics (obstacles, misinterpreted situations), and in some cases, operate pre-manipulation to isolate the targeted object. To do so, vanilla methods in robotics are not flexible enough. They must be combined with data-driven approaches, which are very efficient for approximating fuctions in high dimension problems.
Recent works applied modern Reinforcement Learning (RL) algorithms to grasping in robotics, making them interacting with an experimental environment for months. But this approach is poorly data-efficient, and cannot be scaled without drastically increasing money, time, and energy cost of the training. This problem is mostly due to exploration : applying Deep Reinforcement Learning (Deep-RL) methods – which are performance-driven – often stuck the learning process to suboptimal solutions.
To tackle this issue, we propose to leverage the exploration capabilities of evolutionary algorithms, such as Novelty-Search (NS) of Quality-Diversity (QD) approaches. By using those methods, we could create a repertoire of grasping trajectories that contains many ways to reach a single point in the behavior space. This repertoire could then be used to train a policy network, bypassing the exploration problem.
Additionnaly, this work aims to rely on robotics expert-knowledge to make the learning problem easier, and cross the reality gap between simulation and real robot manipulation. It would also increase the interpretability of the trained policies, which is one of the data-driven methods weaknesses.
This project is therefore at the crossroad of Artificial Intelligence (AI) and robotics.
Credit : Images by Any Lane and Tara Winstead from Pexels, edited by Lou Hacquet-Delepine.
PhD student: Johann HUBER
PhD supervisor: Stéphane DONCIEUX
Research laboratory: ISIR – Institut des Systèmes Intelligents et de Robotique