You are here

Common Software Platform for Learning-based Robots


OUSD (R&E) CRITICAL TECHNOLOGY AREA(S): Advanced Computing and Software; Trusted AI and Autonomy


OBJECTIVE: Develop an open-source common software platform that is independent of robotics hardware and can be widely used to incorporate artificial intelligence (AI) skills, such as perception and manipulation, for robots that learn to facilitate transfer of research in robotics into application products in a short time.


DESCRIPTION: In recent years, significant advances in AI have been made from image recognition to generation, from large-scale language models to dialogues, and from locomotion to diverse manipulation. A key feature of this advancement has been the rapid transfer of technology—transition time from fundamental research to deployment is unusually short, typically taking only a couple of months. Examples include Jasper AI (for fast content generation), Stability AI (image generation), Photoshop, Hugging Face (natural language understanding), and others. Interestingly, while advances in computer vision (CV) and natural language processing (NLP) have shown this rapid deployment, AI research in robotics has seen little transition to application products and across a very narrow segment of table-top grasping. Most other robotic products, either for defense or consumer, service or industry domain, still exploit and rely on classical control-theoretic and optimization-based approaches and have difficulty with machine learning (ML) and generalization. In learning-based control, assessing safety and performance limits are still challenging. A common software platform will enable expedited research in these issues.

The absence of a common software platform has created an increasing gap between robot learning research and deployment. One of the key reasons is the lack of infrastructure and software platforms for reproducibility and fast transfer of robot learning technology. Robotics hardware vary across tasks in their capabilities and do not enjoy independence in hardware variability, unlike CV or NLP. Hence, as a result, it has become a standard practice in robotics companies to proceed full stack from hardware to software. This not only lengthens the development cycle, but also results in most robotics companies needing to develop their own AI infrastructure and expertise, which makes it difficult to keep up with cutting-edge advances in research.


The above issue has created a unique opportunity. ONR is seeking development of a common software platform for the rapid technology transfer of data-driven robotic algorithms. Such a software platform would go beyond current platforms such as the Robot Operating System (ROS) framework, which focuses on resource scheduling and communication but does not focus on AI capabilities, or Mission Oriented Operating Suite Interval Programming (MOOS-IvP) with similar capabilities. The proposed software platform would build a mid-level AI layer with state-of-the-art perception, locomotion, and manipulation skills. The goal is to abstract low-level robotic skills so that the developers do not need robotics expertise and can focus on the creative applications of the robots. Ideally, this platform could be shared by different robotics companies allowing them to focus better on their application vertical with faster iteration cycles while having a way to easily incorporate the latest algorithmic developments in robot learning. Selected references related to certain skills such as manipulation and locomotion are included below.


PHASE I: Design and demonstrate the feasibility of a shared platform for efficient transfer and implementations of data-driven robotic algorithms. Validate the platform's ability to meet key parameters on a custom reference hardware which is to be scaled to multiple platforms in Phase II. The key parameters to be met in Phase I: 90% success rate on simple terrain locomotion, 80% on complex rough terrain locomotion, 85% success rate for point to point navigation for both legged and wheeled robots, 70% grasping rate for at least a selection of 50 everyday objects. Produce detailed design specifications and capabilities descriptions for Phase II prototype development.


PHASE II: Develop and deliver a deployable prototype of the platform, including perception and action capabilities such as locomotion, navigation, and targeted class-conditioned manipulation. Validate the prototype's ability to run on multiple hardware configurations such as Franka robotic arm, UR5 arm, X-arm as well as legged robot with arms and wheeled robot with arms in home and warehouse settings. The key parameters to be met at this stage are: work on multiple hardware including a total of at least 5 different commercial hardware platforms across tasks, more than 98% success rate on simple terrain locomotion, more than 95% success rate on rough terrain locomotion, 95% accuracy of point to point navigation, 75% grasping rate for at least a selection of 100 everyday objects. In parallel, produce a detailed Phase III plan for partnering for commercial as well as DoD applications.


PHASE III DUAL USE APPLICATIONS: Perform additional experiments in a variety of situations and environments. Begin testing with external partners.


This technology could be used in commercial sectors such as medical robotics, warehousing, and delivery, for developing versatile robots capable of performing maintenance, service robots at home or work places, and other tasks.



  3. A. Agarwal, et al. Legged Locomotion in Challenging Terrains using Egocentric Vision. CoRL 2022.
  4. Z. Cao, et al. Reconstructing hand-object interactions in the wild. ICCV 2021.
  5. T. Chen, et al. Visual dexterity: In-hand dexterous manipulation from depth. ICML Workshop 2023.
  6. S. Pandian, et al. Dexterous Imitation Made Easy: A Learning-Based Framework for Efficient Dexterous Manipulation. ICRA 2023.
  7. L. Pinto and A. Gupta. Supersizing Self-supervision: Learning to Grasp from 50K Tries and 700 Robot Hours. ICRA 2016.
  8. A. Simeonov, et al. Neural descriptor fields: Se (3)-equivariant object representations for manipulation. ICRA 2022.


KEYWORDS: Software platform, robot manipulation, robot perception, robot Artificial Intelligence skills, learning robots


US Flag An Official Website of the United States Government