You are here

Deep Reinforcement Learning for Collaborative Multi-Robot Systems with Low-Latency Wireless Networking

Award Information
Agency: Department of Defense
Branch: Navy
Contract: N68335-23-C-0708
Agency Tracking Number: N23B-T031-0016
Amount: $139,942.00
Phase: Phase I
Program: STTR
Solicitation Topic Code: N23B-T031
Solicitation Number: 23.B
Solicitation Year: 2023
Award Year: 2023
Award Start Date (Proposal Award Date): 2023-09-13
Award End Date (Contract End Date): 2024-03-11
Small Business Information
10041 Wild Orchid Way
Elk Grove, CA 95757-4345
United States
DUNS: 084613536
HUBZone Owned: No
Woman Owned: No
Socially and Economically Disadvantaged: Yes
Principal Investigator
 Amitav Mukherjee
 (408) 687-8288
Business Contact
 Amitav Mukherjee
Phone: (408) 687-8288
Research Institution
 University of Massachusetts Boston
 Shala Bonyun
100 Morrissey Boulevard
Boston, MA 02125-3393
United States

 (617) 287-5592
 Nonprofit College or University

In this Phase I effort, Tiami, LLC, aims to develop and demonstrate a hardware proof of concept for a collaborative multi-robot system (MRS) that leverages imitative augmented deep reinforcement learning (IADRL) amongst heterogeneous uncrewed systems (robots) to achieve a common task. Collaboration is based on low-latency machine-to-machine wireless links between robots that use both RF and optical wireless communications (OWC) for multimode resiliency. To accomplish this task, the Phase I personnel include a principal engineer from Tiami LLC with 12 years of experience in RF systems design and 124+ patents in low-latency 4G/5G wireless systems, and a subaward to the University of Massachusetts Boston to leverage Dr. Annavajjala’s and Prof. Michael Rahaim’s expertise in optical wireless communications research and lab facilities for rapid software-defined radio (SDR) prototyping. A letter of support is provided from Lockheed Martin Missile Fire Control and Lockheed Martin Space, an anticipated Phase II/Phase III  subcontractor to transition our technology to military and commercial users. The system objective is to track the desired target of interest while the adversary MRS attempts to disrupt the ally MRS cyber topology (i.e., communications and intelligence). Strategic motion planning will be implemented with the IADRL methodology, while real-time mission updates are intelligently distributed amongst nodes using a combination of RF and OWC. For strategic motion planning, our MRS will use a single node as a mission leader to aggregate information, implement the IADRL model, and distribute motion plans. The remaining agents continuously transmit their positional information (longitude, latitude, altitude/height, velocity, roll, pitch, yaw/heading, angular rates, acceleration) and status to the mission leader node. Each node acquires absolute positioning, navigation, and timing (PNT) information from an alternative signal-of-opportunity such as commercial LEO systems (Starlink) if GPS is denied. The system will adapt to network segmentation (or any loss of a mission leader) by dynamic reselection of the mission leader(s). For the communications component of our solution, individual nodes will actively decide to provide information on the RF or OWC network (or some combination thereof) for each round of sensor data aggregation. The M3P network utilizes post-quantum cryptographic encryption (e.g., CRYSTALS-DILITHIUM) for security. Traffic routing decisions will be based on channel state information for RF/OWC links and statistical characterization of each network’s reliability (i.e., likelihood of connecting to the mission leader in a desired timeframe).

* Information listed above is at the time of submission. *

US Flag An Official Website of the United States Government