You are here

HAMMR: Hierarchical Autonomous Malthusian Multi-Agent Reinforcement Learning

Award Information
Agency: Department of Defense
Branch: Army
Contract: W912CG-20-P-0005
Agency Tracking Number: A201-061-1265
Amount: $111,498.00
Phase: Phase I
Program: SBIR
Solicitation Topic Code: A20-061
Solicitation Number: 20.1
Timeline
Solicitation Year: 2020
Award Year: 2020
Award Start Date (Proposal Award Date): 2020-06-23
Award End Date (Contract End Date): 2021-03-17
Small Business Information
15400 Calhoun Drive Suite 190
Rockville, MD 20855-2814
United States
DUNS: 161911532
HUBZone Owned: No
Woman Owned: No
Socially and Economically Disadvantaged: No
Principal Investigator
 Akshay Rao
 (240) 406-7616
 arao@i-a-i.com
Business Contact
 Mark James
Phone: (301) 294-5221
Email: mjames@i-a-i.com
Research Institution
N/A
Abstract

The basic Reinforcement Learning algorithms behind recent advances have been around since the 90s. The proliferation of deep learning function approximators; first made popular in computer vision quickly made their way into game theoretic approaches. These high dimensional functions made it possible to learn complex value functions and behavior policies that were not possible before. It is now possible to teach an agent to play Atari games at human level performance from observing pixels, on a desktop CPU in a matter of days. Learned agents in MMO RPGS (e.g.  DOTA and Starcraft) as well as realistic physics-based simulators (e.g. “Hide and Seek”) have demonstrated the potential for the emergence of complex cooperative behavior. During training, complex group behaviors emerged in stages, each more impressive than the last. The ideas behind these advances can be applied to NeuralMMO and transitioned to current and future wargame environment tools to allow military planners to fully realize strategy evolution and how the measures-countermeasure cycle would play out over very long time periods. The nature of the strategies at each transition point has the potential to exceed human-created strategies in the same way AlphaZero developed super-human winning strategies for Chess, Go, and Shogi.

* Information listed above is at the time of submission. *

US Flag An Official Website of the United States Government