You are here
HAMMR: Hierarchical Autonomous Malthusian Multi-Agent Reinforcement Learning
Phone: (240) 406-7616
Email: arao@i-a-i.com
Phone: (301) 294-5221
Email: mjames@i-a-i.com
The basic Reinforcement Learning algorithms behind recent advances have been around since the 90s. The proliferation of deep learning function approximators; first made popular in computer vision quickly made their way into game theoretic approaches. These high dimensional functions made it possible to learn complex value functions and behavior policies that were not possible before. It is now possible to teach an agent to play Atari games at human level performance from observing pixels, on a desktop CPU in a matter of days. Learned agents in MMO RPGS (e.g. DOTA and Starcraft) as well as realistic physics-based simulators (e.g. “Hide and Seek”) have demonstrated the potential for the emergence of complex cooperative behavior. During training, complex group behaviors emerged in stages, each more impressive than the last. The ideas behind these advances can be applied to NeuralMMO and transitioned to current and future wargame environment tools to allow military planners to fully realize strategy evolution and how the measures-countermeasure cycle would play out over very long time periods. The nature of the strategies at each transition point has the potential to exceed human-created strategies in the same way AlphaZero developed super-human winning strategies for Chess, Go, and Shogi.
* Information listed above is at the time of submission. *