TECHNOLOGY AREA(S): Info Systems
OBJECTIVE: Develop a method for improving the performance of computer vision algorithms using simulated imagery and reinforcement learning.
DESCRIPTION: While great advances have been made in the field of computer vision (CV), there remain challenges when algorithms are deployed in mission-critical applications that change continually or involve objects rarely seen in training datasets—or perhaps not seen at all. Several groups have addressed these scenarios using synthetic imagery to supplement a dearth of available training data for targets of interest. While they have demonstrated promising results, these approaches tend to be labor intensive. Reinforcement learning (RL) is a framework that seeks to optimize some notion of reward through the exploration of actions taken by an agent within an environment. RL seeks to learn a policy that maps observations to actions in a way that maximizes long-term reward. This notion of long-term reward is important to ensure exploration of the solution space, as opposed to a more greedy approach. In the context of using deep RL to train object detection CV algorithms, the “action” of the RL framework could be thought of as positioning synthetic objects based on 3D CAD models within a scene (which itself could be synthetic or real) in various ways such that the overall reward of improving the performance of object detection algorithms is optimized. This allows CV algorithms to be trained on rare targets in a variety of environments to improve performance on unexpected scenarios.
PHASE I: Design and develop a deep reinforcement learning framework that demonstrates a proof-of-concept system capable of improving object detection performance on objects with limited training data (i.e. few-shot learning) and the ability to improve generalizability of object detection to new environments. The demonstration task for this phase will be detection of rare objects within commercial high-resolution satellite imagery. Define a set of metrics for assessing performance. Deliverables will include a system architecture design; a block diagram identifying data flows and interfaces; a technical report describing the approach, evaluation results, and proposed future research directions.
PHASE II: Develop a robust deep reinforcement learning framework based on the lessons learned in Phase I. Test and fully characterize the performance of the framework with respect to both limited training data and generalizability to detectors to new background environments.
PHASE III: Computer vision algorithms with reduced training data requirements have applications across a variety of different applications. Because training data is difficult and expensive to acquire, reducing our reliance on it provides many benefits. Additionally, generalizability through the use of deep RL enables successful CV models to be rapidly adapted to new environments to meet the changing needs of DoD. Commercially, similar benefits exist by enabling algorithms to more easily adapt to changing conditions.
1: Stoica, Ion, et al., "A Berkeley View of Systems Challenges for AI", 15 Dec 2017, arXiv:1710585
2: Hinterstroisser, Stefan, "On Pre-Trained Image Features and Synthetic Images for Deep Learning", 16 Nov 2017, arXiv:1710.10710.
3: Sha, Shital, et al. "AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles", 18 Jul 2017, arXiv:1700506
4: Arulkumaran, Kai, et al. "A Brief Survey of Deep Reinforcement Learning", 28 Sept 2017, arXiv:1700586
5: Qiu, W. and Yuille, A., "UnrealCV: Connecting Computer Vision to Unreal Engine", European Conference on Computer Vision Workshops, 201
KEYWORDS: Deep Reinforcement Learning, Deep Learning, Augmented Reality, Simulated Imagery, Synthetic Imagery Model Training
OSD Program Manager