You are here

SLACA: Self-Learned Agents for Collective Aerial Video Analysis

Award Information
Agency: Department of Defense
Branch: Air Force
Contract: FA8650-19-P-6014
Agency Tracking Number: F18B-002-0134
Amount: $150,000.00
Phase: Phase I
Program: STTR
Solicitation Topic Code: AF18B-T002
Solicitation Number: 18.B
Timeline
Solicitation Year: 2018
Award Year: 2019
Award Start Date (Proposal Award Date): 2019-02-19
Award End Date (Contract End Date): 2019-02-19
Small Business Information
15400 Calhoun Drive Suite 190
Rockville, MD 20855
United States
DUNS: 161911532
HUBZone Owned: No
Woman Owned: Yes
Socially and Economically Disadvantaged: No
Principal Investigator
 Huiping Li
 Lead Scientist
 (240) 406-7614
 hli@i-a-i.com
Business Contact
 Mark James
Phone: (301) 294-5200
Email: mjames@i-a-i.com
Research Institution
 Univ. of Maryland, College Park
 Chris Jones Chris Jones
 
Room 3112 Lee Building, 7809 Regents Drive
College Park, MD 20742
United States

 (301) 405-6269
 Nonprofit College or University
Abstract

For this STTR Intelligent Automation, Inc. teams with researchers from University of Maryland, College Park to develop SLA, a self-learned agent system for collective human activities and events in aerial videos. Aerial video analytics often faces challenges such as low resolution, shadows, varied spatio-temporal dynamics, etc. The traditional methods depending on the object detection and tracking often fail due to these challenges. Although deep learning shows the promises in recent years, it needs large volume groundtruthed dataset for training, which is not always available. We formulate SLA as a reinforcement learning agent that interacts with a video over time. Each agent is trained to perform a specific task defined by users, and multiple agents can interact and communicate to perform more complex event detection. Compared with existing work[1], our solution has the following advantages: 1) SLA only needs simple webcast style text as the annotation instead of detailed labeled dataset; 2) For an event, our solution detects not only temporal boundaries, but also spatial attention; and 3) our solution will caption the detected activities by generating short phrases describing the event.

* Information listed above is at the time of submission. *

US Flag An Official Website of the United States Government