You are here

Graphical Methods for Discovering Structure and Context in Large Datasets

Award Information
Agency: Department of Defense
Branch: National Geospatial-Intelligence Agency
Contract: HM047622C0060
Agency Tracking Number: NGA-P1-22-06
Amount: $99,901.00
Phase: Phase I
Program: SBIR
Solicitation Topic Code: NGA203-005
Solicitation Number: 20.3
Solicitation Year: 2020
Award Year: 2022
Award Start Date (Proposal Award Date): 2022-02-08
Award End Date (Contract End Date): 2022-11-14
Small Business Information
5266 Hollister Avenue, Suite 229
Santa Barbara, CA 93111-1111
United States
DUNS: 097607852
HUBZone Owned: No
Woman Owned: No
Socially and Economically Disadvantaged: No
Principal Investigator
 Elliot Staudt
 (805) 967-9828
Business Contact
 Bangalore S. Manjunath
Phone: (805) 448-8227
Research Institution

The ubiquity of image sensors for data collection creates a glut of data, which leads to bottlenecks in the processing capabilities of modern systems. In order to process this data, meticulously labeled datasets are required and that must be reviewed by humans in order to guarantee state-of-the-art performance. In this effort we endeavor to create a system that can automatically exploit salient information in a training set and utilize human capital efficiently to produce accurate models that will identify the target objects. State-of-the-art algorithms for image processing, speech recognition, and object detection often assume an abundance of labeled data. Indeed, the big data domain is where deep learning algorithms are known to outperform its competitors. This domain also helps practitioners to ignore overfitting concerns. For instance, deep neural networks (DNN) are often trained in the over-parameterized regime where the network has more parameters than the size of the training data and is prone to overfitting if the data has outliers. These practices can be detrimental and sub-optimal in the few labels regime, where quality data is a finite resource and we need to extract the most out of our dataset; while preventing overfitting. This proposed effort will explore several strategies to reduce labeled data, find exemplars for training and adapting to models through:  1) exploiting contextual knowledge implicit in the data, 2) exploit the structure in the problem, and 3) search for robust models to address overfitting, 4) knowledge transfer from related domains and 5) semi-supervised learning (SSL) strategies. These will be followed by evaluation on relevant datasets in object detection and classification.

* Information listed above is at the time of submission. *

US Flag An Official Website of the United States Government