You are here

Self-Supervised Training in Geospatial Applications with a Robust Hierarchical Vision Transformer (STAR)

Award Information
Agency: Department of Defense
Branch: National Geospatial-Intelligence Agency
Contract: HM047622C0043
Agency Tracking Number: O22A-001-0029
Amount: $99,988.23
Phase: Phase I
Program: STTR
Solicitation Topic Code: OSD22A-001
Solicitation Number: 22.A
Solicitation Year: 2022
Award Year: 2022
Award Start Date (Proposal Award Date): 2022-06-27
Award End Date (Contract End Date): 2023-04-04
Small Business Information
6411 Ivy Lane, Suite 108
Greenbelt, MD 20770-1111
United States
DUNS: 829354062
HUBZone Owned: No
Woman Owned: No
Socially and Economically Disadvantaged: No
Principal Investigator
 Kemal Davaslioglu
 (301) 345-8664
Business Contact
 Eric Heidhausen
Phone: (301) 345-8664
Research Institution
 Maryland Advanced Development Lab (URF)
 Melissa Allen
6411 Ivy Lane
Greenbelt, MD 20770-1406
United States

 (301) 345-8664
 Domestic Nonprofit Research Organization

Satellite Imagery in Geospatial Intelligence (GEOINT), in conjunction with imagery intelligence (IMINT), geospatial information, and other means of gaining intelligence, has greatly improved the potential of the warfighter and decision makers enabling them to gain a more comprehensive perspective, an in-depth understanding, and a cross-functional awareness of the operational environment. The Artificial Intelligence (AI) and Deep Learning (DL) techniques have been one of the main drivers in improving the GEOINT capabilities by extracting knowledge from geospatial data at scale. However, current DL techniques require training on massive, labeled datasets, which are costly and time-consuming for satellite imagery that generally covers an area of several kilometers square. In addition, many of the DL-based (e.g., object detectors, semantic segmentation) models rely on a backbone using a Convolutional Neural Network (CNN) pretrained on an image classification task to extract visual features from the input image. Although these backbone networks play an important role for the performance of the object detectors, they are pretrained with images taken from ground-level which do not apply well to overhead satellite imagery. The recent advances in computer vision show a modeling shift from the CNN architecture in the backbone to the one using Transformers to achieve the state-of-the-art (SoTA) performance results. However, currently, there is no geospatial computer vision pipeline software solution that allows the development of a model pretraining on satellite imagery data leveraging the benefits of self-supervised learning and hierarchical vision transformers. Such solution would enable prototyping SoTA models for almost all geospatial downstream tasks. Thus, innovative solutions are needed to improve the training of DL-based computer vision algorithms for geospatial tasks using satellite imagery.

* Information listed above is at the time of submission. *

US Flag An Official Website of the United States Government