Description:
OUSD (R&E) CRITICAL TECHNOLOGY AREA(S): Trusted AI and Autonomy
OBJECTIVE: Develop advanced AI/ML algorithms that combine generative AI with discriminative AI to enhance 3D geospatial analytics.
DESCRIPTION: DESCRIPTION: Current artificial intelligence and machine learning (AI/ML) techniques for geospatial analysis use pixel or voxel information for semantic segmentation, detection, and classification tasks, but do not exploit the rich contextual relational information in the scene (e.g., cars drive on roads, ships/boats sail on water, etc.). Scenes are fundamentally compositional, adhering to relational rules, which can be exploited to improve geospatial analytics. We seek approaches that combine generative AI, which can describe relations between objects, with discriminative AI to improve multi-task geospatial analysis. Specifically, this topic seeks approaches that leverage generative AI in the form of large language models capable of generating relationships between objects while capturing the relational diversity present in the real world (e.g., cars drive on roads, cars park in drive ways, drive ways connect houses to roads). The proposed approaches must combine generative AI with discriminative AI such as deep convolutional neural network models for segmentation and classification in an end-to-end system. The relations from the generative AI can be considered as constraints and regularization that aid the discriminative AI in solving highly under-constrained problems in 3D geospatial analysis.
NGA anticipates that Phase II work will involve input data that may be Controlled Unclassified Information (CUI) or classified.
PHASE I: Demonstrate proof-of-concept approach capable of combining generative AI with discriminative AI using open source 3D datasets derived from commercial satellite (COMSAT) imagery and full motion video (FMV). For a given set of classes (buildings, houses, cars, roads, trees), demonstrate that approach is able to improve multi-task performance in semantic segmentation and object detection beyond baseline approaches that utilize only discriminative AI. Develop a Phase II plan that includes integration, test, and validation of the end-to-end system.
PHASE II: Realize the optimization and implementation of the selected generative AI and discriminative AI into an end-to-end model. Demonstrate the proposed model is trainable with an expanded class/target set, and is able to perform inference on 3D datasets derived from COMSAT and FMV. Develop a technology transition plan and business case assessment.
PHASE III DUAL USE APPLICATIONS: 3D geospatial analytics software leveraging generative AI capable of supporting DoD use cases including scene segmentation, classification, and target detection; civil engineering missions such as surveying, urban mapping and city planning; commercial robotics applications such as route planning.
REFERENCES:
1. Zhao, Wayne Xin, et al. "A survey of large language models." arXiv preprint arXiv:2303.18223 (2023);
2. Xiao, Aoran, et al. "Unsupervised Point Cloud Representation Learning with Deep Neural Networks: A Survey." IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).
KEYWORDS: 3D geospatial analytics; generative AI; AI/ML