You are here

Multi-Modal Synthetic Data Corpus to Support Machine Intelligence Development


OUSD (R&E) CRITICAL TECHNOLOGY AREA(S): Advanced Computing and Software, Trusted AI and Autonomy OBJECTIVE: 1. Synthetically create a multi-modal data corpus that can be used to train Artificial Intelligence/Machine Learning (AI/ML) Algorithms to support multi-Intelligence (multi-INT) data fusion and machine intelligence. 2. Develop a scenario-based tool that enables the Army to create an environment that can develop and test future multi-modal AI/ML capabilities DESCRIPTION: Multi-Modal data includes text, images, sounds, etc. Having a corpus of synthetic multi-modal data allows the Army to fuse this data together and rapidly generate higher preforming AI/ML algorithms. Creating an Army owned environment that can develop and test future AI/ML capabilities with a focus on multi-INT data fusion and machine intelligence. This environment should be able use the synthetic data to simulate different scenarios for AL/ML training and validation. Some scenarios may include situations where we need to distribute AI to edge deceives. PHASE I: Conduct research and complete the initial design of the scenario-based tool for testing and developing AI/ML capabilities with a baseline dataset for the multi-modal synthetic data corpus. PHASE II: Creation of the scenario-based prototype tool for testing and developing AI/ML capabilities along with the multi-modal synthetic data corpus that can train high fidelity AI/ML algorithms. PHASE III DUAL USE APPLICATIONS: Maturing the prototype into a planned operational system which can be demonstrated in the operational environment. REFERENCES: 1. Microsoft 2. TensorFlow\ 3. Penn State University SYNCOIN Data set; KEYWORDS: Data Fusion, Multi-Modal Data, Multi-INT Data, Machine Intelligence, Artificial Intelligence/ Machine Learning (AI/ML), Testing and Validation
US Flag An Official Website of the United States Government