You are here

Learning from Massive Data Sets Generated by Physics Based Simulations

Description:

TECHNOLOGY AREAS: Air Platform, Information Systems

OBJECTIVE:  To develop and evaluate novel methods for understanding and visualizing information obtained from (ultra) massive data sets generated through physics based simulation.

DESCRIPTION:  Large-scale simulations are carried out routinely in modern computational physics.  Such simulations produce enormous data sets.  There is a need for developing and exploring new, more efficient methods for extracting coherent and usable information along with measures of the uncertainty associated with the information derived using only the given data.  As an example, vortex detection [1] is an important aspect of data mining obtained with steady-state computational fluid dynamics simulations on curvilinear grids. A systematic theory for an ab initio data-driven organization of information and representation of data complexity is needed.

The effort should focus on novel approaches for developing a systematic theory for exploiting the relevant physical principles [2] in the simulation of physical phenomena. The approach should lend itself to a variety of secondary transformations for visualization [3]. Technologies may include, but are not limited to, information fusion, cognitive architectures, robust statistical learning, search and optimization, automated reasoning, and possibly new applications of mathematics [4]. Emphasis should be placed on identifying broadly applicable principles, supporting theoretical constructs, practical methods, and algorithmic realizations and measures of effectiveness.

PHASE I:  Perform research to develop a set of new technologies as described above and define or identify a representative data set to be used in demonstrating the effectiveness of the resulting technologies.

PHASE II:  Extend the Phase I knowledge products to develop a prototype comprehensive framework applicable to massive amounts of data obtained from physics based simulations.

PHASE III DUAL USE COMMERCIALIZATION:

Military Application:  Developing new techniques for computer simulations including feature analysis in multi-physics simulations.  Technology developed also applicable to threat identification hidden in massive datasets.

Commercial Application:  This technology could be adapted to efficient analysis of gene expression and other high-throughput molecular data.  Analysis of massive, high dimensional data from the life sciences would benefit.

US Flag An Official Website of the United States Government