A Framework for In Situ Analysis and Visualization of Large-Scale Cosmology Simulations
Numerical simulations play a crucial role in the DOE High Energy Physics (HEP) Cosmic Frontier program. They provide a theoretical framework with which to compare observational data from surveys. Through such comparisons, significant breakthroughs are made. These breakthroughs can guide scientists in calibrating the underlying cosmological model, to produce simulations that match the observed sky, which enhances our overall understanding of the universe. Science requirements for such surveys demand simulations of extreme scales. The ability to perform these simulations,however, is limited by the scale at which we can store, manage, and process the resulting data. Chief among the pressing challenges are: (i) the explosive growth of data, (ii) the lack of domain-specific analysis algorithms, and (iii) the disconnect between simulation (theory) and observation (experiment). Large-scale simulations are also used in manufacturing to reduce design costs. Therefore, these challenges are reflected across multiple scientific domains and not just in cosmology. The goal of the work proposed herein is the development of an open-source, in situ analysis framework targeting large-scale simulations. The library will address the explosive growth of data by employing in situ analysis. Science-aware algorithms will be implemented to address the lack of domain-specific solutions. Lastly, to bridge the gap between simulation and experiments, the library will also be integrated with a data management system that leverages web-based mining and visualization technologies to simplify scientific workflows. In Phase I, CosmologyTools was successfully developed as a lightweight in situ analysis framework and it was coupled with HACC (Hybrid/Hardware Accelerated Cosmology Code), one of the core DOE gravity-only N-Body codes. The Phase I effort has demonstrated the feasibility of our technical approach. By employing in situ analysis, the data is reduced to the features of interest, I/O overheads and associated costs are minimized, and the scientific workflow is simplified. Lastly, the Phase I effort clearly demonstrated the need for data management to enable direct comparisons of simulation and observations. The Phase II project will focus on the development of a full-featured, production-level, in situ analysis library with a number of advanced features including novel feature identification, extraction, and tracking. Further, the library will be integrated with data management to enable direct comparison of simulation results and data from experiments. The team will also develop web-based technologies for data mining and visualization to simplify scientific workflows. Commercial Applications and Other Benefits: Upon completion of Phase II, the resulting infrastructure will have a number of key benefits and broader impacts in science and manufacturing in general. Specifically, it will (i) enable analysis of massive scientific datasets, (ii) enable scientists to glean insights not available before by embedding analysis within the simulation, (iii) enable direct comparison of simulation data and experiments, and (iv) simplify data management and scientific workflows.
Small Business Information at Submission:
28 Corporate Dr. Clifton Park, NY 12065-8688
Number of Employees: