You are here

Enabling scientific discovery from complex data at extreme scales

Award Information
Agency: Department of Energy
Branch: N/A
Contract: DE-SC0017069
Agency Tracking Number: 0000227725
Amount: $212,799.00
Phase: Phase I
Program: STTR
Solicitation Topic Code: 01b
Solicitation Number: DE-FOA-0001618
Timeline
Solicitation Year: 2017
Award Year: 2017
Award Start Date (Proposal Award Date): 2017-02-21
Award End Date (Contract End Date): 2018-02-20
Small Business Information
5184 Tehachapi Way
Antioch, CA 94531-8821
United States
DUNS: 080231972
HUBZone Owned: No
Woman Owned: No
Socially and Economically Disadvantaged: No
Principal Investigator
 Brown James
 (510) 486-7147
 jbbrown@lbl.gov
Business Contact
 Phillips Jason
Phone: (559) 759-0930
Email: jphillips@preminon.com
Research Institution
 Lawrence Berkeley National Laboratory
 James Brown
 
Once Cyclotron Rd
Berkeley, CA 94720
United States

 (510) 486-7147
 Federally Funded R&D Center (FFRDC)
Abstract

Statistical machine learning has had substantial impacts on many business areas, including finance, supply chain management, cyber security, bioengineering, and elsewhere – it is the core of companies including Google, FaceBook, TrueCar, and many others. However, extant tools for predictive analytics provide little or no insight into underlying processes. Hence, while machine learning has inarguably been enormously beneficial to industry, it has yet to enable engineering-level insights into complex systems. A central challenge is to develop “open box” learning machines that provide deep insights into the systems they model, enabling the application of marketplace engineering principals to all industrial sectors. Preminon LLC, in collaboration with the Brown Group at Lawrence Berkeley National Laboratory, will develop a new, indefinitely scalable algorithm for feature discovery in supervised, unsupervised, and semisupervised regimes on massively multi-dimensional, hybrid (structured and unstructured) data, in both streaming and static regimes. Our techniques are based on our previous work on iterative Random Forests (iRF, https://github.com/sumbose/iRF). In nonlinear complex systems, to obtain engineering-level insights from BigData, it is necessary to identify and map nonlinear interactions. Mapping nonlinear interactions is done with forward approaches, which become computationally intractable at relatively low-orders. Our approach, for the first time, de-couples the order of interaction from the cost of detection – we have developed importance sampling in the space of all subsets nonparametric regression. We aim to generate HPC-compatible open box learning machines that provide substantial improvements to the informativeness of predictive analytics. We will commercialize this technology as licensable software in the rapidly growing business intelligence sector, currently at $2.7B, and projected to reach $9.7B by 2020.

* Information listed above is at the time of submission. *

US Flag An Official Website of the United States Government