You are here
Enabling scientific discovery from complex data at extreme scales
Phone: (510) 486-7147
Email: jbbrown@lbl.gov
Phone: (559) 759-0930
Email: jphillips@preminon.com
Contact: James Brown
Address:
Phone: (510) 486-7147
Type: Federally Funded R&D Center (FFRDC)
Statistical machine learning has had substantial impacts on many business areas, including finance, supply chain management, cyber security, bioengineering, and elsewhere – it is the core of companies including Google, FaceBook, TrueCar, and many others. However, extant tools for predictive analytics provide little or no insight into underlying processes. Hence, while machine learning has inarguably been enormously beneficial to industry, it has yet to enable engineering-level insights into complex systems. A central challenge is to develop “open box” learning machines that provide deep insights into the systems they model, enabling the application of marketplace engineering principals to all industrial sectors. Preminon LLC, in collaboration with the Brown Group at Lawrence Berkeley National Laboratory, will develop a new, indefinitely scalable algorithm for feature discovery in supervised, unsupervised, and semisupervised regimes on massively multi-dimensional, hybrid (structured and unstructured) data, in both streaming and static regimes. Our techniques are based on our previous work on iterative Random Forests (iRF, https://github.com/sumbose/iRF). In nonlinear complex systems, to obtain engineering-level insights from BigData, it is necessary to identify and map nonlinear interactions. Mapping nonlinear interactions is done with forward approaches, which become computationally intractable at relatively low-orders. Our approach, for the first time, de-couples the order of interaction from the cost of detection – we have developed importance sampling in the space of all subsets nonparametric regression. We aim to generate HPC-compatible open box learning machines that provide substantial improvements to the informativeness of predictive analytics. We will commercialize this technology as licensable software in the rapidly growing business intelligence sector, currently at $2.7B, and projected to reach $9.7B by 2020.
* Information listed above is at the time of submission. *