You are here
SHAPE-BASED GENERALIZATION BOUNDS FOR DEEP LEARNING
Phone: (623) 261-7734
Email: jay.hineman@geomdata.com
Phone: (919) 670-0808
Email: megan.hohenstein@geomdata.com
Contact: Keith Owen
Address:
Phone: (919) 681-8687
Type: Nonprofit College or University
We propose to develop a theoretical understanding of the relationship between intrinsic geometric structure in both training and latent data and characteristics of functions learned from that data for deep neural network (DNN) architectures. Along the way we propose to also understand the structure of the neural networks that are best trained on a given data set. Both of these theories will lead to generalization bounds in the form of theorems that will be tested rigorously and at scale, using both synthetic datasets and real imagery datasets. NGA’s mission is to provide timely, relevant, and accurate geospatial intelligence in support of national security. We will develop partnerships that will test the theory at the points of greatest value to NGA, as well as to potential commercial partners that is, on deep architecture at a scale not explored in the research literature. \n\n Theoretical tools to measure generalization in deep neural network lag behind practical cross-validation techniques that simply apply trained models to unseen data. In many cases generalization bounds are vacuous or only apply to simple architectures that are not used in practice. Recent principled work has provided improved generalization bounds, but there remains a gap between practical evaluation and theoretic bound. A common thread that has appeared in recent work is the resurfacing of compression as inference. There is a clear opportunity to measure compression geometrically, and our proposed work is a key step in that direction. \n\n More specifically, in Phase I we propose to: 1) leverage techniques from stratified space theory, persistent homology, and tropical geometry to investigate how data shape changes as it moves through the layers of a deep network, enabling us to prove theorems constraining the function classes learnable by such networks, and thus prove generalization bounds relating to predictions that can be made; 2) study the relationship between geometric structure on training data and the structure of the neural networks best trained on that data in order to reduce the amount of time it takes to analyze a class of data; 3) determine the class of functions that are appropriate to train with a given training dataset; 4) empirically validate our theorems on modern architectures by measuring estimates of network generalization on data drawn from previously unseen distributions. We will focus on models trained on imagenet.
* Information listed above is at the time of submission. *