You are here
Graphical Methods for Discovering Structure and Context in Large Datasets
Phone: (805) 967-9828
Email: staudt@mayachitra.com
Phone: (805) 448-8227
Email: manj@mayachitra.com
The ubiquity of image sensors for data collection creates a glut of data, which leads to bottlenecks in the processing capabilities of modern systems. In order to process this data, meticulously labeled datasets are required and that must be reviewed by humans in order to guarantee state-of-the-art performance. In this effort we endeavor to create a system that can automatically exploit salient information in a training set and utilize human capital efficiently to produce accurate models that will identify the target objects. State-of-the-art algorithms for image processing, speech recognition, and object detection often assume an abundance of labeled data. Indeed, the big data domain is where deep learning algorithms are known to outperform its competitors. This domain also helps practitioners to ignore overfitting concerns. For instance, deep neural networks (DNN) are often trained in the over-parameterized regime where the network has more parameters than the size of the training data and is prone to overfitting if the data has outliers. These practices can be detrimental and sub-optimal in the few labels regime, where quality data is a finite resource and we need to extract the most out of our dataset; while preventing overfitting. This proposed effort will explore several strategies to reduce labeled data, find exemplars for training and adapting to models through: 1) exploiting contextual knowledge implicit in the data, 2) exploit the structure in the problem, and 3) search for robust models to address overfitting, 4) knowledge transfer from related domains and 5) semi-supervised learning (SSL) strategies. These will be followed by evaluation on relevant datasets in object detection and classification.
* Information listed above is at the time of submission. *