Multi-Modal Knowledge Acquisition from Documents

Award Information
Agency:
Department of Defense
Branch
Navy
Amount:
$69,908.00
Award Year:
2010
Program:
STTR
Phase:
Phase I
Contract:
N00014-10-M-0296
Award Id:
95103
Agency Tracking Number:
N10A-019-0065
Solicitation Year:
n/a
Solicitation Topic Code:
NAVY 10T019
Solicitation Number:
n/a
Small Business Information
11600 Sunrise Valley Drive, Suite # 290, Reston, VA, 20191
Hubzone Owned:
N
Minority Owned:
N
Woman Owned:
N
Duns:
038732173
Principal Investigator:
Gaurav Aggarwal
Principal Investigator
(703) 654-9300
gaggarwal@objectvideo.com
Business Contact:
PAUL BREWER
VP, NEW TECHNOLOGY
(703) 654-9314
pbrewer@objectvideo.com
Research Institution:
University of Arizona
Kobus Barnard
1040 E. 4th Street
Gould-Simpson Building
Tucson, AZ, 85721
(520) 621-4632
Nonprofit college or university
Abstract
Images with associated text are now available in vast quantities, and provide a rich resource for mining for the relationship between visual information and semantics encoded in language. In particular, the quantity of such data means that sophisticated machine learning approaches can be applied to determine effective models for objects, backgrounds, and scenes. Such understanding can then be used to: (1) understand, label, and index images that do not have text; and (2) augment the semantic understanding of images that do have text. This points to great potential power for searching, browsing, and mining documents containing image data. To this end, this STTR effort proposes a pipeline-based framework that focuses on the difficult task of text-image alignment (or correspondence). The proposed pipeline will take images and associated text to reduce correspondence ambiguity in stages. The framework will include both feed-forward and feed-back controls passing partially inferred information from one stage to another, leading to information enrichment and potential to provide inputs towards learning and understanding of novel objects and concepts. Ideas from both stochastic grammar representations and (joint) probabilistic representations will be investigated to facilitate modeling of text-image associations and visual modeling of objects, scenes, etc.

* information listed above is at the time of submission.

Agency Micro-sites


SBA logo

Department of Agriculture logo

Department of Commerce logo

Department of Defense logo

Department of Education logo

Department of Energy logo

Department of Health and Human Services logo

Department of Homeland Security logo

Department of Transportation logo

Enviromental Protection Agency logo

National Aeronautics and Space Administration logo

National Science Foundation logo
US Flag An Official Website of the United States Government