You are here

Multi-Modal Knowledge Acquisition from Documents

Award Information
Agency: Department of Defense
Branch: Navy
Contract: N00014-10-M-0296
Agency Tracking Number: N10A-019-0065
Amount: $69,908.00
Phase: Phase I
Program: STTR
Solicitation Topic Code: N10A-T019
Solicitation Number: 2010.A
Timeline
Solicitation Year: 2010
Award Year: 2010
Award Start Date (Proposal Award Date): 2010-06-28
Award End Date (Contract End Date): 2011-04-30
Small Business Information
11600 Sunrise Valley Drive Suite # 290
Reston, VA 20191
United States
DUNS: 038732173
HUBZone Owned: No
Woman Owned: No
Socially and Economically Disadvantaged: No
Principal Investigator
 Gaurav Aggarwal
 Principal Investigator
 (703) 654-9300
 gaggarwal@objectvideo.com
Business Contact
 PAUL BREWER
Title: VP, NEW TECHNOLOGY
Phone: (703) 654-9314
Email: pbrewer@objectvideo.com
Research Institution
 University of Arizona
 Kobus Barnard
 
1040 E. 4th Street Gould-Simpson Building
Tucson, AZ 85721
United States

 (520) 621-4632
 Nonprofit College or University
Abstract

Images with associated text are now available in vast quantities, and provide a rich resource for mining for the relationship between visual information and semantics encoded in language. In particular, the quantity of such data means that sophisticated machine learning approaches can be applied to determine effective models for objects, backgrounds, and scenes. Such understanding can then be used to: (1) understand, label, and index images that do not have text; and (2) augment the semantic understanding of images that do have text. This points to great potential power for searching, browsing, and mining documents containing image data. To this end, this STTR effort proposes a pipeline-based framework that focuses on the difficult task of text-image alignment (or correspondence). The proposed pipeline will take images and associated text to reduce correspondence ambiguity in stages. The framework will include both feed-forward and feed-back controls passing partially inferred information from one stage to another, leading to information enrichment and potential to provide inputs towards learning and understanding of novel objects and concepts. Ideas from both stochastic grammar representations and (joint) probabilistic representations will be investigated to facilitate modeling of text-image associations and visual modeling of objects, scenes, etc.

* Information listed above is at the time of submission. *

US Flag An Official Website of the United States Government