SBIR Phase I: Quantifying consumer rationale expressed in free text online discussions

Award Information
Agency: National Science Foundation
Branch: N/A
Contract: 1248768
Agency Tracking Number: 1248768
Amount: $150,000.00
Phase: Phase I
Program: SBIR
Awards Year: 2013
Solicitation Year: 2012
Solicitation Topic Code: EI
Solicitation Number: N/A
Small Business Information
dMetrics Inc.
181 North 11th St, Brooklyn, NY, 11211-1175
DUNS: 800752441
HUBZone Owned: N
Woman Owned: N
Socially and Economically Disadvantaged: N
Principal Investigator
 Paul Nemirovsky
 (617) 642-7163
 paul.nemirovsky@gmail.com
Business Contact
 Paul Nemirovsky
Phone: (617) 642-7163
Email: paul.nemirovsky@gmail.com
Research Institution
 Stub
Abstract
This Small Business Innovation Research (SBIR) Phase I project aims at developing a targeted semantic processing framework for the analysis of online conversations. A particular focus is given to improving the state-of-the art in the design of semi-supervised syntactic and semantic parsers capable of processing informal conversations about consumer products and services across industry verticals. This project will address some of the key shortcomings of existing natural language processing (NLP) frameworks when applied to noisy language typical of user-generated content, where linguistic phenomena is significantly divergent from the carefully edited content such as news reports, including colloquialisms, misspellings, grammatical errors, and incomplete sentences. Given the wide proliferation, amount, and richness of available user-generated content, combined with the limitations of the currently available shallow textual representations, creation of better NLP models is critical for extracting useful information from text that is largely informal, and for which no annotated data is available. Successful completion of this work will fulfill the urgent need for the development of NLP models accurate across multiple domains, introduce novel information extraction models for the extraction of consumer decisions, experiences, and rationale, while advancing our capability to model, quantify, and understand the reasons underlying product adoption and attrition. The broader impact/commercial potential of this project relates to the accuracy of automation that can be achieved when analyzing large amounts of unstructured text. Millions of Americans use the Internet to share, search and communicate information about products and services. Whether considering a new product, relaying past product experience, or analyzing product performance, consumers and companies alike are faced with an overwhelming range of informational sources, making the process of gathering information about products both slow and costly. Using in-depth text processing technology to analyze product reports found across social networks, forums, blogs, and company-driven websites will both increase the value of consumer-driven product reports and improve companies' and regulators' ability to learn from real-world product usage data to improve performance and public perception of marketed products and services. If successful, this project will lead to advancements in knowledge discovery, enable the automation of manual processes associated with large-scale consumer studies and change the dominant methodologies for conducting market research.

* information listed above is at the time of submission.

Agency Micro-sites

SBA logo
Department of Agriculture logo
Department of Commerce logo
Department of Defense logo
Department of Education logo
Department of Energy logo
Department of Health and Human Services logo
Department of Homeland Security logo
Department of Transportation logo
Environmental Protection Agency logo
National Aeronautics and Space Administration logo
National Science Foundation logo
US Flag An Official Website of the United States Government