SBIR Phase I: Hybrid Question Answering Combining a Search Index with an RDF Store
National Science Foundation
Agency Tracking Number:
Solicitation Topic Code:
Small Business Information
1701 N. Collins Blvd., Suite 2200, Richardson, TX, 75080-3587
Socially and Economically Disadvantaged:
AbstractThis Small Business Innovation Research (SBIR) Phase I project will address the issue that enterprises today are faced with the problem of linking their disparate structured databases with unstructured text documents like articles, manuals, reports, emails, blogs, folksonomies, and others. There is no easy way to perform a federated search, let alone enable more intelligent applications over such diverse data sources without considerable time and effort spent in system and data model customization by experts. With the recent emergence of commercial grade Resource Description Framework (RDF) triple stores it becomes possible to merge massive amounts of structured and unstructured data by defining a common ontology model for the DBMS schemas and representing the structured content as semantic triples. Lymba proposes novel methods to transform unstructured data sources inside corporate firewalls into a consolidated RDF store, merge it with other ontologies and structured data, and moreover offer a natural language question answering (QA) interface for easy use. To make the QA robust, an innovative hybrid approach is proposed that draws answers from the RDF store as well as directly from indexed text documents. The potential impact of delivering a question answering system that operates on a commercial grade RDF store is significant as it fills a need for users of this store to easily access more information and quickly implement intelligent applications using natural language questions as the main vehicle. The proposal also leads to enabling technology software to advance the semantic web. If successfully deployed, the proposed research has the potential to translate into a viable commercial product with significant revenues.
* information listed above is at the time of submission.