Achieving a High Level of Scalability in Federated Information Retrieval

Award Information
Agency: Department of Energy
Branch: N/A
Contract: DE-FG02-06ER84659
Agency Tracking Number: 80762S06-I
Amount: $99,982.00
Phase: Phase I
Program: SBIR
Awards Year: 2006
Solitcitation Year: 2005
Solitcitation Topic Code: 45
Solitcitation Number: DE-FG01-05ER05-28
Small Business Information
Deep Web Technologies, LLC
122 Longview Drive, Los Alamos, NM, 87544
Duns: N/A
Hubzone Owned: N
Woman Owned: N
Socially and Economically Disadvantaged: N
Principal Investigator
 Abe Lederman
 Mr.
 (505) 672-0007
 abe@deepwebtech.com
Business Contact
 Abe Lederman
Title: Mr.
Phone: (505) 672-0007
Email: abe@deepwebtech.com
Research Institution
N/A
Abstract
At the present time, no federated search engine exists that is capable of searching, aggregating, and ranking more than a small fraction of the scientific content produced by the research community at large and by DOE researchers in particular. Although thousands of sources of valuable content exist, a solution has not been developed that ensures that scientific discoveries already made can be easily found and leveraged to further advance science. This project will identify and implement an architecture to enable the access and processing of massive numbers of documents from thousands of distributed heterogeneous resources. The approach will involve the creation of nested virtual collections ¿ comprised of individual content sources, which can be accessed in a cascading fashion. Phase I will identify and specify an architecture and set of requirements for a distributed computing solution for implementing the high-performance, scalable Information Retrieval system. Simple prototypes ¿ which employ federation, dynamic workflow management, and support of custom document processing filters ¿ will be built and used to prove the feasibility of the proposed architecture. In Phase II, the architecture will be implemented so that documents can be easily accessed and so that customizable, meaningful extraction functions can be applied to these documents. Commercial Applications And Other Benefits as described by the Applicant: The high-performance, scalable information-retrieval solution should find use in organizations that need to access vast numbers of distributed collections and documents, and apply custom filtering to the text, in order to extract meaning from it. In addition to national laboratories and research universities, potential customers include pharmaceutical and biotech companies, oil and gas companies, legal research firms, and financial institutions.

* information listed above is at the time of submission.

Agency Micro-sites

US Flag An Official Website of the United States Government