Distributed Relevance Ranking in Heterogeneous Document Collections
72223S03-I Given the large and ever-growing volume of scientific information spread throughout the Internet, a researcher with limited time needs help to determine the most relevant documents to review. No satisfactory tools exist to retrieve the most relevant documents across different collections. Therefore, this project will develop, test, and implement key components of a distributed approach for ranking the relevance of documents acquired from an in-depth search of multiple sources. Machine-learning heuristics will be introduced to minimize the processing required to find those best documents. Phase I will conduct experiments and demonstrate that the automated approach can find a greater number of relevant documents, and miss fewer important ones, compared to a human-based approach. The use of computational grids will be investigated as a framework for implementing a scalable and resource-intensive solution. Commercial Applications and Other Benefits as described by awardee: The relevance ranking system should have use in research divisions of companies with the need to do high quality, exhaustive document search and retrieval, especially where time-to-market is critical (e.g., in the pharmaceutical and oil and gas industries).
Small Business Information at Submission:
Deep Web Technologies, Llc
154 Piedra Loop Los Alamos, NM 87544
Number of Employees: