You are here
Feasibility Study of a Multilingual Text Retrieval System
Phone: (315) 443-4456
This SBIR Phase I effort promises to yield a major adance in document retrival, with direct benefits to government, industry, and education resulting from improved network communication, CD-ROM retrieval, and scientific/technical information transfer within the United States and at the international level. All retrieval technology to date has assumed that the information being sought will be found in documents in the same language as the user's query. TextWise Inc's proposed CINDOR system offers the opporutnity to extend technology developed under the auspices of ARPA funding to a heretofore unattained multilingual capability, whereby the speaker of one language an achieve a text search with high precision and recall in any of several languages, simultaneously, even if they do not speak the target languages. This extension is accomplished via two machine readable lexical knowledge bases. The first one, a monolingual hierarchical concept dictionary, is used to assign concept group codes to foreign language words. The second one, a multilingual translation database, converts a concept group code into an English semantic code. The semantic codes are used to produce semantic vectors which allow texts of the supported languages to be retrieved, routed, classified, and clustered as if they were English documents. Anticipated Benefits: Formidable industrial, military, government, and economic benefits from improved technical and scientific communication would be realized by this breakthrough from monolingual to multilingual text searching. In addition, the feasibility shown by Phase I will allow us to demonstrate multilingual text retrieval technology to potential Phase II commercial entities who can provide part of the support and investment needed to move toward the commercialization of CINDOR.
* Information listed above is at the time of submission. *