Near-Real Time Arabic/English Machine Translation by Integrated Statistical and Linguistic Learning Methods

Award Information
Agency:
Department of Defense
Branch
Army
Amount:
$69,998.00
Award Year:
2004
Program:
SBIR
Phase:
Phase I
Contract:
W909MY-04-C-0015
Award Id:
68174
Agency Tracking Number:
A032-2396
Solicitation Year:
n/a
Solicitation Topic Code:
n/a
Solicitation Number:
n/a
Small Business Information
1202 Delafield Place, NW, Washington, DC, 20011
Hubzone Owned:
N
Minority Owned:
N
Woman Owned:
N
Duns:
155774941
Principal Investigator:
Evelyne Tzoukermann
Director of Research
(202) 722-2440
evelyne.tzoukermann@streamsage.com
Business Contact:
Seth Murray
President
(202) 722-2440
seth.murray@streamsage.com
Research Institute:
n/a
Abstract
StreamSage proposes an approach to the automatic translation of Arabic and Arabic dialect texts to and from English that significantly extends the state-of-the-art in regards to the integration of statistical and traditional machine translation techniques. This research will greatly increase translation accuracy while decreasing the need for domain-specific training. The proposed near real time translation system will use automatically induced transfer rules between English and Arabic syntactic structures that have been statistically trained on a feature set that is of unprecedented sophistication. This feature set will be automatically generated through the use of tools that have not before been applied to Arabic machine translation, such as language-wide noun and verb sense disambiguation, a TAG-Based Stochastic Parser, and a hierarchical representation of Arabic dialect morphology, lexical features, and syntactic structures. Additional innovations include the application of state-of-the-art Arabic morphological analysis throughout the translation process, from word sense disambiguation to transfer rule induction to generation, and the automatic induction of syntactic-structure to target language generation rules. This research will make use of past work in machine tranlation, Arabic parsing, Arabic dialect analysis, and word sense disambiguation by StreamSage, Columbia University, and CoGenTex.

* information listed above is at the time of submission.

Agency Micro-sites


SBA logo

Department of Agriculture logo

Department of Commerce logo

Department of Defense logo

Department of Education logo

Department of Energy logo

Department of Health and Human Services logo

Department of Homeland Security logo

Department of Transportation logo

Enviromental Protection Agency logo

National Aeronautics and Space Administration logo

National Science Foundation logo
US Flag An Official Website of the United States Government