USA flag logo/image

An Official Website of the United States Government

Generic Automatic Recognition System for Handwritten Arabic-Style Script…

Award Information

Department of Defense
Award ID:
Program Year/Program:
2009 / SBIR
Agency Tracking Number:
Solicitation Year:
Solicitation Topic Code:
Solicitation Number:
Small Business Information
Optimal Synthesis, Inc.
95 First Street, Suite 240 Los Altos, CA 94022-2777
View profile »
Woman-Owned: No
Minority-Owned: No
HUBZone-Owned: No
Phase 2
Fiscal Year: 2009
Title: Generic Automatic Recognition System for Handwritten Arabic-Style Script Documents
Agency / Branch: DOD / ARMY
Contract: W911QX-09-C-0096
Award Amount: $729,973.00


Development of a generic system framework for automatically recognizing handwritten text for non-Arabic languages using Arabic-style script such as Urdu or Pashto is addressed. The goal of the Phase II work is to develop a prototype of the generic handwritten Arabic-style script recognition system useable for screening Urdu documents such as personal letters for key terms and general subject matter. The proposed Phase II SBIR builds on a successful feasibility demonstration of a handwritten Urdu word recognition system carried out during the Phase I SBIR project. The Phase I SBIR project has demonstrated the feasibility of building a generic recognition framework for non-Arabic languages using Arabic-style script based on the Hidden Markov Model (HMM) approach. We developed and evaluated different types of feature extraction methods under the HMM recognition framework. In particular, we have developed the novel Contourlet-based feature extraction algorithm to exploit the cursive nature of Arabic-style scripts. To further enhance the performance of the recognition system, more elaborate feature extraction approaches that integrates the Contourlet feature and Graph-based feature was also developed. The script recognition system was evaluated using a handwritten Urdu database collected during Phase I. Experimental results show that both Contourlet-based feature extraction method and integrated Contourlet- and Graph-based feature extraction methods outperform the state-of-the-art baseline approaches. Based on the successful Phase I feasibility study, Phase II work will develop a prototype system that will be capable of recognizing personal letters written in Urdu. Basic performance of the prototype system will be determined. The prototype system will serve as the baseline system for integrating with the software packages that is being developed under the Army's Sequoyah Machine Language Translation program.

Principal Investigator:

Hui-Ling Lu
Director, Signal Processi

Business Contact:

P. K. Menon
Small Business Information at Submission:

95 First Street Suite 240 Los Altos, CA 94022

EIN/Tax ID: 770484755
Number of Employees:
Woman-Owned: No
Minority-Owned: No
HUBZone-Owned: No