Detecting Identity of Authors from Lexical Elements and Cognitive Topics (DIALECT)

Award Information
Agency:
Department of Defense
Branch
n/a
Amount:
$79,999.00
Award Year:
2012
Program:
SBIR
Phase:
Phase I
Contract:
N00014-12-M-0205
Award Id:
n/a
Agency Tracking Number:
N121-080-0319
Solicitation Year:
2012
Solicitation Topic Code:
N121-080
Solicitation Number:
2012.1
Small Business Information
12 Gill Street, Suite 1400, Woburn, MA, -
Hubzone Owned:
N
Minority Owned:
N
Woman Owned:
N
Duns:
967259946
Principal Investigator:
CharlotteShabarekh
Senior Research Scientist
(781) 496-2465
cshabarekh@aptima.com
Business Contact:
ThomasMcKenna
Chief Financial Officer
(781) 496-2443
mckenna@aptima.com
Research Institute:
n/a
Abstract
Exploiting the anonymous nature of the internet, terrorists are able to cloak their identity when authoring blogs, posting to chatrooms and sending tweets by using pseudonyms and creating multiple usernames. This makes it difficult to ascertain who the true author is of a web post, and to determine if posts under different profiles, across websites can be attributed to the same author. Detecting Identity of Authors from Lexical Elements and Cognitive Topics (DIALECT) addresses the challenge of authorship attribution facing intelligence analysts working with Open-Source Intelligence (OSINT). Using an inherently language-independent approach, DIALECT automatically learns a profile of linguistic, idiosyncratic and content-based features that form a unique fingerprint for an author. Additionally, DIALECT uses social science theory to influence the core machine learning algorithm"s selection of dialectal and semantic features for use in distinguishing which cultural, tribal, religious or political groups the author belongs to. By associating authors with their socio-cultural group, DIALECT provides insight into the authors"cognitive processes, such as their political leanings and ideological affiliations. By modeling feature sets at both the individual author and group levels, DIALECT is able to attribute documents to groups, even when it is unable to determine the specific author.

* information listed above is at the time of submission.

Agency Micro-sites


SBA logo

Department of Agriculture logo

Department of Commerce logo

Department of Defense logo

Department of Education logo

Department of Energy logo

Department of Health and Human Services logo

Department of Homeland Security logo

Department of Transportation logo

Enviromental Protection Agency logo

National Aeronautics and Space Administration logo

National Science Foundation logo
US Flag An Official Website of the United States Government