USA flag logo/image

An Official Website of the United States Government

Detecting Identity of Authors from Lexical Elements and Cognitive Topics…

Award Information

Agency:
Department of Defense
Branch:
N/A
Award ID:
Program Year/Program:
2012 / SBIR
Agency Tracking Number:
N121-080-0319
Solicitation Year:
2012
Solicitation Topic Code:
N121-080
Solicitation Number:
2012.1
Small Business Information
Aptima, Inc.
12 Gill Street Woburn, MA -
View profile »
Woman-Owned: No
Minority-Owned: No
HUBZone-Owned: No
 
Phase 1
Fiscal Year: 2012
Title: Detecting Identity of Authors from Lexical Elements and Cognitive Topics (DIALECT)
Agency: DOD
Contract: N00014-12-M-0205
Award Amount: $79,999.00
 

Abstract:

Exploiting the anonymous nature of the internet, terrorists are able to cloak their identity when authoring blogs, posting to chatrooms and sending tweets by using pseudonyms and creating multiple usernames. This makes it difficult to ascertain who the true author is of a web post, and to determine if posts under different profiles, across websites can be attributed to the same author. Detecting Identity of Authors from Lexical Elements and Cognitive Topics (DIALECT) addresses the challenge of authorship attribution facing intelligence analysts working with Open-Source Intelligence (OSINT). Using an inherently language-independent approach, DIALECT automatically learns a profile of linguistic, idiosyncratic and content-based features that form a unique fingerprint for an author. Additionally, DIALECT uses social science theory to influence the core machine learning algorithm"s selection of dialectal and semantic features for use in distinguishing which cultural, tribal, religious or political groups the author belongs to. By associating authors with their socio-cultural group, DIALECT provides insight into the authors"cognitive processes, such as their political leanings and ideological affiliations. By modeling feature sets at both the individual author and group levels, DIALECT is able to attribute documents to groups, even when it is unable to determine the specific author.

Principal Investigator:

Charlotte Shabarekh
Senior Research Scientist
(781) 496-2465
cshabarekh@aptima.com

Business Contact:

Thomas McKenna
Chief Financial Officer
(781) 496-2443
mckenna@aptima.com
Small Business Information at Submission:

Aptima, Inc.
12 Gill Street Suite 1400 Woburn, MA -

EIN/Tax ID: 043281859
DUNS: N/A
Number of Employees:
Woman-Owned: No
Minority-Owned: No
HUBZone-Owned: No