You are here

Pedigree Reconstruction for Identifying Terrorist Networks


TECHNOLOGY AREA(S): Human Systems, Information Systems

OBJECTIVE: To develop a software platform for pedigree reconstruction that can use naïve DNA profiles to establish familial relationships between individuals and within groups of interest.

DESCRIPTION: DESCRIPTION:A presently under-utilized tool for military applications is pedigree analysis that combines DNA data with computational approaches to derive information regarding the nature of familial relationships.The advent of DNA sequencing and its adaptation as the gold-standard method for biological characterization led to the development of expansive genetic databases, which are now available for a use in a wide variety of studies [1,2].Genetic typing provides a compelling means to establish identity in cases where biological evidence is available.Pedigree reconstruction further extends its utility by allowing inferences of relatedness [3].DNA markers like single nucleotide polymorphisms (SNP) are shown to be informative for evaluating ancestry as well as for forensic reconstruction of lineages, and sustained efforts like the 1000 Genomes Project, GEDmatch, Family Tree DNA and others provide a wealth of accessible information that is essential to and underpins the fidelity of a given reconstruction [4].Researchers have used various approaches, including the application of alternative statistical approaches and different combinations of markers, to improve heritability estimates and thus veracity of results.Groups report different levels of success dependent upon specific project goals [4,5,6], but some boast detection of relatedness out to 9th degree relationships and deduction of the precise degree of relatedness between 6th degree relatives (e.g., second cousins, once removed).However, because the primary aim of such projects is maintaining high levels of accuracy, supporting analysis requires long periods of time and substantial computational resources.Although direct-to-consumer genetic testing companies make similar assurances and enjoy the advantages associated with access to massive amounts of data, their processes tend to lack scientific rigor, and the companies fall short of making quality assurance guarantees with respect to their analyses [7].Reconstruction of family lineages has multiple military and civilian applications where identifying the probable contributors of samples of human origin is desirable.For example, one notable application is identification of missing service members.The Armed Forces presently require provision of DNA samples from service members upon processing through enlisted basic training or officer training school, so that the samples can be used for identification of remains if needed [8].However, prior to the establishment of the Armed Forces Repository of Specimen Samples (AFRSSIR) in 1992, such samples were not required, thus establishing the identity of war casualties from fifty-plus years ago can present a considerable technical challenge.The Armed Forces Medical Examiner System's-Armed Forces DNA Identification Laboratory (AFMES-AFDIL) is presently charged with “providing human remains DNA testing in support of current day operations (AFMES), past accounting operations (Defense POW/MIA Accounting Agency) and other U.S. Department of Defense Agency missions.” [9]AFMES-AFDIL uses the most modern tools available for genomic sequencing, but pedigree reconstruction for the purpose of identification remains a lengthy process because a high level of stringency is desirable.Tradeoffs between surety and timeliness similarly plague those tasked with establishing likely identity of sample origin for the purpose of criminal investigations.Another application of importance is identification of familial lineages that are significantly represented in terrorist network nodes.Family ties can serve as the mobilizing infrastructures for establishment of terrorist groups.Many social scientists argue that other “preconditions” are irrelevant without an organizational structure that brings together friends and family members using the strength of their relationships as the precipitant event rather than other factors like the nature of their grievances or religious affiliations [10].The history of groups like the Irish Republican Army underscores the foundational role of kinship in the incipient formation of terrorist cells.Several similar examples exist in other regions of the world, including those of specific interest to the U.S. Department of Defense, and recent high-profile acts of terrorism executed by relative and friend groups have refocused attention on the importance of family ties for establishing terrorist networks and garnering commitment to a common cause [11].Analysing the relationships among the perpetrators of terrorist acts will, in the short term, allow identification of likely sources of recruitment and radicalization.In the longer term, analysis of the causal and contributing factors will allow development of more effective de-radicalization strategies so that such acts can be subverted.The specific goal of the present topic is to conflate the useful elements of small-scale and large-scale approaches in order to develop a process that handles new data with high efficiency while maintaining quality in terms of analytical stringency.The overarching aims are to validate the concept that DNA is a theater-relevant biometric and to develop a software platform that supports its operational use.

PHASE I: Leverage or create a computational architecture for pedigree analysis that can be scaled to incorporate successively larger DNA datasets while maintaining operational efficiency and veracity of analyses normally conducted at smaller scales, with the end-state goal of developing a software platform that can accept new information and generate pedigrees as the data arrives.Identify criteria for final selection of ancestry-estimation methods and markers.Explicitly identify genetic databases used for the project and indicate means by which [potentially] sensitive data were protected.Develop metrics to evaluate performance of the new architecture as compared to presently available approaches and standards to represent statistical confidence in resultant pedigrees.Initiate development of a quality assurance protocol.Phase I deliverables will include (1) a final report and (2) demonstration of the preliminary architecture to the cognizant project officer.The report should also provide results on architecture performance using unambiguous statistical methods, describe development including parameterization, and identify limitations / weaknesses.The report should include plans for development of a user interface which will address Phase II expectations.Operating system, other software requirements, and data compatibility should be specifically addressed, as should proposed location of the final interface.

PHASE II: Phase II efforts will focus on iterative improvement to the proof-of-concept approach developed during Phase I.The performer will mature the architecture by improving performance as compared to the preliminary architecture evaluated as part of the Phase I effort and will modify the software, as needed, to provide for ease-of-use and –interpretation of results.The performer will identify weaknesses in performance that could be improved through additional data, modified statistical approaches, and / or additional pre-processing steps and will codify / relay observations to the project officer.The phase II deliverables will be a proof of concept demonstration of the software platform with the introduction of novel genetic profiles whose pedigrees have been established by other means and a report detailing (1) description of the approach, including optimization techniques and performance outcomes, (2) testing and validation methods, and (3) advantages and disadvantages / limitations of the method; and a user interface with any associated executables.

PHASE III: In addition to implementing further improvements that would enhance use of the developed product by the sponsoring office, identify and exploit features that would be attractive for commercial or other private sector pedigree analysis applications.

KEYWORDS: Pedigree reconstruction, kinship, familial relatedness, genotyping., terrorism, terrorist, radicalization


[1] Goudet J, Kay T, Weir BS. 2018. How to Estimate Kinship. Molecular Ecology 27:4121-4135. [2] Auton A, Abecasis GR. 2015. A Global Reference for Human Genome Variation. Nature 526:68-87. [3] Budowle B, van Daal A. 2008. Forensically Relevant SNP Classes. BioTechniques 44:603-610. [4] Huisman J. 2017. Pedigree Reconstruction from SNP Data: Parentage Assignment, Sibling Clustering and Beyond. Molecular Ecology Resources 17:1009-1024. [5] Morimoto C et al. 2016. Pairwise Kinship Analysis by the Index of Chromosome Sharing Using High-Density Single Nucleotide Polymorphisms. PLOS ONE DOI:10.1371/journal.pone.0160287. [6] Wang J. 2019. Pedigree Reconstruction from Poor Quality Genotype Data. Heredity 122:719-728.[7] Royal CD et al. 2010. Inferring Genetic Ancestry: Opportunities, Challenges, and Implications. The American Journal of Human Genetics 86:661-673. [8] De Castro M et al. 2016. Genomic Medicine in the Military. Genomic Medicine 1:1-4. [9] [10] Noricks D et al. 2009. Social Science for Counterterrorism: Putting the Pieces Together. Rand Corporation Technical Report (ISBN 978-0-8330-4706-9). [11] Copeland S. 2017. The Importance of Terrorists’ Family and Friends.

US Flag An Official Website of the United States Government