You are here
BASALTBiosecurity Assessment using Semantic Analysis of Life-sciences Text
Phone: (703) 414-5009
Phone: (703) 414-5016
The BASALT (Biosecurity Assessment using Semantic Analysis of Life sciences Text) system accelerates the process of DURC (Dual Use Research of Concern) review by automatically the detecting of concepts of concern in research proposals. Concepts of concern are statements expressed as phrases, sentences, or even whole paragraphs, that evoke ideas related to the 15 pathogens and 7 experimental categories defined by the NSABB as potential triggers for dual-use categorization. BASALT is a novel application of new Natural Language Processing technologies to automatically identify concepts of concern in life sciences text. Our approach is designed to require a minimal amount of labeled training data, allowing us to achieve high-performance concept-labeling with a modest data labeling effort. BASALT can be trained on a data set that contains only a handful of a minimal number of examples of for each type of process or outcome that is of concern. DAC has teamed with the Texas Biomedical Research Institute to construct a dataset of examples of life sciences text containing concepts of concern that can be used for training and evaluation of BASALT and for research into DURC analysis by the research community.
* Information listed above is at the time of submission. *