You are here

Computational Genomics and Data Science Opportunities for Small Business (R43/R44 Clinical Trial Not Allowed)



Since its inception the field of genomics has been grounded in computational approaches. All facets of genomic research, such as processing raw sequencing signals, assembling genomes, calling variants, deriving insight from population sequencing studies, and designing and studying the implementation of genomics in clinical settings, are dependent upon computational, analytical, statistical and bioinformatics approaches. The scale of genomic data and the commitment of genomics researchers to share data resources have necessitated new computational paradigms for data processing, storage, organization, and access.

As the science of genomics continues to develop, producing data is no longer rate-limiting for genomic discovery; instead, processing, storing, accessing, analyzing, and deriving insight from genomic data, all computationally-based efforts, are emerging as the major challenges and bottlenecks. Understanding the complex relationships through which genotypes influence phenotypes, a key goal of the NHGRI, is increasingly dependent upon analytical, statistical and computational approaches. The rapid pace of sequencing technology development remains a driving force in genomics, and new genomic data types produced by novel technologies demand new modes of commercial analytical and computational support. Genomics increasingly underlies the study of complex networks and systems ranging in scale from single cells to complete organisms, presenting opportunities for commercial computational approaches to address previously intractable problems in basic biological sciences. The broadening adoption of genomics in clinical settings also requires new commercial computational approaches to enable improved outcomes, while the sensitive nature of some genomic data demands new commercial computational methods to balance data sharing and privacy considerations. Existing tools also require improvement and hardening, and the exponential growth of genomic data demands new scalable algorithms and new solutions for making genomic data findable, accessible, interoperable, and reusable (FAIR).

In recognition of the central role of computation and data science in genomics and to identify future needs and emerging opportunities, the NHGRI held an Informatics and Data Science workshop on September 29-30, 2016. Participants considered bioinformatics for genomics in both basic biology and clinical sciences and prioritized scientific opportunities for the NHGRI Computational Genomics and Data Science program over the next 3-5 years. Details from this workshop, including a workshop report, can be found here:  Workshop participants identified several areas where continued or expanded support by NHGRI was thought important. Key recommendations highlighted the importance of maintained or enhanced support for development of: interactive tools for visualization and analysis of genomic data in both basic and clinical sciences; computational methods to investigate how genotype translates to phenotype; tools and approaches to enhance genomic data sharing; scalable algorithms for analysis of genomic data; methods to make genomic and phenotypic data and metadata FAIR, and others (for full list of recommendations, please see the workshop report).


Through this FOA, NHGRI seeks to fund innovative commercial product development in computational genomics, data science, statistics, and bioinformatics for basic or clinical genomic sciences and broadly applicable to human health and disease, as well as commercial product development stemming from improvement of existing software or approaches demonstrated to be in broad use by the genomics community.

Research topics appropriate for this FOA include, but are not limited to, development of commercial computational, bioinformatics, statistical, or analytical approaches, tools, or software for:

This FOA does not support:

In addition to this PAR, NHGRI participates in several funding opportunities, including the parent R01 and R21 announcements.

All applicants are strongly encouraged to contact NHGRI Program Staff to discuss the alignment of their proposed work with the goals of this FOA prior to submitting an application.

See Section VIII. Other Information for award authorities and regulations.

    • Interactive analysis and visualization of large genomic data sets.
    • Identification or prioritization of disease-causal genetic variants.
    • Causal statistical modeling related to genomic research.
    • Analysis of single-cell or sub-cellular genomic data both in situ and in dissociated cells.
    • Integrating model organism data with human data to derive biomedical insight.
    • Integrating and interpreting various genomic data types, including sequence data, functional data, phenotypic data, and clinical data.
    • Processing and integrating genome sequence data to enhance representation of population variation.
    • Processing sequence data for sequence assembly, variant detection (SNPs and SVs), imputation, and resolution of haplotypes.
    • Development of efficient and scalable algorithms for compute-intensive genomic applications, or otherwise achieving major cost reductions in genomic data processing and analysis.
    • Enabling scalable and cost-effective curation of FAIR metadata for genomic and phenotypic data.
    • Enhancing secure sharing and use of genomic data in combination with clinical data.
    • Processing or analyzing new genomic data types, or major improvement in processing or analyzing existing genomic data types.
    • Hardening an existing widely-used genomic data processing pipeline to enable its reproducible implementation by the biomedical research community.
    • Improved and novel methods for integrating prior biological knowledge into machine learning models.
    • Development, maintenance, or curation of genomic databases and other genomic data resources. Applicants considering developing such resources are directed to the Genomic Community Resources (U24) program:
    • Research not generalizable beyond one or a small number of diseases or biological systems. Research utilizing a small number of disease models or biological systems for proof-of-concept studies may be acceptable when the resulting methods, tools, approaches, or software are generalizable.
    • Development and application of ontologies or controlled vocabularies, or manual curation efforts.
    • Basic data science research that is not developed for genomics.
    • Significant experimental work. Applicants may propose limited experimental work to test predictions generated as a result of computational approaches and/or inform modeling efforts, but this should not be a major focus of the application.
    • Approaches not clearly pertaining to computational genomics and data science and/or lacking relevance to human health and disease.
    • Work focused on microbial genomics or the microbiome.
US Flag An Official Website of the United States Government