You are here

Technology Development for Single-Molecule Protein Sequencing (R43/R44 Clinical Trial Not Allowed)


Purpose This Funding Opportunity Announcement (FOA) solicits R43/R44 grant applications to accelerate innovation and early development in the emerging field of single-molecule protein sequencing (SMPS). This FOA is seeking novel technologies or significant improvements to existing technologies. Exploration of technologies other than those currently being commercialized is highly encouraged. The short-term goal of this initiative is technological improvements that lead to significant increases in one or more combinations of throughput, cost, accuracy, sensitivity, dynamic range, and scale. The long-term goal is to achieve technological advances that enable generation of protein sequencing data at sufficient scale, speed, cost and accuracy to use routinely in studies of genome biology and function, and in biomedical and clinical research in general. Background Modern next-generation nucleic acid sequencing (NGS) has transformed the field of genomics and biomedicine due to its high throughput, low cost, and generalizability, and has enabled many clinical applications and research projects that are producing striking insights into biology and disease. Proteomics has not had the same success or widespread adoption, partly because of a lack of proteomics technologies that have sufficient sensitivity and dynamic range for protein detection. Also, the human proteome is extremely complex. The 20 amino acid building blocks of proteins and their modifications convey a great array of chemical diversity. The size of any given proteome is a matter of debate. For pathogens, the range varies depending on genome size and pathogen lifestyle. For humans, it theoretically ranges from about 20,000 (a single representative protein for every gene) to several million when factoring in combinations of DNA-, RNA- and posttranslational modification (PTM)-level variations (proteoforms). Not all genes and their gene-product combinations are expressed in a given cell, however. In humans, it is estimated that about half of human genes (10,000) are expressed as proteins at over 20 copies per cell in a typical cell type, and proteins per cell may range from 1,000-15,000 depending on cell type and cell cycle stage. In addition, the dynamic range of proteins within a human cell can vary over 7 to 10 orders of magnitude. These factors compound the technological challenge to comprehensively characterize a given proteome. Currently, the two main approaches to measuring proteins are affinity reagents and mass spectrometry. While these two approaches are extremely valuable, both have their limits with respect to proteome-wide detection. Affinity reagents have excellent spatial resolution, but they rely on custom reagents, thus limiting unbiased work and scale. Mass spectrometry has long been the mainstay technology for large-scale protein sequencing and quantitation, however it lacks the dynamic range and sensitivity needed to routinely detect low-abundance proteins. Both techniques are constrained to operate above a given abundance level. To achieve proteome-scale capability, there is a need to significantly improve proteomics methods in terms of speed, sensitivity, and dynamic range. There is an emerging approach, single-molecule protein sequencing (SMPS), which is based on reading amino acid sequence from individual protein or peptide molecules, that could possibly address these needs. The most common single-molecule techniques being developed are based on nanopores, parallel-in-space fluorescent methods, and tunneling currents across nanogaps. These SMPS techniques have the potential to detect and quantify very small amounts of protein and proteoforms, and to approach the scale needed to enable the analysis of complex human protein samples. The realization of SMPS technologies would revolutionize the field of proteomics, allowing for a more facile unbiased cataloguing and assessment of protein-coding gene products in the human genome. Objectives This FOA seeks to fund technology development research efforts in instrumentation innovation and sample preparation/processing approaches for single-molecule protein sequencing. The technology development proposed should have the potential to significantly propel the field of SMPS forward in the next five years, and have the potential to have a large impact on future studies of genome biology or genome function in general, but particularly in the context of cancer, cancer therapy, infectious and immune-mediated diseases. The proposed research also must have the clear potential to scale proteome-wide. The technology proposed can innovate substantially novel approaches or significantly improve (i.e., no less than tenfold) existing methodologies for SMPS. The FOA deliberately does not specify cost, quality, scale, sensitivity, dynamic range, throughput, or other key metrics since achievable endpoints are likely to improve during the course of this initiative and can substantially differ from one technology to another. However, the applicant must propose quantitative metrics so progress can be evaluated, and have convincing rationale that the proposed technology has the potential to scale long-term and to achieve a throughput compatible with widespread adoption by the proteogenomics, biomedical and clinical research community. Given the complexity of a given proteome, as noted above, for the purposes of this initiative, “potential to scale proteome wide” can be defined in the context of the specific technology and applications being proposed, and will depend on the chosen species, cell type, sample amount, as well as breadth vs. depth. For example, being able to identify and quantify routinely in a single experiment unique protein products from 10,000 genes from any human cell type; analyzing 1000s of proteins from small volumes of body fluids (serum, plasma, cerebral spinal fluid, sputum, etc.); identifying and quantifying unique proteins from 1000 genes from a single human cell; in-depth sequencing and proteoform analysis of pathogen antigens present in human samples; or a combination analysis of thousands of proteins and their proteoforms for unbiased biomarker discovery and clinical application. Applications that do not address long-term potential to scale will be considered non-responsive to the FOA. Applicants may choose to work with any species and cell types, but the resulting technological advancements must be applicable to protein sequencing in human cells, to increase our understanding of basic biological processes related to human health and disease. Further, the proposed technology should have the potential to go beyond a particular disease, PTM type, or cell/tissue type chosen for study and be more broadly applicable to the research community. It is expected that applicants will develop and detail scientific and practical definitions of optimal throughput, cost, accuracy, sensitivity, dynamic range, and scale. Applicants are expected to propose innovations or improvements of no less than an order of magnitude, based on state-of-the-art at the time the application is submitted. Such improvements may be achieved by focusing on one critical factor, or a combination of important ones to develop complete systems or novel key components for SMPS. While this opportunity is intended to focus on amino acid sequencing approaches, this FOA is open to nucleic acid readouts as a proxy for protein sequencing (i.e. reverse-translation technology that turns peptide sequences into DNA). Research primarily focused on computational approaches (such as data analysis/modeling of complex data sets, understanding error in assigning amino acid position or side-chain identity) is not appropriate for this FOA. However, the proposed experimental research will frequently be paired with computational components, such as analysis of the new or improved data types generated to maximize usefulness and derive information content. Examples of possible research topics include, but are not limited to: Instrumentation development for single-molecule amino acid resolution (eg. nanopore, fluorescence, tunneling currents) either as complete systems or components of complete systems; Advanced microfluidics to allow interfacing of heterogeneous, subnanomolar samples to single molecule sensors; Development of novel chemistries, reagents or dyes for capturing molecules or for differentiating amino acids; Engineering protein or solid-state nanopores; Development of nucleic acid readouts as a proxy for protein sequencing if readouts are at true single-molecule level; Development of affinity reagent arrays only if high likelihood of achieving proteome-scale in a single experiment; SMPS for unbiased sampling of the variable regions of TCR, BCR/antibody, and the binding pockets of MHC/HLA molecules; Applications of SMPS from samples infected with viral, bacterial, parasite, or fungal pathogens; SMPS for identifying tumor-specific antigens; SMPS for sampling HLA/MHC:Ag complexes. Research topics focused on mass spectrometry would not be considered within scope for this FOA. Applications Not Responsive to this FOA In summary, applications that would not be considered responsive include: Applications that do not address potential of the proposed technology to scale proteome-wide; Applications that do not propose significant improvements to existing technologies in one or more critical factors (throughput, cost, accuracy, sensitivity, dynamic range, and scale); Applications focused on mass spectrometry technology development; Applications primarily focused on computational approaches. If applicable, awardees are expected to comply with the NIH Genomic Data Sharing (GDS) Policy. More information on this expectation can be found in Resource Sharing Plan section.
US Flag An Official Website of the United States Government