You are here

Generative Text Engine for Form Completion

Description:

OUSD (R&E) CRITICAL TECHNOLOGY AREA(S): Advanced Computing and Software; Human-Machine Interfaces; Trusted AI and Autonomy

 

OBJECTIVE: This SBIR topic is soliciting tools and techniques to facilitate generating semi-structured text reports with free-form text. There is a research interest in exploring the application of Generative Text artificial intelligence (AI) (such as Chat GPT, GPT3/4, etc.) to facilitate the filling in of text-based data collection forms; however, other tools and approaches will be considered if it is explained how they would contribute to the requested capability. The data generated by this general purpose form completion engine will lead to reduced data curation for subsequent analytics. The desired solution will:

(1) generate a general-purpose curation/creation text engine that facilitates completing a variety of text-based forms.

(2) describe a mechanism for incorporating technical terminology & phrasing appropriate for a specific usage domains (potentially including sensitive or classified terminology and phrases) along with a general baseline generative text engine.

(3) be designed to be useful with minimal compute, and without immediate or sustained connection to cloud-based processing resources. Cloud-based Processing intense resources may be used in developing the general-purpose engine and achieving threshold performance, but the proposal must describe how the initial capability will be refined to be useful with minimal computer and storage footprint. Further, the proposal must state the size and capabilities for processing that shall be required to achieve with threshold and objective (final) performance in the desired system.

(4) describe any key technologies being used in creating the capability, and clearly characterize the data usage rights associated with those capabilities.

 

The concept being proposed in this SBIR topic shall demonstrate the use generative text algorithms to curate the text entries as they are being created. The desired solutions should:

(1) focus on a workflow / process for a prompting dialog between the generative text engine and the user vice developing large language models. It is expected that some tuning of large language models may be required to address a specific technical domain, but that should be as constrained as possible to focus on the process whereby users interact with the models to facilitate form completion.

(2) be easily adapted for incorporating technical jargon and domain specific phrases for different usage domains. The technique(s) for incorporating specialized technical language into the application must be described.

(3) address anticipated prompt tuning techniques to adapt to specific technical domains enabling techniques for one-shot or few-shot learning.

(4) generate appropriate phrases/descriptions (an understanding of what is being described) in different task domains that are correctly structured and generate consistent and appropriate technical descriptions.

(5) be scalable for use from PCs/Tablets/Phones with limited connectivity to a local server and be cloud- connected, not cloud-dependent.

(6) provide for the use of instructions + answers as a sustainable workflow for maintaining / utilizing the authoring / curation engine.

 

DESCRIPTION: This effort is aimed at enabling the creation of text-based forms with consistent terminology and phrasing by applying generative text artificial intelligence (AI) technology during the authoring of form content. The desired technology will assist content creators by offering interactive curation during the content authoring. The application of the developed technology will result in more consistent form content that is amenable to automated analytics on the generated text and will therefore accelerate and improve accuracy of ship maintenance reporting.

 

New advances in integrating Large Language Models (LLMs) in application pipelines have demonstrated the potential to support a wide range of technical reporting domains; however, there are significant challenges in generating text with relevant content and terminology when completing maintenance reports. While LLMs show impressive performance in general knowledge and reasoning capabilities, they have inherent limitations and lack capabilities required for broad language understanding and use in the real world (e.g., specialized or proprietary knowledge of terms, facts and concepts). Fine tuning, parameterizing, and combining LLMs with external tools should produce capabilities that enable LLMs to be more useful in real world settings, such as that of facilitating completing form-based descriptions of technical problems and their impacts. The desired applications will provide customized content to support maintenance reporting workflows and answer technical questions across a variety of maintenance reporting use cases.

 

PHASE I: Conduct research in open source LLMs with commercially permissive license (e.g., Apache 2.0, MIT) to identify, select, and track appropriate models that have the potential to perform well for the Navy domain and desired downstream tasks. Selected models must be usable in both research and commercial settings. The solution will need to work on resource constrained devices (e.g., tablets, laptops), which may be disconnected from the Internet and cloud-based resources during form authoring. To improve the performance of models in deployment environments, different techniques (e.g., distillation, supervised fine-tuning, parameterization) should be identified, explored, and evaluated to ensure correct information is generated for the defined downstream tasks. Define the task and data sources that will be used to act as a suitable proxy for ship maintenance reporting, which involves consistently generating text necessary to fill-in ship maintenance forms. The longer-term technical objective is a general-purpose form-completion engine that can be readily adapted to various technical domains and terminologies and utilize alternative technical jargon and phraseology. The selected LLM and a systems-based approach will minimize model behaviors that generate incorrect content for the selected domains and defined tasks. It is assumed that the task being performed will require new knowledge that was not part of the pre-training data of a general large language model. Successful approaches will securely combine new private data into the workflow and customize the LLM for a target domain and authoring task. Phase I should result in proof of concept demonstrations of key capabilities so as to show how a prototype tool will be built and demonstrated during Phase II. The primary metrics for Phase I success will be quality of proposed workflows for user interaction and a demonstrated use case to show how forms would be completed using a representative large language model.

 

PHASE II: Build on the tools and results of Phase I to create a viable prototype tool for form completion. Utilize real world forms completion tasks. Ideally the problems and real-world data sources would relate to Navy ship maintenance reporting and ship material readiness, although use cases for other transition customers would be acceptable. A prototype tool will be built and tested to demonstrate a proof of concept involving a user interacting with the system to produce a complete and accurate report. The Ship's Maintenance Action Form (OPNAV 4790 or two-kilo) is an example of a primary maintenance data system (MDS) form that would be of interest, which is used to report both deferred and completed maintenance actions. The mission-degrading casualty report (CASREP), is another example that is used to report an equipment degradation to the operational commander which impacts mission readiness. Automated tools will (1) generate text and fill in these semi-structured forms with free-form text fields, (2) reduce data curation requirements, and (3) enable analytics on the curated data.

 

For Naval applications, the contractor will need to be able to process Controlled Unclassified Information (CUI) and/or classified data sources up to the Secret level. The government team will provide contractor access to historical reports to support development and evaluation of the proposed techniques, automated tools, and analytics (e.g., text generators, classifiers). The historical text was often written inconsistently and therefore making it challenging to automate analytics across this data. Address inconsistencies and unique language in the various text reporting workflows and describe how the proposed capabilities will support generation of high-quality data for reporting. Describe and demonstrate analytics/metrics on the text data generated to assess the quality of the text being generated. Assess how the tool will run on resource constrained hardware (e.g., tablets, laptops) with reasonable compute capabilities and document its ability to run on-line and off-line (i.e., that the developed technology would be suitable for shipboard/at sea use with limited access to cloud/remote computing capabilities). The tool will provide a tailorable vocabulary database suitable for use across different technical reporting domains (e.g., electrical systems, distillation systems, turbine mechanics, etc.). The workflow and user interface will be fully described and demonstrated as appropriate. The workflow shall be demonstrably easy to use and will demonstrate valid, predictive results. Technical evaluations, capability demonstrations, and metrics will focus on the quality of the human machine interaction (HMI), completeness / correctness of reports, and generalizability of approach across technical reporting domains shall be addressed at the completion of Phase II.

 

PHASE III DUAL USE APPLICATIONS: Integrate and transition the developed tools for support of the NAVSEA SEA21 Ship Maintenance Data Improvement Initiative (SMDII) Program of Record (POR) to support automated text processing requirements for Navy ship maintenance reporting and ship material readiness. The tools being developed are expected to be applicable to a broad range of form completion applications, including for medical, maintenance, and other domains reliant on text-based data entry.

 

REFERENCES:

  1. DoD Instruction: Casualty Reporting (CASREP) Policy (Material) (2014). https://media.defense.gov/2023/Jan/05/2003140406/-1/-1/0/CI_3501_3G.PDF
  2. COMUSFLTFORCOMINST 4790.3B, Joint Fleet Maintenance Manual (JFMM). https://www.navsea.navy.mil/Portals/103/Documents/SUBMEPP/JFMM/Searchable_JFMM_Rev_D-1.pdf
  3. CNRMC Fleet Desk Guide (FDG). https://dodcac.portal.navy.mil/navsea/CNRMC/fdg/default.aspx
  4. Yongliang, Shen; Kaitao, Song; Xu, Tan; Dongsheng, Li; Weiming, Lu and Yueting, Zhuang. (2023). “HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face.” arXiv:2303.17580v2 [cs.CL] 2 Apr 2023. https://github.com/microsoft/JARVIS
  5. Brown, Tom B. et al. “Language Models are Few-Shot Learners. Conference on Neural Information Processing Systems (NeurIPS), 2020.
  6. Long, Ouyang et al. “Training language models to follow instructions with human feedback.” ArXiv, 022. https://arxiv.org/abs/2203.02155
  7. Chowdhery, Aakanksha et al. PaLM: Scaling language modeling with pathways. ArXiv, abs/2204.02311, 2022.
  8. Zhang, Susan at al. Opt: Open Pre-trained Transformer Language Models. ArXiv, abs/2205.01068, 2022.
  9. Zeng, Aohan et al. Glm-130b: An Open Bilingual Pre-trained Model. ICLR 2023 poster, 2023.
  10. Touvron, Hugo et al, Llama: Open and Efficient Foundation Language Models. ArXiv, abs/2302.13971, 2023.
  11. Sang, Michael Xie; Raghunathan, Aditi; Liang, Percy and Ma, Tengyu. An Explanation of In-context Learning as Implicit Bayesian Inference. ICLR 2022 Poster, 2022.
  12. Min, Sewoet al. Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 2022.

 

KEYWORDS: Automated Text Curation, Large Language Models (LLM), 2-Kilos, CASREP, Casualty Report, Form authoring, Artificial Intelligence, AI

US Flag An Official Website of the United States Government