You are here

Chief Digital and Artificial Intelligence Office (CDAO) Data Mesh Reference Design (REFDES)

Description:

OUSD (R&E) CRITICAL TECHNOLOGY AREA(S): Advanced Computing and Software; Integrated Network Systems-of-Systems

 

OBJECTIVE: To break the Department of Defense (DoD) enterprise data out of stovepipes created for single use cases, and to make all data seamlessly interoperable across the department, while retaining federated control, hosting, and ownership.

 

DESCRIPTION:

The Department aims to establish a set of software services to allow data users across the Department of Defense (DoD) enterprise to discover DoD data products, understand their structure and meaning, seamlessly negotiate access, and consume them via self-service API. Per DoD rulemaking, data access must support attribute-based access control (ABAC) and operate in a zero-trust environment.

 

The Department has determined that data sharing models with a unified schema and single system of record (sometimes called a “Data Warehouse”) or even semi-unstructured data in a single system of record (“Data Lake”) are not good operational fits for the Department’s requirements. A closer analogue is a “Data Mesh” as described in Dehghani (2022).1

 

This program consists of three phases, described in the sections below. First, however, a few descriptions.

 

Peer-Cooperative Microservices
Each unique microservice is able to communicate with all other similar/identical microservices to form a specific community.

 

Services vs. Microservices
The rest of this document will reference services and microservices as “services”, however it recognizes the distinction between them. Larger applications, built on a single code base, typically consist of a client-side UI, a database, and a server-side application. These are Services. On the other hand microservices are built for a fully distributed system to accomplish a single feature or business logic. Instead of exchanging data within the same code base, microservices communicate with an API.

 

The DoD has identified 15 core functional capabilities that an enterprise data mesh at the Department must have:

  1. UIDs: Tools to describe how data transforms and flows as it is transported from source to destination across the entire data lifecycle. Data versioning for tracking data and models as they change. [A prototype of this is available, accompanied by a whitepaper describing its recommended structure]
  2. Semantic Services: Tools to promote sharing, collaboration, and reuse of data models and ontologies; alias re-referencing to build a canonical controlled vocabulary. [A prototype of this is available, accompanied by a whitepaper describing its recommended structure]
  3. Federated Data Catalog: Virtually federated catalog enabling defense-wide visibility of data and interfaces through pointers to DOD assets and services. [Multiple instantiations exist]
  4. Data and Metadata Profiles (xBOMs): Managed service providing attribution and characteristics that describe the meaning and intended use for data, metadata, algorithms, hardware, software, and data objects. [A whitepaper describing its structure and the recommended schema are available]
  5. Policy Access Control: Tools for ensuring proper access restrictions and identity verification for all consumers and producers in the data mesh.
  6. Digital Policy Administration: Policy administration points feeding enforcement points enabling managed data access across environments.
  7. Data Exchange Management: Handles and routes requests via any exchange method (e.g., API, cloud storage location, access-denied environments) to appropriate services.
  8. Data Product Search: Tool for fast, relatable, and semantically congruent searching of all data products. Provides intuitive result finding for ingenuity and novel discovery of data products.
  9. Data Mesh Pub/Sub: Systems of producers and consumers given by asynchronous service-to-service communication.
  10. Mesh Performance Analytics: Track the flow and usage of data across the mesh. Flow monitoring and alerting.
  11. Data Product Lifecycle Management: Submits data products for registration to the domain and enterprise catalogs. Updates/maintains/revokes registration as necessary. Manage recalled data products. Provide recall and other data product-associated notifications to data product consumers.
  12. Data Security Classification: Tools and policy for proper marking of all types of sensitive data across the DoD. Includes an approach to handling escalation of classification due to data aggregation.
  13. Quality Management Services: Tools for properly computing quality metrics on data and marking the data appropriately with its quality level.
  14. Mediation Hub Services: Managed service for coordinating automated translation capabilities from data producer schemas and contexts to those of consumers, for immediate use without further transformation. The managed service consumes structured metadata about the schema and content of the producer data, available on the Mesh (e.g. from its xBOMs), and the target information about the consumer’s schema and context, and then sends the producer provided data to one or more translation services as required to return the translated data to the consumer. Implementation of specific translation services is outside the scope of this proposal; the Mediation Hub only coordinates their use and manages the mapping of producer to consumer and the required metadata.
  15. Mesh Instrumentation Tools: Behavior analytic data stream analytics to allow performance optimization and asset value determination.

 

Multiple of these services are thought to be available by off the shelf (COTS) software products. In all cases, the Department is interested in keeping the resulting mesh services modular, with clear interfaces and clear separation of concerns.

 

PHASE I: The output of Phase I is a formal REFDES consisting of a composition of textual documentation and visual images as is appropriate to convey all concepts and their interoperability. It is required to use DoD-approved architecture tool and document creation software (e.g., Cameo). For any hybrid COTS/GOTS (Commercial Off-the-Shelf, Government Off-the-Shelf) or COTS service the REFDES must include the interoperability approach with all other services. Describe the enterprise interoperability services to promote a uniform pattern-based communication among all services and data. It is required to use DoD-approved architecture tool and document creation software and to be in accordance with (IAW) the DoD CIO Reference Design guidance[2].  A separate output, for Phase I performers that proposed an option, is a program plan that includes a detailed Phase I option plan that bridges to your level 3 breakdown structure for the Phase II effort.

 

The required REFDES must address the key concepts identified in the provided outline. Any deviations from this outline must be approved by the federal government lead. The complete REFDES must clearly articulate how all services will achieve both service-level communication interoperability and data interoperability. The end product shall enable any developer to design, develop, and implement any or all of the services independent of any other developer while ensuring full interoperability among all delivered capability. It should be accompanied by a time phased roadmap for service evolution.  

 

The REFDES outline is below.

Reference Design Outline

Introduction

Background

Purpose

Scope

Document Overview

Assumptions and Principles

Assumptions

Principles

Capability Concepts

Key Terms and Conceptual Model

Lifecycle Management

Management Environment

Organization

Process

Technology

Governance

Ecosystem

Planning

Production Services

Operations

External Systems

Tools and Activities

Planning Tools and Activities

Develop

Build

Test

Release and Deliver

Production Operation Tools and Activities

Deploy

Operate

Monitor

Sustain

Support

Security

Deployment

Operation

Monitoring

Acronym Table

Glossary

References

 

Key concepts for Data Mesh componentry shall at a minimum include:

    • Visible, Accessible, Understandable, Linked, Trustworthy, Interoperable, and Secure (VAULTIS) compliance
    • Services communication model and framework
    • Data Templating
      • Machine-readability
      • Machine-comprehensibility
    • Dynamic Attribution association
    • Automated notification services
    • Cybersecurity and Zero Trust support

 

The REFDES concept of operations (CONOPS) should consider the provided information describing service, COTS, GOTS, and the implementation model. (Figure 3)

 

The DoD-supplied information papers found in the References section provide the minimally-acceptable architecture and REFDES concepts. The strong default is for these to be individual services. Any deviation from the specified approaches shall be approved by the Government.

 

Figure 3

 

PHASE II: In Phase II, participants will create a Minimum Viable Product (MVP) version of the chosen design, building complete enough versions of the systems in the selected Reference Design(s) to demonstrate that they can achieve the DoD’s final objective. The Phase II deliverables provide foundational understanding or capability basis for Phase III. Phase II should include viable proof of concept (POC) matured to MVP 1 for each of the 15 services demonstrably, independently, and cooperatively as a mesh component.

 

PHASE III DUAL USE APPLICATIONS: In Phase III, participants will create the balance of the required services and deliver a full production capability that meets all requirements for infrastructure compliance while delivering the end-using community the advantages outlined in VAULTIS. The fully operating data mesh that achieves full data interoperability with minimal to no human intervention for specific data exchange. The resulting mesh will support interoperability for applications both in the battlefield (e.g., Coalition Joint All Domain Command and Control (CJADC2), military exercises) and the boardroom (e.g., dashboarding, regular reporting).

 

REFERENCES:

  1. https://www.oreilly.com/library/view/data-mesh/9781492092384/
  2. https://dodcio.defense.gov/Portals/0/Documents/Library/DoD%20Enterprise%20DevSecOps-Pathway%20to%20a%20Reference%20Design_DoD-CIO_20211018.pdf
  3. https://aws.amazon.com/compare/the-difference-between-monolithic-and-microservices-architecture/
  4. Data Mesh Reference Architecture (DMRA) paper: https://media.defense.gov/2024/Mar/15/2003414274/-1/-1/1/dmra_paper.PDF
  5. Unique Identifier (UID) Whitepaper: https://media.defense.gov/2024/Mar/15/2003414275/-1/-1/1/unique_identifier_wp.PDF
  6. Canonical Controlled Vocabulary (CCV) Whitepaper: https://media.defense.gov/2024/Mar/15/2003414273/-1/-1/1/canonical_controlled_VOC_wp.PDF
  7. eXtensible Bill of Material ([x]BOM) paper: https://media.defense.gov/2024/Mar/15/2003414075/-1/-1/1/xBOM_paper.PDF

 

KEYWORDS:  Microservices; Data Mesh; Data Interoperability; Data Sharing Capability; VAULTIS

US Flag An Official Website of the United States Government