You are here

Finding Guaranteed RL Control for Satellite Systems


OUSD (R&E) CRITICAL TECHNOLOGY AREA(S): Trusted AI and Autonomy; Space Technology


The technology within this topic is restricted under the International Traffic in Arms Regulation (ITAR), 22 CFR Parts 120-130, which controls the export and import of defense-related material and services, including export of sensitive technical data, or the Export Administration Regulation (EAR), 15 CFR Parts 730-774, which controls dual use items. Offerors must disclose any proposed use of foreign nationals (FNs), their country(ies) of origin, the type of visa or work permit possessed, and the statement of work (SOW) tasks intended for accomplishment by the FN(s) in accordance with the Announcement. Offerors are advised foreign nationals proposed to perform on this topic may be restricted due to the technical data under US Export Control Laws.


OBJECTIVE: The focus of this topic is to discover and design machine learning or reinforcement learning architectures for use within a satellite control context which are capable of providing guaranteed closed-loop behavior in a nonlinear controls context.


DESCRIPTION: Machine or reinforcement learning techniques have proven their usefulness in creating control schemes that can handle complicated inputs and maneuver vehicles as intended. One of the fundamental drawbacks of these controllers is how quickly their capabilities have grown in comparison to the theoretical guarantees normally required to ensure their safety. At the moment, machine learning-based controllers are rapidly being adopted despite the lack of guaranteed safety. The future of satellite autonomy may involve these types of controllers, but these controllers must be shown to be safe for flight with guaranteed closed-loop behavior if they are to be implemented in space. This topic aims to discover whether certain control structures, activation functions, and training results for machine/reinforcement learning controllers exist that can provide formal guarantees for a satellite's behavior and, if so, what they may be.


PHASE I: Awardee(s) will conduct a comprehensive review of current research in machine/reinforcement learning techniques and architectures capable of providing performance guarantees within a control context. Awardee(s) will investigate and compile the possible requirements for a satellite controller as well as the necessary theorems needed to demonstrate guaranteed behavior. Awardee(s) will devise a test plan capable of demonstrating the use of the control systems using machine/reinforcement learning techniques and allowing the comparison of them between theory and practice.


PHASE II: Awardee(s) will design the theorems and control structures that illustrate what type of machine learning-based models and training approaches can be shown to provide provable stability guarantees under certain conditions. Implement these control systems on multiple vehicles, including a representative satellite, showcasing the adherence of these controllers to their theoretical guarantees.


PHASE III DUAL USE APPLICATIONS: In cooperative efforts with one or more satellite software manufacturers and military satellite system developers, awardee(s) should expect to integrate the proposed control algorithms with satellite software. Awardee(s) should expect to demonstrate the control system performance with it running on board a satellite. Awardee(s) should expect to evaluate transition opportunities for utilization in approved Government civilian applications.



  1. RL uncertainty characterization still relies on statistical estimation: Clements, William R., et al. "Estimating risk and uncertainty in deep reinforcement learning." arXiv preprint arXiv:1905.09638 (2019);
  2. Trained RL agents still suffer from brittleness and data issues: Lockwood, Owen, and Mei Si. "A Review of Uncertainty for Deep Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment. Vol. 18. No. 1. 2022;
  3. State of the art guarantees still exist at most only for linear models: Z. Marvi and B. Kiumarsi, "Reinforcement Learning With Safety and Stability Guarantees During Exploration For Linear Systems," in IEEE Open Journal of Control Systems, vol. 1, pp. 322-334, 2022, doi: 10.1109/OJCSYS.2022.3209945.;


KEYWORDS: Reinforcement learning; nonlinear controls; optimization; satellite control; autonomy

US Flag An Official Website of the United States Government