LAGRANGIAN RELAXATION DEEP REINFORCEMENT LEARNING SYSTEMS AND METHODS FOR WEAKLY COUPLED MARKOV DECISION PROCESSES

Application US20260080256A1 Kind: A1 Mar 19, 2026

Inventors

Ibrahim EL SHAR, Haiyan WANG, Chetan GUPTA

Abstract

Systems and methods described herein train a deep reinforcement learning agent to solve weakly coupled Markov decision processes using Lagrangian relaxation in a model-free setting. By relaxing linking constraints separate subproblems may be obtained that are easier to solve when considered individually. In embodiments, this is accomplished by collecting experience tuples from a main problem, decomposing them into subproblems, and introducing Lagrangian multipliers to manage linking constraints. Transition experiences are stored in a replay buffer and Lagrangian action-values learn for each subproblem via DQN using a relaxed Bellman equation. The method includes estimating the overall Lagrangian action-value function, solving an optimization problem over Lagrangian multipliers, and choosing actions greedily. Various embodiments iteratively improve a policy and integrate subproblem solutions into a main problem solution to apply a policy that has been learned by subagents using a single deep Q-network, in real-world scenarios without prior knowledge of the environment.

CPC Classifications

G06N 3/092

Filing Date

2024-09-16

Application No.

18886832

View original document →