Inverse Safety Filtering: Inferring Constraints from Safety Filters for Decentralized Coordination

Published 3 Apr 2026 in eess.SY | (2604.02687v1)

Abstract: Safe multi-agent coordination in uncertain environments can benefit from learning constraints from other agents. Implicitly communicating safety constraints through actions is a promising approach, allowing agents to coordinate and maintain safety without expensive communication channels. This paper introduces an online method to infer constraints from observing the safety-filtered actions of other agents. We approach the problem by using safety filters to ensure forward safety and exploit their structure to work backwards and infer constraints. We provide sufficient conditions under which we can infer these constraints and prove that our inference method converges. This constraint inference procedure is coupled with a decentralized planning method that ensures safety when the constraint activation distance is sufficiently large. We then empirically validate our method with Monte Carlo simulations and hardware experiments with quadruped robots.

Abstract PDF Upgrade to Chat

Authors (4)

Summary

The paper demonstrates that safety filters based on quadratic control barrier functions allow the inference of unknown constraints from observed control deviations.
It introduces a KKT-based analytical method combined with a regularized Newton solver to resolve underdetermined constraint inference in multi-agent settings.
Empirical results and hardware experiments validate nearly perfect collision avoidance and scalable, decentralized coordination under asymmetric information.

Inverse Safety Filtering for Decentralized Multi-Agent Constraint Inference

Introduction and Motivation

Robust multi-agent coordination under uncertainty is central to robotic applications in navigation, manipulation, and monitoring. A central challenge emerges when decentralized agents have asymmetric observations: critical environment constraints (e.g., obstacles) may be known only to subsets of agents, complicating safe coordination without centralized communication. "Inverse Safety Filtering: Inferring Constraints from Safety Filters for Decentralized Coordination" (2604.02687) develops a formal framework allowing agents to infer each other's unknown environmental constraints by observing their safety-filtered control actions. This endows the system with implicit communication capabilities, enabling safety and cooperation without explicit message passing.

Problem Formulation and Constraints

The paper models the system as an $N$ -agent, discrete-time dynamic game, where traditional centralized coordination is unattainable due to communication latency or partial observability. Each agent enforces parameterized constraint sets—typically represented by quadratic control barrier functions (CBFs):

$h(s, \theta) = (s - \theta)^\top Q (s - \theta) - r^2$

where $s$ is the state projection (e.g., position), $\theta$ parameterizes the constraint (e.g., obstacle center), $Q$ is positive definite, and $r$ is a safe distance. Agents apply CBF-based safety filters to minimally alter their nominal input, ensuring forward invariance of the safe set over time.

When constraints are only partially known, the crux is for one agent to infer $\theta$ by observing another agent's pair $(u_\text{nom}, u_\text{safe})$ , that is, the nominal and safety-filtered action, along with relevant state.

Inverse Constraint Inference via KKT Structure

The key insight is that the KKT conditions of the quadratic CBF safety filter permit one to analytically recover the unknown constraint parameter $\theta$ under sufficient excitation and actuation:

Figure 1: Illustrative scenario where asymmetric constraint knowledge is resolved through decentralized inference; one agent infers the obstacle known only to its peer via observed control deviations.

For single active constraint cases (e.g., obstacle avoidance only), a closed-form solution for $\theta$ exists due to the quadratic structure of $h(s, \theta) = (s - \theta)^\top Q (s - \theta) - r^2$ 0. The paper formalizes necessary and sufficient conditions for identifiability and uniqueness, relying on the invertibility of the projection and constraint sensitivity, as well as the actuation matrix.

Extension to Multi-Constraint and Multi-Agent Settings

Realistic scenarios require simultaneous enforcement of formation-keeping alongside obstacle avoidance. Here, safety filters have multiple active constraints, rendering the KKT-based inference underdetermined. The paper develops a regularized Newton solver, initialized from the closed-form KKT result, to resolve such cases:

Strong regularity and compact sublevel sets guarantee convergence to the true constraint.
The region of convergence is empirically shown—Newton's method substantially outperforms direct input-matching optimizers, which are sensitive to initialization.
Figure 3: Regularized Newton method demonstrates a large region of convergence for constraint recovery, compared to rapid divergence for input matching outside a narrow initialization region.

Decentralized Inference and Planning Protocol

A decentralized, round-robin framework decomposes the agent team into one demonstrator and $h(s, \theta) = (s - \theta)^\top Q (s - \theta) - r^2$ 1 learners per iteration:

Dem: The demonstrator uses private and public constraints.
Lrn: Learners operate with only public constraints.
Upon a safety-filter-inducing event ( $h(s, \theta) = (s - \theta)^\top Q (s - \theta) - r^2$ 2) by the demonstrator, learners infer and publicize the new constraint.

This protocol is critical to guarantee that, within $h(s, \theta) = (s - \theta)^\top Q (s - \theta) - r^2$ 3 steps, all agents have sufficient information to maintain safety (provided the minimum formation slack is strictly less than the constraint activation radius). Theoretical results rigorously establish forward invariance and show that, once a constraint is inferred, all future control actions will preserve safety for every agent with respect to all constraints learned up to that point.

Figure 5: Visualizations of three- and four-agent teams in decentralized navigation, color-coded by asymmetric obstacle knowledge and subsequent successful inference and avoidance.

Multi-Team and Moving Obstacle Extensions

The method generalizes to multi-team interaction by treating other agent teams as moving obstacles (with bounded velocity). Robust CBFs are constructed by inflating the safe radius to account for maximal expected movement during the planning step. This leads to provable guarantees that safety is preserved even in adversarial, dynamic multi-team environments.

Figure 4: Two teams performing crossing maneuvers; robust CBF constraints enforce decentralized, collision-free interactions with empirical confirmation that inter-team distances always exceed safety thresholds.

Empirical Validation

Comprehensive Monte Carlo experiments compare the proposed KKT-based inference (with CBF) to input matching and to non-CBF constraints in a series of two-agent navigation tasks:

CBF+KKT achieves nearly perfect collision-avoidance and over 90% successful real-time constraint discovery, orders of magnitude better than alternative methods.
Input matching is highly sensitive, suffering both higher error and false positives ("ghost constraints").
In multi-agent and multi-team environments, decentralized round-robin inference achieves high success rates and scales efficiently.
Figure 2: In a single Monte Carlo rollout with two double-integrator agents, only the CBF+KKT method successfully avoids all collisions; methods using input matching or non-CBF constraints fail with regularity.

Hardware Experiments

The method is validated on real quadruped robots connected by a constraint (e.g., a rope). When only the rear robot perceives obstacles, the leading robot accurately infers hidden constraints from the peer's filtered actions and both maintain safety and formation compliance.

Figure 6: Two quadrupeds, with asymmetric constraint knowledge, coordinate to avoid obstacles and negotiate gaps solely through decentralized, implicit inference from filtered actions.

Figure 7: Hardware experiment trajectories confirm preservation of safety and formation, matching theoretical predictions.

Theoretical and Practical Implications

The framework provides several significant contributions:

Provable constraint identifiability and uniqueness from single-step action-state observations with CBF-based safety filters.
A robust Newton-based solver for underdetermined, multi-constraint inference.
Formal guarantees of decentralized safety with minimal (implicit) communication, applicable to static and dynamic environments.
Empirical and hardware evidence of real-time applicability and scalability.

These results challenge the prevailing assumption that explicit communication or centralization is required for robust multi-agent safety. From a formal methods and control theory viewpoint, this work further bridges safety filtering, inverse constraint inference, and decentralized planning with rigorous performance and convergence properties.

Conclusion

Inverse Safety Filtering establishes that it is both tractable and reliable to infer private environment constraints in decentralized multi-agent systems by observing only locally available filtered control actions. The proposed framework provides a pathway for scalable, robust, safety-critical coordination in the presence of information asymmetries—enabling implicit communication and coordination mechanisms suitable for future, large-scale, intelligent robotic collectives.

Future work lies in extending the approach to partial observability, richer classes of constraints, and integration with reinforcement learning-based decentralized planners. The framework's formal basis and real-world efficacy position it as a promising template for safe, information-efficient cooperation in heterogeneous agent systems.

Markdown Report Issue