Learning and Enforcing Context-Sensitive Control for LLMs

Published 12 Apr 2026 in cs.CL, cs.AI, and cs.LG | (2604.10667v1)

Abstract: Controlling the output of LLMs through context-sensitive constraints has emerged as a promising approach to overcome the limitations of Context-Free Grammars (CFGs) in guaranteeing generation validity. However, such constraints typically require manual specification -- a significant barrier demanding specialized expertise. We introduce a framework that automatically learns context-sensitive constraints from LLM interactions through a two-phase process: syntactic exploration to gather diverse outputs for constraint learning, followed by constraint exploitation to enforce these learned rules during generation. Experiments demonstrate that our method enables even small LLMs (1B parameters) to learn and generate with perfect constraint adherence, outperforming larger counterparts and state-of-the-art reasoning models. This work represents the first integration of context-sensitive grammar learning with LLM generation, eliminating manual specification while maintaining generation validity.

Abstract PDF Upgrade to Chat

Authors (4)

Summary

The paper's main contribution is a two-phase framework that automates learning and enforcing context-sensitive constraints, ensuring valid LLM outputs.
It combines CFG-based masking with temperature sampling and oracle labeling to generate examples for ILASP-driven logic-based constraint learning.
Empirical evaluations on synthetic tasks demonstrate that learned ASG constraints achieve 100% accuracy, matching manually specified rules and outperforming unconstrained models.

Automated Context-Sensitive Constraint Learning for LLM Generation

Motivation and Problem Statement

The paper "Learning and Enforcing Context-Sensitive Control for LLMs" (2604.10667) addresses the challenge of guaranteeing semantic and syntactic validity in LLM-generated sequences, focusing on scenarios where context-sensitive constraints exceed the expressiveness of context-free grammars (CFGs). Many structured generation tasks (e.g., semantic parsing, agent planning) require interplay between elements that cannot be encoded using CFGs alone. Typical approaches for enforcing context-sensitive constraints (CSGs) rely on manual rule specification, creating a barrier due to the need for specialized expertise and limiting scale and generalization.

Framework Overview

The authors propose a two-phase framework for neuro-symbolic constraint learning and enforcement, leveraging Answer Set Grammars (ASGs) and ILASP-based logic learning:

Syntactic Exploration: LLMs are masked using CFGs, and temperature-based sampling generates diverse syntactically valid sequences. An oracle labels these outputs as valid or invalid with respect to target CSGs, creating positive and negative examples for learning context-sensitive constraints.
Constraint Exploitation: The learned context-sensitive grammar (as ASG annotations over the CFG) is applied to the LLM generation process, ensuring strict adherence to the constraints without further oracle intervention.

Figure 1: Two-phase methodology—syntactic exploration collects CFG-valid outputs, labels them via oracle, and learns context-sensitive ASG constraints; constraint exploitation enforces the learned ASG mask during LLM generation to guarantee validity.

Technical Approach

Formalism

CFGs and CSGs: CFG production rules can only encode local syntactic structure, whereas CSGs can represent constraints such as $a^n b^n c^n$ (equal counts and ordering).
ASGs: Extend CFGs with logic-based context-sensitive constraints via Answer Set Programming (ASP), interpreted over parse trees.

Learning Protocol

Diverse CFG-masked Sampling: A temperature schedule is employed to maximize diversity and expose constraint violations under CFG masking. The constraint function $\mathcal{C}$ restricts token sampling, guaranteeing syntactic validity.
Oracle Labeling: Outputs are labeled as valid or invalid with respect to the target context-sensitive language, separating positive ( $E^+$ ) and negative ( $E^-$ ) samples.
Logic-based Constraint Learning: The ASG learner (ILASP) induces minimal, sound, and complete ASP annotations over the CFG, covering all labeled examples.

Controlled Decoding

Learned ASG constraints are integrated into the LLM decoding process, dynamically masking invalid tokens to ensure generated outputs strictly satisfy context-sensitive constraints, overcoming the limitations of both pure CFG masking and unconstrained sampling.

Empirical Evaluation

The method is validated on synthetic grammar tasks— $L_1 = \{ a^n b^n c^n \mid n \geq 1 \}$ (perfect context-sensitive dependency) and $L_2 = \{ a^n b^n c^m \mid n,m \geq 1 \}$ (partial dependency). Multiple open- and closed-source LLMs (Llama 1B, 3B, 8B, 70B; GPT-4.1; DeepSeek-R1) are evaluated across unconstrained, manual ASG-constrained, and automatically learned ASG settings.

Key numerical findings:

Perfect constraint adherence: All Llama models (including 1B parameters) achieve 100% accuracy under both manually specified and learned ASG constraints.
Unconstrained failure: Even large models (Llama 70B, GPT-4.1) under unconstrained sampling fail to exceed 76.7% accuracy for $a^n b^n c^n$ , and state-of-the-art reasoning models (DeepSeek-R1, o4-mini) do not achieve correctness guarantees.
Automated constraint learning matches manual specification: Manual inspection confirms that the learned ASG matches the handcrafted ground truth ASG, showing completeness of the learning process.

Practical and Theoretical Implications

Automation of Symbolic Constraints: The approach eradicates the need for manual constraint engineering in context-sensitive language control, democratizing the use of formal grammars for LLMs.
Robustness and Reliability: Neuro-symbolic masking based on learned ASG constraints provides correctness guarantees that cannot be matched by model scale or longer inference (multi-step reasoning) alone.
Scalability and Transfer: The framework adapts to arbitrary context-sensitive grammar tasks via oracle labeling, supporting generalization across domains.

Limitations and Future Directions

Sampling Coverage: No formal guarantees exist for convergence in syntactic exploration; temperature-based heuristic sampling may not capture the entire constraint space in real-world, high-complexity tasks.
Hard Constraint Focus: The binary validity system restricts application to domains requiring soft constraint handling or graded acceptability.
LLM Prior Dependency: Success is contingent on LLMs being exposed to terminals and structural priors relevant to the target formal grammar.

Future research will extend the method to domains with semantic relationships and soft constraints, and explore lifelong/active learning with one-shot sample-efficient ASG refinement.

Conclusion

The presented framework successfully automates the learning and enforcement of context-sensitive constraints for controlled LLM generation, achieving correctness guarantees through symbolic masking that are unattainable via unconstrained or scale-based approaches. The synergy between syntactic exploration (via CFG masking and oracle labeling) and logic-based learning (via ASG and ILASP) demonstrates significant potential for robust, domain-adaptive, neuro-symbolic LLM control. Anticipated future work includes extension to semantic parsing, active learning paradigms, and coverage in soft-constraint domains.

Markdown Report Issue