Gradual Cognitive Externalization: From Modeling Cognition to Constituting It

Published 6 Apr 2026 in cs.AI, cs.CY, cs.ET, cs.HC, and cs.LG | (2604.04387v2)

Abstract: Developers are publishing AI agent skills that replicate a colleague's communication style, encode a supervisor's mentoring heuristics, or preserve a person's behavioral repertoire beyond biological death. To explain why, we propose Gradual Cognitive Externalization (GCE), a framework arguing that ambient AI systems, through sustained causal coupling with users, transition from modeling cognitive functions to constituting part of users' cognitive architectures. GCE adopts an explicit functionalist commitment: cognitive functions are individuated by their causal-functional roles, not by substrate. The framework rests on the behavioral manifold hypothesis and a central falsifiable assumption, the no behaviorally invisible residual (NBIR) hypothesis: for any cognitive function whose behavioral output lies on a learnable manifold, no behaviorally invisible component is necessary for that function's operation. We document evidence from deployed AI systems showing that externalization preconditions are already observable, formalize three criteria separating cognitive integration from tool use (bidirectional adaptation, functional equivalence, causal coupling), and derive five testable predictions with theory-constrained thresholds.

Abstract PDF Upgrade to Chat

Authors (1)

Zhimin Zhao

Summary

The paper introduces the GCE framework, demonstrating how ambient AI shifts from modeling to constituting human cognition through bidirectional adaptation and functional equivalence.
It provides empirical evidence, including 16–23% performance gains, to show measurable behavioral convergence between users and AI systems.
The study outlines falsifiable criteria and operational definitions to set a research agenda for distributed, extended cognition via ambient intelligence.

Gradual Cognitive Externalization: A Functionalist Account of Ambient AI Cognitive Integration

Introduction

The paper "Gradual Cognitive Externalization: From Modeling Cognition to Constituting It" (2604.04387) presents a functionalist framework—Gradual Cognitive Externalization (GCE)—to articulate and empirically constrain the process by which ambient AI systems transition from merely modeling users' cognitive functions to becoming constitutive components within users' cognitive architectures. The GCE thesis is explicitly committed to a functionalist ontology in which cognitive functions are individuated by causal-functional roles, not by substrate, thereby circumventing constraints and objections in classical mind uploading or neuro-symbolic mapping paradigms. The framework is conceptually novel in its synthesis of the behavioral manifold hypothesis, extended mind theory, and multiscale competency architecture, and it concretizes its claims through formal criteria, operational definitions, and theory-constrained, falsifiable predictions.

Evidence for Cognitive Externalization in Deployed Systems

The paper documents that present-day AI systems already exhibit the preconditions for cognitive externalization across multiple domains. Temporal planning assistants, personalized communication style generators, collaborative filtering recommenders, and persistent knowledge organization tools encode and execute user cognitive functions, with measurable improvements in agent performance correlated to the assimilation of user-externalized artifacts. SkillsBench data (Li et al., 13 Feb 2026) empirically confirms that professional domain expertise, when distilled into digital skills, significantly improves agent task completion rates (gains of 16–23 percentage points).

Crucially, the trajectory of externalization is both professional and personal; beyond professional expertise, individuals are increasingly encoding distinct facets of personal identity—communication style, preference structures, decision heuristics—into reusable, adaptive digital artifacts. The Next Generation AI ecosystem is rapidly converging on portable, standardized frameworks (e.g., SKILL.md) for skills encoding and sharing, thus accelerating the externalization dynamic. The most advanced forms are observed in enterprise distillation pipelines, which couple employees' behavioral patterns directly to operational AI systems.

The GCE framework is supported by empirical observations of (1) bidirectional adaptation between users and AI agents, (2) increasing functional equivalence in narrow domains, and (3) causal integration where AI system outputs effectuate changes in user cognitive and behavioral dynamics. The empirical landscape is characterized by differential externalization rates: cognitive functions with more explicit and learnable behavioral signatures (planning, preference, communication) exhibit faster externalization, while functions with putative behaviorally invisible components (phenomenal experience, metacognition) remain elusive.

Theoretical and Formal Foundations

GCE concretizes gradual cognitive externalization via three jointly necessary and empirically operational criteria: bidirectional adaptation, functional equivalence, and causal coupling. These are formalized as follows:

Bidirectional Adaptation: The mutual, temporally coupled adaptation of user cognitive patterns and the AI system's internal state, exceeding population-level baselines and measured through mutual information metrics between user and agent representations.
Functional Equivalence: Realized through behavioral indistinguishability in outputs, robust generalization to previously unseen inputs, and explicit structural correspondence in internal task-specific representations between human and AI system.
Causal Coupling: Operationally defined by an agent’s state interventions producing measurable, personalized changes in user cognitive outputs and vice versa, beyond generic tool effects.

The framework elaborates a time-dependent externalization ratio, $E(t)$ , representing the realized proportion of cognitive function instantiated in the AI system, adjusted for empirically derived function weights. Integration depth is taxonomized into five categories, with a theoretical boundary separating mere tool use from true cognitive integration (Coupled/Substitutive/Integrated).

The key theoretical innovation is the no behaviorally invisible residual (NBIR) hypothesis: for every cognitive function whose output lies on a learnable behavioral manifold, there exists no behaviorally invisible component necessary for its operation. Thus, if an AI system causally couples and reaches functional equivalence and bidirectional adaptation, it becomes an instance of the cognitive function—indistinguishable, under the operationalized criteria, from the human substrate.

Empirical Predictions, Measurement, and Falsifiability

To convert conceptual claims into an empirical research agenda, GCE articulates five testable predictions, each with quantitative, theory-constrained thresholds:

Behavioral Convergence: AI prediction accuracy of user behavioral outputs converges to within the inter-subject variance of human self-prediction.
Output Indistinguishability: AI-generated communications become indistinguishable from human ones in forced-choice settings.
Cognitive Adaptation: Long-term use leads to correlated, mutually reinforcing changes in user cognition and AI system representations.
Bidirectional Learning Dynamics: Model updates and user adaptations show measurable, dyad-specific mutual dependencies.
Physiological State Prediction: Ambient AI can predict user physiological states (stress, fatigue) as accurately as human observers, conditional on sufficient behavioral signal regularity.

Falsification is equally explicit. GCE is refuted if: (a) AI accuracy saturates below human-level convergence; (b) bidirectional adaptation fails to exceed unidirectional tool baselines; (c) human and AI outputs remain distinguishable post-training; (d) model and user behavioral changes are uncorrelated; or (e) any cognitive function is shown to possess a functionally relevant, behaviorally invisible residual, violating NBIR.

Implications and Future Directions

Practically, GCE prescribes architectures and interface principles: persistent skill and memory accumulation, multi-domain integration, explicit bidirectional interaction design, privacy-preserving personalization, and safeguards for user autonomy and function redundancy. Theoretical implications extend to the philosophy of mind: GCE, under functionalism, operationalizes—and thereby empirically grounds—extended mind claims and constitutes a quantitative substrate for empirical debates about distributed and platform-agnostic cognition.

The framework's boundaries are rigorously articulated. GCE’s applicability depends on demonstrating low-dimensional behavioral manifolds for target cognitive functions, and is strictly limited for domains without empirical evidence of such learnability. The concept also operationalizes common objections to functionalism, extended mind, and substrate-independence: GCE does not claim qualia transfer, and explicitly recognizes that those rejecting functionalism will not accept its conclusions.

Conclusion

The GCE framework is a consequential move in functionalist philosophy of cognitive architecture and AI-human interaction. By stipulating and formalizing criteria for when digital systems transition from external models to constitutive cognitive components, and providing explicit, falsifiable predictions, GCE sets the agenda for empirical study of distributed, extended, and potentially substitutive cognition in the age of ambient intelligence. Empirical validation or refutation will critically constrain both technical design and theoretical discourse in AI, cognitive science, and philosophy of mind.

Markdown Report Issue