How to sketch a learning algorithm

Published 8 Apr 2026 in cs.LG | (2604.07328v1)

Abstract: How does the choice of training data influence an AI model? This question is of central importance to interpretability, privacy, and basic science. At its core is the data deletion problem: after a reasonable amount of precomputation, quickly predict how the model would behave in a given situation if a given subset of training data had been excluded from the learning algorithm. We present a data deletion scheme capable of predicting model outputs with vanishing error $\varepsilon$ in the deep learning setting. Our precomputation and prediction algorithms are only $\mathrm{poly}(1/\varepsilon)$ factors slower than regular training and inference, respectively. The storage requirements are those of $\mathrm{poly}(1/\varepsilon)$ models. Our proof is based on an assumption that we call "stability." In contrast to the assumptions made by prior work, stability appears to be fully compatible with learning powerful AI models. In support of this, we show that stability is satisfied in a minimal set of experiments with microgpt. Our code is available at https://github.com/SamSpo1/microgpt-sketch. At a technical level, our work is based on a new method for locally sketching an arithmetic circuit by computing higher-order derivatives in random complex directions. Forward-mode automatic differentiation allows cheap computation of these derivatives.

Abstract PDF Upgrade to Chat

Authors (1)

Sam Gunn

Summary

The paper introduces a novel two-phase data deletion scheme that precomputes auxiliary information to predict model behavior after data removal with arbitrarily small error.
It leverages random projections and truncated Taylor expansions via forward-mode AD to efficiently compute high-order derivative effects without full tensor storage.
The approach significantly outperforms linear surrogate models by providing provable error guarantees for arbitrary measurement functions, enabling fast privacy audits and model unlearning.

Sketch-Based Data Deletion in Deep Learning: Theory and Practice

Motivation and Problem Formulation

The paper "How to sketch a learning algorithm" (2604.07328) addresses the data deletion problem in modern deep learning: after precomputing auxiliary information during training, efficiently predict how the model would have behaved if a subset of training data were excluded. This question is central to interpretability, privacy, data attribution, and machine unlearning. Prior approaches are either heuristic or rely on surrogate models (e.g., linear additivity assumptions), leading to prediction errors that cannot be reduced at will and often failing to match ground truth in practical deep learning settings.

The data deletion scheme introduced here consists of two phases:

Precomputation: Takes the learning algorithm $A$ and dataset $T$ , outputs the trained model $A(T)$ and auxiliary information.
Prediction: Quickly approximates a measurement $\phi(A(T\setminus D))$ for any subset $D$ , representing how the model would behave if examples in $D$ were omitted.

The desiderata are:

Prediction error $\varepsilon$ can be made arbitrarily small at polynomial cost.
Prediction is orders of magnitude faster than retraining for small $d$ (size of deletion set).
Storage overhead scales as $poly(1/\varepsilon)$ models (not scaling with dataset size).
Assumptions are compatible with practical deep learning; no foreknowledge of $\phi$ required during precomputation.

Theoretical Foundations: Stability and Local Sketching

The key technical assumption is stability of the function $T$ 0, formalized as rapid decay of higher-order derivatives in the Taylor expansion at the origin:

$T$ 1

for some superlinear $T$ 2. If $T$ 3 is stable, Taylor truncation to $T$ 4 yields error $T$ 5 for constant $T$ 6.

However, storing the full tensor $T$ 7 is infeasible for large $T$ 8. The scheme leverages sketching via random projections: instead of storing the full derivatives, store random projections $T$ 9 for random directions $A(T)$ 0. These can be computed efficiently using forward-mode automatic differentiation, bypassing the need to materialize $A(T)$ 1.

Prediction for deletions is performed via unbiased estimators utilizing these sketches, with variance controlled via the Johnson-Lindenstrauss lemma and exponential concentration achieved by employing a median-of-means strategy.

To support arbitrarily many deletions, stability is precisely defined in terms of Frobenius norm decay and modulus of continuity for arithmetic circuits, allowing uniform bounds under compositional measurement functions.

Algorithmic Scheme

The proposed data deletion algorithm operates as follows:

Precomputation:
- Sample $A(T)$ 2 random directions in $A(T)$ 3.
- Compute truncated Taylor expansions of the learning circuit in each direction to order $A(T)$ 4.
- Use forward-mode AD to efficiently obtain all required projections.
Prediction:
- For any subset $A(T)$ 5, approximate $A(T)$ 6 using the sketches.
- Use Faà di Bruno's formula for compositional measurements, relying on fast subset convolution algorithms for multi-directional derivatives.
- Achieve prediction error $A(T)$ 7 in $A(T)$ 8 time.

Strong claims are made:

For any constant number of deletions $A(T)$ 9, prediction error is $\phi(A(T\setminus D))$ 0 with precomputation $\phi(A(T\setminus D))$ 1 times slower than training and prediction $\phi(A(T\setminus D))$ 2 times slower than inference.
The storage requirements are $\phi(A(T\setminus D))$ 3 models.
Uniform guarantees are provided regardless of the measurement function $\phi(A(T\setminus D))$ 4 (no foreknowledge needed), provided $\phi(A(T\setminus D))$ 5 is stable.

Empirical Validation: Stability in MicroGPT

Minimal experiments with MicroGPT confirm the stability condition in practical settings. The decay of $\phi(A(T\setminus D))$ 6 is empirically superlinear, and the sketched Taylor expansions offer vanishing error for moderate $\phi(A(T\setminus D))$ 7. Moreover,

Higher-order approximate predictions (degree $\phi(A(T\setminus D))$ 8) are necessary to achieve accurate post-deletion behavior.
The method outperforms linear surrogate schemes in both numerical and estimator concentration properties.

Empirical plots show that predicted losses after deletion closely match the actual values, with variance controlled and error bounded as theorized.

Comparison to Prior Work

Previous methods in data influence and unlearning (e.g., datamodels, influence functions, post-hoc gradient ascent) rely heavily on surrogate models or linear additivity, failing to guarantee vanishing prediction error or independence from the measurement function. Notably:

Additivity models yield irreducible approximation errors as model complexity grows.
Post hoc certified unlearning cannot materially use the trained model and is essentially equivalent to retraining when considered under rigorous correctness guarantees.
Fourier analysis insights partially explain random subset influence, but are insufficient for worst-case deletions.

This scheme is the first to offer provable guarantees for arbitrary measurement functions and deletion sets, with prediction error tunable via computation.

Implications and Future Directions

The presented scheme is foundational for privacy-preserving learning, interpretable training pipelines, and principled machine unlearning. Theoretical implications include:

Demonstration of practical sketching-based Taylor expansions for deep learning circuits.
Efficient computation of high-order derivatives via forward-mode AD and subset convolution.
Uniform guarantees for diverse measurement functions, crucial in real-world attribution, copyright, and privacy scenarios.

Practically, these guarantees enable:

Auditable and legal compliance for data requests (right to be forgotten).
Fast privacy audits and counterfactual data attribution in real-world LLMs and generative models.
Potential extensions to discrete influence regimes, robust statistics, and derivative-based training meta-optimization.

Key directions for future research include:

Scaling stability verification for large models and datasets.
Translating the stability condition to broader neural architectures.
Reducing practical inefficiencies (e.g., fast subset convolution bottlenecks).
Exploring connections to harmonic analysis, statistical learning, and privacy-protecting deployments.

Conclusion

The paper provides a mathematically rigorous, computationally feasible framework for sketch-based data deletion in deep learning. By leveraging stability properties and efficient derivative sketching, it achieves vanishing prediction error for arbitrary deletions, measurement functions, and models, with polynomial computational and storage overhead. The results represent a substantial step forward in provable data attribution, unlearning, and privacy for AI systems, indicating robust directions for theory, practice, and interpretability in model training and deployment (2604.07328).

Markdown Report Issue