🤖 24 AI
🟡 🛡️ Security Thursday, April 16, 2026 · 2 min read

ArXiv: RePAIR Enables LLMs to 'Forget' Targeted Information Without Retraining

Why it matters

RePAIR is a new framework for interactive machine unlearning that enables users to instruct large language models to forget specific information in real time via natural language prompts. The key innovation, the STAMP method, redirects MLP activations toward the refusal subspace using a closed-form formula, without any model retraining, achieving near-zero forgetting scores while preserving model utility.

A research team led by Jagadeesh Rachapudi has introduced RePAIR — a framework that establishes the concept of Interactive Machine Unlearning (IMU). The system enables users to instruct an LLM to forget targeted information through natural language prompts, in real time and without retraining.

How Does the Three-Model Architecture Work?

RePAIR uses an architecture with three specialized components. The Watchdog model acts as a sentinel — it detects when a user’s prompt contains a request to forget specific information. The Surgeon model generates precise “repair” instructions — defining which activations in the neural network need to be redirected. The Patient model — the LLM in use — autonomously applies those repairs.

This three-part architecture means a user simply says something like “forget everything about person X” or “remove knowledge of process Y,” and the system automatically identifies, localizes, and neutralizes the relevant information in the model.

What Is STAMP and Why Is It the Key Innovation?

STAMP (Steering Through Activation Manipulation with PseudoInverse) is the core of RePAIR. The method redirects MLP (Multi-Layer Perceptron) layer activations toward the refusal subspace — the part of the activation space corresponding to answer refusal — using a closed-form pseudoinverse formula.

Critically, STAMP requires no training whatsoever. Changes are computed analytically, meaning the forgetting is performed in seconds rather than the hours or days that retraining requires. Results show near-zero forgetting scores (the information is genuinely removed) while the overall utility of the model is preserved — the model continues to function normally for all other tasks.

Why Is This Important for Regulation and Privacy?

RePAIR addresses three concrete scenarios: suppressing harmful knowledge (such as instructions for creating dangerous substances), correcting misinformation (removing inaccurate facts the model learned), and deleting personal data on user request.

The last scenario is particularly relevant in the context of the European GDPR and the right to erasure. Until now, removing specific data from a trained model required costly and time-consuming retraining. RePAIR offers a practical alternative — on-demand forgetting, in real time, without performance degradation.

Results across multiple benchmarks show that RePAIR outperforms six existing state-of-the-art machine unlearning methods, offering a better trade-off between completeness of forgetting and preservation of useful capabilities.

🤖

This article was generated using artificial intelligence from primary sources.