Physics Steering: Scientists Unlock Causal Control Over Scientific Foundation Models
Share this article
Physics Steering: Scientists Unlock Causal Control Over Scientific Foundation Models
In a breakthrough that could transform scientific research, researchers have demonstrated the ability to directly control a physics foundation model by manipulating its internal representations. Their novel approach, detailed in a recent arXiv paper, reveals that these models learn fundamental physical principles rather than merely recognizing patterns—a discovery with profound implications for the future of AI-enabled scientific discovery.
The Quest to Understand Scientific Models
Recent advances in mechanistic interpretability have shown that large language models (LLMs) develop internal representations corresponding not just to concrete entities, but also to distinct, human-understandable abstract concepts. Researchers have even found they can directly manipulate these hidden features to steer model behavior.
But a critical question remained: was this phenomenon unique to models trained on structured data like language and images, or was it a general property of foundation models regardless of their domain?
"It remains an open question whether this phenomenon is unique to models trained on inherently structured data (ie. language, images) or if it is a general property of foundation models." — Physics Steering: Causal Control of Cross-Domain Concepts in a Physics Foundation Model
This question motivated a research team, led by Rio Alexa Fear, to investigate the internal representations of a large physics-focused foundation model.
The Physics Steering Technique
Drawing inspiration from recent work that identified single directions in activation space for complex behaviors in LLMs, the researchers developed a novel approach for scientific models. Their method involves extracting activation vectors from the model during forward passes over simulation datasets representing different physical regimes.
The key innovation was computing "delta" representations between these different regimes. These delta tensors act as concept directions in activation space, encoding specific physical features. By injecting these concept directions back into the model during inference, the researchers could steer its predictions with remarkable precision.
"By injecting these concept directions back into the model during inference, we can steer its predictions, demonstrating causal control over physical behaviours, such as inducing or removing some particular physical feature from a simulation." — Physics Steering: Causal Control of Cross-Domain Concepts in a Physics Foundation Model
This technique effectively allows researchers to manipulate a model's understanding of physical phenomena directly, rather than just adjusting input parameters—a capability that could revolutionize how we interact with scientific AI.
Implications for Scientific Discovery
The results of this research are profound. They suggest that scientific foundation models learn generalized representations of physical principles rather than relying on superficial correlations and patterns in the simulations.
This insight has significant implications for AI-enabled scientific discovery. If models truly understand the underlying principles, they could potentially:
- Generate novel hypotheses by exploring concept spaces that humans haven't considered
- Identify subtle patterns in complex physical systems that might be missed by traditional analysis
- Accelerate scientific discovery by rapidly testing theoretical predictions across different physical regimes
The researchers note that their findings "open new avenues for understanding and controlling scientific foundation models."
Toward More Transparent Scientific AI
As AI models become increasingly integrated into scientific research, techniques like physics steering will be crucial for ensuring these models are both interpretable and controllable. The ability to directly manipulate a model's understanding of physical concepts represents a significant step toward creating more transparent and reliable scientific AI.
The paper, titled "Physics Steering: Causal Control of Cross-Domain Concepts in a Physics Foundation Model," is available on arXiv with identifier 2511.20798. The authors note that code for their approach will be made available soon.
This research joins a growing body of work focused on making AI models more interpretable and controllable, particularly in high-stakes domains like scientific research. As these techniques mature, we may see AI systems that not only process data but can genuinely reason about and manipulate the fundamental principles they've learned—potentially accelerating our understanding of the universe in ways we can only begin to imagine.
Source: Physics Steering: Causal Control of Cross-Domain Concepts in a Physics Foundation Model by Rio Alexa Fear, Payel Mukhopadhyay, Michael McCabe, Alberto Bietti, and Miles Cranmer. arXiv:2511.20798, 2025.