Building reliable AI for a changing world

Foundation models can analyse images, draft reports, and support decisions across complex workflows. But in real-world settings, even highly capable AI systems can become unreliable when conditions change. A model trained in one context may encounter new data, new equipment, or new environments at deployment – and still produce confident answers, even when those answers are no longer trustworthy.

[Translate to English:] Members of the research team presenting their work on Test-Time Adaptation at the ICLR 2025 conference. Behzad Bozorgtabar in the middle.

17 April 2026 by Litte Dalsgaard

That challenge is at the centre of Behzad Bozorgtabar’s research. As an Associate Professor in AI and Computer Vision at the Department of Electrical and Computer Engineering at Aarhus University, Bozorgtabar develops adaptive AI systems that can respond safely when real-world conditions shift. His goal is not simply to make models more powerful, but to make them more reliable, transparent, and useful in practice – especially in safety-critical areas such as healthcare, industrial inspection, and autonomous systems.

Why changing conditions are a problem for AI

AI models are often built and evaluated under controlled conditions. But deployment is rarely controlled for long. A hospital may install a new scanner. An imaging workflow may be updated. A production line may switch cameras. In each case, the data can change in ways that are subtle to humans but significant to a model. This is where problems begin.

An AI system may still produce output that appears plausible, even when its assumptions no longer match the setting in which it operates. In safety-critical contexts, that is a serious risk: not only that a model gets something wrong, but that it does so confidently.

“Conditions change all the time in the real world,” says Bozorgtabar. “If we want to use AI in settings like medicine or industry, we need systems that remain safe and reliable under those changes – not only under ideal conditions.”

Adapting at deployment time

A central focus of Bozorgtabar’s research is test-time adaptation – methods that allow a pre-trained model to adjust when it encounters new conditions during deployment, without requiring new labelled data or full retraining.

The idea is simple in principle but difficult in practice. If a model is exposed to new conditions, it may need to adapt. But it should be done carefully. In high-stakes settings, adaptation should not be uncontrolled or automatic for its own sake. It needs to happen within clear limits.

For Bozorgtabar, the goal is not maximum flexibility. It is a safe adaptation: systems that can respond when conditions change, while preserving reliability, recognizing uncertainty, and avoiding unstable behavior.

This is a shift away from the traditional assumption that AI can be trained once and then deployed unchanged. In practice, real-world systems operate in moving environments. Reliable AI, therefore, must account for change as part of deployment itself.

Different ways to adapt safely

Bozorgtabar’s work explores several ways an AI system can adapt to changing environments.

In some cases, the safest option is to adjust how the input is handled, aligning new data with the conditions under which the model was originally trained. In other cases, it may be possible to update selected parts of the model in a controlled way. More advanced approaches can combine multiple models or expert systems, allowing predictions to be cross-checked or routed more intelligently.

These strategies differ in the extent to which they intervene in the system. Some are lightweight and cautious. Others are more flexible but also require stronger safeguards.

This matters because the best response depends on the situation. If conditions change only slightly, a small correction may be enough. If the system becomes uncertain, it may be better to verify its output, rely on another model, or defer to a human expert.

That broader perspective increasingly shapes Bozorgtabar’s research: adaptation is not only a technical mechanism but also a decision about which intervention is justified.

From adaptive AI to agentic AI with guardrails

This also connects to Bozorgtabar’s work on agentic AI. Agentic systems are often described as systems that can plan, choose actions, or use tools autonomously. But in safety-critical settings, autonomy alone is not the goal. The real question is how such systems can act within clear boundaries.

Bozorgtabar’s vision is therefore one of structured autonomy – AI systems that monitor whether conditions are changing, assess their own reliability, and choose a bounded response. That response might involve adjusting inputs, updating selected model components, consulting multiple experts, or deferring to a human user.

What matters is not only what the system can do, but what it is allowed to do – and whether its decisions remain understandable and auditable.

“In safety-critical applications, the key is not to give AI unlimited freedom,” Bozorgtabar says. “It is to design systems that can respond intelligently while staying within explicit safety constraints.”

Trust matters more than scale

At a time when much attention is focused on ever-larger models, Bozorgtabar’s research highlights a different priority: making AI dependable over time.

Large models may perform impressively in demonstrations, but real-world value depends on whether they remain trustworthy when reality changes. In domains such as healthcare, that means supporting professionals with systems they can rely on – systems that help when appropriate, signal uncertainty when needed, and do not hide their limitations behind confident outputs.

“Big models can look impressive on day one,” he says. “But in safety-critical settings, what matters is day one thousand.”

That long-term view is central to his research. Rather than treating deployment as a final step after training, Bozorgtabar studies what it takes for AI systems to remain useful under ongoing change.

Research with real-world purpose

For Bozorgtabar, that practical focus is essential. His work sits at the intersection of machine learning, computer vision, and medical image analysis, but the driving question is broader: how to make adaptive AI something society can actually depend on.

“That’s a major motivation for me,” he says. “We want to work on problems that matter in the real world – problems where better reliability and safety could make a real difference for people.”

That also shapes the long-term vision of the A3 Lab, the research group he founded at Aarhus University. The lab studies adaptive and agentic AI systems that can operate under changing conditions while remaining safe, transparent, and aligned with human oversight.

Rather than designing AI only for benchmark performance, the ambition is to help build systems that can function responsibly in practice.

About Behzad Bozorgtabar

Behzad Bozorgtabar is an Associate Professor in AI and Computer Vision at the Department of Electrical and Computer Engineering at Aarhus University, where he leads the A3 Lab (Adaptive & Agentic AI Lab). He is also a member of the European Laboratory for Learning and Intelligent Systems (ELLIS) and a Principal Investigator at the Pioneer Centre for AI (P1). His research focuses on trustworthy test-time adaptation, adaptive multimodal foundation models, and agentic AI systems for safety-critical deployment. Before joining Aarhus University, he led a computer vision research team at EPFL and previously worked as a postdoctoral researcher at IBM Research Australia.