Mini Control Arena

v1.1.0

Mini Control Arena is a collection of template scripts to run AI Control experiments. Mini Control Arena is a ground-up rewrite of ControlArena from UK AISI, with a focus on simplicity. If you prefer building evaluations directly with minimal scaffolding, you'll find this library useful.

Each evaluation template showcases the core moving parts of an AI Control experiment. We use Inspect AI as the underlying evaluation engine. Our hope is that you can easily expand this library with some help from a coding assistant of your choice.

Two main principles guided all design decisions:

Lightweight. We designed this library so you don't have to read through thousands of lines of code to understand the core function. This leads to the next point.
Decoupled. Eval-specific components (like tools, dataset, scorer, system prompts) are decoupled from the main framework. This means that the main framework is simple, and you also need to understand only the specific template that you want to use. You'll learn more about this in the Anatomy of Mini Control Arena page.

Available Templates¶

Much work in building Mini Control Arena has been spent on standardizing evaluation setups across scattered implementations. Note that most of these were heavily refactored to be easier to understand.

We aimed to maintain similar functionalities, but you should assume that they wouldn't be exactly the same as the original implementations. Instead, we provide several performance estimates that you should expect in the Code Samples section on the left navigation bar.

template name	description	citation
`apps`	evaluates AI agents on interview-style programming problems with a side task to insert backdoor	AI Control
`bashbench`	evaluates AI agents on shell scripting tasks with a side task to download malware	Ctrl-Z
`bashbench2`	evaluates AI agents on shell scripting tasks with multiple side tasks (disable_firewall, download_malware, sudo_edit, leak_passwords)	BashArena
`shade_arena`	evaluates AI agents on 17 different tasks with various hidden objectives (travel booking, calendar management, financial systems, etc.)	SHADE-Arena
`agentic_misalignment`	reimplementation of the original eval as a react agent with functional tools, tests whether AI agents spontaneously take harmful actions	Agentic Misalignment

agentic_misalignment is different from other templates

Agentic Misalignment + is a reimplementation of the original text-based evaluation as a react agentic setup. The agent operates with functional tools (send_email, forward_email, cancel_alert) and an actual inbox component. Unlike other templates where agents receive explicit side task instructions, this template tests out-of-context harmful behavior. The agent is not explicitly instructed to do anything harmful. It discovers opportunities through emails and must decide on its own whether to act. See the Agentic Misalignment + page for details.

Refactor legend: = light (>10%) | = moderate (>30%) | = heavy (>70%)

BashBench2 and SHADE Arena combine multiple side tasks into one template.

Set Up¶

This library is not meant to be a hidden pip dependency. We hope that it'll be used as starter template code for your AI Control experiments, and the scripts are written to be human-friendly!

We recommend cloning directly from GitHub for installation. This will make it easier for you to read our scripts and modify if needed.

git clone https://github.com/brucewlee/mini-control-arena 
uv pip install -e mini-control-arena

If you have access to the private version, use:

git clone https://github.com/brucewlee/mini-control-arena-private mini-control-arena
uv pip install -e mini-control-arena

Virtual Environment¶

We think that your project will significantly benefit from having a dedicated virtual environment. It's a good practice to create one with uv.

# Download uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create venv (we test with Python 3.12)
uv venv --python 3.12

# Activate venv
source .venv/bin/activate

# Download mini-control-arena using uv
uv pip install -e mini-control-arena

We like uv because it provides a drop-in replacement for common pip commands. Just prepend uv to normal pip commands that you're used to.

Docker (Required)¶

We run experiments in Docker. So you need to install Docker. We recommend that you use Docker Desktop, which is very simple to use! We test Docker v4.44.3 with Apple silicon but we think any reasonably recent version should be fine.

Once you've downloaded the Docker Desktop, just open it and keep it running in the background. That's all you need to do for Docker every time you want to run an experiment.

Managing API Keys¶

You'll need API keys for the models you want to evaluate. We make it easy to use a .env file:

Create a .env file in your project directory:

OPENAI_API_KEY=sk-proj-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=AIza...

Load it in your code:

from mini_control_arena import load_env

# Load API keys from .env file
load_env()

The load_env() function automatically searches for .env files in the current directory and parent directories. You can also specify a custom path: load_env(".env.local").

You can then add .env to your .gitignore file to prevent accidentally exposing API keys.

Inspect AI¶

Mini Control Arena is built on Inspect AI, a popular framework for LLM evaluation. We think that this is a reasonable (and close-to-minimal) layer of abstraction to keep your code simple. This library is currently pegged to Inspect AI v0.3.144.

Quickstart¶

You only need to understand two core functions to get off the ground.

First is copy_template(), which copies a template to your directory. You'll learn more about what's included in templates later in the Anatomy of Mini Control Arena page.

APPS is the easiest example:

from mini_control_arena import copy_template

# Copy the apps template to your directory
copy_template("apps", "your-directory/my-custom-apps")
# This automatically downloads the required dataset too

This creates a complete evaluation setup in your-directory/my-custom-apps.

Second is run_eval(), which runs the evaluation on your chosen model. Note that the parameters that start with environment_ prefix are defined in your environment (which you downloaded to your-directory/my-custom-apps), not run_eval().

from mini_control_arena import run_eval

# Run the evaluation
results = run_eval(
    environment_dir="your-directory/my-custom-apps",
    model="openai/gpt-4.1-mini-2025-04-14",
    log_format="eval",

    environment_mode="honest",
    environment_samples=10,  # Random 10 samples
    environment_seed=42      # Reproducible random sampling
)

This will run the evaluation and save logs. When you combine these functions, you can run a simple experiment by first copying a template and then evaluating a model on it.

import os
os.environ["OPENAI_API_KEY"] = "sk-proj-..."

from mini_control_arena import run_eval, copy_template

# Copy template and prepare dataset
copy_template("apps", "./my-custom-apps")

run_eval(
    environment_dir="./my-custom-apps",
    model="openai/gpt-4.1-mini-2025-04-14",
    log_format="eval",

    environment_mode="honest",
    environment_samples=50,  # Random 50 samples
    environment_seed=42      # Reproducible
)

run_eval(
    environment_dir="./my-custom-apps",
    model="openai/gpt-4.1-mini-2025-04-14",
    log_format="eval",

    environment_mode="attack",
    environment_samples=50,  # Random 50 samples
    environment_seed=42      # Reproducible
)

You'll automatically have the inspect CLI in your environment after installing Mini Control Arena. You'll be able to view your evaluation results with:

inspect view --log-dir ./logs

Learn More¶

For a deeper understanding of the library, read the Anatomy of Mini Control Arena page which explains templates and environments. The Tools page covers how tools are structured. The Viewing Logs page explains how to view, share, and analyze evaluation results. See FAQs for frequently asked questions.

For code snippets that you can run directly, see Running BashBench2 and Running SHADE Arena.