HUD Documentation — Evaluations and RL Environments.

Turn any scenario into a callable API endpoint. Define what your agent does, train it (optional), and get a single cURL command that runs it on demand. This is useful for:

Automations — Trigger agent runs from CI/CD, webhooks, or other services
Integrations — Embed HUD agents in your existing tools
Demos — Share a simple endpoint that showcases your agent’s capabilities

Overview

The workflow is:

Create a Scenario — Define setup and evaluation in your environment
Train (Optional) — Fine-tune a model on successful runs
Call the API — Use the /v1/agent/run endpoint to trigger runs

Step 1: Create a Scenario

Scenarios define what your agent does. In your environment code:

from hud import Env

env = Env("my-assistant")

@env.scenario()
async def answer_question(query: str):
    """Answer a user question using available tools."""
    
    yield env.setup(f"Answer this question: {query}")
    
    result = yield  # Agent runs here
    
    # Evaluate the result
    if "helpful" in result.lower():
        yield env.reward(1.0)
    else:
        yield env.reward(0.0)

Deploy your environment through the platform:

Go to hud.ai/environments → New Environment
Connect your GitHub repository or upload your code
The platform builds and deploys automatically

See Environments for the full deployment workflow.

Step 2: Train a Model (Optional)

If you want better performance, train a model on successful trajectories:

Create a Taskset with tasks using your scenario
Run evaluations to generate trajectories
Go to the Models tab in your taskset and click Train Model
Select successful runs and a base model for fine-tuning

The platform handles training and creates a model checkpoint you can use in API calls.

Training is optional. You can use any available model directly without fine-tuning.

Step 3: Call the API

Once your scenario exists, call it via the REST API:

cURL

curl -X POST https://api.hud.so/v1/agent/run \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $HUD_API_KEY" \
  -d '{
    "env_name": "my-assistant",
    "scenario_name": "answer_question",
    "scenario_args": {
      "query": "What is the capital of France?"
    },
    "model": "claude-opus-4-5",
    "max_steps": 100
  }'

Response

{
  "trace_id": "abc123-def456-..."
}

The trace_id lets you look up the full run on hud.ai or via the API.

Python

import hud
from hud import Environment

env = Environment()
env.connect_hub("my-assistant")

# Create a task with scenario and args
task = env("answer_question", query="What is the capital of France?")

# Run evaluation
async with hud.eval(task) as ctx:
    result = await ctx.call_tool("respond", answer="Paris is the capital of France.")

Request Parameters

Parameter	Type	Required	Description
`env_name`	string	✓	Environment name
`scenario_name`	string	✓	Scenario to run
`scenario_args`	object	✓	Arguments for the scenario
`model`	string	✓	Model to use (see Models)
`max_steps`	integer		Maximum agent actions (default: 100, max: 500)

Viewing Results

Each run creates a trace. View it at:

https://hud.ai/traces/<trace_id>

Traces show:

Full agent trajectory (actions + observations)
Tool calls and responses
Final evaluation result and reward
Timing and token usage

From the Platform

Every scenario card has a code snippet panel with ready-to-copy commands:

Go to your environment’s Scenarios tab
Click the </> icon on any scenario card
Select cURL or Python tab
Copy the command with your scenario args pre-filled

Arguments you enter in the scenario card get inserted into the code snippet automatically.

Use Cases

Webhook Handler

Trigger agent runs from external events:

@app.post("/webhook")
async def handle_webhook(event: dict):
    response = requests.post(
        "https://api.hud.so/v1/agent/run",
        headers={"Authorization": f"Bearer {HUD_API_KEY}"},
        json={
            "env_name": "support-agent",
            "scenario_name": "handle_ticket",
            "scenario_args": {"ticket_id": event["id"]},
            "model": "claude-opus-4-5"
        }
    )
    return {"trace_id": response.json()["trace_id"]}

Scheduled Tasks

Run agents on a schedule with cron or similar:

# Run daily cleanup agent
0 0 * * * curl -X POST https://api.hud.so/v1/agent/run \
  -H "Authorization: Bearer $HUD_API_KEY" \
  -d '{"env_name": "ops", "scenario_name": "daily_cleanup", "scenario_args": {}, "model": "claude-sonnet-4-5"}'

Chat Interface

Power a chat UI with agent capabilities:

async function sendMessage(message) {
  const response = await fetch("https://api.hud.so/v1/agent/run", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "Authorization": `Bearer ${HUD_API_KEY}`
    },
    body: JSON.stringify({
      env_name: "chat-assistant",
      scenario_name: "respond",
      scenario_args: { message },
      model: "claude-opus-4-5"
    })
  });
  return response.json();
}

Best Practices

Set appropriate max_steps — Lower for simple tasks, higher for complex ones
Handle errors — The API returns standard HTTP error codes
Monitor usage — Each run consumes compute; check your usage dashboard
Use trained models — Fine-tuned models often perform better on specific tasks

Slack Integration

Run agents from Slack

Models

Available models and fine-tuning

Get Started

Concepts

Guides

Integrations

How We Use HUD on HUD

Subagent API

Overview

Step 1: Create a Scenario

Step 2: Train a Model (Optional)

Step 3: Call the API

cURL

Response

Python

Request Parameters

Viewing Results

From the Platform

Use Cases

Webhook Handler

Scheduled Tasks

Chat Interface

Best Practices

Slack Integration

Models

Get Started

Concepts

Guides

Integrations

How We Use HUD on HUD

​Overview

​Step 1: Create a Scenario

​Step 2: Train a Model (Optional)

​Step 3: Call the API

​cURL

​Response

​Python

​Request Parameters

​Viewing Results

​From the Platform

​Use Cases

​Webhook Handler

​Scheduled Tasks

​Chat Interface

​Best Practices

Slack Integration

Models

Overview

Step 1: Create a Scenario

Step 2: Train a Model (Optional)

Step 3: Call the API

cURL

Response

Python

Request Parameters

Viewing Results

From the Platform

Use Cases

Webhook Handler

Scheduled Tasks

Chat Interface

Best Practices