Server Runtime

Agent Air provides a pluggable frontend architecture that enables agents to run without a terminal interface. The server runtime powers headless deployments—web servers, background processes, API backends, and embedded integrations. The same agent logic that runs in the TUI works identically in server mode, just with different input and output handling.

This architecture separates the agent’s core functionality from its presentation layer through three abstractions: EventSink for output, InputSource for input, and PermissionPolicy for automated decision-making. By implementing these traits, you can integrate agents into any environment that can send and receive messages.

When to Use Server Mode

Server mode is the right choice when your agent needs to run without direct terminal access or when you’re building a custom frontend. The headless architecture handles all the complex coordination—streaming responses, permission management, session state—while you focus on your specific integration needs.

Common use cases include:

Web applications - Expose agent capabilities through WebSocket or HTTP APIs
Background workers - Process tasks from queues without user interaction
Chat platforms - Integrate with Slack, Discord, or custom chat systems
API backends - Provide agent functionality as a service
Embedded systems - Run agents inside larger applications
Batch processing - Automate workflows with programmatic control

For interactive terminal use, the TUI Runtime provides a complete interface. Server mode offers the same functionality without the visual interface.

Architecture Overview

The server runtime uses three pluggable interfaces that together handle all communication between your application and the agent. This design keeps the agent logic independent of how messages are transported.

Your Application          Agent Air              LLM Controller
      │                        │                        │
      │◄── EventSink ─────────│◄─── Controller ────────│
      │    (events out)        │     Events             │
      │                        │                        │
      │─── InputSource ───────►│──── User Input ───────►│
      │    (messages in)       │                        │
      │                        │                        │
      └── PermissionPolicy ────┘                        │
          (auto-decisions)                              │

EventSink receives all events from the agent—streaming text, tool executions, permission requests, errors, and completion signals. Your sink implementation decides how to deliver these to your frontend.

InputSource provides user messages to the agent. Your source implementation receives input from wherever your users are—HTTP requests, WebSocket messages, queue items, or programmatic calls.

PermissionPolicy makes automatic decisions about permission requests. In server environments without interactive users, policies can auto-approve safe operations or deny sensitive ones based on your security requirements.

The run_with_frontend Method

The primary entry point for server mode is run_with_frontend(), which takes your three interface implementations and runs the agent until the input source closes. This method handles all the internal coordination—event routing, permission management, and graceful shutdown.

use agent_air::{AgentAir, EventSink, InputSource, PermissionPolicy};
use agent_air::policy::AutoApprovePolicy;

let mut agent = AgentAir::with_config(
    "my-server-agent",
    "~/.config/agent/config.yaml",
    "You are a helpful assistant."
)?;

// Create your sink, source, and policy
let sink = MyCustomSink::new();
let source = MyCustomSource::new();
let policy = AutoApprovePolicy::new();

// Run until the source closes
agent.run_with_frontend(sink, source, policy).await?;

The method returns when the input source signals completion (returns None from recv()). For long-running servers, keep the source open; for batch jobs, close it when processing completes.

Channel-Based Communication

For simpler integrations, agent-air provides channel-based implementations of the interfaces. Channels let you send and receive messages directly without implementing traits, which works well for in-process communication.

use agent_air::AgentAir;

let mut agent = AgentAir::with_config("agent", config_path, prompt)?;
agent.start_background_tasks();
agent.create_initial_session()?;

// Get the communication channels
let tx = agent.to_controller_tx();
let mut rx = agent.take_from_controller_rx().unwrap();

// Send a message
tx.send(ControllerInputPayload::data(
    session_id,
    "Hello, agent!",
    TurnId::new_user_turn(1)
)).await?;

// Receive events
while let Some(event) = rx.recv().await {
    match event {
        UiMessage::TextChunk { text, .. } => print!("{}", text),
        UiMessage::Complete { .. } => break,
        _ => {}
    }
}

Channels are useful for quick prototypes and situations where you need direct access to the message streams. For production servers, implementing the traits provides better encapsulation and error handling.

Event Flow

Events flow from the LLM controller through your EventSink to your application. Understanding this flow helps you build responsive integrations that handle each event type appropriately.

User sends message via InputSource
Agent processes with LLM and tools
Events stream through EventSink:
- TextChunk for streaming response text
- ToolExecuting when a tool starts
- ToolCompleted when a tool finishes
- PermissionRequired if a tool needs approval
- TokenUpdate with usage statistics
- Complete when the turn finishes

Your sink receives these events in real-time, allowing you to stream responses to users rather than waiting for complete messages.

Permission Handling

Server mode requires a strategy for handling permission requests since there’s no interactive user to click “Allow” or “Deny”. The PermissionPolicy trait encapsulates this decision-making.

Three built-in policies cover common scenarios:

Policy	Behavior	Use Case
`AutoApprovePolicy`	Approve all requests	Trusted environments, automation
`DenyAllPolicy`	Deny all requests	Sandboxed execution, read-only agents
`InteractivePolicy`	Always ask user	When you can forward to a real user

For more nuanced control, implement a custom policy that makes decisions based on the request type, target resource, or other context. See Permission Policies for details.

Session Management

Server agents support multiple concurrent sessions, each with independent conversation history and context. Sessions are identified by integer IDs and can be created, switched, and cleaned up programmatically.

// Create a session
let (session_id, model_name, model_info) = agent.create_initial_session()?;

// Create additional sessions
let second_session = agent.create_session()?;

// Include session_id in all input messages
tx.send(ControllerInputPayload::data(
    session_id,  // Route to correct session
    "User message",
    turn_id
)).await?;

Events include their session ID, so your sink can route responses to the appropriate user or context. This enables multi-tenant servers where each user maintains a separate conversation.

Graceful Shutdown

Proper shutdown ensures in-flight requests complete and resources are released cleanly. The agent provides a shutdown method that cancels background tasks and waits for completion.

// Signal shutdown
agent.shutdown();

// Or with timeout
tokio::select! {
    _ = agent.shutdown() => println!("Clean shutdown"),
    _ = tokio::time::sleep(Duration::from_secs(30)) => {
        println!("Forced shutdown after timeout");
    }
}

For run_with_frontend(), shutdown happens automatically when the input source closes. For channel-based usage, call shutdown() explicitly when your server stops.

Error Handling

Server mode surfaces errors through the EventSink as UiMessage::Error events and through the return value of run_with_frontend(). Design your sink to handle error events gracefully and log or report them appropriately.

match event {
    UiMessage::Error { message, .. } => {
        log::error!("Agent error: {}", message);
        // Optionally forward to user or monitoring system
    }
    // ... handle other events
}

Transient errors like rate limits are typically retried internally. Persistent errors like invalid API keys cause the agent to stop and return an error from the run method.

Complete Example

A minimal server agent that processes a single message and prints the response:

use agent_air::{AgentAir, ControllerInputPayload, UiMessage, TurnId};
use agent_air::sink::SimpleEventSink;
use agent_air::source::ChannelInputSource;
use agent_air::policy::AutoApprovePolicy;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create the agent
    let mut agent = AgentAir::with_config(
        "example",
        "~/.config/example/config.yaml",
        "You are a helpful assistant."
    )?;

    // Set up channels for this example
    let (input_tx, input_rx) = tokio::sync::mpsc::channel(100);
    let source = ChannelInputSource::new(input_rx);

    // Simple sink that prints text chunks
    let sink = SimpleEventSink::new(|event| {
        if let UiMessage::TextChunk { text, .. } = event {
            print!("{}", text);
        }
    });

    // Auto-approve all permissions for this example
    let policy = AutoApprovePolicy::new();

    // Start the agent in a background task
    let agent_handle = tokio::spawn(async move {
        agent.run_with_frontend(sink, source, policy).await
    });

    // Create session and send a message
    let session_id = 1;
    input_tx.send(ControllerInputPayload::data(
        session_id,
        "What is 2 + 2?",
        TurnId::new_user_turn(1)
    )).await?;

    // Close input to signal completion
    drop(input_tx);

    // Wait for agent to finish
    agent_handle.await??;

    Ok(())
}

This example demonstrates the core pattern: create an agent, set up the three interfaces, run until input closes. Real servers would keep the input channel open and route messages from HTTP requests, WebSocket connections, or other sources.

Building Agents

Architecture & Internals