Cohere Provider

The Cohere provider enables integration with Cohere’s Command models through the Chat API v2. It supports both synchronous and streaming message completion with full tool calling capabilities. Command-R models are designed for enterprise use cases with strong performance on retrieval-augmented generation (RAG) and multi-step tool use.

The provider handles Cohere-specific API formats, including the message structure with tool calls and the streaming event format. Tool results are sent with the special tool role as required by Cohere’s API.


CohereProvider Struct

The provider is defined in src/client/providers/cohere/mod.rs:

pub struct CohereProvider {
    api_key: String,
    model: String,
}

impl CohereProvider {
    pub fn new(api_key: String, model: String) -> Self {
        Self { api_key, model }
    }

    pub fn model(&self) -> &str {
        &self.model
    }
}

API Configuration

The Cohere provider uses the following API settings:

SettingValue
Endpointhttps://api.cohere.com/v2/chat
Content-Typeapplication/json

Request headers:

Content-Type: application/json
Authorization: Bearer <api_key>

LLMSessionConfig Builder

Create a Cohere session configuration using the cohere() builder method:

use agent_air::controller::LLMSessionConfig;

let config = LLMSessionConfig::cohere("co-...", "command-r-plus");

The cohere() method sets these defaults:

OptionDefault Value
max_tokens4096
streamingtrue
context_limit128,000
compactionThreshold (default)

Builder Methods

Customize the configuration using builder methods:

let config = LLMSessionConfig::cohere("co-...", "command-r-plus")
    .with_max_tokens(4096)
    .with_system_prompt("You are a helpful assistant.")
    .with_temperature(0.3)
    .with_streaming(true)
    .with_context_limit(128_000);

Available methods:

MethodDescription
with_max_tokens(u32)Set maximum response tokens
with_system_prompt(impl Into<String>)Set the system prompt
with_temperature(f32)Set sampling temperature
with_streaming(bool)Enable or disable streaming
with_context_limit(i32)Set context window size
with_threshold_compaction(config)Configure compaction
without_compaction()Disable compaction

Streaming Support

The Cohere provider fully supports streaming. When streaming is enabled, responses arrive incrementally as StreamEvent values:

pub enum StreamEvent {
    MessageStart { message_id: String, model: String },
    TextDelta(String),
    ToolUse { id: String, name: String, input: Value },
    MessageStop,
    Error(String),
}

Enable streaming by setting stream: true in the request body.


Request Format

Cohere uses a straightforward messages array with role-based formatting:

{
  "model": "command-r-plus",
  "messages": [
    {"role": "system", "content": "You are helpful."},
    {"role": "user", "content": "Hello"}
  ],
  "max_tokens": 4096,
  "temperature": 0.3
}

Role Mapping

Generic RoleCohere Role
Useruser
Assistantassistant
Systemsystem
Tool Resulttool

Tool Use Format

Tools are sent in the function format:

{
  "tools": [{
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "Get weather for a location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {"type": "string"}
        },
        "required": ["location"]
      }
    }
  }]
}

Tool choice options:

ValueDescription
(default)Auto - model decides
"required"Must use at least one tool
"none"Cannot use tools
{"type": "function", "function": {"name": "..."}}Force specific tool

Tool Calls in Responses

When the model requests tool calls, the response includes:

{
  "message": {
    "role": "assistant",
    "tool_calls": [{
      "id": "call_abc123",
      "type": "function",
      "function": {
        "name": "get_weather",
        "arguments": "{\"location\":\"San Francisco\"}"
      }
    }]
  }
}

Tool Results

Tool results are sent with the special tool role:

{
  "role": "tool",
  "tool_call_id": "call_abc123",
  "content": "72F and sunny in San Francisco"
}

Response Format

Cohere responses use the v2 Chat API format:

{
  "message": {
    "role": "assistant",
    "content": [{"type": "text", "text": "Hello! How can I help?"}]
  },
  "finish_reason": "COMPLETE"
}

The provider parses the message.content array and extracts text and tool call content.


Environment Variables

Configure Cohere via environment variables:

VariableDescriptionDefault
COHERE_API_KEYAPI key (required)None
COHERE_MODELModel identifiercommand-r-plus

Available Models

Common Cohere model identifiers:

ModelContextDescription
command-r-plus128KMost capable model
command-r128KBalanced performance
command-r-plus-08-2024128KAugust 2024 snapshot
command-r-08-2024128KAugust 2024 snapshot

YAML Configuration

Configure in your agent’s config file:

providers:
  - provider: cohere
    api_key: co-...
    model: command-r-plus
    system_prompt: "You are a helpful coding assistant."

default_provider: cohere

Complete Example

use agent_air::AgentAir;
use agent_air::controller::LLMSessionConfig;

struct MyConfig;

impl AgentConfig for MyConfig {
    fn config_path(&self) -> &str { ".myagent/config.yaml" }
    fn default_system_prompt(&self) -> &str { "You are helpful." }
    fn log_prefix(&self) -> &str { "myagent" }
    fn name(&self) -> &str { "MyAgent" }
}

fn main() -> std::io::Result<()> {
    let mut agent = AgentAir::new(&MyConfig)?;

    // Configuration is loaded automatically from:
    // 1. ~/.myagent/config.yaml (if exists)
    // 2. COHERE_API_KEY environment variable (fallback)

    agent.run()
}

Programmatic Configuration

For direct configuration without files:

use agent_air::controller::{LLMSessionConfig, LLMController};

let config = LLMSessionConfig::cohere(
    std::env::var("COHERE_API_KEY").expect("API key required"),
    "command-r-plus"
)
.with_system_prompt("You are a helpful assistant.")
.with_max_tokens(4096)
.with_temperature(0.3);

// Create session with this config
let controller = LLMController::new(None);
let session_id = controller.create_session(config).await?;

Error Handling

Cohere API errors are converted to LlmError:

pub struct LlmError {
    pub error_code: String,
    pub error_message: String,
}

Common error codes:

Error CodeDescription
COHERE_ERRORGeneral API error
PARSE_ERRORResponse parsing failed
INVALID_REQUESTMalformed request

RAG and Grounding

Cohere’s Command-R models are optimized for retrieval-augmented generation. While the provider supports tool calling for RAG workflows, native Cohere RAG features (connectors, documents) require direct API access.

For RAG workflows, implement a retrieval tool:

struct RetrieveTool {
    vector_db: Arc<VectorDatabase>,
}

impl Executable for RetrieveTool {
    fn name(&self) -> &str { "search_knowledge_base" }
    fn description(&self) -> &str { "Search the knowledge base for relevant information" }
    fn input_schema(&self) -> &str {
        r#"{"type":"object","properties":{"query":{"type":"string"}},"required":["query"]}"#
    }

    fn execute(
        &self,
        _ctx: ToolContext,
        input: HashMap<String, serde_json::Value>,
    ) -> Pin<Box<dyn Future<Output = Result<String, String>> + Send>> {
        let db = self.vector_db.clone();
        let query = input.get("query")
            .and_then(|v| v.as_str())
            .unwrap_or("")
            .to_string();

        Box::pin(async move {
            let results = db.search(&query, 5).await?;
            Ok(format_search_results(&results))
        })
    }
}

Comparison with Other Providers

FeatureCohereAnthropicOpenAI
Max context128K200K128K
StreamingSupportedSupportedSupported
System messageIn messagesDedicated fieldIn messages
Tool formatFunction wrapperDirect toolFunction wrapper
RAG optimizationYesNoNo