Cohere Provider
The Cohere provider enables integration with Cohere’s Command models through the Chat API v2. It supports both synchronous and streaming message completion with full tool calling capabilities. Command-R models are designed for enterprise use cases with strong performance on retrieval-augmented generation (RAG) and multi-step tool use.
The provider handles Cohere-specific API formats, including the message structure with tool calls and the streaming event format. Tool results are sent with the special tool role as required by Cohere’s API.
CohereProvider Struct
The provider is defined in src/client/providers/cohere/mod.rs:
pub struct CohereProvider {
api_key: String,
model: String,
}
impl CohereProvider {
pub fn new(api_key: String, model: String) -> Self {
Self { api_key, model }
}
pub fn model(&self) -> &str {
&self.model
}
}
API Configuration
The Cohere provider uses the following API settings:
| Setting | Value |
|---|---|
| Endpoint | https://api.cohere.com/v2/chat |
| Content-Type | application/json |
Request headers:
Content-Type: application/json
Authorization: Bearer <api_key>
LLMSessionConfig Builder
Create a Cohere session configuration using the cohere() builder method:
use agent_air::controller::LLMSessionConfig;
let config = LLMSessionConfig::cohere("co-...", "command-r-plus");
The cohere() method sets these defaults:
| Option | Default Value |
|---|---|
| max_tokens | 4096 |
| streaming | true |
| context_limit | 128,000 |
| compaction | Threshold (default) |
Builder Methods
Customize the configuration using builder methods:
let config = LLMSessionConfig::cohere("co-...", "command-r-plus")
.with_max_tokens(4096)
.with_system_prompt("You are a helpful assistant.")
.with_temperature(0.3)
.with_streaming(true)
.with_context_limit(128_000);
Available methods:
| Method | Description |
|---|---|
with_max_tokens(u32) | Set maximum response tokens |
with_system_prompt(impl Into<String>) | Set the system prompt |
with_temperature(f32) | Set sampling temperature |
with_streaming(bool) | Enable or disable streaming |
with_context_limit(i32) | Set context window size |
with_threshold_compaction(config) | Configure compaction |
without_compaction() | Disable compaction |
Streaming Support
The Cohere provider fully supports streaming. When streaming is enabled, responses arrive incrementally as StreamEvent values:
pub enum StreamEvent {
MessageStart { message_id: String, model: String },
TextDelta(String),
ToolUse { id: String, name: String, input: Value },
MessageStop,
Error(String),
}
Enable streaming by setting stream: true in the request body.
Request Format
Cohere uses a straightforward messages array with role-based formatting:
{
"model": "command-r-plus",
"messages": [
{"role": "system", "content": "You are helpful."},
{"role": "user", "content": "Hello"}
],
"max_tokens": 4096,
"temperature": 0.3
}
Role Mapping
| Generic Role | Cohere Role |
|---|---|
| User | user |
| Assistant | assistant |
| System | system |
| Tool Result | tool |
Tool Use Format
Tools are sent in the function format:
{
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
}]
}
Tool choice options:
| Value | Description |
|---|---|
| (default) | Auto - model decides |
"required" | Must use at least one tool |
"none" | Cannot use tools |
{"type": "function", "function": {"name": "..."}} | Force specific tool |
Tool Calls in Responses
When the model requests tool calls, the response includes:
{
"message": {
"role": "assistant",
"tool_calls": [{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\":\"San Francisco\"}"
}
}]
}
}
Tool Results
Tool results are sent with the special tool role:
{
"role": "tool",
"tool_call_id": "call_abc123",
"content": "72F and sunny in San Francisco"
}
Response Format
Cohere responses use the v2 Chat API format:
{
"message": {
"role": "assistant",
"content": [{"type": "text", "text": "Hello! How can I help?"}]
},
"finish_reason": "COMPLETE"
}
The provider parses the message.content array and extracts text and tool call content.
Environment Variables
Configure Cohere via environment variables:
| Variable | Description | Default |
|---|---|---|
COHERE_API_KEY | API key (required) | None |
COHERE_MODEL | Model identifier | command-r-plus |
Available Models
Common Cohere model identifiers:
| Model | Context | Description |
|---|---|---|
command-r-plus | 128K | Most capable model |
command-r | 128K | Balanced performance |
command-r-plus-08-2024 | 128K | August 2024 snapshot |
command-r-08-2024 | 128K | August 2024 snapshot |
YAML Configuration
Configure in your agent’s config file:
providers:
- provider: cohere
api_key: co-...
model: command-r-plus
system_prompt: "You are a helpful coding assistant."
default_provider: cohere
Complete Example
use agent_air::AgentAir;
use agent_air::controller::LLMSessionConfig;
struct MyConfig;
impl AgentConfig for MyConfig {
fn config_path(&self) -> &str { ".myagent/config.yaml" }
fn default_system_prompt(&self) -> &str { "You are helpful." }
fn log_prefix(&self) -> &str { "myagent" }
fn name(&self) -> &str { "MyAgent" }
}
fn main() -> std::io::Result<()> {
let mut agent = AgentAir::new(&MyConfig)?;
// Configuration is loaded automatically from:
// 1. ~/.myagent/config.yaml (if exists)
// 2. COHERE_API_KEY environment variable (fallback)
agent.run()
}
Programmatic Configuration
For direct configuration without files:
use agent_air::controller::{LLMSessionConfig, LLMController};
let config = LLMSessionConfig::cohere(
std::env::var("COHERE_API_KEY").expect("API key required"),
"command-r-plus"
)
.with_system_prompt("You are a helpful assistant.")
.with_max_tokens(4096)
.with_temperature(0.3);
// Create session with this config
let controller = LLMController::new(None);
let session_id = controller.create_session(config).await?;
Error Handling
Cohere API errors are converted to LlmError:
pub struct LlmError {
pub error_code: String,
pub error_message: String,
}
Common error codes:
| Error Code | Description |
|---|---|
COHERE_ERROR | General API error |
PARSE_ERROR | Response parsing failed |
INVALID_REQUEST | Malformed request |
RAG and Grounding
Cohere’s Command-R models are optimized for retrieval-augmented generation. While the provider supports tool calling for RAG workflows, native Cohere RAG features (connectors, documents) require direct API access.
For RAG workflows, implement a retrieval tool:
struct RetrieveTool {
vector_db: Arc<VectorDatabase>,
}
impl Executable for RetrieveTool {
fn name(&self) -> &str { "search_knowledge_base" }
fn description(&self) -> &str { "Search the knowledge base for relevant information" }
fn input_schema(&self) -> &str {
r#"{"type":"object","properties":{"query":{"type":"string"}},"required":["query"]}"#
}
fn execute(
&self,
_ctx: ToolContext,
input: HashMap<String, serde_json::Value>,
) -> Pin<Box<dyn Future<Output = Result<String, String>> + Send>> {
let db = self.vector_db.clone();
let query = input.get("query")
.and_then(|v| v.as_str())
.unwrap_or("")
.to_string();
Box::pin(async move {
let results = db.search(&query, 5).await?;
Ok(format_search_results(&results))
})
}
}
Comparison with Other Providers
| Feature | Cohere | Anthropic | OpenAI |
|---|---|---|---|
| Max context | 128K | 200K | 128K |
| Streaming | Supported | Supported | Supported |
| System message | In messages | Dedicated field | In messages |
| Tool format | Function wrapper | Direct tool | Function wrapper |
| RAG optimization | Yes | No | No |
