Gemini Provider

The Gemini provider enables integration with Google’s Gemini models through the Generative Language API. It supports both synchronous and streaming message completion with full tool calling capabilities. Gemini offers extended context windows up to 1 million tokens and unique features like grounding with web search and safety ratings.

The provider handles Gemini-specific API formats, including the content/parts structure, function declarations, and response metadata extraction. Safety filters and grounding metadata are automatically parsed and made available through the response.

GeminiProvider Struct

The provider is defined in src/client/providers/gemini/mod.rs:

pub struct GeminiProvider {
    api_key: String,
    model: String,
}

impl GeminiProvider {
    pub fn new(api_key: String, model: String) -> Self {
        Self { api_key, model }
    }

    pub fn model(&self) -> &str {
        &self.model
    }
}

API Configuration

The Gemini provider uses the following API settings:

Setting	Value
Base URL	`https://generativelanguage.googleapis.com/v1beta/models`
Generate	`/{model}:generateContent`
Stream	`/{model}:streamGenerateContent?alt=sse`
Content-Type	`application/json`

Request headers:

Content-Type: application/json
x-goog-api-key: <api_key>

LLMSessionConfig Builder

Create a Gemini session configuration using the google() builder method:

use agent_air::controller::LLMSessionConfig;

let config = LLMSessionConfig::google("AIza...", "gemini-2.5-pro");

The google() method sets these defaults:

Option	Default Value
max_tokens	4096
streaming	true
context_limit	1,000,000
compaction	Threshold (default)

Builder Methods

Customize the configuration using builder methods:

let config = LLMSessionConfig::google("AIza...", "gemini-2.5-pro")
    .with_max_tokens(8192)
    .with_system_prompt("You are a helpful assistant.")
    .with_temperature(0.7)
    .with_streaming(true)
    .with_context_limit(1_000_000);

Available methods:

Method	Description
`with_max_tokens(u32)`	Set maximum response tokens
`with_system_prompt(impl Into<String>)`	Set the system prompt
`with_temperature(f32)`	Set sampling temperature
`with_streaming(bool)`	Enable or disable streaming
`with_context_limit(i32)`	Set context window size
`with_threshold_compaction(config)`	Configure compaction
`without_compaction()`	Disable compaction

Streaming Support

The Gemini provider fully supports streaming via Server-Sent Events (SSE). When streaming is enabled, responses arrive incrementally as StreamEvent values:

pub enum StreamEvent {
    MessageStart { message_id: String, model: String },
    TextDelta(String),
    ToolUse { id: String, name: String, input: Value },
    MessageStop,
    Error(String),
}

The streaming endpoint appends ?alt=sse to enable SSE format.

Request Format

Gemini uses a contents[] with parts[] structure. The provider automatically handles format conversion:

{
  "systemInstruction": {
    "parts": [{"text": "You are a helpful assistant."}]
  },
  "contents": [
    {
      "role": "user",
      "parts": [{"text": "Hello"}]
    }
  ],
  "generationConfig": {
    "maxOutputTokens": 4096,
    "temperature": 0.7
  }
}

Role Mapping

Generic Role	Gemini Role
User	`user`
Assistant	`model`
System	`systemInstruction` (separate field)

Tool Use Format

Tools are sent as function declarations:

{
  "tools": [{
    "functionDeclarations": [{
      "name": "get_weather",
      "description": "Get weather for a location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {"type": "string"}
        },
        "required": ["location"]
      }
    }]
  }],
  "toolConfig": {
    "functionCallingConfig": {
      "mode": "AUTO"
    }
  }
}

Tool choice modes:

Mode	Description
`AUTO`	Model decides whether to use tools
`ANY`	Model must use at least one tool
`NONE`	Model cannot use tools

Tool Results

Gemini matches tool results by function name, not by unique ID. The provider handles this automatically by using the function name as the tool use ID:

{
  "role": "user",
  "parts": [{
    "functionResponse": {
      "name": "get_weather",
      "response": {"result": "72F and sunny"}
    }
  }]
}

Response Metadata

Gemini responses include additional metadata that the provider extracts:

Safety Ratings

Every response includes safety ratings for content categories:

pub struct SafetyRating {
    pub category: String,      // e.g., "HARM_CATEGORY_HARASSMENT"
    pub probability: String,   // e.g., "NEGLIGIBLE", "LOW", "MEDIUM", "HIGH"
    pub blocked: bool,
}

Access safety ratings from the response:

if let Some(metadata) = message.response_metadata {
    if let Some(ratings) = metadata.safety_ratings {
        for rating in ratings {
            println!("{}: {} (blocked: {})",
                rating.category,
                rating.probability,
                rating.blocked
            );
        }
    }
}

Grounding Metadata

When grounding is enabled, responses include web search citations:

pub struct GroundingMetadata {
    pub web_search_queries: Vec<String>,
    pub grounding_chunks: Vec<GroundingChunk>,
    pub grounding_supports: Vec<GroundingSupport>,
}

pub struct GroundingChunk {
    pub source_type: String,  // "web"
    pub uri: Option<String>,
    pub title: Option<String>,
}

Environment Variables

Configure Gemini via environment variables:

Variable	Description	Default
`GOOGLE_API_KEY`	API key (required)	None
`GOOGLE_MODEL`	Model identifier	`gemini-2.5-pro`

Available Models

Common Gemini model identifiers:

Model	Context	Description
`gemini-2.5-pro`	1M	Latest and most capable
`gemini-2.5-flash`	1M	Fast, cost-effective
`gemini-1.5-pro`	2M	Previous generation pro
`gemini-1.5-flash`	1M	Previous generation flash

YAML Configuration

Configure in your agent’s config file:

providers:
  - provider: google
    api_key: AIza...
    model: gemini-2.5-pro
    system_prompt: "You are a helpful coding assistant."

default_provider: google

Complete Example

use agent_air::AgentAir;
use agent_air::controller::LLMSessionConfig;

struct MyConfig;

impl AgentConfig for MyConfig {
    fn config_path(&self) -> &str { ".myagent/config.yaml" }
    fn default_system_prompt(&self) -> &str { "You are helpful." }
    fn log_prefix(&self) -> &str { "myagent" }
    fn name(&self) -> &str { "MyAgent" }
}

fn main() -> std::io::Result<()> {
    let mut agent = AgentAir::new(&MyConfig)?;

    // Configuration is loaded automatically from:
    // 1. ~/.myagent/config.yaml (if exists)
    // 2. GOOGLE_API_KEY environment variable (fallback)

    agent.run()
}

Programmatic Configuration

For direct configuration without files:

use agent_air::controller::{LLMSessionConfig, LLMController};

let config = LLMSessionConfig::google(
    std::env::var("GOOGLE_API_KEY").expect("API key required"),
    "gemini-2.5-pro"
)
.with_system_prompt("You are a helpful assistant.")
.with_max_tokens(8192)
.with_streaming(true);

// Create session with this config
let controller = LLMController::new(None);
let session_id = controller.create_session(config).await?;

Error Handling

Gemini API errors are converted to LlmError:

pub struct LlmError {
    pub error_code: String,
    pub error_message: String,
}

Common error patterns:

Error Code	Description
`GEMINI_ERROR_400`	Invalid request (bad API key, malformed request)
`GEMINI_ERROR_429`	Rate limit exceeded
`PROMPT_BLOCKED`	Prompt blocked by safety filters
`CONTENT_BLOCKED`	Response blocked by safety filters

Safety Filter Errors

When content is blocked by safety filters, the error message includes details:

// Error message format:
// "Prompt blocked by Gemini safety filters. Reason: SAFETY.
//  Safety concerns: HARM_CATEGORY_SEXUALLY_EXPLICIT: HIGH"

Comparison with Other Providers

Feature	Gemini	Anthropic	OpenAI
Max context	1-2M	200K	128K
Streaming	Supported	Supported	Supported
System message	`systemInstruction`	`system` field	In messages array
Tool ID matching	By name	By unique ID	By unique ID
Safety ratings	Yes	No	No
Grounding/Citations	Yes	No	No

Building Agents

Architecture & Internals