Client Overview

This page documents the LLMClient, the core abstraction for communicating with LLM providers. The client provides a provider-agnostic interface that enables seamless switching between different LLM APIs.

Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                    Application Code                              │
│  LLMSession / StatelessExecutor                                 │
└─────────────────────────────────┬───────────────────────────────┘


┌─────────────────────────────────────────────────────────────────┐
│                    LLMClient                                     │
│  - Provider-agnostic API                                        │
│  - Message sending (streaming and non-streaming)                │
│  - Tool call handling                                           │
└─────────────────────────────────┬───────────────────────────────┘

                    ┌─────────────┴─────────────┐
                    │                           │
                    ▼                           ▼
┌───────────────────────────────┐ ┌───────────────────────────────┐
│         HttpClient            │ │        LlmProvider            │
│  - HTTP/TLS handling          │ │  - Request formatting         │
│  - Retry logic                │ │  - Response parsing           │
│  - Connection pooling         │ │  - Provider-specific logic    │
└───────────────────────────────┘ └───────────────────────────────┘

LLMClient Structure

The client wraps an HTTP client and a provider implementation:

pub struct LLMClient {
    http_client: HttpClient,
    provider: Box<dyn LlmProvider + Send + Sync>,
}
FieldTypePurpose
http_clientHttpClientHandles HTTP requests with TLS and retry
providerBox<dyn LlmProvider>Provider-specific request/response handling

Client Creation

Clients are created with a boxed provider implementation:

impl LLMClient {
    pub fn new(provider: Box<dyn LlmProvider + Send + Sync>) -> Result<Self, LlmError> {
        Ok(Self {
            http_client: HttpClient::new()?,
            provider,
        })
    }
}

The constructor initializes the HTTP client with TLS configuration. Any TLS initialization errors are returned immediately.

Factory Pattern

Sessions and executors create clients using a factory pattern:

fn create_llm_client(config: &LLMSessionConfig) -> Result<LLMClient, LlmError> {
    match config.provider {
        LLMProvider::Anthropic => {
            let provider = AnthropicProvider::new(
                config.api_key.clone(),
                config.model.clone()
            );
            LLMClient::new(Box::new(provider))
        }
        LLMProvider::OpenAI => {
            let provider = OpenAIProvider::new(
                config.api_key.clone(),
                config.model.clone()
            );
            LLMClient::new(Box::new(provider))
        }
    }
}

Sending Messages

Non-Streaming

impl LLMClient {
    pub async fn send_message(
        &self,
        messages: &[Message],
        options: &MessageOptions,
    ) -> Result<Message, LlmError> {
        self.provider
            .send_msg(&self.http_client, messages, options)
            .await
    }
}

Streaming

impl LLMClient {
    pub async fn send_message_stream(
        &self,
        messages: &[Message],
        options: &MessageOptions,
    ) -> Result<impl Stream<Item = Result<StreamEvent, LlmError>>, LlmError> {
        self.provider
            .send_msg_stream(&self.http_client, messages, options)
            .await
    }
}

MessageOptions

Request options control LLM behavior:

pub struct MessageOptions {
    pub temperature: Option<f32>,
    pub max_tokens: Option<u32>,
    pub model: Option<String>,
    pub tools: Option<Vec<Tool>>,
    pub tool_choice: Option<ToolChoice>,
    pub stop_sequences: Option<Vec<String>>,
    pub top_p: Option<f32>,
    pub top_k: Option<u32>,
    pub metadata: Option<Metadata>,
}

Option Descriptions

OptionTypeDescription
temperaturef32Sampling temperature (0.0-1.0)
max_tokensu32Maximum output tokens
modelStringModel override
toolsVec<Tool>Available tools for function calling
tool_choiceToolChoiceTool usage control
stop_sequencesVec<String>Custom stop sequences
top_pf32Nucleus sampling parameter
top_ku32Top-K sampling parameter
metadataMetadataUser ID for tracking

Tool Choice

Control how tools are used:

pub enum ToolChoice {
    Auto,        // LLM decides when to use tools
    Any,         // LLM must use a tool
    Tool(String), // LLM must use specific tool
    None,        // Disable tool use
}

Message Types

Messages represent conversation history:

pub struct Message {
    pub role: Role,
    pub content: Vec<Content>,
}

pub enum Role {
    System,
    User,
    Assistant,
}

Content Types

pub enum Content {
    Text(String),
    Image(ImageSource),
    ToolUse(ToolUse),
    ToolResult(ToolResult),
}

Tool Definitions

Define tools for function calling:

pub struct Tool {
    pub name: String,
    pub description: String,
    pub input_schema: String,  // JSON Schema
}

impl Tool {
    pub fn new(
        name: impl Into<String>,
        description: impl Into<String>,
        input_schema: impl Into<String>,
    ) -> Self {
        Self {
            name: name.into(),
            description: description.into(),
            input_schema: input_schema.into(),
        }
    }
}

Usage Example

Complete example of client usage:

// Create provider and client
let provider = AnthropicProvider::new(api_key, "claude-sonnet-4-20250514".to_string());
let client = LLMClient::new(Box::new(provider))?;

// Build messages
let messages = vec![
    Message::system("You are a helpful assistant."),
    Message::user("What is 2 + 2?"),
];

// Configure options
let options = MessageOptions {
    max_tokens: Some(1024),
    temperature: Some(0.7),
    ..Default::default()
};

// Send message
let response = client.send_message(&messages, &options).await?;

// Process response
for content in response.content {
    match content {
        Content::Text(text) => println!("{}", text),
        Content::ToolUse(tool) => {
            println!("Tool call: {} with {}", tool.name, tool.input);
        }
        _ => {}
    }
}

Streaming Example

let mut stream = client.send_message_stream(&messages, &options).await?;

while let Some(event) = stream.next().await {
    match event? {
        StreamEvent::TextDelta { text, .. } => {
            print!("{}", text);
        }
        StreamEvent::MessageDelta { usage, .. } => {
            if let Some(usage) = usage {
                println!("\nTokens: {}", usage.output_tokens);
            }
        }
        StreamEvent::MessageStop => break,
        _ => {}
    }
}

Thread Safety

The client is designed for concurrent use:

  • LLMClient is Clone (wraps HttpClient which is Clone)
  • LlmProvider is Send + Sync (safe to share across threads)
  • HttpClient uses connection pooling internally

Error Handling

Client operations return LlmError:

pub struct LlmError {
    pub error_code: String,
    pub error_message: String,
}

Common error codes:

  • TLS_INIT_FAILED - TLS initialization error
  • HTTP_REQUEST_FAILED - Network request failed
  • PARSE_ERROR - Response parsing failed
  • RATE_LIMIT_EXHAUSTED - All retries exhausted

Next Steps