Context Management

This page documents how conversation context is stored, accessed, and manipulated within LLMSession. Context management handles message storage, turn tracking, and conversation lifecycle operations.

Conversation Storage

Conversations are stored in a thread-safe, copy-on-write structure:

pub struct LLMSession {
    conversation: RwLock<Arc<Vec<Message>>>,
    // ...
}

Design Rationale

Component	Purpose
`RwLock`	Allows concurrent reads, exclusive writes
`Arc`	Enables cheap cloning for consumers
`Vec<Message>`	Ordered message history

This design provides:

Lock-free reads when multiple consumers access history
Copy-on-write semantics using Arc::make_mut()
O(1) cloning for snapshot operations

Message Types

The Message enum represents conversation entries:

pub enum Message {
    User(UserMessage),
    Assistant(AssistantMessage),
}

UserMessage

User messages contain the input sent to the LLM:

pub struct UserMessage {
    pub id: String,
    pub session_id: i64,
    pub turn_id: TurnId,
    pub created_at: DateTime<Utc>,
    pub content: Vec<ContentBlock>,
}

AssistantMessage

Assistant messages contain the LLM response:

pub struct AssistantMessage {
    pub id: String,
    pub session_id: i64,
    pub turn_id: TurnId,
    pub created_at: DateTime<Utc>,
    pub content: Vec<ContentBlock>,

    // Response metadata
    pub model_id: String,
    pub provider_id: String,
    pub input_tokens: i64,
    pub output_tokens: i64,
    pub cache_read_tokens: i64,
    pub cache_write_tokens: i64,

    // Completion state
    pub completed_at: Option<DateTime<Utc>>,
    pub finish_reason: Option<String>,
    pub error: Option<String>,
}

Content Blocks

Messages contain one or more content blocks:

pub enum ContentBlock {
    Text(TextBlock),
    ToolUse(ToolUseBlock),
    ToolResult(ToolResultBlock),
}

TextBlock

Plain text content:

pub struct TextBlock {
    pub text: String,
}

ToolUseBlock

Tool invocation from the LLM:

pub struct ToolUseBlock {
    pub id: String,
    pub name: String,
    pub input: HashMap<String, Value>,
}

ToolResultBlock

Result returned to the LLM:

pub struct ToolResultBlock {
    pub tool_use_id: String,
    pub content: String,
    pub is_error: bool,
    pub compact_summary: Option<String>,
}

The compact_summary field stores a pre-computed summary for use during compaction.

Turn Tracking

Turn IDs group related messages within a conversation turn:

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct TurnId {
    pub owner: String,  // "u" for user, "a" for assistant
    pub number: i64,    // Turn number
}

Turn ID Format

Turn IDs display as strings like “u1”, “a1”, “u2”, “a2”:

impl Display for TurnId {
    fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result {
        write!(f, "{}{}", self.owner, self.number)
    }
}

Turn Counter

The TurnCounter generates sequential turn IDs:

pub struct TurnCounter {
    current: AtomicI64,
}

impl TurnCounter {
    pub fn next_user(&self) -> TurnId {
        let number = self.current.fetch_add(1, Ordering::SeqCst);
        TurnId { owner: "u".to_string(), number }
    }

    pub fn next_assistant(&self) -> TurnId {
        let number = self.current.load(Ordering::SeqCst);
        TurnId { owner: "a".to_string(), number }
    }
}

Message Grouping

Turn IDs enable:

Grouping user message with its assistant response
Linking tool calls with their results
Selective removal during interrupts
Turn counting for compaction decisions

Adding Messages

Messages are appended during request processing:

async fn add_user_message(&self, content: Vec<ContentBlock>, turn_id: TurnId) {
    let message = Message::User(UserMessage {
        id: Uuid::new_v4().to_string(),
        session_id: self.id(),
        turn_id,
        created_at: Utc::now(),
        content,
    });

    let mut guard = self.conversation.write().await;
    Arc::make_mut(&mut *guard).push(message);
}

Tool results are stored similarly:

async fn add_tool_result(
    &self,
    tool_use_id: String,
    content: String,
    is_error: bool,
    compact_summary: Option<String>,
    turn_id: TurnId,
) {
    let block = ContentBlock::ToolResult(ToolResultBlock {
        tool_use_id,
        content,
        is_error,
        compact_summary,
    });

    let message = Message::User(UserMessage {
        id: Uuid::new_v4().to_string(),
        session_id: self.id(),
        turn_id,
        created_at: Utc::now(),
        content: vec![block],
    });

    let mut guard = self.conversation.write().await;
    Arc::make_mut(&mut *guard).push(message);
}

Reading Conversation

Access conversation history for display or processing:

impl LLMSession {
    pub async fn conversation(&self) -> Arc<Vec<Message>> {
        self.conversation.read().await.clone()
    }

    pub async fn conversation_len(&self) -> usize {
        self.conversation.read().await.len()
    }
}

The Arc clone is O(1) and provides a consistent snapshot.

Counting Turns

Extract unique turn count from conversation:

async fn count_turns(&self) -> usize {
    let conversation = self.conversation.read().await;
    let mut turn_ids: Vec<&TurnId> = conversation
        .iter()
        .map(|msg| msg.turn_id())
        .collect();

    turn_ids.dedup();
    turn_ids.len()
}

Clearing Conversation

Reset conversation state:

pub async fn clear_conversation(&self) {
    // Clear message history
    *self.conversation.write().await = Arc::new(Vec::new());

    // Reset token counters
    self.current_input_tokens.store(0, Ordering::SeqCst);
    self.current_output_tokens.store(0, Ordering::SeqCst);

    // Clear compact summaries
    self.compact_summaries.write().await.clear();
}

This operation:

Replaces conversation with empty vector
Resets token tracking to zero
Clears stored compaction summaries

Interrupt Handling

Interrupts remove messages from the current turn:

pub async fn interrupt(&self) {
    // Cancel current request
    if let Some(cancel) = self.current_cancel.lock().await.take() {
        cancel.cancel();
    }

    // Remove messages from current turn
    let turn_id = self.current_turn_id.read().await.clone();
    if let Some(turn_id) = turn_id {
        let mut guard = self.conversation.write().await;
        Arc::make_mut(&mut *guard).retain(|msg| msg.turn_id() != &turn_id);
    }

    // Clear turn tracking
    *self.current_turn_id.write().await = None;
}

This ensures partial responses are not left in history.

Compact Summaries Storage

Pre-computed summaries for tool results:

pub struct LLMSession {
    compact_summaries: RwLock<HashMap<String, String>>,
}

async fn store_compact_summaries(&self, summaries: &HashMap<String, String>) {
    let mut guard = self.compact_summaries.write().await;
    for (tool_use_id, summary) in summaries {
        guard.insert(tool_use_id.clone(), summary.clone());
    }
}

These summaries are used during compaction to replace verbose tool results.

Message Accessors

Common message access patterns:

impl Message {
    pub fn turn_id(&self) -> &TurnId {
        match self {
            Message::User(m) => &m.turn_id,
            Message::Assistant(m) => &m.turn_id,
        }
    }

    pub fn content(&self) -> &[ContentBlock] {
        match self {
            Message::User(m) => &m.content,
            Message::Assistant(m) => &m.content,
        }
    }

    pub fn is_user(&self) -> bool {
        matches!(self, Message::User(_))
    }

    pub fn is_assistant(&self) -> bool {
        matches!(self, Message::Assistant(_))
    }
}

Building LLM Requests

Conversation is converted to provider format for API calls:

async fn build_messages(&self) -> Vec<ProviderMessage> {
    let conversation = self.conversation.read().await;

    conversation
        .iter()
        .map(|msg| match msg {
            Message::User(m) => ProviderMessage::user(convert_content(&m.content)),
            Message::Assistant(m) => ProviderMessage::assistant(convert_content(&m.content)),
        })
        .collect()
}

Thread Safety Patterns

Context management uses appropriate synchronization:

Operation	Lock Type	Duration
Read conversation	`RwLock::read()`	Short
Add message	`RwLock::write()`	Short
Clear conversation	`RwLock::write()`	Short
Interrupt	`RwLock::write()`	Short

Write locks are held briefly with Arc::make_mut() providing copy-on-write semantics.

Next Steps

Compaction Algorithm - Automatic context reduction
Token Tracking - Token counting details
Session Lifecycle - Session state machine

Building Agents

Architecture & Internals