Chapter 6 · Free~1 min read

Model Configuration

Configure multiple LLM providers and implement intelligent model routing for optimal performance and cost.

Adding LLM Providers

OpenAI

models:
  openai:
    api_key: "${OPENAI_API_KEY}"
    models:
      - gpt-4-turbo
      - gpt-3.5-turbo
    base_url: "https://api.openai.com/v1"

OpenAI Codex (openai-responses)

For Codex models, use the openai-responses provider instead of the regular openai chat endpoint:

models:
  openai-responses:
    api_key: "${OPENAI_API_KEY}"
    models:
      - gpt-5.3-codex
      - gpt-5.2-codex
    base_url: "https://api.openai.com/v1"

Anthropic Claude

models:
  anthropic:
    api_key: "${ANTHROPIC_API_KEY}"
    models:
      - claude-3-opus
      - claude-3-sonnet

Local Models

models:
  ollama:
    base_url: "http://localhost:11434"
    models:
      - llama2
      - codellama

Model Routing

Route tasks to appropriate models based on complexity:

routing:
  rules:
    - pattern: "code_review|refactor"
      model: "gpt-4-turbo"
    - pattern: "simple_query|translation"
      model: "gpt-3.5-turbo"
  default: "claude-3-sonnet"

Fallback Chains

Ensure reliability with automatic fallbacks:

fallback:
  chains:
    - ["gpt-4-turbo", "claude-3-opus", "gpt-3.5-turbo"]
  retry_attempts: 3
  retry_delay: 2000  # milliseconds

Cost Control

Set budget limits and track usage:

cost_control:
  daily_limit: 10.00  # USD
  user_quota: 1.00
  alert_threshold: 0.8