Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/block/goose/llms.txt

Use this file to discover all available pages before exploring further.

Providers are the bridge between Goose and AI models. They abstract different LLM APIs behind a common interface, enabling Goose to work with 25+ AI services from Anthropic, OpenAI, local models, and more.

What is a Provider?

A provider in Goose is a component that:
  • Connects to an AI model service (cloud or local)
  • Translates Goose’s conversation format to the model’s API format
  • Handles authentication and API keys
  • Streams responses back to the agent
  • Manages tool calling protocols
  • Tracks token usage and costs
// Core provider trait (simplified)
#[async_trait]
pub trait Provider: Send + Sync {
    /// Send a completion request to the model
    async fn complete(
        &self,
        system: String,
        messages: Vec<Message>,
        tools: Vec<Tool>,
    ) -> Result<BoxStream<ProviderMessage>>;
    
    /// List available models
    fn list_models(&self) -> Vec<ModelInfo>;
    
    /// Generate a session name from conversation
    async fn generate_session_name(
        &self,
        messages: &[Message],
    ) -> Result<String>;
}

Built-in Providers

Goose includes native support for many popular AI providers:

Cloud Providers

Anthropic

Claude models (Sonnet, Opus, Haiku)
  • Native tool calling
  • Prompt caching
  • Extended context windows

OpenAI

GPT models (GPT-4o, GPT-4, GPT-3.5)
  • Function calling
  • Vision support
  • Structured outputs

Google

Gemini models via Vertex AI
  • Multi-modal support
  • Large context windows
  • OAuth authentication

AWS Bedrock

Multiple model families
  • Anthropic Claude
  • Meta Llama
  • AWS credentials

Local & Open Source

Ollama

Local model execution
  • Privacy-first
  • No API keys required
  • Qwen, Llama, Mistral, etc.

LiteLLM

Unified gateway to 100+ models
  • Consistent API across providers
  • Load balancing
  • Fallback handling

OpenAI Compatible

Custom OpenAI-compatible servers
  • vLLM, LocalAI, etc.
  • Self-hosted models
  • Custom endpoints

Local Inference

Direct local model execution
  • llama.cpp integration
  • GGUF model support
  • CPU/GPU acceleration

Enterprise Providers

  • Azure OpenAI: Enterprise OpenAI deployment
  • Databricks: Databricks model serving
  • Snowflake Cortex: Snowflake’s AI models
  • GitHub Copilot: GitHub’s code models
  • OpenRouter: Multi-provider routing
  • Venice.ai: Privacy-focused inference

Provider Architecture

Configuration

Via Environment Variables

The simplest way to configure a provider:
# Anthropic
export GOOSE_PROVIDER=anthropic
export GOOSE_MODEL=claude-sonnet-4-20250514
export ANTHROPIC_API_KEY=sk-ant-...

# OpenAI
export GOOSE_PROVIDER=openai
export GOOSE_MODEL=gpt-4o
export OPENAI_API_KEY=sk-...

# Ollama (local)
export GOOSE_PROVIDER=ollama
export GOOSE_MODEL=qwen3-coder:latest
export OLLAMA_HOST=http://localhost:11434

Via Configuration File

# ~/.config/goose/config.yaml
GOOSE_PROVIDER: anthropic
GOOSE_MODEL: claude-sonnet-4-20250514
API keys stored separately in keyring or secrets file:
# ~/.config/goose/secrets.yaml (if GOOSE_DISABLE_KEYRING=1)
ANTHROPIC_API_KEY: sk-ant-...

Via Recipe

# recipe.yaml
title: GPT-4o Code Review
settings:
  goose_provider: openai
  goose_model: gpt-4o
  temperature: 0.2

Provider Implementation

Example: Anthropic Provider

// From crates/goose/src/providers/anthropic.rs
pub struct AnthropicProvider {
    api_key: String,
    base_url: String,
    http_client: reqwest::Client,
}

#[async_trait]
impl Provider for AnthropicProvider {
    async fn complete(
        &self,
        system: String,
        messages: Vec<Message>,
        tools: Vec<Tool>,
    ) -> Result<BoxStream<ProviderMessage>> {
        // Convert to Anthropic API format
        let request = self.build_request(
            system,
            messages,
            tools,
        )?;
        
        // Make streaming API call
        let response = self.http_client
            .post(&format!("{}/v1/messages", self.base_url))
            .header("x-api-key", &self.api_key)
            .header("anthropic-version", "2023-06-01")
            .json(&request)
            .send()
            .await?;
        
        // Stream and parse chunks
        let stream = response
            .bytes_stream()
            .map(|chunk| self.parse_chunk(chunk))
            .boxed();
        
        Ok(stream)
    }
    
    fn list_models(&self) -> Vec<ModelInfo> {
        vec![
            ModelInfo::with_cost(
                "claude-sonnet-4-20250514",
                200_000,  // context window
                0.000003, // input cost per token
                0.000015, // output cost per token
            ),
            // ... other models
        ]
    }
}

Tool Calling Translation

Different providers have different tool calling formats. Goose translates between them:
// Anthropic format
{
  "tools": [{
    "name": "read_file",
    "description": "Read a file",
    "input_schema": {
      "type": "object",
      "properties": {
        "path": {"type": "string"}
      }
    }
  }]
}

// OpenAI format
{
  "tools": [{
    "type": "function",
    "function": {
      "name": "read_file",
      "description": "Read a file",
      "parameters": {
        "type": "object",
        "properties": {
          "path": {"type": "string"}
        }
      }
    }
  }]
}
Goose providers handle these translations automatically.

Model Capabilities

Providers expose model capabilities through ModelInfo:
pub struct ModelInfo {
    pub name: String,
    pub context_limit: usize,              // Max tokens
    pub input_token_cost: Option<f64>,     // Cost per input token
    pub output_token_cost: Option<f64>,    // Cost per output token
    pub supports_cache_control: Option<bool>,  // Prompt caching
}
Goose uses this information to:
  • Manage context windows
  • Estimate costs
  • Enable/disable features (like prompt caching)
  • Choose appropriate models for subagents

Custom Providers

You can add custom providers without modifying Goose’s code:

1. Declarative Provider (JSON)

For OpenAI-compatible APIs:
// ~/.config/goose/custom_providers/my-provider.json
{
  "name": "my_provider",
  "engine": "openai",
  "display_name": "My Custom LLM",
  "description": "Internal LLM endpoint",
  "api_key_env": "MY_PROVIDER_API_KEY",
  "base_url": "https://llm.company.internal/v1",
  "models": [
    {
      "name": "company-llm-v1",
      "context_limit": 32768,
      "input_token_cost": 0.000001,
      "output_token_cost": 0.000002
    }
  ],
  "supports_streaming": true,
  "requires_auth": true
}
Supported engines:
  • openai: OpenAI-compatible API
  • anthropic: Anthropic-compatible API
  • ollama: Ollama-compatible API

2. Code-Based Provider (Rust)

For custom protocols:
// 1. Create a new file: crates/goose/src/providers/my_provider.rs

use super::base::{Provider, ModelInfo, ProviderMessage};
use async_trait::async_trait;
use anyhow::Result;

pub struct MyProvider {
    api_key: String,
    endpoint: String,
}

impl MyProvider {
    pub fn new(api_key: String, endpoint: String) -> Self {
        Self { api_key, endpoint }
    }
}

#[async_trait]
impl Provider for MyProvider {
    async fn complete(
        &self,
        system: String,
        messages: Vec<Message>,
        tools: Vec<Tool>,
    ) -> Result<BoxStream<ProviderMessage>> {
        // Your implementation here
        todo!()
    }
    
    fn list_models(&self) -> Vec<ModelInfo> {
        vec![ModelInfo::new("my-model-v1", 8192)]
    }
}

// 2. Register in crates/goose/src/providers/init.rs
pub fn create_provider(config: &Config) -> Result<Box<dyn Provider>> {
    match config.provider.as_str() {
        "anthropic" => Ok(Box::new(AnthropicProvider::new(config)?)),
        "openai" => Ok(Box::new(OpenAIProvider::new(config)?)),
        "my_provider" => Ok(Box::new(MyProvider::new(
            config.get_secret("MY_PROVIDER_API_KEY")?,
            config.get("MY_PROVIDER_ENDPOINT")?,
        )?)),
        // ...
    }
}

Provider Selection

Goose determines which provider to use via configuration precedence:
  1. Subagent settings (highest priority)
    subagent(settings: {provider: "openai", model: "gpt-4o-mini"})
    
  2. Recipe settings
    settings:
      goose_provider: anthropic
      goose_model: claude-sonnet-4-20250514
    
  3. Environment variables
    GOOSE_PROVIDER=ollama
    GOOSE_MODEL=qwen3-coder:latest
    
  4. Config file
    GOOSE_PROVIDER: anthropic
    
  5. Default (Anthropic Claude)

Streaming

All providers support streaming responses:
pub enum ProviderMessage {
    Text(String),              // Text chunk
    ToolUse(ToolRequest),      // Tool call request
    Thinking(String),          // Model reasoning (if supported)
    Usage(TokenUsage),         // Token counts
    Done,                      // Stream complete
}

// Agent consumes stream:
let mut stream = provider.complete(system, messages, tools).await?;
while let Some(msg) = stream.next().await {
    match msg {
        ProviderMessage::Text(text) => {
            // Stream to user immediately
            send_to_user(text).await?;
        }
        ProviderMessage::ToolUse(tool) => {
            // Execute tool
            execute_tool(tool).await?;
        }
        ProviderMessage::Done => break,
    }
}

Token Usage Tracking

Providers report token usage for cost estimation:
pub struct TokenUsage {
    pub input_tokens: usize,
    pub output_tokens: usize,
    pub cache_read_tokens: Option<usize>,   // For prompt caching
    pub cache_write_tokens: Option<usize>,
}

// Stored in session
session.accumulated_input_tokens += usage.input_tokens;
session.accumulated_output_tokens += usage.output_tokens;

// Calculate cost
let cost = (usage.input_tokens as f64 * model_info.input_token_cost)
         + (usage.output_tokens as f64 * model_info.output_token_cost);

Error Handling

Providers return standardized errors:
pub enum ProviderError {
    AuthenticationError(String),   // Invalid API key
    RateLimitError(String),        // Rate limit hit
    InvalidRequestError(String),   // Bad request
    ModelNotFoundError(String),    // Unknown model
    NetworkError(String),          // Connection issues
    StreamError(String),           // Streaming failure
    // ...
}
The agent’s retry manager handles transient errors automatically:
loop {
    match provider.complete(...).await {
        Ok(stream) => return Ok(stream),
        Err(ProviderError::RateLimitError(_)) => {
            // Wait and retry
            sleep(backoff.next_delay()).await;
        }
        Err(ProviderError::NetworkError(_)) => {
            // Retry with backoff
            sleep(backoff.next_delay()).await;
        }
        Err(e) => return Err(e), // Don't retry auth errors, etc.
    }
}

Multi-Provider Workflows

You can use different providers for different tasks:
title: Multi-Provider Analysis
instructions: |
  Use different models for different tasks:
  - GPT-4o-mini for simple file operations
  - Claude Sonnet for complex analysis
  - Local Ollama for privacy-sensitive data

settings:
  goose_provider: anthropic
  goose_model: claude-sonnet-4-20250514

prompt: |
  # Use cheap model for file listing
  subagent(
    instructions: "List all Python files",
    settings: {provider: "openai", model: "gpt-4o-mini"}
  )
  
  # Use powerful model for analysis
  subagent(
    instructions: "Analyze the architecture for security issues",
    settings: {provider: "anthropic", model: "claude-sonnet-4-20250514"}
  )
  
  # Use local model for sensitive data
  subagent(
    instructions: "Process customer data locally",
    settings: {provider: "ollama", model: "qwen3-coder:latest"}
  )

Provider Comparison

ProviderTool CallingStreamingVisionLocalCost
AnthropicNativeYesYes (Claude 3.5+)No$$$
OpenAIFunction callingYesYes (GPT-4V)No$$$
OllamaVia toolshimYesSome modelsYesFree
Google GeminiNativeYesYesNo$$
AWS BedrockModel-dependentYesModel-dependentNo$$
LiteLLMPass-throughYesModel-dependentNoVaries
Local InferenceVia toolshimYesNoYesFree

Tool Calling Approaches

Native: Provider API has built-in tool calling support
  • Anthropic, OpenAI, Google Gemini
  • Best accuracy and performance
Toolshim: Goose adds tool calling via system prompts
  • Ollama, local models
  • Works but less reliable
  • Good for experimentation

Best Practices

  • Simple tasks: Use cheaper/faster models (GPT-4o-mini, Claude Haiku)
  • Complex reasoning: Use powerful models (Claude Sonnet, GPT-4o)
  • Code generation: Use code-specialized models (Qwen Coder, Claude)
  • Privacy-sensitive: Use local models (Ollama)
// Check model context before sending
let token_count = count_tokens(&messages);
let model_info = provider.list_models()
    .find(|m| m.name == model_name)?;

if token_count > model_info.context_limit * 0.75 {
    // Compact messages
    messages = compact_messages(messages);
}
# Configure retry behavior
retry:
  max_retries: 3
  initial_delay_ms: 1000
  max_delay_ms: 30000
  backoff_multiplier: 2.0
Anthropic and some other providers support caching system prompts:
// Automatically enabled for supported models
// Significantly reduces cost for repeated requests
if model_info.supports_cache_control {
    // System prompt is cached
    // Only pay for new user messages
}

Troubleshooting

Common Issues

“Authentication failed”
# Check API key is set
echo $ANTHROPIC_API_KEY

# Verify in config
goose configure get ANTHROPIC_API_KEY
“Model not found”
# List available models
goose configure list-models

# Check provider documentation for exact model names
“Rate limit exceeded”
  • Wait and retry (automatic)
  • Upgrade API tier
  • Use multiple API keys with load balancing (via LiteLLM)
“Context length exceeded”
# Reduce max_turns or enable aggressive compaction
settings:
  max_turns: 20  # Limit conversation length

Next Steps

Extensions

Learn about the tools providers can use

Recipes

Configure providers in recipes

Configuration

Advanced provider configuration

Custom Distributions

Bundle custom providers