Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/block/goose/llms.txt

Use this file to discover all available pages before exploring further.

Overview

The Databricks provider connects to models hosted on Databricks AI Gateway, including Claude models and Meta Llama models. It supports both token-based and OAuth authentication. Source: crates/goose/src/providers/databricks.rs

Configuration

Environment Variables

DATABRICKS_HOST
string
required
Your Databricks workspace URL (e.g., https://your-workspace.cloud.databricks.com)
DATABRICKS_TOKEN
string
Personal access token (optional if using OAuth)
DATABRICKS_MAX_RETRIES
number
default:"3"
Maximum number of retry attempts
DATABRICKS_INITIAL_RETRY_INTERVAL_MS
number
default:"1000"
Initial retry interval in milliseconds
DATABRICKS_BACKOFF_MULTIPLIER
number
default:"2.0"
Multiplier for exponential backoff
DATABRICKS_MAX_RETRY_INTERVAL_MS
number
default:"60000"
Maximum retry interval in milliseconds

Setup

# Configure using the CLI
goose configure

# Or set environment variables
export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com"
export DATABRICKS_TOKEN="dapi..."

Authentication

The provider supports two authentication methods:

1. Token Authentication

Use a personal access token:
export DATABRICKS_TOKEN="dapi1234567890abcdef"

2. OAuth Authentication

If no token is provided, the provider automatically uses OAuth device code flow:
# Only set the host
export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com"

# When you run goose, you'll be prompted to authenticate via browser
goose session start
The OAuth flow:
  1. Displays a device code and URL
  2. Opens your browser to authenticate
  3. Caches the OAuth token for future use
  4. Automatically refreshes expired tokens
Default OAuth configuration:
  • Client ID: databricks-cli
  • Redirect URL: http://localhost
  • Scopes: all-apis, offline_access

Supported Models

Claude Models

  • databricks-claude-sonnet-4 (default) - Claude Sonnet on Databricks
  • databricks-claude-sonnet-4-5 - Latest Sonnet
  • databricks-claude-haiku-4-5 (fast model) - Fast Claude model

Meta Llama Models

  • databricks-meta-llama-3-3-70b-instruct - Llama 3.3 70B
  • databricks-meta-llama-3-1-405b-instruct - Llama 3.1 405B
Documentation: https://docs.databricks.com/en/generative-ai/external-models/

Usage

Basic Usage

use goose::providers::create;
use goose::model::ModelConfig;

// Create with default model
let model_config = ModelConfig::new("databricks-claude-sonnet-4")?;
let provider = create("databricks", model_config, vec![]).await?;

// Stream a response
let messages = vec![Message::user().with_text("Hello!")];
let stream = provider.stream(
    &provider.get_model_config(),
    "session-123",
    "You are a helpful assistant.",
    &messages,
    &[],
).await?;

Custom Configuration

let model_config = ModelConfig::new("databricks-meta-llama-3-3-70b-instruct")?
    .with_temperature(0.7)
    .with_max_tokens(2048);

let provider = create("databricks", model_config, vec![]).await?;

Using Fast Models

// Automatically tries databricks-claude-haiku-4-5 with 0 retries
let (response, usage) = provider.complete_fast(
    "session-123",
    "You are a helpful assistant.",
    &messages,
    &[],
).await?;

Advanced Features

Embeddings

The Databricks provider supports text embeddings:
if provider.supports_embeddings() {
    let texts = vec![
        "Hello world".to_string(),
        "Databricks embeddings".to_string(),
    ];
    
    let embeddings = provider.create_embeddings(
        "session-123",
        texts,
    ).await?;
    
    println!("Generated {} embeddings", embeddings.len());
}
Embedding endpoint: serving-endpoints/text-embedding-3-small/invocations

Retry Configuration

Configure retry behavior:
export DATABRICKS_MAX_RETRIES="5"
export DATABRICKS_INITIAL_RETRY_INTERVAL_MS="2000"
export DATABRICKS_BACKOFF_MULTIPLIER="2.5"
export DATABRICKS_MAX_RETRY_INTERVAL_MS="120000"
Fast models use a different retry strategy:
  • Max retries: 0 (fail fast)
  • No exponential backoff

Model Endpoints

The provider automatically routes to the correct endpoint:
// Regular models
GET /serving-endpoints/{model_name}/invocations

// Codex models (if model name contains "codex")
GET /serving-endpoints/responses

// Embeddings
GET /serving-endpoints/text-embedding-3-small/invocations

Implementation Details

Provider Metadata

impl ProviderDef for DatabricksProvider {
    fn metadata() -> ProviderMetadata {
        ProviderMetadata::new(
            "databricks",
            "Databricks",
            "Models on Databricks AI Gateway",
            "databricks-claude-sonnet-4",
            DATABRICKS_KNOWN_MODELS.to_vec(),
            "https://docs.databricks.com/en/generative-ai/external-models/",
            vec![
                ConfigKey::new("DATABRICKS_HOST", true, false, None, true),
                ConfigKey::new("DATABRICKS_TOKEN", false, true, None, true),
            ],
        )
    }
}

Authentication Flow

pub enum DatabricksAuth {
    Token(String),
    OAuth {
        host: String,
        client_id: String,
        redirect_url: String,
        scopes: Vec<String>,
    },
}
The provider dynamically gets the auth token:
impl AuthProvider for DatabricksAuthProvider {
    async fn get_auth_header(&self) -> Result<(String, String)> {
        let token = match &self.auth {
            DatabricksAuth::Token(token) => token.clone(),
            DatabricksAuth::OAuth { host, client_id, redirect_url, scopes } => {
                oauth::get_oauth_token_async(host, client_id, redirect_url, scopes).await?
            }
        };
        Ok(("Authorization".to_string(), format!("Bearer {}", token)))
    }
}

API Format

Requests use OpenAI-compatible format but without the model field:
{
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Hello!"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 2048,
  "stream": true
}
Note: The model is specified in the endpoint URL, not the request body.

Retry Strategy

pub struct RetryConfig {
    pub max_retries: usize,
    pub initial_interval_ms: u64,
    pub backoff_multiplier: f64,
    pub max_interval_ms: u64,
}

// Default config
RetryConfig {
    max_retries: 3,
    initial_interval_ms: 1000,
    backoff_multiplier: 2.0,
    max_interval_ms: 60000,
}

// Fast model config (fail fast)
RetryConfig {
    max_retries: 0,
    initial_interval_ms: 0,
    backoff_multiplier: 1.0,
    max_interval_ms: 0,
}

Fetching Available Models

// Get all serving endpoints
let models = provider.fetch_supported_models().await?;

// Queries: GET /api/2.0/serving-endpoints
// Returns endpoint names that can be used as model names
Example response:
{
  "endpoints": [
    {
      "name": "databricks-claude-sonnet-4-5",
      "creator": "user@example.com",
      "creation_timestamp": 1234567890,
      "config": { ... }
    }
  ]
}

Error Handling

match provider.stream(...).await {
    Ok(stream) => { /* handle stream */ },
    Err(ProviderError::Authentication(msg)) => {
        eprintln!("Auth failed: {}", msg);
        eprintln!("Try running: goose configure");
    },
    Err(ProviderError::RateLimited { retry_after }) => {
        eprintln!("Rate limited");
    },
    Err(e) => eprintln!("Error: {}", e),
}

Programmatic Configuration

use goose::providers::databricks::DatabricksProvider;

let provider = DatabricksProvider::from_params(
    "https://your-workspace.cloud.databricks.com".to_string(),
    "dapi1234567890abcdef".to_string(),
    model_config,
)?;

OAuth Token Management

Tokens are cached in the system keyring:
  • Service: databricks_oauth
  • Account: {host}_access_token and {host}_refresh_token
Tokens are automatically refreshed when expired.

See Also