Documentation Index
Fetch the complete documentation index at: https://mintlify.com/block/goose/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The Databricks provider connects to models hosted on Databricks AI Gateway, including Claude models and Meta Llama models. It supports both token-based and OAuth authentication.
Source: crates/goose/src/providers/databricks.rs
Configuration
Environment Variables
Your Databricks workspace URL (e.g., https://your-workspace.cloud.databricks.com)
Personal access token (optional if using OAuth)
Maximum number of retry attempts
DATABRICKS_INITIAL_RETRY_INTERVAL_MS
Initial retry interval in milliseconds
DATABRICKS_BACKOFF_MULTIPLIER
Multiplier for exponential backoff
DATABRICKS_MAX_RETRY_INTERVAL_MS
Maximum retry interval in milliseconds
Setup
# Configure using the CLI
goose configure
# Or set environment variables
export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com"
export DATABRICKS_TOKEN="dapi..."
Authentication
The provider supports two authentication methods:
1. Token Authentication
Use a personal access token:
export DATABRICKS_TOKEN="dapi1234567890abcdef"
2. OAuth Authentication
If no token is provided, the provider automatically uses OAuth device code flow:
# Only set the host
export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com"
# When you run goose, you'll be prompted to authenticate via browser
goose session start
The OAuth flow:
- Displays a device code and URL
- Opens your browser to authenticate
- Caches the OAuth token for future use
- Automatically refreshes expired tokens
Default OAuth configuration:
- Client ID:
databricks-cli
- Redirect URL:
http://localhost
- Scopes:
all-apis, offline_access
Supported Models
Claude Models
databricks-claude-sonnet-4 (default) - Claude Sonnet on Databricks
databricks-claude-sonnet-4-5 - Latest Sonnet
databricks-claude-haiku-4-5 (fast model) - Fast Claude model
databricks-meta-llama-3-3-70b-instruct - Llama 3.3 70B
databricks-meta-llama-3-1-405b-instruct - Llama 3.1 405B
Documentation: https://docs.databricks.com/en/generative-ai/external-models/
Usage
Basic Usage
use goose::providers::create;
use goose::model::ModelConfig;
// Create with default model
let model_config = ModelConfig::new("databricks-claude-sonnet-4")?;
let provider = create("databricks", model_config, vec![]).await?;
// Stream a response
let messages = vec![Message::user().with_text("Hello!")];
let stream = provider.stream(
&provider.get_model_config(),
"session-123",
"You are a helpful assistant.",
&messages,
&[],
).await?;
Custom Configuration
let model_config = ModelConfig::new("databricks-meta-llama-3-3-70b-instruct")?
.with_temperature(0.7)
.with_max_tokens(2048);
let provider = create("databricks", model_config, vec![]).await?;
Using Fast Models
// Automatically tries databricks-claude-haiku-4-5 with 0 retries
let (response, usage) = provider.complete_fast(
"session-123",
"You are a helpful assistant.",
&messages,
&[],
).await?;
Advanced Features
Embeddings
The Databricks provider supports text embeddings:
if provider.supports_embeddings() {
let texts = vec![
"Hello world".to_string(),
"Databricks embeddings".to_string(),
];
let embeddings = provider.create_embeddings(
"session-123",
texts,
).await?;
println!("Generated {} embeddings", embeddings.len());
}
Embedding endpoint: serving-endpoints/text-embedding-3-small/invocations
Retry Configuration
Configure retry behavior:
export DATABRICKS_MAX_RETRIES="5"
export DATABRICKS_INITIAL_RETRY_INTERVAL_MS="2000"
export DATABRICKS_BACKOFF_MULTIPLIER="2.5"
export DATABRICKS_MAX_RETRY_INTERVAL_MS="120000"
Fast models use a different retry strategy:
- Max retries: 0 (fail fast)
- No exponential backoff
Model Endpoints
The provider automatically routes to the correct endpoint:
// Regular models
GET /serving-endpoints/{model_name}/invocations
// Codex models (if model name contains "codex")
GET /serving-endpoints/responses
// Embeddings
GET /serving-endpoints/text-embedding-3-small/invocations
Implementation Details
impl ProviderDef for DatabricksProvider {
fn metadata() -> ProviderMetadata {
ProviderMetadata::new(
"databricks",
"Databricks",
"Models on Databricks AI Gateway",
"databricks-claude-sonnet-4",
DATABRICKS_KNOWN_MODELS.to_vec(),
"https://docs.databricks.com/en/generative-ai/external-models/",
vec![
ConfigKey::new("DATABRICKS_HOST", true, false, None, true),
ConfigKey::new("DATABRICKS_TOKEN", false, true, None, true),
],
)
}
}
Authentication Flow
pub enum DatabricksAuth {
Token(String),
OAuth {
host: String,
client_id: String,
redirect_url: String,
scopes: Vec<String>,
},
}
The provider dynamically gets the auth token:
impl AuthProvider for DatabricksAuthProvider {
async fn get_auth_header(&self) -> Result<(String, String)> {
let token = match &self.auth {
DatabricksAuth::Token(token) => token.clone(),
DatabricksAuth::OAuth { host, client_id, redirect_url, scopes } => {
oauth::get_oauth_token_async(host, client_id, redirect_url, scopes).await?
}
};
Ok(("Authorization".to_string(), format!("Bearer {}", token)))
}
}
Requests use OpenAI-compatible format but without the model field:
{
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
],
"temperature": 0.7,
"max_tokens": 2048,
"stream": true
}
Note: The model is specified in the endpoint URL, not the request body.
Retry Strategy
pub struct RetryConfig {
pub max_retries: usize,
pub initial_interval_ms: u64,
pub backoff_multiplier: f64,
pub max_interval_ms: u64,
}
// Default config
RetryConfig {
max_retries: 3,
initial_interval_ms: 1000,
backoff_multiplier: 2.0,
max_interval_ms: 60000,
}
// Fast model config (fail fast)
RetryConfig {
max_retries: 0,
initial_interval_ms: 0,
backoff_multiplier: 1.0,
max_interval_ms: 0,
}
Fetching Available Models
// Get all serving endpoints
let models = provider.fetch_supported_models().await?;
// Queries: GET /api/2.0/serving-endpoints
// Returns endpoint names that can be used as model names
Example response:
{
"endpoints": [
{
"name": "databricks-claude-sonnet-4-5",
"creator": "user@example.com",
"creation_timestamp": 1234567890,
"config": { ... }
}
]
}
Error Handling
match provider.stream(...).await {
Ok(stream) => { /* handle stream */ },
Err(ProviderError::Authentication(msg)) => {
eprintln!("Auth failed: {}", msg);
eprintln!("Try running: goose configure");
},
Err(ProviderError::RateLimited { retry_after }) => {
eprintln!("Rate limited");
},
Err(e) => eprintln!("Error: {}", e),
}
Programmatic Configuration
use goose::providers::databricks::DatabricksProvider;
let provider = DatabricksProvider::from_params(
"https://your-workspace.cloud.databricks.com".to_string(),
"dapi1234567890abcdef".to_string(),
model_config,
)?;
OAuth Token Management
Tokens are cached in the system keyring:
- Service:
databricks_oauth
- Account:
{host}_access_token and {host}_refresh_token
Tokens are automatically refreshed when expired.
See Also