Databricks Integration
Databricks provides a unified analytics platform with MLflow for ML experiment tracking. LangGuard integrates with Databricks to import MLflow traces and Unity Catalog metadata.
Overview
The Databricks integration enables LangGuard to:
- Import MLflow traces from your Databricks workspace
- Sync Unity Catalog entities (catalogs, schemas, tables)
- Track ML experiments and runs
- Apply governance policies to ML operations
Prerequisites
- A Databricks workspace
- Personal Access Token (PAT) with appropriate permissions
- Python 3.9+ with MLflow 3.0+ installed (for trace sync)
Architecture
LangGuard uses a Python bridge for Databricks MLflow integration:
LangGuard Server
│
▼
Python Bridge (subprocess)
│
▼
MLflow Python SDK
│
▼
Databricks MLflow API
This enables full MLflow SDK compatibility and proper authentication handling.
Getting Your API Token
Create Personal Access Token
- Log in to your Databricks workspace
- Click your profile icon → User Settings
- Navigate to Access Tokens tab
- Click Generate New Token
- Enter a description (e.g., "LangGuard Integration")
- Set expiration (recommended: 90 days)
- Click Generate
- Copy the token immediately (it won't be shown again)
Required Permissions
Your token needs these permissions:
| Permission | Required For |
|---|---|
CAN_VIEW on workspace | List experiments and runs |
CAN_READ on MLflow experiments | Read trace data |
USE CATALOG | Unity Catalog access (optional) |
SELECT on tables | Query catalog metadata (optional) |
Setup
Step 1: Prepare Python Environment
LangGuard requires Python with MLflow installed:
# Create virtual environment
python3 -m venv .venv
# Activate (Linux/Mac)
source .venv/bin/activate
# Install dependencies
pip install -r server/python/requirements.txt
Verify installation:
python3 -c "import mlflow; print(mlflow.__version__)"
# Should print: 3.0.0 or higher
Step 2: Add Integration
- Navigate to Settings > Integrations in LangGuard
- Click Add Integration
- Select Databricks
Step 3: Configure Credentials
Enter your credentials:
| Field | Description | Required |
|---|---|---|
| Name | Friendly name (e.g., "Production Databricks") | Yes |
| Host URL | Workspace URL (e.g., https://dbc-xxx.cloud.databricks.com) | Yes |
| API Token | Personal Access Token | Yes |
| Warehouse ID | SQL Warehouse ID | No* |
*Warehouse ID is only required for Unity Catalog SQL operations, not for trace sync.
Step 4: Test Connection
- Click Test Connection
- LangGuard validates:
- Python environment
- MLflow SDK availability
- Databricks authentication
- Verify success message
Step 5: Configure Sync
Configure what to sync:
| Setting | Description |
|---|---|
| Sync Traces | Import MLflow traces |
| Sync Catalog | Import Unity Catalog entities |
| Experiments | Specific experiments or all |
Step 6: Save
Click Save to complete setup.
Configuration Options
Environment Variables
# Required
DATABRICKS_HOST=https://dbc-xxx.cloud.databricks.com
DATABRICKS_API_KEY=dapi-your-token-here
# Optional
DATABRICKS_WAREHOUSE_ID=your-warehouse-id
# Python bridge settings
PYTHON_PATH=.venv/bin/python
DATABRICKS_PYTHON_TIMEOUT=180000
DATABRICKS_VALIDATION_TIMEOUT=15000
Multiple Workspaces
Connect multiple Databricks workspaces:
- Add separate integrations for each workspace
- Use descriptive names (e.g., "Databricks - Production", "Databricks - Dev")
- Configure different credentials for each
What Gets Synced
MLflow Traces
MLflow traces are converted to LangGuard format:
| MLflow Field | LangGuard Field |
|---|---|
request_id | externalId |
timestamp_ms | timestamp |
execution_time_ms | duration |
status | status |
request | input |
response | output |
request_metadata | metadata |
tags | tags |
Span Information
Each trace includes detailed span data:
- LLM calls with model info
- Tool invocations
- Retrieval operations
- Custom spans
Unity Catalog Entities
When catalog sync is enabled:
| Entity Type | Synced Data |
|---|---|
| Catalogs | Name, owner, comment |
| Schemas | Name, catalog, properties |
| Tables | Name, columns, stats |
| Columns | Name, type, nullable, comment |
Agent Detection
LangGuard detects agents from MLflow trace metadata:
# Include agent info in your MLflow traces
import mlflow
with mlflow.start_span(name="agent_operation") as span:
span.set_attributes({
"agent.name": "CustomerService",
"agent.version": "2.0.0",
"agent.type": "RAG"
})
Sync Behavior
Trace Sync
- Lists recent experiments (or specified experiments)
- Queries for traces since last sync
- Fetches full trace details including spans
- Transforms to LangGuard format
- Applies policy evaluation
- Stores in database
Catalog Sync
- Lists accessible catalogs
- Traverses catalog → schema → table hierarchy
- Fetches column metadata
- Upserts to Data Catalog
Performance
For large workspaces:
- Traces are fetched in batches (default: 100)
- Parallel processing where possible
- Incremental sync minimizes data transfer
Troubleshooting
Python Environment Issues
Error: MLflow SDK not available
Solutions:
- Ensure Python 3.9+ is installed
- Install MLflow:
pip install mlflow>=3.0.0 - Set
PYTHON_PATHto correct Python executable - Verify:
python3 -c "import mlflow; print(mlflow.__version__)"
Authentication Failed
Error: 401 Unauthorized
Solutions:
- Verify token hasn't expired
- Check token has required permissions
- Ensure host URL is correct (include
https://) - Generate a new token if needed
Timeout Errors
Error: DATABRICKS_PYTHON_TIMEOUT exceeded
Solutions:
- Increase timeout:
DATABRICKS_PYTHON_TIMEOUT=300000 - Reduce batch size
- Check network latency to Databricks
- Verify SQL warehouse is running (for catalog sync)
No Traces Found
Possible causes:
- MLflow tracing not enabled in your code
- Experiments don't have traces (only runs)
- Time range doesn't include trace timestamps
- Experiments are archived
SQL Warehouse Issues
Error: Warehouse ID required for catalog operations
Solutions:
- Create a SQL warehouse in Databricks
- Add
DATABRICKS_WAREHOUSE_IDto configuration - Ensure warehouse is running
- Verify token has warehouse access
Python Bridge Details
How It Works
LangGuard spawns Python subprocesses to interact with MLflow:
Node.js Server
│
│ spawn python3
▼
Python Process
│
│ import mlflow
│ mlflow.set_tracking_uri(...)
│ mlflow.search_traces(...)
▼
JSON Response → Node.js
Configuration
# Path to Python with MLflow
PYTHON_PATH=.venv/bin/python
# Timeout for API operations (default: 3 minutes)
DATABRICKS_PYTHON_TIMEOUT=180000
# Timeout for startup validation (default: 15 seconds)
DATABRICKS_VALIDATION_TIMEOUT=15000
Debugging
Enable verbose Python logging:
LOG_LEVEL=debug npm run dev
Check Python bridge logs in server output.
Next Steps
- Data Catalog - Browse synced entities
- Policies - Apply governance to ML traces
- Troubleshooting - More debugging help