MCP Integration
archmax exposes your semantic models to AI agents through the Model Context Protocol (MCP). Agents discover available models, browse datasets and fields, and run scoped SQL queries.
While semantic models are stored internally as OSI YAML files, the MCP tools never return raw YAML. Instead, models are converted on-the-fly into a compressed markdown digest that uses 3–5× fewer tokens than the equivalent YAML while preserving all semantically relevant context. See Semantic Models — How Agents See Your Models for details on the digest format.
Endpoints
Section titled “Endpoints”Each project has two MCP endpoints:
| Endpoint | Reads from | Use case |
|---|---|---|
POST /mcp/<project-slug>/mcp | Published build (build/) | Production: give this to external AI agents |
POST /mcp/<project-slug>/test/mcp | Source files (src/), assembled on-the-fly | Development: test changes immediately without publishing |
The production endpoint serves the last published version of your semantic models. Changes you make in the editor are not visible to production agents until you click Publish.
The test endpoint assembles the model from source files on every request, so it always reflects your latest saved changes. The built-in Playground and batch test runner use this endpoint. You can also point an external agent at the test endpoint during development to iterate quickly.
Both endpoints accept JSON-RPC requests with tools/list and tools/call methods, and both require the same Bearer token authentication.
Authentication
Section titled “Authentication”All MCP requests require a Bearer token:
Authorization: Bearer <your-mcp-token>Tokens are created in the admin UI under MCP Access. Each token has:
- Scopes: which semantic models the token can access
- Expiry: optional expiration date
Available Tools
Section titled “Available Tools”| Tool | Description |
|---|---|
list_semantic_models | List semantic models the token has access to |
get_semantic_model | Get an overview of a model with datasets, relationships, and metrics |
get_datasets | Get fields for one or more datasets with types, examples, enums, and instructions |
execute_query | Run a read-only SQL query scoped to a semantic model’s VIEWs. Returns a storedQueryId by default. |
execute_stored_query | Re-execute a previously stored query by ID, optionally with different parameters |
request_improvement | Submit an improvement request for a semantic model |
Query Execution
Section titled “Query Execution”The execute_query tool lets agents run SQL against your data through sandboxed VIEWs. Instead of accessing raw tables, agents write SQL with bare dataset names — the correct scoped schema is resolved automatically via DuckDB’s search_path:
SELECT o.total_amount, c.nameFROM "orders" oJOIN "customers" c ON o.customer_id = c.customer_idWHERE o.created_at > '2024-01-01'LIMIT 100Agents should not add schema or catalog prefixes. The search_path ensures that "orders" resolves to the correct scoped VIEW for the requested model.
Security
Section titled “Security”- Queries are validated to only allow
SELECT,WITH,EXPLAIN, andDESCRIBEstatements - Raw catalog references (e.g.,
shopify.public.orders) are rejected — use bare dataset names - Explicit
_scope_prefixes are rejected — names resolve automatically - Each query runs with DuckDB security hardening: external access disabled, resource limits,
SETstatements blocked - Results are capped at 1,000 rows with a 30-second timeout
Client Configuration Examples
Section titled “Client Configuration Examples”Claude Desktop
Section titled “Claude Desktop”Add to your Claude Desktop MCP config:
{ "mcpServers": { "archmax": { "url": "https://your-server/mcp/your-project/mcp", "headers": { "Authorization": "Bearer sk-your-token" } } }}Cursor
Section titled “Cursor”Add to your .cursor/mcp.json:
{ "mcpServers": { "archmax": { "url": "https://your-server/mcp/your-project/mcp", "headers": { "Authorization": "Bearer sk-your-token" } } }}Rate Limiting
Section titled “Rate Limiting”MCP requests are rate-limited per client IP. The default is 120 requests per 60-second window, configurable via MCP_RATE_LIMIT_MAX.