---
layout: 'page'
uri: '/configuration/ai-chat'
position: 3
slug: 'configuration-ai-chat'
parent: 'configuration'
navTitle: 'AI chat'
title: 'AI chat configuration'
description: 'Configure the AI chat assistant — a grounded, multi-agent chat with your documentation that answers with verified citations. Full reference for all DOCUMAN_CHAT_* environment variables.'
---
# AI chat configuration
Documan ships an optional **AI chat assistant** — a "chat with your documentation"
experience that answers visitor questions grounded in your pages, with verified
citations back to the source. A lead agent works over the same documentation tools
your [MCP server](/ai-integration/mcp-setup) exposes, and can escalate a broad,
multi-part question to several research sub-agents running in parallel. For a feature
overview, see [AI chat](/ai-integration/ai-chat).
The assistant is **off by default** and uses its **own Anthropic API key**,
completely independent of the OpenAI key used for [semantic search](/configuration/semantic-search).
All settings are environment variables, and any change requires a server restart to
take effect.
## Enabling the assistant
Three variables are **required** to turn the chat on. Until all three are set, the
chat endpoints stay disabled and the UI shows nothing.
### DOCUMAN_CHAT_ENABLED
**Required.** Master switch. Set to `true` to enable the chat assistant.
```bash
DOCUMAN_CHAT_ENABLED=true
```
### DOCUMAN_CHAT_ANTHROPIC_API_KEY
**Required.** A dedicated Anthropic API key for the chat. This is **separate** from
`DOCUMAN_OPENAI_API_KEY` (used only for embeddings and search) — the chat never
shares it.
```bash
DOCUMAN_CHAT_ANTHROPIC_API_KEY=sk-ant-...
```
### DOCUMAN_CHAT_DEFAULT_MODEL
**Required.** The fallback Anthropic model used for any agent role not overridden
in `DOCUMAN_CHAT_AGENTS_JSON`.
```bash
DOCUMAN_CHAT_DEFAULT_MODEL=claude-sonnet-4-6
```
## Models and reasoning
### DOCUMAN_CHAT_DEFAULT_EFFORT
**Default:** `high`
Fallback reasoning effort, one of `off`, `low`, `medium`, `high`, `xhigh`, `max`.
Higher effort means deeper reasoning at higher cost and latency.
### DOCUMAN_CHAT_AGENTS_JSON
**Default:** none (every role uses the defaults above)
Per-role model configuration as a JSON map of `role -> {model, effort, maxTokens,
maxIterations}`. The roles are `lead` (the orchestrator) and `research` (the
sub-agents); a `default` entry applies to any role you omit.
A common setup runs a strong lead and a cheaper research tier:
```bash
DOCUMAN_CHAT_AGENTS_JSON={"lead":{"model":"claude-opus-4-8","effort":"high","maxTokens":32000,"maxIterations":8},"research":{"model":"claude-haiku-4-5","effort":"low","maxTokens":8000,"maxIterations":4}}
```
Research fan-out costs roughly 10–15× the tokens of a single answer, so a cheaper
research model is recommended.
## Endpoints
By default the chat talks to the public Anthropic API. You can point it at a proxy,
gateway, or a self-hosted Anthropic-compatible endpoint.
### DOCUMAN_CHAT_ANTHROPIC_BASE_URL
**Default:** `https://api.anthropic.com/v1`
Base URL for the Anthropic Messages API. Override it to route chat traffic through
a proxy or a compatible gateway.
```bash
DOCUMAN_CHAT_ANTHROPIC_BASE_URL=https://your-gateway.example/anthropic/v1
```
The embeddings endpoint used for semantic search has its own override,
`DOCUMAN_OPENAI_BASE_URL` — see [Semantic search](/configuration/semantic-search).
### DOCUMAN_CHAT_ANTHROPIC_VERSION
**Default:** `2023-06-01`
The `anthropic-version` header. Override only if a newer API version is required.
### DOCUMAN_CHAT_ANTHROPIC_BETA
**Default:** none
Optional `anthropic-beta` header value(s) for enabling Anthropic beta features.
## System prompt
### DOCUMAN_CHAT_SYSTEM_PROMPT_PATH
**Default:** none (a built-in prompt is used)
Path to an optional Markdown file with your own system prompt — set the assistant's
persona, tone, and scope (for example, "only answer questions about Product X;
politely decline anything else"). Your prompt is **prepended** to a
server-controlled instruction block that enforces grounding and citations, so a
custom prompt can never weaken those guarantees.
```bash
DOCUMAN_CHAT_SYSTEM_PROMPT_PATH=/etc/documan/chat-prompt.md
```
## Limits and budgets
These cap cost, concurrency, and abuse on a public endpoint. The defaults are safe
for a low-traffic docs site; raise or lower them to fit your traffic and budget.
| Variable | Default | Purpose |
|---|---|---|
| `DOCUMAN_CHAT_MAX_SUBAGENTS` | `3` | Max research sub-agents per run (`0` disables fan-out — single-agent mode). |
| `DOCUMAN_CHAT_MAX_CONCURRENT` | `4` | Global cap on concurrent in-flight LLM calls. |
| `DOCUMAN_CHAT_RATE_LIMIT_PER_MIN` | `10` | Chat requests allowed per minute, per visitor IP. |
| `DOCUMAN_CHAT_RUN_TIMEOUT_SECONDS` | `90` | Hard wall-clock timeout for a single run. |
| `DOCUMAN_CHAT_TOKEN_BUDGET` | `120000` | Hard token budget for a single run. |
| `DOCUMAN_CHAT_DAILY_TOKEN_BUDGET` | `5000000` | Aggregate daily token budget across all runs — a kill-switch that temporarily disables chat once exceeded. |
| `DOCUMAN_CHAT_MAX_INPUT_CHARS` | `4000` | Maximum length of a single visitor question. |
**Example:**
```bash
DOCUMAN_CHAT_MAX_SUBAGENTS=3
DOCUMAN_CHAT_DAILY_TOKEN_BUDGET=2000000
```
## Security notes
- The chat uses a **separate Anthropic key** and never shares the OpenAI embeddings key.
- Answers are **grounded**: the assistant may only cite pages it actually read, and citation titles come from the server, never the model.
- Hidden and unpublished pages are **never** served to the assistant.
- On a public, anonymous endpoint the **daily token budget** is the backstop against runaway cost — keep it set.
## Related configuration
- [General](/configuration/general) — project name, paths, port, AI discovery surfaces, and license key
- [Semantic search](/configuration/semantic-search) — OpenAI embeddings for `vectorize` and semantic search
---
[← Semantic search](/configuration/semantic-search.md) | [🚢 Deployment →](/deployment.md)