Configuration Reference

docsfy is configured through environment variables. Values can be set directly in the shell environment or loaded from a .env file in the working directory.

Loading Configuration

docsfy uses Pydantic Settings to manage configuration. On startup, settings are resolved in the following order (highest priority first):

Environment variables set in the current shell
.env file in the current working directory
Default values defined in the application

Unknown environment variables are silently ignored.

# src/docsfy/config.py
class Settings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file=".env",
        env_file_encoding="utf-8",
        extra="ignore",
    )

    ai_provider: str = "claude"
    ai_model: str = "claude-opus-4-6[1m]"
    ai_cli_timeout: int = Field(default=60, gt=0)
    log_level: str = "INFO"
    data_dir: str = "/data"

Settings are cached at first access via @lru_cache, so all callers within a single process see the same configuration.

Quick Start

Copy the example file and edit it with your credentials:

cp .env.example .env

A minimal .env for the Claude provider:

AI_PROVIDER=claude
ANTHROPIC_API_KEY=sk-ant-...

A minimal .env for the Gemini provider:

AI_PROVIDER=gemini
AI_MODEL=gemini-2.5-pro
GEMINI_API_KEY=AIza...

Core Settings

`AI_PROVIDER`

The AI backend used for documentation generation.


Type	`string`
Default	`claude`
Valid values	`claude`, `gemini`, `cursor`

Each provider requires its own CLI tool to be installed (handled automatically in the Docker image) and its own authentication credentials.

AI_PROVIDER=claude

Per-request overrides are also supported via the API:

{
  "repo_url": "https://github.com/org/repo",
  "ai_provider": "gemini"
}

`AI_MODEL`

The specific model identifier passed to the AI CLI.


Type	`string`
Default	`claude-opus-4-6[1m]`

The [1m] suffix in the default value indicates a 1-million-token context window. Model identifiers are provider-specific; use whatever model string your chosen provider accepts.

# Claude
AI_MODEL=claude-opus-4-6[1m]

# Gemini
AI_MODEL=gemini-2.5-pro

# Cursor
AI_MODEL=cursor-latest

`AI_CLI_TIMEOUT`

Maximum time in seconds to wait for a single AI CLI invocation to complete.


Type	`integer`
Default	`60`
Constraint	Must be greater than `0`

This timeout applies independently to each AI call -- both the planning phase and each individual page generation. Setting a value of 0 or below will cause a validation error at startup.

AI_CLI_TIMEOUT=120

Tip: For large repositories or complex documentation plans, consider increasing this value. Each page is generated as a separate AI call, so the timeout does not need to cover the entire generation run.

`LOG_LEVEL`

Controls the verbosity of application logging.


Type	`string`
Default	`INFO`
Valid values	`DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`

LOG_LEVEL=DEBUG

`DATA_DIR`

Root directory for all persistent data, including the SQLite database and generated documentation sites.


Type	`string` (filesystem path)
Default	`/data`

docsfy creates the following structure under DATA_DIR:

/data/
├── docsfy.db              # SQLite database
└── projects/
    └── <project-name>/
        ├── plan.json      # Documentation plan
        ├── cache/pages/   # Cached markdown per page
        └── site/          # Rendered static site

The paths are resolved at import time in storage.py:

DB_PATH = Path(os.getenv("DATA_DIR", "/data")) / "docsfy.db"
DATA_DIR = Path(os.getenv("DATA_DIR", "/data"))
PROJECTS_DIR = DATA_DIR / "projects"

Warning: The DATA_DIR directory must be writable by the application process. When running in Docker, mount a volume to this path (see Docker Configuration).

Provider Authentication

Each AI provider requires its own credentials. Only the credentials for your chosen AI_PROVIDER need to be set.

Claude (Anthropic)

Claude supports two authentication methods. Use one of the following:

Option 1: API Key

Authenticate directly with an Anthropic API key.

Variable	Description
`ANTHROPIC_API_KEY`	Your Anthropic API key

AI_PROVIDER=claude
ANTHROPIC_API_KEY=sk-ant-api03-...

Option 2: Vertex AI

Use Claude through Google Cloud's Vertex AI platform.

Variable	Description
`CLAUDE_CODE_USE_VERTEX`	Set to `1` to enable Vertex AI
`CLOUD_ML_REGION`	Google Cloud region (e.g., `us-central1`)
`ANTHROPIC_VERTEX_PROJECT_ID`	Your Google Cloud project ID

AI_PROVIDER=claude
CLAUDE_CODE_USE_VERTEX=1
CLOUD_ML_REGION=us-central1
ANTHROPIC_VERTEX_PROJECT_ID=my-gcp-project

Note: When using Vertex AI, the ANTHROPIC_API_KEY variable is not required. Authentication is handled through Google Cloud credentials (e.g., Application Default Credentials).

Gemini (Google)

Variable	Description
`GEMINI_API_KEY`	Your Google Gemini API key

AI_PROVIDER=gemini
AI_MODEL=gemini-2.5-pro
GEMINI_API_KEY=AIza...

Cursor

Variable	Description
`CURSOR_API_KEY`	Your Cursor API key

AI_PROVIDER=cursor
CURSOR_API_KEY=cur-...

Server Settings

These variables control the uvicorn server and are read directly via os.getenv() in main.py. They are not part of the Pydantic Settings class and cannot be set in the .env file loaded by Pydantic.

`HOST`

Network interface to bind the server to.


Type	`string`
Default	`0.0.0.0`

HOST=127.0.0.1

`PORT`

Port number for the HTTP server.


Type	`integer`
Default	`8000`

PORT=3000

`DEBUG`

Enables uvicorn's auto-reload mode for development.


Type	`string`
Default	`false`
Activates when	Set to `true` (case-insensitive)

DEBUG=true

The reload check in main.py:

reload = os.getenv("DEBUG", "").lower() == "true"

Warning: Do not enable DEBUG in production. Auto-reload watches the filesystem for changes and restarts the server process, which is unsuitable for serving real traffic.

Per-Request Overrides

The /api/generate endpoint accepts optional fields that override the global configuration for a single request. This allows different projects to use different providers or models without changing server-wide settings.

# src/docsfy/models.py
class GenerateRequest(BaseModel):
    ai_provider: Literal["claude", "gemini", "cursor"] | None = None
    ai_model: str | None = None
    ai_cli_timeout: int | None = Field(default=None, gt=0)

When a field is None (omitted from the request), the global setting is used:

# src/docsfy/main.py
ai_provider = request.ai_provider or settings.ai_provider
ai_model = request.ai_model or settings.ai_model
ai_cli_timeout = request.ai_cli_timeout or settings.ai_cli_timeout

Example API call with overrides:

curl -X POST http://localhost:8000/api/generate \
  -H "Content-Type: application/json" \
  -d '{
    "repo_url": "https://github.com/org/repo",
    "ai_provider": "gemini",
    "ai_model": "gemini-2.5-pro",
    "ai_cli_timeout": 180
  }'

Docker Configuration

The Docker image pre-installs all three provider CLIs (Claude Code, Cursor Agent, and Gemini CLI) and exposes port 8000.

docker-compose.yaml

services:
  docsfy:
    build: .
    ports:
      - "8000:8000"
    env_file: .env
    volumes:
      - ./data:/data
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3

Key points:

env_file: .env -- loads your .env file into the container environment.
./data:/data -- maps a local data/ directory to the container's /data path, matching the DATA_DIR default. This persists the database and generated sites across container restarts.

Environment in the Dockerfile

The Dockerfile sets two environment variables for runtime:

ENV PATH="/home/appuser/.local/bin:/home/appuser/.npm-global/bin:${PATH}"
ENV HOME="/home/appuser"

PATH ensures the Claude, Cursor, and Gemini CLI binaries are discoverable.
HOME is set explicitly for OpenShift compatibility, where containers run as a random UID without a passwd entry.

Complete `.env.example`

# AI Configuration
AI_PROVIDER=claude
# [1m] = 1 million token context window, this is a valid model identifier
AI_MODEL=claude-opus-4-6[1m]
AI_CLI_TIMEOUT=60

# Claude - Option 1: API Key
# ANTHROPIC_API_KEY=

# Claude - Option 2: Vertex AI
# CLAUDE_CODE_USE_VERTEX=1
# CLOUD_ML_REGION=
# ANTHROPIC_VERTEX_PROJECT_ID=

# Gemini
# GEMINI_API_KEY=

# Cursor
# CURSOR_API_KEY=

# Logging
LOG_LEVEL=INFO

Reference Table

Variable	Default	Type	Required	Description
`AI_PROVIDER`	`claude`	`string`	No	AI backend (`claude`, `gemini`, `cursor`)
`AI_MODEL`	`claude-opus-4-6[1m]`	`string`	No	Provider-specific model identifier
`AI_CLI_TIMEOUT`	`60`	`int` (> 0)	No	Seconds before an AI call times out
`LOG_LEVEL`	`INFO`	`string`	No	Logging verbosity
`DATA_DIR`	`/data`	`path`	No	Root directory for database and projects
`ANTHROPIC_API_KEY`	--	`string`	For Claude (API)	Anthropic API key
`CLAUDE_CODE_USE_VERTEX`	--	`flag`	For Claude (Vertex)	Set to `1` to use Vertex AI
`CLOUD_ML_REGION`	--	`string`	For Claude (Vertex)	Google Cloud region
`ANTHROPIC_VERTEX_PROJECT_ID`	--	`string`	For Claude (Vertex)	Google Cloud project ID
`GEMINI_API_KEY`	--	`string`	For Gemini	Google Gemini API key
`CURSOR_API_KEY`	--	`string`	For Cursor	Cursor API key
`HOST`	`0.0.0.0`	`string`	No	Server bind address
`PORT`	`8000`	`int`	No	Server port
`DEBUG`	`false`	`string`	No	Enable auto-reload (`true`/`false`)