Repository Structure
docsfy is organized as a src-layout Python service with server-rendered UI, static site rendering utilities, and a focused async test suite.
Top-Level Layout
docsfy/
├── src/docsfy/ # Application package
│ ├── __init__.py
│ ├── main.py # FastAPI app, auth middleware, API routes
│ ├── config.py # Environment-backed settings
│ ├── models.py # Pydantic request/plan models
│ ├── storage.py # SQLite + filesystem storage + user/session auth
│ ├── repository.py # git clone/diff helpers
│ ├── ai_client.py # AI CLI wrapper re-exports
│ ├── prompts.py # Planner/page/incremental prompt builders
│ ├── json_parser.py # Robust JSON extraction from AI output
│ ├── generator.py # Planner + page generation orchestration
│ ├── renderer.py # Markdown-to-HTML rendering + asset/site output
│ ├── templates/ # Jinja templates (app UI + generated docs pages)
│ └── static/ # Frontend assets copied into generated docs
├── tests/ # Unit + integration tests
├── docs/plans/ # Design/implementation planning docs
├── test-plans/ # End-to-end/manual UI test plan
├── pyproject.toml # Packaging, deps, pytest config, script entrypoint
├── uv.lock # Locked dependency graph
├── tox.toml # Local test task runner
├── Dockerfile # Multi-stage runtime image
├── docker-compose.yaml # Local container orchestration
├── .env.example # Environment variable template
├── .pre-commit-config.yaml # Lint/type/security hooks
├── .flake8 # Flake8 plugin settings
├── .gitleaks.toml # Secret scanning config
├── .gitignore
└── OWNERS
Source Modules (src/docsfy)
API entrypoint and route wiring
main.py defines app startup, authentication middleware, API endpoints, and the end-to-end generation lifecycle.
app = FastAPI(
title="docsfy",
description="AI-powered documentation generator",
version="0.1.0",
lifespan=lifespan,
)
app.add_middleware(AuthMiddleware)
@app.get("/health")
async def health() -> dict[str, str]:
return {"status": "ok"}
@app.post("/api/generate", status_code=202)
async def generate(request: Request, gen_request: GenerateRequest) -> dict[str, str]:
Key responsibilities:
- request auth (Bearer token or docsfy_session cookie)
- project/variant ownership checks
- generation task scheduling and abort logic
- docs serving (/docs/...) and archive download endpoints
Settings and request models
config.pycentralizes runtime settings (ADMIN_KEY,AI_PROVIDER,AI_MODEL,DATA_DIR, cookie security, timeout).models.pyvalidates generation input (repo_urlvsrepo_path) and doc-plan schemas (DocPlan,NavGroup,DocPage).
class Settings(BaseSettings):
model_config = SettingsConfigDict(
env_file=".env",
env_file_encoding="utf-8",
extra="ignore",
)
admin_key: str = ""
ai_provider: str = "claude"
ai_model: str = "claude-opus-4-6[1m]"
ai_cli_timeout: int = Field(default=60, gt=0)
log_level: str = "INFO"
data_dir: str = "/data"
secure_cookies: bool = True
Generation pipeline modules
prompts.py: prompt construction for planner, page generation, and incremental page selection.ai_client.py: re-exports provider/runtime helpers fromai-cli-runner.json_parser.py: resilient parsing from noisy AI output.generator.py: planning + page generation, cache support, bounded concurrency.repository.py: git clone and changed-file detection for incremental behavior.
success, output = await call_ai_cli(
prompt=prompt,
cwd=repo_path,
ai_provider=ai_provider,
ai_model=ai_model,
ai_cli_timeout=ai_cli_timeout,
cli_flags=cli_flags,
)
results = await run_parallel_with_limit(
coroutines, max_concurrency=MAX_CONCURRENT_PAGES
)
Persistence and runtime pathing
storage.py owns both database schema/migrations and output path conventions.
DB_PATH = Path(os.getenv("DATA_DIR", "/data")) / "docsfy.db"
DATA_DIR = Path(os.getenv("DATA_DIR", "/data"))
PROJECTS_DIR = DATA_DIR / "projects"
CREATE TABLE IF NOT EXISTS projects (
name TEXT NOT NULL,
ai_provider TEXT NOT NULL DEFAULT '',
ai_model TEXT NOT NULL DEFAULT '',
owner TEXT NOT NULL DEFAULT '',
repo_url TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'generating',
...
PRIMARY KEY (name, ai_provider, ai_model, owner)
)
def get_project_dir(
name: str, ai_provider: str = "", ai_model: str = "", owner: str = ""
) -> Path:
...
safe_owner = _validate_owner(owner)
return PROJECTS_DIR / safe_owner / _validate_name(name) / ai_provider / ai_model
Templates and Static Assets
Jinja templates (src/docsfy/templates)
- App UI pages:
dashboard.html,status.html,login.html,admin.html - Generated docs pages:
index.html,page.html - Shared partials:
_theme.html,_sidebar.html,_modal.html
Generated docs templates explicitly load the packaged static assets:
<script src="assets/theme.js"></script>
<script src="assets/search.js"></script>
<script src="assets/copy.js"></script>
<script src="assets/callouts.js"></script>
<script src="assets/scrollspy.js"></script>
<script src="assets/codelabels.js"></script>
<script src="assets/github.js"></script>
Static frontend assets (src/docsfy/static)
style.css: full docs theme (layout, typography, callouts, TOC, search modal)theme.js: dark/light theme toggle + persistencesearch.js:Cmd/Ctrl+Kmodal search usingsearch-index.jsoncopy.js: code block copy buttonscallouts.js: transforms blockquotes (Note,Warning,Tip, etc.) into calloutsscrollspy.js: active heading sync in TOCcodelabels.js: inferred language badges on code blocksgithub.js: optional GitHub stars badge hydration
renderer.py copies these files to the generated site output and emits search/LLM artifacts:
if STATIC_DIR.exists():
for static_file in STATIC_DIR.iterdir():
if static_file.is_file():
shutil.copy2(static_file, assets_dir / static_file.name)
(output_dir / "search-index.json").write_text(
json.dumps(search_index), encoding="utf-8"
)
(output_dir / "llms.txt").write_text(llms_txt, encoding="utf-8")
(output_dir / "llms-full.txt").write_text(llms_full_txt, encoding="utf-8")
Tests (tests/)
The suite is split by module/feature area:
test_main.py: API route behavior and generation endpoint lifecycletest_auth.py: login/session flows, role permissions (admin,user,viewer)test_storage.py: DB CRUD, migrations, key/session management, ACL behaviortest_repository.py: clone/local SHA/diff helperstest_generator.py: planner/page generation and incremental planner handlingtest_renderer.py: markdown rendering and HTML sanitization behaviortest_config.py,test_models.py,test_json_parser.py,test_prompts.py,test_ai_client.py: focused unit teststest_dashboard.py: dashboard page rendering behaviortest_integration.py: mocked full flow (generate -> serve -> download -> delete)
Example integration assertion flow:
response = await client.get("/api/status")
assert response.status_code == 200
projects = response.json()["projects"]
assert len(projects) == 1
assert projects[0]["status"] == "ready"
response = await client.get("/docs/test-repo/claude/opus/index.html")
assert response.status_code == 200
Runtime and Configuration Files
Python packaging and app entrypoint
[project]
name = "docsfy"
requires-python = ">=3.12"
dependencies = [
"ai-cli-runner",
"fastapi",
"uvicorn",
"pydantic-settings",
"python-simple-logger",
"aiosqlite",
"jinja2",
"markdown",
"pygments",
"python-multipart>=0.0.22",
]
[project.scripts]
docsfy = "docsfy.main:run"
Container/runtime config
services:
docsfy:
build: .
ports:
- "8000:8000"
env_file: .env
volumes:
- ./data:/data
HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
CMD curl -f http://localhost:8000/health || exit 1
ENTRYPOINT ["uv", "run", "--no-sync", "uvicorn", "docsfy.main:app", "--host", "0.0.0.0", "--port", "8000"]
Environment template
ADMIN_KEY=your-secure-admin-key-here-min-16-chars
AI_PROVIDER=claude
AI_MODEL=claude-opus-4-6[1m]
AI_CLI_TIMEOUT=60
LOG_LEVEL=INFO
# SECURE_COOKIES=false
Local quality/security tooling
# tox.toml
[env.unittests]
deps = ["uv"]
commands = [["uv", "run", "--extra", "dev", "pytest", "-n", "auto", "tests"]]
# .pre-commit-config.yaml (excerpt)
- repo: https://github.com/astral-sh/ruff-pre-commit
hooks:
- id: ruff
- id: ruff-format
- repo: https://github.com/pre-commit/mirrors-mypy
hooks:
- id: mypy
# Data
data/
.dev/data/
Warning: Runtime state (
/data, SQLite DB, generated sites/cache) is intentionally untracked; do not commit generated project output.
CI/CD and Contributor Workflow
Note: No hosted pipeline definitions (for example
.github/workflows/) are currently checked into this repository.
Quality gates are still defined and reproducible locally via tox, pytest, pre-commit, and secret scanning configs (.gitleaks.toml, detect-secrets hook).
Tip: Before opening a PR, run
uv run --extra dev pytest -n auto testsandpre-commit run --all-files.
Runtime Output Layout (Generated, Not Source-Controlled)
Based on storage.py + renderer.py, generation outputs are stored under owner/provider/model-specific paths:
/data/
├── docsfy.db
└── projects/
└── {owner}/
└── {project}/
└── {ai_provider}/{ai_model}/
├── plan.json
├── cache/pages/*.md
└── site/
├── .nojekyll
├── index.html
├── *.html
├── *.md
├── search-index.json
├── llms.txt
├── llms-full.txt
└── assets/*
This separation is important for multi-user and multi-variant isolation (name + provider + model + owner).