Core Concepts

docsfy organizes generated documentation around six core entities:

  • Project: a repository identity (derived name + metadata).
  • Variant: one generated output for a specific AI provider/model.
  • Owner: the authenticated user who owns that project/variant namespace.
  • Role: authorization level (admin, user, viewer).
  • Session: login state via secure cookie and DB-backed expiry.
  • Generated artifacts: cached markdown and rendered static site files.

Note: In docsfy, project names are repository-centric, but storage and access are owner-scoped to avoid cross-user collisions.

1) Projects

A generation request must include exactly one source (repo_url or repo_path), and project_name is derived from that source.

```10:30:src/docsfy/models.py class GenerateRequest(BaseModel): repo_url: str | None = Field( default=None, description="Git repository URL (HTTPS or SSH)" ) repo_path: str | None = Field(default=None, description="Local git repository path") ai_provider: Literal["claude", "gemini", "cursor"] | None = None ai_model: str | None = None ai_cli_timeout: int | None = Field(default=None, gt=0) force: bool = Field( default=False, description="Force full regeneration, ignoring cache" )

@model_validator(mode="after")
def validate_source(self) -> GenerateRequest:
    if not self.repo_url and not self.repo_path:
        msg = "Either 'repo_url' or 'repo_path' must be provided"
        raise ValueError(msg)
    if self.repo_url and self.repo_path:
        msg = "Provide either 'repo_url' or 'repo_path', not both"
        raise ValueError(msg)
    return self
```55:64:src/docsfy/models.py
@property
def project_name(self) -> str:
    if self.repo_url:
        name = self.repo_url.rstrip("/").split("/")[-1]
        if name.endswith(".git"):
            name = name[:-4]
        return name
    if self.repo_path:
        return Path(self.repo_path).resolve().name
    return "unknown"

Projects are tracked in SQLite with generation metadata (status, commit SHA, page count, plan JSON, timestamps).

```56:73:src/docsfy/storage.py CREATE TABLE IF NOT EXISTS projects ( name TEXT NOT NULL, ai_provider TEXT NOT NULL DEFAULT '', ai_model TEXT NOT NULL DEFAULT '', owner TEXT NOT NULL DEFAULT '', repo_url TEXT NOT NULL, status TEXT NOT NULL DEFAULT 'generating', current_stage TEXT, last_commit_sha TEXT, last_generated TEXT, page_count INTEGER DEFAULT 0, error_message TEXT, plan_json TEXT, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (name, ai_provider, ai_model, owner) )

## 2) Variants

A **variant** is one `(project, provider, model, owner)` tuple.  
This is the real unit of generation, status, deletion, serving, and download.

```282:290:src/docsfy/storage.py
"""INSERT INTO projects (name, ai_provider, ai_model, owner, repo_url, status, updated_at)
   VALUES (?, ?, ?, ?, ?, ?, CURRENT_TIMESTAMP)
   ON CONFLICT(name, ai_provider, ai_model, owner) DO UPDATE SET
   repo_url = excluded.repo_url,
   status = excluded.status,
   error_message = NULL,
   current_stage = NULL,
   updated_at = CURRENT_TIMESTAMP""",
(name, ai_provider, ai_model, owner, repo_url, status),

Variant-specific API/docs routes are explicit:

```1019:1041:src/docsfy/main.py @app.get("/api/projects/{name}/{provider}/{model}") async def get_variant_details( request: Request, name: str, provider: str, model: str, ) -> dict[str, str | int | None]: name = _validate_project_name(name) project = await _resolve_project( request, name, ai_provider=provider, ai_model=model )

return project

@app.delete("/api/projects/{name}/{provider}/{model}") async def delete_variant( request: Request, name: str, provider: str, model: str, ) -> dict[str, str]:

```1379:1386:src/docsfy/main.py
@app.get("/docs/{project}/{provider}/{model}/{path:path}")
async def serve_variant_docs(
    request: Request,
    project: str,
    provider: str,
    model: str,
    path: str = "index.html",
) -> FileResponse:

3) Owners

Owner is set from the authenticated username at generation time:

```457:484:src/docsfy/main.py project_name = gen_request.project_name owner = request.state.username

if ai_provider not in ("claude", "gemini", "cursor"): raise HTTPException( status_code=400, detail=f"Invalid AI provider: '{ai_provider}'. Must be claude, gemini, or cursor.", ) if not ai_model: raise HTTPException(status_code=400, detail="AI model must be specified.")

Fix 6: Use lock to prevent race condition between check and add

gen_key = f"{owner}/{project_name}/{ai_provider}/{ai_model}" async with _gen_lock: if gen_key in _generating: raise HTTPException( status_code=409, detail=f"Variant '{project_name}/{ai_provider}/{ai_model}' is already being generated", )

await save_project(
    name=project_name,
    repo_url=gen_request.repo_url or gen_request.repo_path or "",
    status="generating",
    ai_provider=ai_provider,
    ai_model=ai_model,
    owner=owner,
)
Owner is also part of filesystem layout:

```501:519:src/docsfy/storage.py
def get_project_dir(
    name: str, ai_provider: str = "", ai_model: str = "", owner: str = ""
) -> Path:
    if not ai_provider or not ai_model:
        msg = "ai_provider and ai_model are required for project directory paths"
        raise ValueError(msg)
    # Sanitize path segments to prevent traversal
    for segment_name, segment in [("ai_provider", ai_provider), ("ai_model", ai_model)]:
        if (
            "/" in segment
            or "\\" in segment
            or ".." in segment
            or segment.startswith(".")
        ):
            msg = f"Invalid {segment_name}: '{segment}'"
            raise ValueError(msg)
    safe_owner = _validate_owner(owner)
    return PROJECTS_DIR / safe_owner / _validate_name(name) / ai_provider / ai_model

Cross-owner sharing is controlled through project_access and scoped by (project_name, project_owner, username).

```237:243:src/docsfy/storage.py CREATE TABLE IF NOT EXISTS project_access ( project_name TEXT NOT NULL, project_owner TEXT NOT NULL DEFAULT '', username TEXT NOT NULL, PRIMARY KEY (project_name, project_owner, username) )

> **Warning:** For admin users, if multiple owners have the same variant `(name/provider/model)`, owner is ambiguous and some variant routes return `409` until disambiguated.

```241:246:src/docsfy/main.py
if len(distinct_owners) > 1:
    raise HTTPException(
        status_code=409,
        detail="Multiple owners found for this variant, please specify owner",
    )

4) Roles

docsfy defines three roles:

  • admin: full access, including user and access management endpoints.
  • user: read/write project operations (generate, abort, delete) within accessible scope.
  • viewer: read-only access (dashboard/docs/download/status), no write operations.

```609:623:src/docsfy/storage.py VALID_ROLES = frozenset({"admin", "user", "viewer"})

async def create_user(username: str, role: str = "user") -> tuple[str, str]: """Create a user and return (username, raw_api_key).""" if username.lower() == "admin": msg = "Username 'admin' is reserved" raise ValueError(msg) if not re.match(r"^[a-zA-Z0-9][a-zA-Z0-9._-]{1,49}$", username): msg = f"Invalid username: '{username}'. Must be 2-50 alphanumeric characters, dots, hyphens, underscores." raise ValueError(msg) if role not in VALID_ROLES: msg = f"Invalid role: '{role}'. Must be admin, user, or viewer." raise ValueError(msg)

```185:191:src/docsfy/main.py
def _require_write_access(request: Request) -> None:
    """Raise 403 if user is a viewer (read-only)."""
    if request.state.role not in ("admin", "user"):
        raise HTTPException(
            status_code=403,
            detail="Write access required.",
        )

5) Sessions

Authentication supports both:

  • Authorization: Bearer ... (admin key or user API key)
  • docsfy_session cookie (browser login flow)

```122:137:src/docsfy/main.py

1. Check Authorization header (API clients)

auth_header = request.headers.get("authorization", "") if auth_header.startswith("Bearer "): token = auth_header[7:] if token == settings.admin_key: is_admin = True username = "admin" else: user = await get_user_by_key(token)

2. Check session cookie (browser) -- opaque session token

if not user and not is_admin: session_token = request.cookies.get("docsfy_session") if session_token: session = await get_session(session_token)

Sessions are opaque tokens, hashed at rest, and expire after 8 hours.

```21:23:src/docsfy/storage.py
SESSION_TTL_SECONDS = 28800  # 8 hours
SESSION_TTL_HOURS = SESSION_TTL_SECONDS // 3600

```686:713:src/docsfy/storage.py async def create_session( username: str, is_admin: bool = False, ttl_hours: int = SESSION_TTL_HOURS ) -> str: """Create an opaque session token.""" token = secrets.token_urlsafe(32) token_hash = _hash_session_token(token) expires_at = datetime.now(timezone.utc) + timedelta(hours=ttl_hours) expires_str = expires_at.strftime("%Y-%m-%d %H:%M:%S") async with aiosqlite.connect(DB_PATH) as db: await db.execute( "INSERT INTO sessions (token, username, is_admin, expires_at) VALUES (?, ?, ?, ?)", (token_hash, username, 1 if is_admin else 0, expires_str), ) await db.commit() return token

```297:304:src/docsfy/main.py
response.set_cookie(
    "docsfy_session",
    session_token,
    httponly=True,
    samesite="strict",
    secure=settings.secure_cookies,
    max_age=SESSION_TTL_SECONDS,
)

Tip: Keep SECURE_COOKIES enabled in production. Only set it to false for local HTTP development.

```27:28:.env.example

Set to false for local HTTP development

SECURE_COOKIES=false

## 6) Generated Artifacts

Each completed variant writes structured outputs under owner/project/provider/model:

- `plan.json` (navigation plan used for rendering and status UI)
- `cache/pages/*.md` (cached AI markdown for incremental regeneration)
- `site/` (served static docs)

Site generation includes HTML, markdown copies, search index, and LLM-friendly files:

```223:290:src/docsfy/renderer.py
# Prevent GitHub Pages from running Jekyll
(output_dir / ".nojekyll").touch()

project_name: str = plan.get("project_name", "Documentation")
tagline: str = plan.get("tagline", "")
navigation: list[dict[str, Any]] = plan.get("navigation", [])
repo_url: str = plan.get("repo_url", "")

# ...
(output_dir / "index.html").write_text(index_html, encoding="utf-8")

# ...
(output_dir / f"{slug}.html").write_text(page_html, encoding="utf-8")
(output_dir / f"{slug}.md").write_text(md_content, encoding="utf-8")

search_index = _build_search_index(valid_pages, plan)
(output_dir / "search-index.json").write_text(
    json.dumps(search_index), encoding="utf-8"
)

# Generate llms.txt files
llms_txt = _build_llms_txt(plan)
(output_dir / "llms.txt").write_text(llms_txt, encoding="utf-8")

llms_full_txt = _build_llms_full_txt(plan, valid_pages)
(output_dir / "llms-full.txt").write_text(llms_full_txt, encoding="utf-8")

The orchestration layer persists the plan and final status:

```998:1015:src/docsfy/main.py site_dir = get_project_site_dir(project_name, ai_provider, ai_model, owner) render_site(plan=plan, pages=pages, output_dir=site_dir)

project_dir = get_project_dir(project_name, ai_provider, ai_model, owner) (project_dir / "plan.json").write_text(json.dumps(plan, indent=2), encoding="utf-8")

page_count = len(pages) await update_project_status( project_name, ai_provider, ai_model, status="ready", owner=owner, current_stage=None, last_commit_sha=commit_sha, page_count=page_count, plan_json=json.dumps(plan), )

Persistent storage is typically mounted to `/data`:

```1:10:docker-compose.yaml
services:
  docsfy:
    build: .
    ports:
      - "8000:8000"
    env_file: .env
    volumes:
      - ./data:/data
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]

7) CI/CD and Quality Gate Context

This repository currently has no checked-in .github workflow directory, but quality checks are still codified via local/CI-capable tooling:

```1:7:tox.toml skipsdist = true

envlist = ["unittests"]

[env.unittests] deps = ["uv"] commands = [["uv", "run", "--extra", "dev", "pytest", "-n", "auto", "tests"]]

```43:61:.pre-commit-config.yaml
- repo: https://github.com/astral-sh/ruff-pre-commit
  rev: v0.15.2
  hooks:
    - id: ruff
    - id: ruff-format

- repo: https://github.com/gitleaks/gitleaks
  rev: v8.30.0
  hooks:
    - id: gitleaks

- repo: https://github.com/pre-commit/mirrors-mypy
  rev: v1.19.1
  hooks:
    - id: mypy

In practice, these concepts fit together as:

  1. Authenticated user (owner + role) submits generation request.
  2. Request creates/updates a project variant.
  3. Background pipeline plans, generates, renders artifacts.
  4. Session-scoped or bearer-scoped access controls who can view/manage each variant.
  5. Static artifacts are served directly or downloaded as .tar.gz.