# github-webhook-server > FastAPI webhook server for automating GitHub repository settings, pull request workflows, checks, releases, and log analysis. --- Source: introduction.md # Introduction `github-webhook-server` is a self-hosted FastAPI service that receives GitHub webhooks and turns them into repository and pull request automation. If you maintain several repositories and want one place to manage reviewer assignment, labels, checks, merge rules, cherry-picks, and release behavior, this is what the server is built for. You configure it once, connect repositories to it, and it applies the same workflow consistently across your GitHub organization. It is not just a passive webhook receiver. On startup, it reads a central `config.yaml`, applies repository settings and labels, updates protected branch rules, resets stale in-progress checks, and creates or updates webhooks for every configured repository. After that, each incoming event is routed to the right handler for PRs, reviews, comments, checks, status updates, and tag pushes. > **Note:** The webhook endpoint returns `200 OK` as soon as the payload is validated, then processes the event in the background. That keeps GitHub deliveries from timing out while the server clones repositories, runs checks, builds containers, or performs cherry-picks. ## What This Server Is For This project is a good fit for: - Teams maintaining multiple GitHub repositories and wanting one place to define automation. - Platform, release, or DevOps engineers who want consistent labels, branch protection, and PR policy across repos. - Projects that use `OWNERS` files and want reviewer and approver rules enforced automatically. - Maintainers who want user-facing PR commands such as `/retest`, `/approve`, `/cherry-pick`, and `/build-and-push-container`. ## What It Automates ### Across repositories At the repository level, the server can: - Create or update GitHub webhooks for the events you configure per repository. - Apply repository defaults such as delete-on-merge and auto-merge support. - Create standard labels and colors, including review labels, merge-state labels, size labels, and cherry-pick labels. - Configure protected branches and required status checks from your central configuration. - Support optional release behavior such as package publishing, container builds, and Slack notifications. ### On pull requests For pull requests, the server acts like a shared workflow layer. It can: - Post a welcome comment when a PR opens or becomes ready for review. - Create a tracking issue for a new PR and close it automatically when the PR is closed or merged. - Assign reviewers from `OWNERS` files, including path-specific `OWNERS` files inside the repository. - Add labels for PR size, target branch, merge conflicts, rebase-needed state, verification, hold/WIP state, review status, auto-merge, and cherry-pick requests. - Queue and run built-in checks such as `tox`, `pre-commit`, `build-container`, `python-module-install`, and `conventional-title`. - Run user-defined `custom-check-runs`, with optional checks that do not have to block merges. - Calculate a `can-be-merged` check from approvals, status checks, blocker labels, mergeability, unresolved review conversations, and any extra required labels you configured. - Auto-merge the PR when the `automerge` label is present and the `can-be-merged` check succeeds. The core PR setup is explicit in the handler: ```779:857:webhook_server/libs/handlers/pull_request_handler.py async def process_opened_or_synchronize_pull_request(self, pull_request: PullRequest) -> None: # Stage 1: Initial setup and check queue tasks setup_tasks: list[Coroutine[Any, Any, Any]] = [] setup_tasks.append(self.owners_file_handler.assign_reviewers(pull_request=pull_request)) setup_tasks.append( self.labels_handler._add_label( pull_request=pull_request, label=f"{BRANCH_LABEL_PREFIX}{pull_request.base.ref}", ) ) setup_tasks.append(self.label_pull_request_by_merge_state(pull_request=pull_request)) setup_tasks.append(self.check_run_handler.set_check_queued(name=CAN_BE_MERGED_STR)) # ... queue tox / pre-commit / python-module-install / build-container / verified / size ... ci_tasks.append(self.runner_handler.run_tox(pull_request=pull_request)) ci_tasks.append(self.runner_handler.run_pre_commit(pull_request=pull_request)) ci_tasks.append(self.runner_handler.run_install_python_module(pull_request=pull_request)) ci_tasks.append(self.runner_handler.run_build_container(pull_request=pull_request)) ``` In this repository's end-to-end tests, a normal PR is expected to end up with successful `build-container`, `pre-commit`, `python-module-install`, and `tox` checks; a queued `verified`; a failing `can-be-merged` until approval and policy requirements are satisfied; and labels such as `size/M` and `branch-main`. ### From comments and reviews Contributors and maintainers can control automation directly from PR comments. In addition to label-driven commands such as `/wip`, `/hold`, `/verified`, `/lgtm`, `/approve`, and `/automerge`, the comment handler supports a set of built-in workflow commands: ```154:202:webhook_server/libs/handlers/issue_comment_handler.py available_commands: list[str] = [ COMMAND_RETEST_STR, COMMAND_REPROCESS_STR, COMMAND_CHERRY_PICK_STR, COMMAND_ASSIGN_REVIEWERS_STR, COMMAND_CHECK_CAN_MERGE_STR, BUILD_AND_PUSH_CONTAINER_STR, COMMAND_ASSIGN_REVIEWER_STR, COMMAND_ADD_ALLOWED_USER_STR, COMMAND_REGENERATE_WELCOME_STR, COMMAND_TEST_ORACLE_STR, ] # ... if _command not in available_commands + list(USER_LABELS_DICT.keys()): self.logger.debug(f"{self.log_prefix} Command {command} is not supported.") return ``` In practice, that means users can do things like: - `/assign-reviewers` or `/assign-reviewer @username` - `/retest tox`, `/retest pre-commit`, or `/retest all` - `/reprocess` to rebuild the whole PR workflow - `/check-can-merge` to force a mergeability recalculation - `/build-and-push-container` to publish a PR image on demand - `/cherry-pick ` to queue or perform backports - `/test-oracle` to request AI-generated test recommendations when configured - `/regenerate-welcome` to refresh the onboarding comment Reviews matter too. The server tracks review state with labels such as `approved-*`, `lgtm-*`, `changes-requested-*`, and `commented-*`, and it also understands `/approve` when it appears inside a review body. > **Note:** In this project, `/approve` and `/lgtm` are part of the merge logic, not just convenient comments. The server converts them into labels and uses those labels when deciding whether `can-be-merged` should pass. ### On tags, releases, and backports The automation is not limited to PRs. On tag pushes, the server can: - Build a Python distribution with `uv build` - Validate and upload it to PyPI with `twine` - Build and push release container images when `container.release: true` is set - Send Slack notifications for successful publish or push operations On merged PRs, it can also: - Detect `cherry-pick-` labels - Create cherry-pick branches and PRs automatically - Optionally use AI to resolve cherry-pick conflicts - Mark AI-resolved cherry-picks for manual verification instead of auto-verifying them ### Optional AI-assisted features The server also includes optional AI integrations: - `test-oracle` connects to an external service that analyzes a PR and recommends which tests to run. - `ai-features` can suggest or auto-fix PR titles to match your `conventional-title` rules. - The same `ai-features` block can enable AI-assisted cherry-pick conflict resolution. ## Configuration Model The configuration model is layered so you can set organization-wide defaults without losing per-repository flexibility. Settings are resolved in this order: ```132:153:webhook_server/libs/config.py def get_value(self, value: str, return_on_none: Any = None, extra_dict: dict[str, Any] | None = None) -> Any: """ Get value from config Supports dot notation for nested values (e.g., "docker.username", "pypi.token") Order of getting value: 1. Local repository file (.github-webhook-server.yaml) 2. Repository level global config file (config.yaml) 3. Root level global config file (config.yaml) """ if extra_dict: result = self._get_nested_value(value, extra_dict) if result is not None: return result for scope in (self.repository_data, self.root_data): result = self._get_nested_value(value, scope) if result is not None: return result ``` That gives you three useful layers: - Root-level defaults in the central `config.yaml` - Per-repository overrides inside the `repositories` map in that same file - Repository-local overrides in `.github-webhook-server.yaml` > **Tip:** Keep shared policy in the central `config.yaml`, then use `.github-webhook-server.yaml` only for repositories that truly need exceptions. A real example from `examples/config.yaml` shows the kind of repository-level behavior you can enable: ```139:183:examples/config.yaml repositories: my-repository: name: my-org/my-repository log-level: DEBUG # Override global log-level for repository log-file: my-repository.log # Override global log-file for repository slack-webhook-url: # Send notification to slack on several operations verified-job: true events: # To listen to all events do not send events - push - pull_request - pull_request_review - pull_request_review_thread - issue_comment - check_run - status tox: main: all # Run all tests in tox.ini when pull request parent branch is main dev: testenv1,testenv2 # Run testenv1 and testenv2 tests in tox.ini when pull request parent branch is dev pre-commit: true # Run pre-commit check protected-branches: dev: [] main: # set [] in order to set all defaults run included include-runs: - "pre-commit.ci - pr" - "WIP" exclude-runs: - "SonarCloud Code Analysis" container: username: password: repository: tag: release: true # Push image to registry on new release with release as the tag ``` At the top level, the example configuration also includes sections such as `labels`, `pr-size-thresholds`, `branch-protection`, `test-oracle`, and `ai-features`, so one server can apply different automation profiles to different repositories without duplicating everything. A few especially important settings to know early: - `webhook-ip` must be a full URL, including the `/webhook_server` path. - `webhook-secret` enables GitHub signature verification. - `allow-commands-on-draft-prs` controls whether slash commands are blocked or allowed on draft PRs. - `conventional-title` validates PR titles against a Conventional Commits-style pattern. - `set-auto-merge-prs` and `auto-verified-and-merged-users` control automatic merge behavior. - `custom-check-runs` lets you add your own shell commands as first-class check runs. ## OWNERS-Driven Reviews Reviewer and approver logic is path-aware. The server reads `OWNERS` files from the cloned repository, matches them against the files changed in the PR, and requests the right reviewers automatically. The root `OWNERS` file in this repository uses the expected YAML shape: ```1:6:OWNERS approvers: - myakove - rnetser reviewers: - myakove - rnetser ``` Subdirectories can have their own `OWNERS` files too. When a PR touches files under those paths, the server uses those path-specific approvers and reviewers. If a path-level `OWNERS` file sets `root-approvers: false`, root approvers are not automatically required for that area. ## Operational Notes The server also writes structured webhook logs and can expose an optional internal log viewer and log APIs for troubleshooting PR flow, status checks, and failures. > **Warning:** If you enable the optional log viewer, keep it on a trusted network. The project treats those endpoints as internal operational tooling, not a public-facing dashboard. Taken together, `github-webhook-server` is best understood as a shared automation layer for GitHub: contributors interact with simple PR comments and labels, while maintainers get consistent policy, repeatable release automation, and one place to operate everything. --- Source: architecture-and-event-flow.md # Architecture and Event Flow `github-webhook-server` is built around one simple idea: accept GitHub webhooks quickly, then do the real work asynchronously. As a user, that means GitHub gets a fast response, while pull request automation, release work, and logging continue in the background. ## At a Glance - The HTTP endpoint validates the request and returns `200 OK` immediately. - A background task creates a `GithubWebhook` object and routes the event to specialized handlers. - PR automation is split across focused components such as `PullRequestHandler`, `IssueCommentHandler`, `PullRequestReviewHandler`, `CheckRunHandler`, `OwnersFileHandler`, `LabelsHandler`, and `RunnerHandler`. - Each webhook gets one temporary base clone; checks and release actions run in isolated Git worktrees created from that clone. - Every webhook produces both normal text logs and structured JSON records, which can be searched later or viewed through the optional log viewer. ## Before the First Event Startup does more than launch the HTTP server. `entrypoint.py` first runs repository bootstrap logic, then starts Uvicorn with the configured worker count. ```23:45:webhook_server/utils/github_repository_and_webhook_settings.py async def repository_and_webhook_settings(webhook_secret: str | None = None) -> None: config = Config(logger=LOGGER) apis_dict: dict[str, dict[str, Any]] = {} ... await set_repositories_settings(config=config, apis_dict=apis_dict) set_all_in_progress_check_runs_to_queued(repo_config=config, apis_dict=apis_dict) create_webhook(config=config, apis_dict=apis_dict, secret=webhook_secret) ``` That startup pass does three important jobs: - It applies repository-side settings such as labels, branch protection, and related GitHub configuration. - It resets built-in check runs that were left in `in_progress` during a previous shutdown back to `queued`. - It creates or updates the GitHub webhook on each configured repository so GitHub actually sends the events listed in your config. `entrypoint.py` then starts the app with `workers=int(_max_workers)`, so worker-level parallelism is controlled by the root `max-workers` setting. > **Note:** The `events` list under each repository is operational, not just descriptive. Startup uses it to create or update the real GitHub webhook subscription. ## Webhook Intake Pipeline When GitHub calls `POST /webhook_server`, the server does only the minimum synchronous work required to prove the request is valid: read the body, verify the signature if configured, parse JSON, and check that the repository and event metadata are present. Once that passes, it returns `200 OK` and hands everything else to a background task. ```418:529:webhook_server/app.py # Return 200 immediately - all validation passed, we can process this webhook LOGGER.info(f"{log_context} Webhook validation passed, queuing for background processing") async def process_with_error_handling( _hook_data: dict[Any, Any], _headers: Headers, _delivery_id: str, _event_type: str ) -> None: # Create structured logging context at the VERY START repository_name = _hook_data.get("repository", {}).get("name", "unknown") repository_full_name = _hook_data.get("repository", {}).get("full_name", "unknown") ctx = create_context( hook_id=_delivery_id, event_type=_event_type, repository=repository_name, repository_full_name=repository_full_name, action=_hook_data.get("action"), sender=_hook_data.get("sender", {}).get("login"), ) ... try: _api: GithubWebhook = GithubWebhook(hook_data=_hook_data, headers=_headers, logger=_logger) try: await _api.process() finally: await _api.cleanup() ... finally: if ctx: ctx.completed_at = datetime.now(UTC) log_webhook_summary(ctx, _logger, _log_context) try: write_webhook_log(ctx) except Exception: _logger.exception(f"{_log_context} Failed to write webhook log") finally: clear_context() task = asyncio.create_task( process_with_error_handling( _hook_data=hook_data, _headers=request.headers, _delivery_id=delivery_id, _event_type=event_type, ) ) _background_tasks.add(task) task.add_done_callback(_background_tasks.discard) return JSONResponse( status_code=status.HTTP_200_OK, content={ "status": status.HTTP_200_OK, "message": "Webhook queued for processing", "delivery_id": delivery_id, "event_type": event_type, }, ) ``` In practice, the intake flow looks like this: 1. GitHub sends the event to `POST /webhook_server`. 2. The server optionally checks the source IP, verifies `x-hub-signature-256` when `webhook-secret` is set, parses the payload, and validates required fields. 3. The server returns a small JSON response containing `delivery_id` and `event_type`. 4. A background task creates the structured context, instantiates `GithubWebhook`, runs processing, performs cleanup, and always writes the final summary log. > **Note:** A `200 OK` means "accepted and queued", not "automation finished successfully". The `delivery_id` is the key you use to trace a specific webhook through the logs. For production deployments, the important security settings live near the top of the global config: `webhook-secret`, `verify-github-ips`, and `verify-cloudflare-ips`. ## Background Processing Model The background model is intentionally simple: - Uvicorn provides process-level concurrency. - Inside each worker, webhook processing is queued with `asyncio.create_task`. - Active tasks are tracked in memory and given up to 30 seconds to finish during shutdown before they are cancelled. - Local work such as Git, `tox`, `pre-commit`, `podman`, `gh`, and `twine` runs as subprocesses through `run_command()`. - PyGithub itself is synchronous, so the code regularly wraps blocking API calls and many property reads in `asyncio.to_thread()` to keep the event loop responsive. This project does not use Celery, Redis, or an external broker. The “queue” is the application process itself. > **Note:** Because the queue is in-process, recovery is operational rather than broker-based. If the server dies after GitHub already received `200 OK`, you recover with logs, GitHub redelivery, or the `/reprocess` command, not by checking a separate job system. The official container image is designed around that model. It includes the toolchain the server expects to run locally, including `pre-commit`, `tox`, `gh`, `podman`, `regctl`, and the supported AI CLIs. ## Handler Architecture `GithubWebhook.process()` is the router for the whole system. It resolves the event into either a tag flow or a pull-request-backed flow, enriches the structured context, and then dispatches to specialized handlers. At a high level, the routes are: - `pull_request`: initialize the PR, assign reviewers, queue and run checks, post the welcome message, create an issue if configured, and maintain merge-related labels. - `pull_request_review`: translate review state into labels and optionally treat `/approve` in a review body as an approval command. - `issue_comment`: parse slash commands such as `/retest`, `/assign-reviewers`, `/check-can-merge`, `/build-and-push-container`, `/cherry-pick`, `/reprocess`, and `/test-oracle`. - `check_run`: ignore non-terminal runs, react to completed checks, and optionally auto-merge when `can-be-merged` succeeds and the PR has `automerge`. - `status` and `pull_request_review_thread`: re-evaluate merge eligibility when a status reaches a terminal state or a review thread is resolved or unresolved. - `push`: handle tag releases; ordinary branch pushes are intentionally skipped. For a new or updated PR, the main handler is organized into two phases: setup first, then local CI/CD work. ```779:864:webhook_server/libs/handlers/pull_request_handler.py async def process_opened_or_synchronize_pull_request(self, pull_request: PullRequest) -> None: if self.ctx: self.ctx.start_step("pr_workflow_setup") # Stage 1: Initial setup and check queue tasks setup_tasks: list[Coroutine[Any, Any, Any]] = [] setup_tasks.append(self.owners_file_handler.assign_reviewers(pull_request=pull_request)) setup_tasks.append( self.labels_handler._add_label( pull_request=pull_request, label=f"{BRANCH_LABEL_PREFIX}{pull_request.base.ref}", ) ) setup_tasks.append(self.label_pull_request_by_merge_state(pull_request=pull_request)) setup_tasks.append(self.check_run_handler.set_check_queued(name=CAN_BE_MERGED_STR)) ... self.logger.info(f"{self.log_prefix} Executing setup tasks") setup_results = await asyncio.gather(*setup_tasks, return_exceptions=True) ... if self.ctx: self.ctx.complete_step("pr_workflow_setup") # Stage 2: CI/CD execution tasks if self.ctx: self.ctx.start_step("pr_cicd_execution") ci_tasks: list[Coroutine[Any, Any, Any]] = [] ci_tasks.append(self.runner_handler.run_tox(pull_request=pull_request)) ci_tasks.append(self.runner_handler.run_pre_commit(pull_request=pull_request)) ci_tasks.append(self.runner_handler.run_install_python_module(pull_request=pull_request)) ci_tasks.append(self.runner_handler.run_build_container(pull_request=pull_request)) ... self.logger.info(f"{self.log_prefix} Executing CI/CD tasks") ci_results = await asyncio.gather(*ci_tasks, return_exceptions=True) ... if self.ctx: self.ctx.complete_step("pr_cicd_execution") ``` A few architectural choices are worth knowing: - PR automation is OWNERS-driven. `OwnersFileHandler` determines reviewers, approvers, and command permissions from repository files and the changed paths in the PR. - Merge eligibility is re-computed from current GitHub state rather than blindly trusting one earlier event. That is why `check_run`, `status`, and `pull_request_review_thread` all feed back into `check_if_can_be_merged()`. - Optional features such as custom check runs, conventional-title validation, AI suggestions, and test-oracle calls plug into the same handler flow rather than creating a separate architecture. On a typical new PR, the end-to-end suite expects the user-visible check state to look like this: - `build-container`, `pre-commit`, `python-module-install`, and `tox` complete successfully when those features are configured. - `verified` starts in `queued`. - `can-be-merged` is expected to fail until approval, labels, status checks, and conversation rules are satisfied. ## Repository Cloning and Worktrees The repository strategy is one of the most important architectural choices in this project. Instead of recloning the repository for every operation, each webhook gets one temporary base clone. That clone is reused for local file inspection, and separate Git worktrees are created on demand for isolated execution. The base clone is prepared once per webhook: ```262:393:webhook_server/libs/github_api.py async def _clone_repository( self, pull_request: PullRequest | None = None, checkout_ref: str | None = None, ) -> None: ... rc, _, err = await run_command( command=f"git clone {clone_url_with_token} {self.clone_repo_dir}", log_prefix=self.log_prefix, redact_secrets=[github_token], mask_sensitive=self.mask_sensitive, ) ... if pull_request: # Fetch the base branch first (needed for checkout) base_ref = await asyncio.to_thread(lambda: pull_request.base.ref) rc, _, err = await run_command( command=f"{git_cmd} fetch origin {base_ref}", log_prefix=self.log_prefix, mask_sensitive=self.mask_sensitive, ) ... # Fetch only this specific PR's ref pr_number = await asyncio.to_thread(lambda: pull_request.number) rc, _, err = await run_command( command=f"{git_cmd} fetch origin +refs/pull/{pr_number}/head:refs/remotes/origin/pr/{pr_number}", log_prefix=self.log_prefix, mask_sensitive=self.mask_sensitive, ) else: # For push events (tags only - branch pushes skip cloning) tag_name = checkout_ref.replace("refs/tags/", "") # type: ignore[union-attr] fetch_refspec = f"refs/tags/{tag_name}:refs/tags/{tag_name}" rc, _, _ = await run_command( command=f"{git_cmd} fetch origin {fetch_refspec}", log_prefix=self.log_prefix, mask_sensitive=self.mask_sensitive, ) ... rc, _, err = await run_command( command=f"{git_cmd} checkout {checkout_target}", log_prefix=self.log_prefix, mask_sensitive=self.mask_sensitive, ) self._repo_cloned = True self.logger.info(f"{self.log_prefix} Repository cloned to {self.clone_repo_dir} (ref: {checkout_target})") ``` That base clone is then used for repository-aware logic such as OWNERS parsing and changed-file detection. `OwnersFileHandler` even uses local `git diff` instead of the GitHub API for changed paths, which keeps rate-limit usage down. When the server needs an isolated execution checkout, it creates a worktree from the shared clone: ```71:164:webhook_server/libs/handlers/runner_handler.py @contextlib.asynccontextmanager async def _checkout_worktree( self, pull_request: PullRequest | None = None, is_merged: bool = False, checkout: str = "", tag_name: str = "", ) -> AsyncGenerator[tuple[bool, str, str, str]]: ... if checkout: checkout_target = checkout elif tag_name: checkout_target = tag_name elif is_merged and pull_request and base_ref is not None: checkout_target = base_ref elif pull_request and pr_number is not None: checkout_target = f"origin/pr/{pr_number}" ... rc, current_branch, _ = await run_command( command=f"git -C {repo_dir} rev-parse --abbrev-ref HEAD", log_prefix=self.log_prefix, mask_sensitive=self.github_webhook.mask_sensitive, ) ... async with helpers_module.git_worktree_checkout( repo_dir=repo_dir, checkout=checkout_target, log_prefix=self.log_prefix, mask_sensitive=self.github_webhook.mask_sensitive, ) as (success, worktree_path, out, err): result: tuple[bool, str, str, str] = (success, worktree_path, out, err) # Merge base branch if needed (for PR testing) if success and pull_request and not is_merged and not tag_name: git_cmd = f"git -C {worktree_path}" rc, out, err = await run_command( command=f"{git_cmd} merge origin/{merge_ref} -m 'Merge {merge_ref}'", log_prefix=self.log_prefix, mask_sensitive=self.github_webhook.mask_sensitive, ) if not rc: result = (False, worktree_path, out, err) yield result ``` This design gives the server a few advantages: - The expensive `git clone` happens once per webhook, not once per check. - The base clone stays on a stable checkout that is good for reading `OWNERS` files and computing diffs. - Each execution path gets its own isolated workspace, which prevents one command from polluting another. - PR checks are run against a worktree that merges the current base branch into the PR checkout, so validation is closer to what GitHub would merge. - Tag-based release work can run against a tag worktree without disturbing PR-related state. Cloning is also deliberately avoided when it is not useful: - Branch pushes skip cloning entirely. - Tag pushes clone because release actions need a real checkout. - `check_run` events are ignored unless the action is `completed`. - A failed `can-be-merged` check run does not trigger another clone-and-recheck cycle. > **Tip:** This shared-clone-plus-worktree model is what lets the server run `tox`, `pre-commit`, Python packaging, container builds, `gh` commands, and AI-assisted flows locally without paying the cost of repeated full clones. ## Structured Logging Flow Every webhook carries a structured execution context from the moment background processing starts to the moment the final summary is written. The flow looks like this: 1. `create_context()` stores a `WebhookContext` in a `ContextVar`. 2. Handlers call `start_step()`, `complete_step()`, and `fail_step()` for major workflow stages such as `repo_clone`, `pr_workflow_setup`, `pr_cicd_execution`, `check_merge_eligibility`, and `push_handler`. 3. Normal log messages are still written, but `JsonLogHandler` also serializes them as JSON `log_entry` records and enriches them with webhook metadata from the current context. 4. At the end of processing, `write_webhook_log()` writes one `webhook_summary` record with timing, PR metadata, token usage, workflow steps, and overall success or failure. The summary writer stores those records as one JSON object per line in daily files: ```93:152:webhook_server/utils/structured_logger.py def write_log(self, context: WebhookContext) -> None: """Write webhook context as JSONL entry to date-based log file.""" completed_at = context.completed_at if context.completed_at else datetime.now(UTC) # Get context dict and update timing locally (without mutating context) context_dict = context.to_dict() context_dict["type"] = "webhook_summary" if "timing" in context_dict: context_dict["timing"]["completed_at"] = completed_at.isoformat() if context.started_at: duration_ms = int((completed_at - context.started_at).total_seconds() * 1000) context_dict["timing"]["duration_ms"] = duration_ms # Get log file path log_file = self._get_log_file_path(completed_at) # Serialize context to JSON (compact JSONL format - single line, no indentation) log_entry = json.dumps(context_dict, ensure_ascii=False) ... # Write JSON entry with single newline (JSONL format) os.write(temp_fd, f"{log_entry}\n".encode()) ... with open(log_file, "a") as log_fd: ... log_fd.write(data.decode("utf-8")) ``` For operators, the important outputs are: - Text logs for day-to-day reading. - `log_entry` JSON records for individual log messages. - `webhook_summary` JSON records for the complete end-to-end outcome of one delivery. - Daily files named `webhooks_YYYY-MM-DD.json` under `{data_dir}/logs`. If you enable `ENABLE_LOG_SERVER=true`, the application also exposes a log viewer and related APIs that read these same structured files for filtering, export, workflow-step drill-down, and live streaming. > **Warning:** Treat the log viewer as an internal operations surface. It is only mounted when `ENABLE_LOG_SERVER=true`, and it should be exposed only on a trusted network boundary. ## Configuration That Changes the Flow These root settings shape intake, logging, and bootstrap behavior: ```3:17:examples/config.yaml log-level: INFO # Set global log level, change take effect immediately without server restart log-file: webhook-server.log # Set global log file, change take effect immediately without server restart mcp-log-file: mcp_server.log # Set global MCP log file, change take effect immediately without server restart logs-server-log-file: logs_server.log # Set global Logs Server log file, change take effect immediately without server restart mask-sensitive-data: true # Mask sensitive data in logs (default: true). Set to false for debugging (NOT recommended in production) # Server configuration disable-ssl-warnings: true # Disable SSL warnings (useful in production to reduce log noise from SSL certificate issues) # ... webhook-ip: # Full URL with path (e.g., https://your-domain.com/webhook_server or https://smee.io/your-channel) ``` These repository settings determine which events are registered and what a PR or tag push actually does when it arrives: ```139:182:examples/config.yaml repositories: my-repository: name: my-org/my-repository log-level: DEBUG # Override global log-level for repository log-file: my-repository.log # Override global log-file for repository mask-sensitive-data: false # Override global setting - disable masking for debugging this specific repo (NOT recommended in production) slack-webhook-url: # Send notification to slack on several operations verified-job: true pypi: token: events: # To listen to all events do not send events - push - pull_request - pull_request_review - pull_request_review_thread - issue_comment - check_run - status tox: main: all # Run all tests in tox.ini when pull request parent branch is main dev: testenv1,testenv2 # Run testenv1 and testenv2 tests in tox.ini when pull request parent branch is dev pre-commit: true # Run pre-commit check protected-branches: dev: [] main: # set [] in order to set all defaults run included include-runs: - "pre-commit.ci - pr" - "WIP" exclude-runs: - "SonarCloud Code Analysis" container: username: password: repository: tag: release: true # Push image to registry on new release with release as the tag build-args: # build args to send to podman build command - my-build-arg1=1 - my-build-arg2=2 args: # args to send to podman build command - --format docker ``` A few configuration rules are especially important when you are reasoning about the event flow: - `repositories..events` controls what GitHub sends to the server after startup sync. - `tox`, `pre-commit`, `pypi`, `container`, `conventional-title`, and custom check-run settings decide which checks are queued and which local commands actually run. - `protected-branches` shapes the status-check list that `can-be-merged` evaluates against. - `mask-sensitive-data` controls whether secrets are scrubbed from text logs. - `slack-webhook-url`, `test-oracle`, and AI features add side effects around the main PR pipeline, but they still fit into the same handler model. > **Note:** Repository-local `.github-webhook-server.yaml` overrides matching values from the global `config.yaml`. That lets one server instance manage repositories with different PR rules, labels, checks, and release behavior without changing the intake architecture. Put together, the architecture is straightforward: validate fast, process in the background, route by event type, work from one shared clone, isolate side effects in worktrees, and leave a structured trail behind for every delivery. That is what makes `github-webhook-server` feel responsive to GitHub while still doing substantial repository automation under the hood.# Architecture and Event Flow `github-webhook-server` is built around one simple idea: accept GitHub webhooks quickly, then do the real work asynchronously. As a user, that means GitHub gets a fast response, while pull request automation, release work, and logging continue in the background. ## At a Glance - The HTTP endpoint validates the request and returns `200 OK` immediately. - A background task creates a `GithubWebhook` object and routes the event to specialized handlers. - PR automation is split across focused components such as `PullRequestHandler`, `IssueCommentHandler`, `PullRequestReviewHandler`, `CheckRunHandler`, `OwnersFileHandler`, `LabelsHandler`, and `RunnerHandler`. - Each webhook gets one temporary base clone; checks and release actions run in isolated Git worktrees created from that clone. - Every webhook produces both normal text logs and structured JSON records, which can be searched later or viewed through the optional log viewer. ## Before the First Event Startup does more than launch the HTTP server. `entrypoint.py` first runs repository bootstrap logic, then starts Uvicorn with the configured worker count. ```23:45:webhook_server/utils/github_repository_and_webhook_settings.py async def repository_and_webhook_settings(webhook_secret: str | None = None) -> None: config = Config(logger=LOGGER) apis_dict: dict[str, dict[str, Any]] = {} ... await set_repositories_settings(config=config, apis_dict=apis_dict) set_all_in_progress_check_runs_to_queued(repo_config=config, apis_dict=apis_dict) create_webhook(config=config, apis_dict=apis_dict, secret=webhook_secret) ``` That startup pass does three important jobs: - It applies repository-side settings such as labels, branch protection, and related GitHub configuration. - It resets built-in check runs that were left in `in_progress` during a previous shutdown back to `queued`. - It creates or updates the GitHub webhook on each configured repository so GitHub actually sends the events listed in your config. `entrypoint.py` then starts the app with `workers=int(_max_workers)`, so worker-level parallelism is controlled by the root `max-workers` setting. > **Note:** The `events` list under each repository is operational, not just descriptive. Startup uses it to create or update the real GitHub webhook subscription. ## Webhook Intake Pipeline When GitHub calls `POST /webhook_server`, the server does only the minimum synchronous work required to prove the request is valid: read the body, verify the signature if configured, parse JSON, and check that the repository and event metadata are present. Once that passes, it returns `200 OK` and hands everything else to a background task. ```418:529:webhook_server/app.py # Return 200 immediately - all validation passed, we can process this webhook LOGGER.info(f"{log_context} Webhook validation passed, queuing for background processing") async def process_with_error_handling( _hook_data: dict[Any, Any], _headers: Headers, _delivery_id: str, _event_type: str ) -> None: # Create structured logging context at the VERY START repository_name = _hook_data.get("repository", {}).get("name", "unknown") repository_full_name = _hook_data.get("repository", {}).get("full_name", "unknown") ctx = create_context( hook_id=_delivery_id, event_type=_event_type, repository=repository_name, repository_full_name=repository_full_name, action=_hook_data.get("action"), sender=_hook_data.get("sender", {}).get("login"), ) ... try: _api: GithubWebhook = GithubWebhook(hook_data=_hook_data, headers=_headers, logger=_logger) try: await _api.process() finally: await _api.cleanup() ... finally: if ctx: ctx.completed_at = datetime.now(UTC) log_webhook_summary(ctx, _logger, _log_context) try: write_webhook_log(ctx) except Exception: _logger.exception(f"{_log_context} Failed to write webhook log") finally: clear_context() task = asyncio.create_task( process_with_error_handling( _hook_data=hook_data, _headers=request.headers, _delivery_id=delivery_id, _event_type=event_type, ) ) _background_tasks.add(task) task.add_done_callback(_background_tasks.discard) return JSONResponse( status_code=status.HTTP_200_OK, content={ "status": status.HTTP_200_OK, "message": "Webhook queued for processing", "delivery_id": delivery_id, "event_type": event_type, }, ) ``` In practice, the intake flow looks like this: 1. GitHub sends the event to `POST /webhook_server`. 2. The server optionally checks the source IP, verifies `x-hub-signature-256` when `webhook-secret` is set, parses the payload, and validates required fields. 3. The server returns a small JSON response containing `delivery_id` and `event_type`. 4. A background task creates the structured context, instantiates `GithubWebhook`, runs processing, performs cleanup, and always writes the final summary log. > **Note:** A `200 OK` means "accepted and queued", not "automation finished successfully". The `delivery_id` is the key you use to trace a specific webhook through the logs. For production deployments, the important security settings live near the top of the global config: `webhook-secret`, `verify-github-ips`, and `verify-cloudflare-ips`. ## Background Processing Model The background model is intentionally simple: - Uvicorn provides process-level concurrency. - Inside each worker, webhook processing is queued with `asyncio.create_task`. - Active tasks are tracked in memory and given up to 30 seconds to finish during shutdown before they are cancelled. - Local work such as Git, `tox`, `pre-commit`, `podman`, `gh`, and `twine` runs as subprocesses through `run_command()`. - PyGithub itself is synchronous, so the code regularly wraps blocking API calls and many property reads in `asyncio.to_thread()` to keep the event loop responsive. This project does not use Celery, Redis, or an external broker. The “queue” is the application process itself. > **Note:** Because the queue is in-process, recovery is operational rather than broker-based. If the server dies after GitHub already received `200 OK`, you recover with logs, GitHub redelivery, or the `/reprocess` command, not by checking a separate job system. The official container image is designed around that model. It includes the toolchain the server expects to run locally, including `pre-commit`, `tox`, `gh`, `podman`, `regctl`, and the supported AI CLIs. ## Handler Architecture `GithubWebhook.process()` is the router for the whole system. It resolves the event into either a tag flow or a pull-request-backed flow, enriches the structured context, and then dispatches to specialized handlers. At a high level, the routes are: - `pull_request`: initialize the PR, assign reviewers, queue and run checks, post the welcome message, create an issue if configured, and maintain merge-related labels. - `pull_request_review`: translate review state into labels and optionally treat `/approve` in a review body as an approval command. - `issue_comment`: parse slash commands such as `/retest`, `/assign-reviewers`, `/check-can-merge`, `/build-and-push-container`, `/cherry-pick`, `/reprocess`, and `/test-oracle`. - `check_run`: ignore non-terminal runs, react to completed checks, and optionally auto-merge when `can-be-merged` succeeds and the PR has `automerge`. - `status` and `pull_request_review_thread`: re-evaluate merge eligibility when a status reaches a terminal state or a review thread is resolved or unresolved. - `push`: handle tag releases; ordinary branch pushes are intentionally skipped. For a new or updated PR, the main handler is organized into two phases: setup first, then local CI/CD work. ```779:864:webhook_server/libs/handlers/pull_request_handler.py async def process_opened_or_synchronize_pull_request(self, pull_request: PullRequest) -> None: if self.ctx: self.ctx.start_step("pr_workflow_setup") # Stage 1: Initial setup and check queue tasks setup_tasks: list[Coroutine[Any, Any, Any]] = [] setup_tasks.append(self.owners_file_handler.assign_reviewers(pull_request=pull_request)) setup_tasks.append( self.labels_handler._add_label( pull_request=pull_request, label=f"{BRANCH_LABEL_PREFIX}{pull_request.base.ref}", ) ) setup_tasks.append(self.label_pull_request_by_merge_state(pull_request=pull_request)) setup_tasks.append(self.check_run_handler.set_check_queued(name=CAN_BE_MERGED_STR)) ... self.logger.info(f"{self.log_prefix} Executing setup tasks") setup_results = await asyncio.gather(*setup_tasks, return_exceptions=True) ... if self.ctx: self.ctx.complete_step("pr_workflow_setup") # Stage 2: CI/CD execution tasks if self.ctx: self.ctx.start_step("pr_cicd_execution") ci_tasks: list[Coroutine[Any, Any, Any]] = [] ci_tasks.append(self.runner_handler.run_tox(pull_request=pull_request)) ci_tasks.append(self.runner_handler.run_pre_commit(pull_request=pull_request)) ci_tasks.append(self.runner_handler.run_install_python_module(pull_request=pull_request)) ci_tasks.append(self.runner_handler.run_build_container(pull_request=pull_request)) ... self.logger.info(f"{self.log_prefix} Executing CI/CD tasks") ci_results = await asyncio.gather(*ci_tasks, return_exceptions=True) ... if self.ctx: self.ctx.complete_step("pr_cicd_execution") ``` A few architectural choices are worth knowing: - PR automation is OWNERS-driven. `OwnersFileHandler` determines reviewers, approvers, and command permissions from repository files and the changed paths in the PR. - Merge eligibility is re-computed from current GitHub state rather than blindly trusting one earlier event. That is why `check_run`, `status`, and `pull_request_review_thread` all feed back into `check_if_can_be_merged()`. - Optional features such as custom check runs, conventional-title validation, AI suggestions, and test-oracle calls plug into the same handler flow rather than creating a separate architecture. On a typical new PR, the end-to-end suite expects the user-visible check state to look like this: - `build-container`, `pre-commit`, `python-module-install`, and `tox` complete successfully when those features are configured. - `verified` starts in `queued`. - `can-be-merged` is expected to fail until approval, labels, status checks, and conversation rules are satisfied. ## Repository Cloning and Worktrees The repository strategy is one of the most important architectural choices in this project. Instead of recloning the repository for every operation, each webhook gets one temporary base clone. That clone is reused for local file inspection, and separate Git worktrees are created on demand for isolated execution. The base clone is prepared once per webhook: ```262:393:webhook_server/libs/github_api.py async def _clone_repository( self, pull_request: PullRequest | None = None, checkout_ref: str | None = None, ) -> None: ... rc, _, err = await run_command( command=f"git clone {clone_url_with_token} {self.clone_repo_dir}", log_prefix=self.log_prefix, redact_secrets=[github_token], mask_sensitive=self.mask_sensitive, ) ... if pull_request: # Fetch the base branch first (needed for checkout) base_ref = await asyncio.to_thread(lambda: pull_request.base.ref) rc, _, err = await run_command( command=f"{git_cmd} fetch origin {base_ref}", log_prefix=self.log_prefix, mask_sensitive=self.mask_sensitive, ) ... # Fetch only this specific PR's ref pr_number = await asyncio.to_thread(lambda: pull_request.number) rc, _, err = await run_command( command=f"{git_cmd} fetch origin +refs/pull/{pr_number}/head:refs/remotes/origin/pr/{pr_number}", log_prefix=self.log_prefix, mask_sensitive=self.mask_sensitive, ) else: # For push events (tags only - branch pushes skip cloning) tag_name = checkout_ref.replace("refs/tags/", "") # type: ignore[union-attr] fetch_refspec = f"refs/tags/{tag_name}:refs/tags/{tag_name}" rc, _, _ = await run_command( command=f"{git_cmd} fetch origin {fetch_refspec}", log_prefix=self.log_prefix, mask_sensitive=self.mask_sensitive, ) ... rc, _, err = await run_command( command=f"{git_cmd} checkout {checkout_target}", log_prefix=self.log_prefix, mask_sensitive=self.mask_sensitive, ) self._repo_cloned = True self.logger.info(f"{self.log_prefix} Repository cloned to {self.clone_repo_dir} (ref: {checkout_target})") ``` That base clone is then used for repository-aware logic such as OWNERS parsing and changed-file detection. `OwnersFileHandler` even uses local `git diff` instead of the GitHub API for changed paths, which keeps rate-limit usage down. When the server needs an isolated execution checkout, it creates a worktree from the shared clone: ```71:164:webhook_server/libs/handlers/runner_handler.py @contextlib.asynccontextmanager async def _checkout_worktree( self, pull_request: PullRequest | None = None, is_merged: bool = False, checkout: str = "", tag_name: str = "", ) -> AsyncGenerator[tuple[bool, str, str, str]]: ... if checkout: checkout_target = checkout elif tag_name: checkout_target = tag_name elif is_merged and pull_request and base_ref is not None: checkout_target = base_ref elif pull_request and pr_number is not None: checkout_target = f"origin/pr/{pr_number}" ... rc, current_branch, _ = await run_command( command=f"git -C {repo_dir} rev-parse --abbrev-ref HEAD", log_prefix=self.log_prefix, mask_sensitive=self.github_webhook.mask_sensitive, ) ... async with helpers_module.git_worktree_checkout( repo_dir=repo_dir, checkout=checkout_target, log_prefix=self.log_prefix, mask_sensitive=self.github_webhook.mask_sensitive, ) as (success, worktree_path, out, err): result: tuple[bool, str, str, str] = (success, worktree_path, out, err) # Merge base branch if needed (for PR testing) if success and pull_request and not is_merged and not tag_name: git_cmd = f"git -C {worktree_path}" rc, out, err = await run_command( command=f"{git_cmd} merge origin/{merge_ref} -m 'Merge {merge_ref}'", log_prefix=self.log_prefix, mask_sensitive=self.github_webhook.mask_sensitive, ) if not rc: result = (False, worktree_path, out, err) yield result ``` This design gives the server a few advantages: - The expensive `git clone` happens once per webhook, not once per check. - The base clone stays on a stable checkout that is good for reading `OWNERS` files and computing diffs. - Each execution path gets its own isolated workspace, which prevents one command from polluting another. - PR checks are run against a worktree that merges the current base branch into the PR checkout, so validation is closer to what GitHub would merge. - Tag-based release work can run against a tag worktree without disturbing PR-related state. Cloning is also deliberately avoided when it is not useful: - Branch pushes skip cloning entirely. - Tag pushes clone because release actions need a real checkout. - `check_run` events are ignored unless the action is `completed`. - A failed `can-be-merged` check run does not trigger another clone-and-recheck cycle. > **Tip:** This shared-clone-plus-worktree model is what lets the server run `tox`, `pre-commit`, Python packaging, container builds, `gh` commands, and AI-assisted flows locally without paying the cost of repeated full clones. ## Structured Logging Flow Every webhook carries a structured execution context from the moment background processing starts to the moment the final summary is written. The flow looks like this: 1. `create_context()` stores a `WebhookContext` in a `ContextVar`. 2. Handlers call `start_step()`, `complete_step()`, and `fail_step()` for major workflow stages such as `repo_clone`, `pr_workflow_setup`, `pr_cicd_execution`, `check_merge_eligibility`, and `push_handler`. 3. Normal log messages are still written, but `JsonLogHandler` also serializes them as JSON `log_entry` records and enriches them with webhook metadata from the current context. 4. At the end of processing, `write_webhook_log()` writes one `webhook_summary` record with timing, PR metadata, token usage, workflow steps, and overall success or failure. The summary writer stores those records as one JSON object per line in daily files: ```93:152:webhook_server/utils/structured_logger.py def write_log(self, context: WebhookContext) -> None: """Write webhook context as JSONL entry to date-based log file.""" completed_at = context.completed_at if context.completed_at else datetime.now(UTC) # Get context dict and update timing locally (without mutating context) context_dict = context.to_dict() context_dict["type"] = "webhook_summary" if "timing" in context_dict: context_dict["timing"]["completed_at"] = completed_at.isoformat() if context.started_at: duration_ms = int((completed_at - context.started_at).total_seconds() * 1000) context_dict["timing"]["duration_ms"] = duration_ms # Get log file path log_file = self._get_log_file_path(completed_at) # Serialize context to JSON (compact JSONL format - single line, no indentation) log_entry = json.dumps(context_dict, ensure_ascii=False) ... # Write JSON entry with single newline (JSONL format) os.write(temp_fd, f"{log_entry}\n".encode()) ... with open(log_file, "a") as log_fd: ... log_fd.write(data.decode("utf-8")) ``` For operators, the important outputs are: - Text logs for day-to-day reading. - `log_entry` JSON records for individual log messages. - `webhook_summary` JSON records for the complete end-to-end outcome of one delivery. - Daily files named `webhooks_YYYY-MM-DD.json` under `{data_dir}/logs`. If you enable `ENABLE_LOG_SERVER=true`, the application also exposes a log viewer and related APIs that read these same structured files for filtering, export, workflow-step drill-down, and live streaming. > **Warning:** Treat the log viewer as an internal operations surface. It is only mounted when `ENABLE_LOG_SERVER=true`, and it should be exposed only on a trusted network boundary. ## Configuration That Changes the Flow These root settings shape intake, logging, and bootstrap behavior: ```3:17:examples/config.yaml log-level: INFO # Set global log level, change take effect immediately without server restart log-file: webhook-server.log # Set global log file, change take effect immediately without server restart mcp-log-file: mcp_server.log # Set global MCP log file, change take effect immediately without server restart logs-server-log-file: logs_server.log # Set global Logs Server log file, change take effect immediately without server restart mask-sensitive-data: true # Mask sensitive data in logs (default: true). Set to false for debugging (NOT recommended in production) # Server configuration disable-ssl-warnings: true # Disable SSL warnings (useful in production to reduce log noise from SSL certificate issues) # ... webhook-ip: # Full URL with path (e.g., https://your-domain.com/webhook_server or https://smee.io/your-channel) ``` These repository settings determine which events are registered and what a PR or tag push actually does when it arrives: ```139:182:examples/config.yaml repositories: my-repository: name: my-org/my-repository log-level: DEBUG # Override global log-level for repository log-file: my-repository.log # Override global log-file for repository mask-sensitive-data: false # Override global setting - disable masking for debugging this specific repo (NOT recommended in production) slack-webhook-url: # Send notification to slack on several operations verified-job: true pypi: token: events: # To listen to all events do not send events - push - pull_request - pull_request_review - pull_request_review_thread - issue_comment - check_run - status tox: main: all # Run all tests in tox.ini when pull request parent branch is main dev: testenv1,testenv2 # Run testenv1 and testenv2 tests in tox.ini when pull request parent branch is dev pre-commit: true # Run pre-commit check protected-branches: dev: [] main: # set [] in order to set all defaults run included include-runs: - "pre-commit.ci - pr" - "WIP" exclude-runs: - "SonarCloud Code Analysis" container: username: password: repository: tag: release: true # Push image to registry on new release with release as the tag build-args: # build args to send to podman build command - my-build-arg1=1 - my-build-arg2=2 args: # args to send to podman build command - --format docker ``` A few configuration rules are especially important when you are reasoning about the event flow: - `repositories..events` controls what GitHub sends to the server after startup sync. - `tox`, `pre-commit`, `pypi`, `container`, `conventional-title`, and custom check-run settings decide which checks are queued and which local commands actually run. - `protected-branches` shapes the status-check list that `can-be-merged` evaluates against. - `mask-sensitive-data` controls whether secrets are scrubbed from text logs. - `slack-webhook-url`, `test-oracle`, and AI features add side effects around the main PR pipeline, but they still fit into the same handler model. > **Note:** Repository-local `.github-webhook-server.yaml` overrides matching values from the global `config.yaml`. That lets one server instance manage repositories with different PR rules, labels, checks, and release behavior without changing the intake architecture. Put together, the architecture is straightforward: validate fast, process in the background, route by event type, work from one shared clone, isolate side effects in worktrees, and leave a structured trail behind for every delivery. That is what makes `github-webhook-server` feel responsive to GitHub while still doing substantial repository automation under the hood. --- Source: installation.md # Installation `github-webhook-server` is configured around a small server data directory plus GitHub credentials. A working install needs a Python `3.13.x` interpreter, `uv`, `git`, a reachable webhook URL, and GitHub credentials that can manage the repositories you configure. ## Runtime requirements The project pins Python exactly: ```45:45:pyproject.toml requires-python = "==3.13.*" ``` Install these tools for a normal source install: - `uv` - `git` Install these only if you use the matching features: - `podman` for `docker:` login and repository `container:` build/push automation - `gh` for automated cherry-pick PR creation - `claude`, `gemini`, or `cursor` CLI if you enable `ai-features` or `test-oracle` - Node.js and `npm` if you want to install the Gemini CLI locally > **Note:** The built-in tox, pre-commit, and twine flows are launched through `uv` and `uvx`, so you do not need to install those tools globally. ## Python and `uv` setup Once Python `3.13.x` and `uv` are available, install the project from the repository root: ```bash uv sync ``` Start the server with a data directory of your choice: ```bash WEBHOOK_SERVER_DATA_DIR=/path/to/data uv run entrypoint.py ``` The bind address, port, worker count, and webhook secret are read from `config.yaml`: ```13:16:entrypoint.py _ip_bind = _root_config.get("ip-bind", "0.0.0.0") _port = _root_config.get("port", 5000) _max_workers = _root_config.get("max-workers", 10) _webhook_secret = _root_config.get("webhook-secret") ``` > **Tip:** Put listener settings such as `ip-bind`, `port`, `max-workers`, and `webhook-secret` in `config.yaml`. The important environment variable for startup is `WEBHOOK_SERVER_DATA_DIR`. ## Prepare the data directory and config The server always looks for `config.yaml` inside the data directory. If `WEBHOOK_SERVER_DATA_DIR` is not set, it defaults to `/home/podman/data`: ```20:33:webhook_server/libs/config.py self.data_dir: str = os.environ.get("WEBHOOK_SERVER_DATA_DIR", "/home/podman/data") self.config_path: str = os.path.join(self.data_dir, "config.yaml") self.repository = repository self.exists() self.repositories_exists() ... if not os.path.isfile(self.config_path): raise FileNotFoundError(f"Config file {self.config_path} not found") ... if not self.root_data.get("repositories"): raise ValueError(f"Config {self.config_path} does not have `repositories`") ``` The GitHub App private key is also expected in the same directory, with this exact filename: ```413:418:webhook_server/utils/github_repository_settings.py with open(os.path.join(config_.data_dir, "webhook-server.private-key.pem")) as fd: private_key = fd.read() github_app_id: int = config_.root_data["github-app-id"] auth: AppAuth = Auth.AppAuth(app_id=github_app_id, private_key=private_key) ``` Create a directory like this before first start: ```text /path/to/data/ config.yaml webhook-server.private-key.pem logs/ ``` You only need to create `config.yaml` and `webhook-server.private-key.pem` yourself. The server creates the log directory and structured log files automatically: ```74:91:webhook_server/utils/structured_logger.py self.log_dir = Path(self.config.data_dir) / "logs" # Create log directory if it doesn't exist self.log_dir.mkdir(parents=True, exist_ok=True) ... date_str = date.strftime("%Y-%m-%d") return self.log_dir / f"webhooks_{date_str}.json" ``` Relative log filenames are stored under `/logs`: ```141:147:webhook_server/utils/helpers.py if log_file_name and not log_file_name.startswith("/"): log_file_path = os.path.join(config.data_dir, "logs") if not os.path.isdir(log_file_path): os.makedirs(log_file_path, exist_ok=True) return os.path.join(log_file_path, log_file_name) ``` Typical generated contents are: - `logs/webhook-server.log` - `logs/webhooks_YYYY-MM-DD.json` - `logs/mcp_server.log` if MCP is enabled - `logs/logs_server.log` if the log viewer is enabled - `log-colors.json` in the data directory root when repository colors are first assigned If you run the container image, mount your host data directory to `/home/podman/data`: ```5:6:examples/docker-compose.yaml volumes: - "./webhook_server_data_dir:/home/podman/data:Z" # Should include config.yaml and webhook-server.private-key.pem ``` ### GitHub credentials A working install needs both of these: - `github-app-id` in `config.yaml`, plus the matching private key in `webhook-server.private-key.pem` - one or more GitHub tokens in `github-tokens` From the shipped example config: ```12:17:examples/config.yaml github-app-id: 123456 # GitHub app id github-tokens: - - webhook-ip: # Full URL with path (e.g., https://your-domain.com/webhook_server or https://smee.io/your-channel) ``` Replace those placeholder values with your real credentials. The `repositories` section uses the short repository name as the map key, and the full `owner/repo` string inside `name`: ```139:142:examples/config.yaml repositories: my-repository: name: my-org/my-repository log-level: DEBUG # Override global log-level for repository ``` That means: - the map key (`my-repository`) should match GitHub’s `repository.name` - the `name` field must be the full `owner/repo` - at least one repository entry is required The server builds a client for every configured token and selects the one with the highest remaining rate limit: ```455:518:webhook_server/utils/helpers.py apis_and_tokens: list[tuple[github.Github, str]] = [] tokens = config.get_value(value="github-tokens") or [] for _token in tokens: apis_and_tokens.append((github.Github(auth=github.Auth.Token(_token)), _token)) # ... choose the token with the highest remaining rate limit ... if not _api_user or not api or not token: raise NoApiTokenError("Failed to get API with highest rate limit") ``` > **Warning:** A GitHub token alone is not enough. The server also reads `github-app-id` and `webhook-server.private-key.pem`, then requests the repository installation from GitHub. Make sure the GitHub App is installed on every repository listed in `config.yaml`. > **Note:** `webhook-secret` is optional in code, but strongly recommended in any real deployment. If you set it, the server verifies GitHub’s webhook signature before queueing work. > **Tip:** `webhook-ip` must be the full external URL GitHub can reach, including the `/webhook_server` path. For local testing, the example config explicitly allows a relay URL such as `https://smee.io/your-channel`. Startup is active, not passive. Before serving requests, the application syncs repository settings and creates or updates webhooks for every configured repository: ```43:45:webhook_server/utils/github_repository_and_webhook_settings.py await set_repositories_settings(config=config, apis_dict=apis_dict) set_all_in_progress_check_runs_to_queued(repo_config=config, apis_dict=apis_dict) create_webhook(config=config, apis_dict=apis_dict, secret=webhook_secret) ``` > **Warning:** Use credentials with enough permission to manage repository settings, branch protection, labels, hooks, and pull-request workflows. Read-only credentials are not enough for this server. ## Start and verify Before first start, validate the config file: ```bash uv run webhook_server/tests/test_schema_validator.py /path/to/data/config.yaml ``` Then start the server: ```bash WEBHOOK_SERVER_DATA_DIR=/path/to/data uv run entrypoint.py ``` Verify that the health endpoint responds: ```bash curl http://127.0.0.1:5000/webhook_server/healthcheck ``` A healthy server responds on `/webhook_server/healthcheck`, and if your credentials and `webhook-ip` are correct, startup will also sync repository settings and webhook configuration. > **Warning:** If you enable `ENABLE_LOG_SERVER=true`, treat `/logs` as a trusted-network-only interface. It is intended for internal use, not public internet exposure. --- Source: quick-start.md # Quick Start This guide gets `github-webhook-server` running with one repository. You will create a data directory, add a minimal `config.yaml`, place the GitHub App private key where the server expects it, start the app, and verify that it is alive. ## Before You Start You need: - Python `3.13` - `uv` - A GitHub App ID - The matching GitHub App private key in PEM format - At least one GitHub token the server can use for API calls - A repository where that GitHub App is installed > **Warning:** The server uses both `github-tokens` and GitHub App auth. The token pool is used for regular GitHub API calls, and `github-app-id` plus `webhook-server.private-key.pem` are used to authenticate as the app installation. ## 1. Create a Data Directory The server loads `config.yaml` from `WEBHOOK_SERVER_DATA_DIR`. If you do not set that variable, it defaults to `/home/podman/data`. ```bash export WEBHOOK_SERVER_DATA_DIR=/path/to/data mkdir -p "$WEBHOOK_SERVER_DATA_DIR" ``` Your directory should look like this: ```text /path/to/data/ ├── config.yaml └── webhook-server.private-key.pem ``` ## 2. Create a Minimal `config.yaml` A minimal working config needs: - `github-app-id` - `github-tokens` - `webhook-ip` - At least one repository under `repositories` ```yaml # yaml-language-server: $schema=https://raw.githubusercontent.com/myk-org/github-webhook-server/refs/heads/main/webhook_server/config/schema.yaml github-app-id: 123456 github-tokens: - token1 webhook-ip: https://your-domain.com/webhook_server repositories: test-repo: name: org/test-repo ``` Replace `123456`, `token1`, `https://your-domain.com/webhook_server`, and `org/test-repo` with your real values. What each part means: - `github-app-id` is your GitHub App ID. - `github-tokens` is the token pool the server will choose from at startup. - `webhook-ip` is the public URL GitHub should call. - `repositories` is the list of repositories the server should manage. - `test-repo` is the short repository name. - `name` is the full `owner/repo` name. > **Warning:** The key under `repositories` should be the short repository name, such as `test-repo`, not the full `owner/repo`. The full name belongs in the nested `name` field. > **Warning:** `webhook-ip` should be the full webhook URL. In a normal deployment that means including `/webhook_server`, for example `https://your-domain.com/webhook_server`. > **Warning:** `localhost` is fine for the health check, but GitHub cannot deliver webhooks to `localhost`. Use a real public URL or a `smee.io` channel URL for webhook delivery. > **Note:** If you omit `events`, the server creates the webhook with `*`, which subscribes it to all events. > **Note:** You can list more than one token in `github-tokens`. The server checks them and selects the one with the highest remaining rate limit. If you want GitHub to sign webhook deliveries, add a shared secret: ```yaml webhook-secret: test-webhook-secret ``` > **Tip:** You do not need a repo-local `.github-webhook-server.yaml` file for a minimal setup. The global `config.yaml` is enough to get started. ## 3. Add the GitHub App Private Key Save the GitHub App private key as: `$WEBHOOK_SERVER_DATA_DIR/webhook-server.private-key.pem` The filename matters. The server loads that exact file from the data directory when it creates the GitHub App installation client. > **Warning:** The private key is not a replacement for `github-tokens`. You need both. > **Warning:** The matching GitHub App must be installed on every repository you add, or the server will not be able to fetch the repository installation. ## 4. Install Dependencies and Start the Server Install the project dependencies: ```bash uv sync ``` Start the server: ```bash WEBHOOK_SERVER_DATA_DIR=/path/to/data uv run entrypoint.py ``` By default, the server starts on `0.0.0.0:5000` with `10` workers. You can override that in `config.yaml` with: - `ip-bind` - `port` - `max-workers` > **Note:** On startup, the server applies repository settings, resets in-progress check runs to queued, and creates or updates GitHub webhooks for every repository in `config.yaml`. > **Tip:** Validate the file before starting the server with `uv run webhook_server/tests/test_schema_validator.py "$WEBHOOK_SERVER_DATA_DIR/config.yaml"`. ## 5. Verify the Health Endpoint Once the server is running, check the health endpoint: ```bash curl http://127.0.0.1:5000/webhook_server/healthcheck ``` You should get: ```json {"status":200,"message":"Alive"} ``` If you changed `port` in `config.yaml`, use that port instead of `5000`. This is the same endpoint the container health check uses. > **Note:** A healthy response means the web server is up. It does not confirm that GitHub can reach your public `webhook-ip` yet. At this point, the process is running and listening for webhook traffic on `/webhook_server`. If GitHub can reach the URL you set in `webhook-ip`, the server is ready to receive events. --- Source: docker-deployment.md # Docker and Container Deployment `github-webhook-server` ships with a container image that is built around Podman-in-container. That matters for deployment: this is not a thin FastAPI-only image. It is designed to run the webhook server itself and, when repository configuration enables it, run nested Podman commands for repository automation such as building and pushing images. ## The container image The top-level `Dockerfile` makes the intent clear: ```dockerfile FROM quay.io/podman/stable:v5 EXPOSE 5000 ENV USERNAME="podman" ENV HOME_DIR="/home/$USERNAME" ENV BIN_DIR="$HOME_DIR/.local/bin" ENV PATH="$PATH:$BIN_DIR:$HOME_DIR/.npm-global/bin" \ DATA_DIR="$HOME_DIR/data" \ APP_DIR="$HOME_DIR/github-webhook-server" ``` ```dockerfile USER $USERNAME WORKDIR $HOME_DIR ENV UV_PYTHON=python3.13 \ UV_COMPILE_BYTECODE=1 \ UV_NO_SYNC=1 \ UV_CACHE_DIR=${APP_DIR}/.cache \ PYTHONUNBUFFERED=1 HEALTHCHECK CMD curl --fail http://127.0.0.1:5000/webhook_server/healthcheck || exit 1 ENTRYPOINT ["tini", "--", "uv", "run", "entrypoint.py"] ``` The same `Dockerfile` also installs Podman tooling, `git`, `gh`, Node/NPM, `uv`, `tini`, and several other CLIs. In other words, the image is intentionally heavier than a typical Python web image because it needs to do more than serve HTTP. A few practical consequences: - The server listens on port `5000`. - It runs as the `podman` user inside the container. - It uses `tini`, which helps with signal handling and process cleanup. - The built-in health check calls `http://127.0.0.1:5000/webhook_server/healthcheck`. ## Persistent data and volume mounts By default, the application reads its persistent state from `/home/podman/data`. That comes directly from the runtime configuration code: ```python self.data_dir: str = os.environ.get("WEBHOOK_SERVER_DATA_DIR", "/home/podman/data") self.config_path: str = os.path.join(self.data_dir, "config.yaml") ``` The GitHub App private key is also read from that same directory: ```python with open(os.path.join(config_.data_dir, "webhook-server.private-key.pem")) as fd: private_key = fd.read() ``` That means your persistent data mount needs to contain at least: - `config.yaml` - `webhook-server.private-key.pem` - `logs/` (created automatically if it does not exist) A good mental model is: | Container path | Purpose | Persist it? | | --- | --- | --- | | `/home/podman/data` | Main app data: config, GitHub App key, text logs, structured webhook logs | Yes | | `/tmp/storage-run-1000` | Nested Podman runtime/storage used by in-container Podman operations | Use a dedicated disposable mount | The structured webhook logs are written under `logs/` as daily files such as `webhooks_2026-03-18.json`. Text logs also live under `logs/`, using names from `config.yaml` such as `webhook-server.log`, `mcp_server.log`, and `logs_server.log`. > **Tip:** If you keep the default in-container path `/home/podman/data`, you do not need to set `WEBHOOK_SERVER_DATA_DIR`. Only set that environment variable if you intentionally mount the data directory somewhere else inside the container. > **Tip:** Keep the `:Z` suffix on the persistent bind mount on SELinux-enabled hosts. The checked-in example uses it so the container can read `config.yaml`, the private key, and log files correctly. ## The example Compose deployment The repository includes this example in `examples/docker-compose.yaml`: ```yaml services: github-webhook-server: container_name: github-webhook-server build: ghcr.io/myk-org/github-webhook-server:latest volumes: - "./webhook_server_data_dir:/home/podman/data:Z" # Should include config.yaml and webhook-server.private-key.pem # Mount temporary directories to prevent boot ID mismatch issues - "/tmp/podman-storage-${USER:-1000}:/tmp/storage-run-1000" environment: - PUID=1000 - PGID=1000 - TZ=Asia/Jerusalem - MAX_WORKERS=50 # Defaults to 10 if not set - WEBHOOK_SERVER_IP_BIND=0.0.0.0 # IP to listen - WEBHOOK_SERVER_PORT=5000 # Port to listen - WEBHOOK_SECRET= # If set verify hook is a valid hook from Github - VERIFY_GITHUB_IPS=1 # Verify hook request is from GitHub IPs - VERIFY_CLOUDFLARE_IPS=1 # Verify hook request is from Cloudflare IPs - ENABLE_LOG_SERVER=true # Enable log viewer endpoints (default: false) - ENABLE_MCP_SERVER=false # Enable MCP server for AI agent integration (default: false) ports: - "5000:5000" privileged: true restart: unless-stopped ``` What this example is doing: - It mounts a persistent host directory into `/home/podman/data`. - It mounts a second host directory into `/tmp/storage-run-1000` for nested Podman runtime state. - It publishes container port `5000`. - It runs the container in `privileged` mode. - It uses `restart: unless-stopped` for long-running deployment. > **Note:** The checked-in example points `ghcr.io/myk-org/github-webhook-server:latest` at the `build:` key. In standard Docker Compose semantics, a registry reference belongs under `image:`. Use `build:` only when you are pointing at a local build context such as `.`. The important deployment details in the example are the volume mounts, port mapping, and `privileged: true`. ## Health checks The application exposes a dedicated health endpoint: ```python @FASTAPI_APP.get(f"{APP_URL_ROOT_PATH}/healthcheck", operation_id="healthcheck") def healthcheck() -> dict[str, Any]: return {"status": requests.codes.ok, "message": "Alive"} ``` The image wires that into the container health check: ```dockerfile HEALTHCHECK CMD curl --fail http://127.0.0.1:5000/webhook_server/healthcheck || exit 1 ``` A healthy container means the web process is up and answering on port `5000`. It does not mean every webhook has been processed successfully. > **Note:** Webhook delivery handling is asynchronous. The main webhook endpoint returns `200 OK` after validation and queueing, so successful HTTP responses do not automatically mean that all downstream GitHub operations succeeded. For real troubleshooting, check the logs in the mounted `logs/` directory. ## What belongs in `config.yaml` Most deployment settings are read from the mounted `config.yaml`, not from environment variables. The checked-in example config shows the expected style: ```yaml log-level: INFO # Set global log level, change take effect immediately without server restart log-file: webhook-server.log # Set global log file, change take effect immediately without server restart mcp-log-file: mcp_server.log # Set global MCP log file, change take effect immediately without server restart logs-server-log-file: logs_server.log # Set global Logs Server log file, change take effect immediately without server restart mask-sensitive-data: true github-app-id: 123456 webhook-ip: # Full URL with path ``` If you use the server's container-build automation, the per-repository container settings also live in `config.yaml`: ```yaml repositories: my-repository: name: my-org/my-repository container: username: password: repository: tag: release: true build-args: - my-build-arg1=1 - my-build-arg2=2 args: - --format docker ``` For containerized deployments, put these runtime settings in `config.yaml`: - `webhook-ip` - `ip-bind` - `port` - `max-workers` - `webhook-secret` - `verify-github-ips` - `verify-cloudflare-ips` > **Warning:** The checked-in Compose example shows `MAX_WORKERS`, `WEBHOOK_SERVER_IP_BIND`, `WEBHOOK_SERVER_PORT`, `WEBHOOK_SECRET`, `VERIFY_GITHUB_IPS`, and `VERIFY_CLOUDFLARE_IPS` as environment variables, but the application code reads those values from `config.yaml` keys (`max-workers`, `ip-bind`, `port`, `webhook-secret`, `verify-github-ips`, and `verify-cloudflare-ips`). The environment variables consumed directly at runtime are `WEBHOOK_SERVER_DATA_DIR`, `ENABLE_LOG_SERVER`, and `ENABLE_MCP_SERVER`. The Podman cleanup script also reads `PUID`. `PGID` appears in the example, but the application code does not read it. > **Note:** `ENABLE_LOG_SERVER` and `ENABLE_MCP_SERVER` are enabled only when they are set to the literal string `true`. > **Note:** `webhook-ip` must be the external URL GitHub should call, and it must include the `/webhook_server` path. If you change `webhook-ip` or `webhook-secret`, restart the container so the startup webhook reconciliation can update GitHub with the new values. ## Startup behavior and operational caveats Container startup does more than launch Uvicorn. The entrypoint runs Podman cleanup and repository/webhook setup first: ```python if __name__ == "__main__": # Run Podman cleanup before starting the application run_podman_cleanup() result = asyncio.run(repository_and_webhook_settings(webhook_secret=_webhook_secret)) uvicorn.run( "webhook_server.app:FASTAPI_APP", host=_ip_bind, port=int(_port), workers=int(_max_workers), reload=False, ) ``` That leads to a few operational caveats that are worth planning for: - Startup depends on valid mounted configuration. If `config.yaml` or `webhook-server.private-key.pem` is missing, the container will not start cleanly. - Startup also depends on GitHub access. Before the server begins listening, it reconciles repository settings and creates or updates GitHub webhooks using the configured `webhook-ip`. - If `verify-github-ips` or `verify-cloudflare-ips` is enabled, the app fetches allowlists at startup. If verification is enabled but no valid networks can be loaded, startup fails closed for security. - The second volume mount is intentionally disposable. The cleanup script removes stale runtime directories under `/tmp/storage-run-${PUID}` and then prunes stopped containers, dangling images, unused volumes, and unused networks from the nested Podman environment. - Use a dedicated host path for that nested Podman mount. Do not point it at shared or important host storage. - The checked-in build path for repository image automation uses Podman inside the container and builds with `--network=host`. That is one reason the example deployment keeps `privileged: true`. > **Warning:** `ENABLE_LOG_SERVER=true` exposes `/logs`, `/logs/api/*`, and `/logs/ws` without authentication. `ENABLE_MCP_SERVER=true` exposes `/mcp` without authentication. Treat both as internal-only endpoints and place them behind a trusted network or an authenticated reverse proxy. > **Note:** The webhook receiver and health check live under `/webhook_server`, but the optional log viewer lives under `/logs` and the optional MCP endpoint lives under `/mcp`. If you deploy behind a reverse proxy or ingress, route those paths explicitly. > **Tip:** Plan for log retention. The structured webhook logs are written as daily `webhooks_YYYY-MM-DD.json` files, and the code documents them as unbounded in size. Text logs are safer to rotate, but the JSON webhook summaries still need external cleanup or retention policies on long-running deployments. --- Source: configuration-model.md # Configuration Model `github-webhook-server` has three potential configuration layers: 1. The root of the server's `config.yaml` 2. The matching `repositories.` entry inside `config.yaml` 3. An optional `.github-webhook-server.yaml` in the repository itself Not every setting participates in all three layers, but when a repository-scoped setting does, the server resolves it from most specific to least specific: repository-local file first, then the repo entry in `config.yaml`, then the root of `config.yaml`. ```132:153:webhook_server/libs/config.py def get_value(self, value: str, return_on_none: Any = None, extra_dict: dict[str, Any] | None = None) -> Any: """ Get value from config Supports dot notation for nested values (e.g., "docker.username", "pypi.token") Order of getting value: 1. Local repository file (.github-webhook-server.yaml) 2. Repository level global config file (config.yaml) 3. Root level global config file (config.yaml) """ if extra_dict: result = self._get_nested_value(value, extra_dict) if result is not None: return result for scope in (self.repository_data, self.root_data): result = self._get_nested_value(value, scope) if result is not None: return result return return_on_none ``` Think of the model like this: root `config.yaml` provides shared defaults, `repositories.` provides server-side exceptions for one repository, and `.github-webhook-server.yaml` lets a repository carry some of its own runtime behavior in version control. ## Where `config.yaml` Lives By default, the server reads `config.yaml` from `/home/podman/data/config.yaml`. Set `WEBHOOK_SERVER_DATA_DIR` if you want a different base directory. The Docker example mounts `./webhook_server_data_dir` into `/home/podman/data`, which is why that path is the default. > **Warning:** `config.yaml` is required, and `repositories:` must exist and be non-empty. Missing file or missing `repositories:` is a hard error. ## Server-Managed `config.yaml` Both the global defaults and the per-repository overrides live in the same file. Root keys apply to every repository unless a repo-specific entry overrides them. ```3:190:examples/config.yaml log-level: INFO # Set global log level, change take effect immediately without server restart log-file: webhook-server.log # Set global log file, change take effect immediately without server restart github-app-id: 123456 # GitHub app id github-tokens: - - webhook-ip: # ... repositories: my-repository: name: my-org/my-repository log-level: DEBUG # Override global log-level for repository log-file: my-repository.log # Override global log-file for repository events: - push - pull_request - pull_request_review - pull_request_review_thread - issue_comment - check_run - status # ... github-tokens: # override GitHub tokens per repository - - ``` Use the root of `config.yaml` for shared or server-level values such as `github-app-id`, global `github-tokens`, `webhook-ip`, global `labels`, and other defaults you want every repository to inherit. Use `repositories.` for repo-specific settings that the server must know before it starts processing that repository. Common examples are `name`, `events`, and repo-specific `github-tokens`. > **Note:** The key under `repositories:` is the short repository name, while `name:` stores the full `owner/repo`. In the example above, `my-repository` is the lookup key and `my-org/my-repository` is the actual GitHub repository. Because lookup is by short name, avoid configuring two different repos that share the same short name. ## Repository-Managed `.github-webhook-server.yaml` Use `.github-webhook-server.yaml` when you want repository-owned behavior to live with the code and be reviewed in pull requests. This is a good fit for runtime settings such as `tox`, `pypi`, `container`, `pre-commit`, `conventional-title`, `ai-features`, `minimum-lgtm`, `create-issue-for-new-pr`, and label-related behavior. ```118:162:examples/.github-webhook-server.yaml conventional-title: "feat,fix,build,chore,ci,docs,style,refactor,perf,test,revert" minimum-lgtm: 2 create-issue-for-new-pr: true # Create tracking issues for new PRs cherry-pick-assign-to-pr-author: true # Assign cherry-pick PRs to the original PR author # ... ai-features: ai-provider: "claude" # claude | gemini | cursor ai-model: "claude-opus-4-6[1m]" conventional-title: enabled: true mode: suggest timeout-minutes: 10 resolve-cherry-pick-conflicts-with-ai: enabled: true timeout-minutes: 10 ``` If the file is missing, the server simply falls back to `config.yaml`. If the file exists but contains invalid YAML, loading it fails instead of being silently ignored. The local file is not applied first thing at startup. The webhook runtime loads base config, selects the API token, and only then fetches `.github-webhook-server.yaml` and reapplies the supported repository settings. ```114:151:webhook_server/libs/github_api.py # Get config without .github-webhook-server.yaml data self._repo_data_from_config(repository_config={}) github_api, self.token, self.api_user = get_api_with_highest_rate_limit( config=self.config, repository_name=self.repository_name ) # ... # Once we have a repository, we can get the config from .github-webhook-server.yaml local_repository_config = self.config.repository_local_data( github_api=github_api, repository_full_name=self.repository_full_name ) # Call _repo_data_from_config() again to update self args from .github-webhook-server.yaml self._repo_data_from_config(repository_config=local_repository_config) ``` > **Warning:** `.github-webhook-server.yaml` is best thought of as a runtime-behavior layer, not a full replacement for `config.yaml`. Keep administrative settings such as `events`, repo tokens, logging, branch protection, draft-command rules, `pr-size-thresholds`, and `test-oracle` in `config.yaml`. > **Note:** The repository-local file is fetched through GitHub's contents API without an explicit `ref`, so the default-branch version is the one the server sees. A config change in a pull request does not become active until that file reaches the default branch. ## Merge Rules The precedence chain is key-by-key, not file-by-file. In practice, that means: - If a key is missing at the repository-local level, lookup continues to the repo entry in `config.yaml`, then to the root. - If a higher-precedence key is present but set to YAML `null`, the server treats it as not set and keeps falling back. - For most nested objects, the higher-precedence object replaces the lower-precedence object instead of being recursively merged. - `labels` is the main special case: the server merges the top-level `labels` object, and then merges `labels.colors` again so you can override a few colors without redefining every color. A concrete example is in `examples/config.yaml`: the root `labels.colors.hold` is `red`, while the repo-specific `labels.colors.hold` is `purple`. For that repository, the effective `hold` color becomes `purple`, but the other global label colors still apply. The same merge behavior is used when `labels` comes from `.github-webhook-server.yaml`. > **Tip:** To inherit a lower-precedence value, omit the key entirely or set it to `null`. > **Tip:** When you override structured settings such as `container`, `branch-protection`, or `test-oracle`, restate every field you still need. Do not assume a deep merge unless that setting is explicitly documented as merged. ## Recommended Placement - Put server-wide defaults and startup-time settings in the root of `config.yaml`. - Put repo-specific server settings in `repositories.` inside `config.yaml`. - Put repository-owned runtime behavior in `.github-webhook-server.yaml` when you want config changes reviewed and versioned alongside the repository. > **Tip:** For webhook-time repository behavior, changes are picked up on later webhook deliveries because the server re-reads `config.yaml` and re-fetches `.github-webhook-server.yaml` instead of keeping one permanently merged config in memory. --- Source: configuration-reference.md # Configuration Reference `github-webhook-server` reads its main configuration from `config.yaml` in the server data directory. In code, that directory defaults to `/home/podman/data`, so the default config path is `/home/podman/data/config.yaml`. Relative log file names are resolved under `/logs/`. The checked-in example file shows the top-level shape: ```3:21:examples/config.yaml log-level: INFO # Set global log level, change take effect immediately without server restart log-file: webhook-server.log # Set global log file, change take effect immediately without server restart mcp-log-file: mcp_server.log # Set global MCP log file, change take effect immediately without server restart logs-server-log-file: logs_server.log # Set global Logs Server log file, change take effect immediately without server restart mask-sensitive-data: true # Mask sensitive data in logs (default: true). Set to false for debugging (NOT recommended in production) # Server configuration disable-ssl-warnings: true # Disable SSL warnings (useful in production to reduce log noise from SSL certificate issues) github-app-id: 123456 # GitHub app id github-tokens: - - webhook-ip: # Full URL with path (e.g., https://your-domain.com/webhook_server or https://smee.io/your-channel) docker: # Used to pull images from docker.io username: password: ``` Repository-specific settings live under `repositories`: ```139:183:examples/config.yaml repositories: my-repository: name: my-org/my-repository log-level: DEBUG # Override global log-level for repository log-file: my-repository.log # Override global log-file for repository mask-sensitive-data: false # Override global setting - disable masking for debugging this specific repo (NOT recommended in production) slack-webhook-url: # Send notification to slack on several operations verified-job: true pypi: token: events: # To listen to all events do not send events - push - pull_request - pull_request_review - pull_request_review_thread - issue_comment - check_run - status tox: main: all # Run all tests in tox.ini when pull request parent branch is main dev: testenv1,testenv2 # Run testenv1 and testenv2 tests in tox.ini when pull request parent branch is dev pre-commit: true # Run pre-commit check protected-branches: dev: [] main: # set [] in order to set all defaults run included include-runs: - "pre-commit.ci - pr" - "WIP" exclude-runs: - "SonarCloud Code Analysis" container: username: password: repository: tag: release: true # Push image to registry on new release with release as the tag build-args: # build args to send to podman build command - my-build-arg1=1 - my-build-arg2=2 args: # args to send to podman build command - --format docker ``` > **Note:** In `repositories`, the map key is the short GitHub repository name, while `name` inside the block is the full `owner/repo`. > **Note:** This page lists keys in `config.yaml` form. The sample `.github-webhook-server.yaml` uses the same repository-level shape without the surrounding `repositories.` wrapper. > **Note:** Most repository settings replace the global value entirely. Two important exceptions are `branch-protection`, which is merged with global defaults, and `labels.colors`, where repository colors override only the keys you redefine. > **Warning:** Use exact branch names for `tox` and `protected-branches`, and use string values such as `all` or `testenv1,testenv2` for `tox`. The current runner/setup code looks up branches by exact key and builds the tox command from a string value. ## Global settings ### Logging and diagnostics - `log-level`: Global application log level. Allowed values are `INFO` and `DEBUG`. - `log-file`: Main webhook server log file. Relative names are written under `/logs/`; absolute paths are used as-is. - `mcp-log-file`: Separate log file for the optional MCP server. Default is `mcp_server.log`. - `logs-server-log-file`: Separate log file for the optional log viewer / logs server. Default is `logs_server.log`. - `mask-sensitive-data`: Enables log redaction. Default is `true`. When enabled, the logger masks secrets such as tokens, passwords, webhook secrets, Slack webhook URLs, and similar values. > **Warning:** `labels.colors` and `pr-size-thresholds.*.color` expect CSS3 color names such as `green`, `orange`, `royalblue`, and `darkred`. The label code converts those names to hex internally; hex strings are not the documented input format. ### Server, webhook, and security - `webhook-ip`: The public webhook URL that GitHub should call. Include the full path, for example `https://example.com/webhook_server`. - `ip-bind`: The bind address for the FastAPI / uvicorn server. If omitted, startup defaults to `0.0.0.0`. - `port`: The listening port. If omitted, startup defaults to `5000`. - `max-workers`: Uvicorn worker count. If omitted, startup defaults to `10`. - `webhook-secret`: Optional shared secret for GitHub webhook signature verification. When set, the server validates the incoming `x-hub-signature-256` header and uses the same secret when it creates GitHub webhooks. - `verify-github-ips`: If `true`, only accept webhook requests from GitHub’s published webhook IP ranges. - `verify-cloudflare-ips`: If `true`, also trust Cloudflare’s published IP ranges. This is useful when traffic reaches the server through Cloudflare. - `disable-ssl-warnings`: If `true`, suppress `urllib3` SSL warnings during runtime. > **Warning:** IP allowlist verification is fail-closed. If `verify-github-ips` and/or `verify-cloudflare-ips` are enabled but the allowlists cannot be loaded, the server aborts startup instead of accepting requests insecurely. ### GitHub authentication and shared defaults - `github-app-id`: GitHub App ID used for app-scoped repository management. In practice this goes with a `webhook-server.private-key.pem` file in the data directory and an installed GitHub App. - `github-tokens`: List of GitHub tokens used for normal API calls. The server checks all configured tokens and picks the one with the highest remaining rate limit. - `docker.username`: Docker Hub username used for the startup `podman login` step. - `docker.password`: Docker Hub password used for the startup `podman login` step. - `default-status-checks`: Extra check or status context names that should always be part of the generated branch-protection rules. Use exact GitHub context names. - `auto-verified-and-merged-users`: Global default list of users or bots whose PRs can be auto-verified and auto-merged when the other merge rules are satisfied. - `auto-verify-cherry-picked-prs`: Global default for automatic verification of cherry-picked PRs. Default is `true`. - `create-issue-for-new-pr`: Global default for creating a tracking issue when a new PR opens. Default is `true`. - `cherry-pick-assign-to-pr-author`: Global default for assigning cherry-pick PRs to the original PR author. Default is `true`. - `allow-commands-on-draft-prs`: Global default for user commands on draft PRs. Omit it to block commands on draft PRs. Set it to `[]` to allow all commands. Set it to a list such as `["build-and-push-container", "retest"]` to allow only those command names. > **Tip:** Repository-level `github-tokens` replace the global token list for that repository. During webhook processing, the server also adds the GitHub users behind the active API tokens to the auto-verified user list. ### Labels and PR size The sample config includes label and size settings like this: ```47:102:examples/config.yaml labels: # Optional: List of label categories to enable # If not set, all labels are enabled. If set, only listed categories are enabled. # Note: reviewed-by labels (approved-*, lgtm-*, etc.) are always enabled and cannot be disabled enabled-labels: - verified - hold - wip - needs-rebase - has-conflicts - can-be-merged - size - branch - cherry-pick - automerge # Optional: Custom colors for labels (CSS3 color names) colors: hold: red verified: green wip: orange needs-rebase: darkred has-conflicts: red can-be-merged: limegreen automerge: green # Dynamic label prefixes approved-: green lgtm-: yellowgreen changes-requested-: orange commented-: gold cherry-pick-: coral branch-: royalblue # Global PR size label configuration (optional) # Define custom categories based on total lines changed (additions + deletions) # threshold: positive integer or 'inf' for unbounded largest category # color: CSS3 color name (e.g., red, green, blue, lightgray, darkorange) # Infinity behavior: 'inf' ensures all PRs beyond largest finite threshold are captured # Always sorted last, regardless of definition order pr-size-thresholds: Tiny: threshold: 10 # PRs with 0-9 lines changed color: lightgray Small: threshold: 50 # PRs with 10-49 lines changed color: green Medium: threshold: 150 # PRs with 50-149 lines changed color: orange Large: threshold: 300 # PRs with 150-299 lines changed color: red Massive: threshold: inf # PRs with 300+ lines changed (unbounded largest category) color: darkred # 'inf' means no upper limit - catches all PRs above 300 lines ``` - `labels.enabled-labels`: List of label categories to allow. Valid categories are `verified`, `hold`, `wip`, `needs-rebase`, `has-conflicts`, `can-be-merged`, `size`, `branch`, `cherry-pick`, and `automerge`. If omitted, all configurable categories are enabled. If set to `[]`, all configurable categories are disabled. Review-state labels such as `approved-*`, `lgtm-*`, `changes-requested-*`, and `commented-*` are always enabled. - `labels.colors`: Map of label names or dynamic label prefixes to CSS3 color names. Exact keys such as `hold` or `verified` affect one label. Prefix keys such as `approved-` or `branch-` affect any label that starts with that prefix. - `pr-size-thresholds.