Logging and Data Files

github-webhook-server keeps its persistent configuration, key material, and logs under a single data directory. By default that directory is /home/podman/data, and you can move it with WEBHOOK_SERVER_DATA_DIR.

self.data_dir: str = os.environ.get("WEBHOOK_SERVER_DATA_DIR", "/home/podman/data")
self.config_path: str = os.path.join(self.data_dir, "config.yaml")

The example container setup mounts a host directory directly to that path and calls out the files that should already exist there:

volumes:
  - "./webhook_server_data_dir:/home/podman/data:Z" # Should include config.yaml and webhook-server.private-key.pem
  # Mount temporary directories to prevent boot ID mismatch issues
  - "/tmp/podman-storage-${USER:-1000}:/tmp/storage-run-1000"
environment:
  - PUID=1000
  - PGID=1000
  - TZ=Asia/Jerusalem
  - MAX_WORKERS=50 # Defaults to 10 if not set
  - WEBHOOK_SERVER_IP_BIND=0.0.0.0 # IP to listen
  - WEBHOOK_SERVER_PORT=5000 # Port to listen
  - WEBHOOK_SECRET=<secret> # If set verify hook is a valid hook from Github
  - VERIFY_GITHUB_IPS=1 # Verify hook request is from GitHub IPs
  - VERIFY_CLOUDFLARE_IPS=1 # Verify hook request is from Cloudflare IPs
  - ENABLE_LOG_SERVER=true # Enable log viewer endpoints (default: false)
  - ENABLE_MCP_SERVER=false # Enable MCP server for AI agent integration (default: false)

Data Directory Layout

Common files and directories you will see:

config.yaml: the main server configuration file.
webhook-server.private-key.pem: the GitHub App private key.
logs/: the main logging directory.
logs/<log-file>: the main human-readable application log.
logs/<log-file>.1, logs/<log-file>.2, and so on: rotated text logs.
logs/webhooks_YYYY-MM-DD.json: structured webhook log files, one file per UTC day.
logs/logs_server.log: dedicated log viewer log when the log viewer is enabled.
logs/mcp_server.log: dedicated MCP server log when MCP support is enabled.

The private key is read from the data directory root, not from logs/:

def get_repository_github_app_api(config_: Config, repository_name: str) -> Github | None:
    LOGGER.debug("Getting repositories GitHub app API")

    with open(os.path.join(config_.data_dir, "webhook-server.private-key.pem")) as fd:
        private_key = fd.read()

    github_app_id: int = config_.root_data["github-app-id"]

Note: The logs/ directory is created automatically when the server needs it.

Configure Log Files

The example config shows the main logging settings:

log-level: INFO # Set global log level, change take effect immediately without server restart
log-file: webhook-server.log # Set global log file, change take effect immediately without server restart
mcp-log-file: mcp_server.log # Set global MCP log file, change take effect immediately without server restart
logs-server-log-file: logs_server.log # Set global Logs Server log file, change take effect immediately without server restart
mask-sensitive-data: true # Mask sensitive data in logs (default: true). Set to false for debugging (NOT recommended in production)

You can also override the text log file and masking behavior for a specific repository:

repositories:
  my-repository:
    name: my-org/my-repository
    log-level: DEBUG # Override global log-level for repository
    log-file: my-repository.log # Override global log-file for repository
    mask-sensitive-data: false # Override global setting - disable masking for debugging this specific repo (NOT recommended in production)

Relative filenames are resolved inside <data-dir>/logs/. If you give an absolute path, the server uses it as-is:

def get_log_file_path(config: Config, log_file_name: str | None) -> str | None:
    """
    Resolve the full path for a log file using the configuration data directory.

    Args:
        config: Config object containing data_dir
        log_file_name: Name of the log file (e.g., "server.log")

    Returns:
        Full path to the log file, or None if log_file_name is None
    """
    if log_file_name and not log_file_name.startswith("/"):
        log_file_path = os.path.join(config.data_dir, "logs")

        if not os.path.isdir(log_file_path):
            os.makedirs(log_file_path, exist_ok=True)

        return os.path.join(log_file_path, log_file_name)

    return log_file_name

Tip: Use repository-level overrides when you need extra visibility for one repository without changing logging for every repository.

Rotating Text Logs

Text logs are the easiest place to read day-to-day activity. They use size-based rotation and are designed not to crash the server if rotated files have already been removed.

The project swaps in a safe rotating handler and sets a 10 MiB max file size for logger-managed text files:

# Patch simple_logger to use SafeRotatingFileHandler to prevent crashes
# when backup log files are missing during rollover
simple_logger.logger.RotatingFileHandler = SafeRotatingFileHandler

logger = get_logger(
    name=logger_cache_key,
    filename=log_file_path_resolved,
    level=log_level,
    file_max_bytes=1024 * 1024 * 10,
    mask_sensitive=mask_sensitive,
    mask_sensitive_patterns=mask_sensitive_patterns,
    console=True,  # Enable console output for docker logs with FORCE_COLOR support
)

During rollover, the handler works with the standard rotated filenames such as .1, .2, and .3, but suppresses file-operation errors so logging can continue:

if self.backupCount > 0:
    # Remove backup files that exceed backupCount, handle missing files
    for i in range(self.backupCount - 1, 0, -1):
        sfn = self.rotation_filename(f"{self.baseFilename}.{i}")
        dfn = self.rotation_filename(f"{self.baseFilename}.{i + 1}")
        if os.path.exists(sfn):
            try:
                if os.path.exists(dfn):
                    os.remove(dfn)
                os.rename(sfn, dfn)
            except FileNotFoundError:
                # File was deleted between exists check and operation - ignore
                pass
            except OSError:
                # Broad suppression intentional: logging must never crash.
                # See module docstring for full rationale.
                pass

    dfn = self.rotation_filename(f"{self.baseFilename}.1")
    try:
        if os.path.exists(dfn):
            os.remove(dfn)
    except FileNotFoundError:
        # File was deleted between exists check and remove - ignore
        pass
    except OSError:
        # Broad suppression intentional: logging must never crash.
        # See module docstring for full rationale.
        pass

    try:
        self.rotate(self.baseFilename, dfn)
    except FileNotFoundError:
        # Base file was deleted - just create a new one
        pass
    except OSError:
        # Broad suppression intentional: logging must never crash.
        # See module docstring for full rationale.
        pass

if not self.delay:
    try:
        self.stream = self._open()
    except OSError:
        # Cannot open new log file - leave stream as None.
        # FileHandler.emit() will attempt to open on next log entry.
        pass

In practice, if log-file is webhook-server.log, expect a current file like logs/webhook-server.log plus rotated siblings such as logs/webhook-server.log.1.

Structured JSONL Webhook Logs

The server also writes structured webhook data into daily files named webhooks_YYYY-MM-DD.json under logs/.

Despite the .json extension, these files use JSON Lines: one compact JSON object per line.

def _get_log_file_path(self, date: datetime | None = None) -> Path:
    """Get log file path for the specified date.

    Args:
        date: Date for the log file (defaults to current UTC date)

    Returns:
        Path to the log file (e.g., {log_dir}/webhooks_2026-01-05.json)
    """
    if date is None:
        date = datetime.now(UTC)
    date_str = date.strftime("%Y-%m-%d")
    return self.log_dir / f"webhooks_{date_str}.json"

def write_log(self, context: WebhookContext) -> None:
    """Write webhook context as JSONL entry to date-based log file.

    Writes a compact JSON entry (single line, no indentation) containing complete webhook execution context.
    Each entry is terminated by a newline character.
    Uses atomic write pattern (temp file + rename) with file locking for safety.

    Args:
        context: WebhookContext to serialize and write

    Note:
        Uses context.completed_at as source of truth, falls back to datetime.now(UTC)
    """
    # Prefer context.completed_at as source of truth, fall back to current time
    completed_at = context.completed_at if context.completed_at else datetime.now(UTC)

    # Get context dict and update timing locally (without mutating context)
    context_dict = context.to_dict()
    context_dict["type"] = "webhook_summary"
    if "timing" in context_dict:
        context_dict["timing"]["completed_at"] = completed_at.isoformat()
        if context.started_at:
            duration_ms = int((completed_at - context.started_at).total_seconds() * 1000)
            context_dict["timing"]["duration_ms"] = duration_ms

    # Get log file path
    log_file = self._get_log_file_path(completed_at)

    # Serialize context to JSON (compact JSONL format - single line, no indentation)
    log_entry = json.dumps(context_dict, ensure_ascii=False)

These files contain two important entry types:

webhook_summary: one end-of-webhook summary with timing, workflow steps, success state, and errors.
log_entry: individual log records enriched with webhook context when that context exists.

A log_entry record is built like this:

message = record.getMessage()
message = _ANSI_ESCAPE_RE.sub("", message)

exc_text: str | None = None
if record.exc_info and record.exc_info[0] is not None:
    exc_text = "".join(traceback.format_exception(*record.exc_info))

entry: dict[str, object] = {
    "type": "log_entry",
    "timestamp": datetime.fromtimestamp(record.created, tz=UTC).isoformat(),
    "level": record.levelname,
    "logger_name": record.name,
    "message": message,
}

if exc_text:
    entry["exc_info"] = exc_text

# Enrich with webhook context when available
ctx = get_context()
if ctx is not None:
    entry["hook_id"] = ctx.hook_id
    entry["event_type"] = ctx.event_type
    entry["repository"] = ctx.repository
    entry["pr_number"] = ctx.pr_number
    entry["api_user"] = ctx.api_user

A webhook_summary carries the higher-level fields you usually want when debugging a delivery:

return {
    "hook_id": self.hook_id,
    "level": self._derive_level(),
    "status": self._derive_status(),
    "event_type": self.event_type,
    "action": self.action,
    "sender": self.sender,
    "repository": self.repository,
    "repository_full_name": self.repository_full_name,
    "pr": {
        "number": self.pr_number,
        "title": self.pr_title,
        "author": self.pr_author,
    }
    if self.pr_number
    else None,
    "api_user": self.api_user,
    "timing": {
        "started_at": self.started_at.isoformat(),
        "completed_at": (self.completed_at.isoformat() if self.completed_at else None),
        "duration_ms": int((self.completed_at - self.started_at).total_seconds() * 1000)
        if self.completed_at
        else None,
    },
    "workflow_steps": self.workflow_steps,
    "token_spend": self.token_spend,
    "initial_rate_limit": self.initial_rate_limit,
    "final_rate_limit": self.final_rate_limit,
    "success": self.success,
    "error": self.error,
    "summary": self._build_summary(),
}

Note: webhooks_*.json is date-split, not size-rotated. The server creates a new file each UTC day, but it does not roll these files over by size. If you keep logs for a long time, plan your own retention or archival policy.

A practical detail: the server always tries to write a structured summary at the end of webhook processing, even after failures. If you need the most reliable delivery-level record, start with webhooks_*.json.

How Masking Works

Masking is enabled by default with mask-sensitive-data: true. The logger treats common secret and credential patterns as sensitive:

mask_sensitive_patterns: list[str] = [
    # Passwords and secrets
    "container_repository_password",
    "password",
    "secret",
    # Tokens and API keys
    "token",
    "apikey",
    "api_key",
    "github_token",
    "GITHUB_TOKEN",
    "pypi",
    # Authentication credentials
    "username",
    "login",
    "-u",
    "-p",
    "--username",
    "--password",
    "--creds",
    # Private keys and sensitive IDs
    "private_key",
    "private-key",
    "webhook_secret",
    "webhook-secret",
    "github-app-id",
    # Slack webhooks (contain sensitive URLs)
    "slack-webhook-url",
    "slack_webhook_url",
    "webhook-url",
    "webhook_url",
]

In practice, this means:

Tokens, passwords, webhook secrets, and similar values are masked in log output by default.
Command helpers redact explicitly supplied secrets before writing command lines, stdout, or stderr to the logs.
Repository-level mask-sensitive-data can override the global setting for one repository.

Warning: Setting mask-sensitive-data: false can expose credentials in your logs. Use it only for short-lived debugging in a controlled environment.

Log Separation

This project intentionally separates logs by purpose.

The main text log is for readable application activity.
webhooks_*.json is for structured webhook diagnostics and analysis.
logs_server.log is for the log viewer itself.
mcp_server.log is for the optional MCP server.

That separation is enforced in the logger setup. The structured JSON handler is attached only to the default webhook/application logger, not to infrastructure loggers that are created with an explicit filename:

# Attach JsonLogHandler for writing log records to the webhook JSONL file.
# Only attach when:
# - A log file path is configured (skip console-only loggers)
# - The logger is for the main webhook log (log_file_name not explicitly set)
#   Infrastructure loggers (mcp_server.log, logs_server.log) must NOT write
#   to webhooks_*.json because their entries lack webhook context (hook_id,
#   event_type, etc.) and pollute the webhook log with noise entries.
# - Only once per logger instance to avoid duplicate handlers.
# Uses _config.data_dir/logs (same directory as StructuredLogWriter) instead
# of deriving from the text log file path, which may differ for absolute paths.
if log_file_path_resolved and not log_file_name:
    log_dir = os.path.join(_config.data_dir, "logs")
    with _JSON_HANDLER_LOCK:
        if not any(isinstance(h, JsonLogHandler) and h.log_dir == Path(log_dir) for h in logger.handlers):
            logger.addHandler(
                JsonLogHandler(
                    log_dir=log_dir,
                    level=getattr(logging, log_level.upper(), logging.DEBUG),
                )
            )

That last comment matters: even if you point log-file at an absolute path somewhere else, the structured webhooks_*.json files still stay under <data-dir>/logs/.

The log viewer gets its own dedicated logger:

if _log_viewer_controller_singleton is None:
    # Use global LOGGER for config operations
    config = Config(logger=LOGGER)
    logs_server_log_file = config.get_value("logs-server-log-file", return_on_none="logs_server.log")

    # Create dedicated logger for log viewer
    log_viewer_logger = get_logger_with_params(log_file_name=logs_server_log_file)
    _log_viewer_controller_singleton = LogViewerController(logger=log_viewer_logger)

The same pattern is used for MCP logging during startup:

# Configure MCP logging separation
if MCP_SERVER_ENABLED:
    mcp_log_file = root_config.get("mcp-log-file", "mcp_server.log")

    # Use get_logger_with_params to reuse existing logging configuration logic
    # (rotation, sensitive data masking, formatting)
    # This returns a logger configured for the specific file
    mcp_file_logger = get_logger_with_params(log_file_name=mcp_log_file)

    # Add the configured handler to the actual MCP logger and stop propagation
    # This ensures MCP logs go ONLY to mcp_server.log and not webhook_server.log
    if mcp_file_logger.handlers:
        for handler in mcp_file_logger.handlers:
            mcp_logger.addHandler(handler)

        mcp_logger.propagate = False

Log Viewer Files

When enabled, the log viewer reads the same files from <data-dir>/logs/; it does not build a separate database.

It scans current text logs, rotated text logs, and structured webhook files:

# Find all log files including rotated ones and JSON files
log_files: list[Path] = []
log_files.extend(log_dir.glob("*.log"))
log_files.extend(log_dir.glob("*.log.*"))
log_files.extend(log_dir.glob("webhooks_*.json"))

# Sort log files to prioritize JSON webhook files first (primary data source),
# then other files by modification time (newest first)
# This ensures webhook data is displayed before internal log files
def sort_key(f: Path) -> tuple[int, float]:
    is_json_webhook = f.suffix == ".json" and f.name.startswith("webhooks_")
    # JSON webhook files: (0, -mtime) - highest priority, newest first
    # Other files: (1, -mtime) - lower priority, newest first
    return (0 if is_json_webhook else 1, -f.stat().st_mtime)

...
async with aiofiles.open(log_file, encoding="utf-8") as f:
    # Use appropriate parser based on file type
    if log_file.suffix == ".json":
        # JSONL files: one compact JSON object per line
        # Process both "log_entry" and "webhook_summary" entries
        # Skip infrastructure logger entries that lack webhook context
        async for line in f:
            entry = self.log_parser.parse_json_log_entry(line)
            if entry and not LogViewerController._is_infrastructure_noise(entry):
                buffer.append(entry)
    else:
        # Text log files: parse line by line
        # Skip infrastructure logger entries that lack webhook context
        async for line in f:
            entry = self.log_parser.parse_log_entry(line)
            if entry and not LogViewerController._is_infrastructure_noise(entry):
                buffer.append(entry)

It also filters out known infrastructure noise when those entries have no webhook context:

@staticmethod
def _is_infrastructure_noise(entry: LogEntry) -> bool:
    """Check if a log entry is infrastructure noise that should be excluded.

    Infrastructure loggers (MCP server, log viewer) produce high-frequency
    entries without webhook context. These are filtered out to prevent them
    from drowning actual webhook processing entries in unfiltered queries.

    Only excludes entries that have NO webhook context (hook_id is None),
    preserving any infrastructure log that happens to correlate with a webhook.

    Args:
        entry: LogEntry to check

    Returns:
        True if the entry is infrastructure noise and should be excluded

    """
    return entry.logger_name in LogViewerController._INFRASTRUCTURE_LOGGERS and entry.hook_id is None

What this means in practice:

The log viewer can show current and rotated text logs together with structured webhook logs.
Structured JSON files are the primary source for webhook summaries, workflow steps, and export data.
Text logs provide the detailed line-by-line context that summary records intentionally do not include.
Exports are streamed to the client on demand; they are not written back into the data directory as extra files.

Warning: The project does not add application-level authentication to the /logs endpoints. Treat the log viewer as an internal tool and protect it with trusted network placement or a reverse proxy that adds authentication.