Testing and Quality
This repository uses a local-first quality model. Unit tests run with pytest, tox provides a thin wrapper around that suite, and repository-wide checks run through pre-commit hooks. On top of that, the Claude Code configuration in this repo wires in runtime guardrails such as rule-enforcer.py and git-protection.py.
The result is a mix of fast automated checks and explicit policy tests. The Python test suite focuses on the myk_claude_tools CLI, hook scripts, release helpers, review automation, and the SQLite analytics layer.
Quick Start
# Run the unit tests directly
uv run --group tests pytest tests
# Run a focused test module while you work
uv run --group tests pytest tests/test_git_protection.py
# Run the same suite through tox
tox
# Run formatting, typing, linting, Markdown, and secret scans
prek run --all-files
Note:
pyproject.tomldeclaresrequires-python = ">=3.10", and most commands in this repository useuvbecause bothtox.tomland the runtime hook policy are built around it.Tip: In the Claude Code environment configured by this repo,
scripts/rule-enforcer.pyintentionally blocks directpython,pip, andpre-commitcommands. Useuv,uvx, andprekthere instead.
Quality Stack at a Glance
| Area | Source of truth | Purpose |
|---|---|---|
| Unit tests | tox.toml, tests/ |
Runs the Python test suite with pytest |
| Linting and formatting | pyproject.toml, .pre-commit-config.yaml |
Ruff lint + Ruff format |
| Type checking | pyproject.toml, .pre-commit-config.yaml |
mypy |
| Extra Python bug check | .flake8, .pre-commit-config.yaml |
Flake8 M511 via flake8-mutable |
| Markdown quality | .markdownlint.yaml, .pre-commit-config.yaml |
markdownlint |
| Secret scanning | .pre-commit-config.yaml |
detect-secrets, gitleaks, and detect-private-key |
| Runtime guardrails | settings.json, scripts/ |
Blocks unsafe commands and protected-branch writes |
Pytest and tox
pytest is the actual test runner. tox is configured as a lightweight wrapper around it, not as a build matrix or packaging pipeline. The checked-in tox config has a single environment named unittests, and skipsdist = true tells tox not to build the package first.
From tox.toml:
skipsdist = true
envlist = ["unittests"]
[env.unittests]
description = "Run pytest tests"
deps = ["uv"]
commands = [["uv", "run", "--group", "tests", "pytest", "tests"]]
From pyproject.toml:
[dependency-groups]
tests = ["pytest>=9.0.2"]
What this means in practice:
- If you want the fastest path, run
uv run --group tests pytest tests. - If you prefer tox,
toxruns the same suite through theunittestsenvironment. - The test suite is designed to be fast and deterministic, with heavy use of mocked Git, GitHub CLI, GraphQL, subprocess, and SQLite interactions.
Ruff, mypy, and Flake8
Ruff
Ruff is the main Python linter and formatter. The repository enables autofix-friendly behavior, grouped output, and preview mode.
From pyproject.toml:
[tool.ruff]
preview = true
line-length = 120
fix = true
output-format = "grouped"
[tool.ruff.lint]
select = ["E", "F", "W", "I", "B", "UP", "PLC0415", "ARG", "RUF059"]
[tool.ruff.lint.per-file-ignores]
"myk_claude_tools/*/commands.py" = ["PLC0415"]
What to expect:
120characters is the Python line-length target.fix = truemeans Ruff is configured for autofix-friendly runs.- The selected rules cover core linting, import ordering, bug-prone patterns, Python modernization, unused arguments, and a few repository-specific checks.
PLC0415is ignored forcommands.pyfiles because the CLI command modules intentionally use lazy imports for faster startup.
Tip: Because
preview = trueis enabled, keeping Ruff reasonably up to date matters. Very old Ruff versions may not match the repository’s expected behavior.
mypy
Mypy is configured more strictly than its defaults, especially around untyped or partially typed functions.
From pyproject.toml:
[tool.mypy]
check_untyped_defs = true
disallow_any_generics = false
disallow_incomplete_defs = true
disallow_untyped_defs = true
no_implicit_optional = true
show_error_codes = true
warn_unused_ignores = true
strict_equality = true
extra_checks = true
warn_unused_configs = true
warn_redundant_casts = true
In plain English, that means:
- New Python code is expected to be typed, not left as “we’ll add types later.”
- Even when a function is untyped, mypy still checks its body.
- Implicit
Optionalbehavior is not allowed. - Redundant casts, unused ignores, and unused config are treated as real issues.
Flake8
Flake8 is deliberately narrow here. It is not duplicating Ruff’s full lint surface. Instead, it is reserved for M511, provided by flake8-mutable, to catch mutable default argument bugs.
From .flake8:
[flake8]
select=M511
exclude =
doc,
.tox,
.git,
.yml,
Pipfile.*,
docs/*,
.cache/*
That is why you will see both Ruff and Flake8 in the hook stack: Ruff does the heavy lifting, and Flake8 adds one focused rule the project still wants enforced.
Markdown Quality
The repository contains a lot of Markdown: plugin commands, agent definitions, rules, and user-facing docs. markdownlint is part of the default hook set, with a small adjustment to keep documentation writing practical.
From .markdownlint.yaml:
default: true
MD013:
line_length: 180
code_blocks: false
tables: false
This means:
- All default markdownlint rules are on.
- Long prose lines are allowed up to
180characters. - Code fences and tables are exempt from the line-length rule, which avoids awkward wrapping of examples and comparison tables.
Secret Scanning and Repository Hygiene
Secret scanning is not a one-tool checkbox here. The pre-commit stack combines several layers:
detect-private-keyfrompre-commit-hookscatches obvious key material.detect-secretslooks for likely secret patterns and high-risk text.gitleaksadds a second secret-scanning engine with different signatures.- Generic hygiene hooks also catch merge conflicts, large files, mixed line endings, debug statements, trailing whitespace, and TOML syntax errors.
From .pre-commit-config.yaml:
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v6.0.0
hooks:
- id: check-added-large-files
- id: check-merge-conflict
- id: detect-private-key
- id: debug-statements
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-toml
- repo: https://github.com/Yelp/detect-secrets
rev: v1.5.0
hooks:
- id: detect-secrets
- repo: https://github.com/gitleaks/gitleaks
rev: v8.30.0
hooks:
- id: gitleaks
The project also handles test-only false positives carefully. Instead of weakening secret scanners globally, tests allowlist specific fake SHA-looking values inline when needed.
From tests/test_store_reviews_to_db.py:
assert result == "abc1234567890abcdef" # pragma: allowlist secret
Note: This is a good pattern to copy. If a test needs a fake token, hash, or key-shaped string, allowlist that exact line instead of broadly disabling the scanner.
Pre-commit Workflow
The checked-in hook stack is the main “one command” quality gate. In addition to the hooks above, it also runs Ruff, Ruff format, mypy, Flake8, and markdownlint.
Relevant parts of .pre-commit-config.yaml:
ci:
autofix_prs: false
autoupdate_commit_msg: "ci: [pre-commit.ci] pre-commit autoupdate"
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.14.14
hooks:
- id: ruff
- id: ruff-format
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.19.1
hooks:
- id: mypy
additional_dependencies:
- pytest
- repo: https://github.com/igorshubovych/markdownlint-cli
rev: v0.44.0
hooks:
- id: markdownlint
args: [--config, .markdownlint.yaml]
types: [markdown]
Two practical takeaways:
- The project expects contributors to fix issues locally rather than relying on automatic bot-generated fix PRs.
- The hook stack is intentionally broad. If
prek run --all-filespasses, you have exercised the linting, formatting, typing, Markdown, and secret-scanning layers in one go.
Hook-Enforced Quality in the Claude Code Environment
This repository does not rely only on conventional lint and test tools. It also encodes runtime guardrails in Claude Code hook scripts, and those scripts are heavily tested.
From settings.json:
"PreToolUse": [
{
"matcher": "TodoWrite|Bash",
"hooks": [
{
"type": "command",
"command": "uv run ~/.claude/scripts/rule-enforcer.py"
}
]
},
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "uv run ~/.claude/scripts/git-protection.py"
}
]
}
]
rule-enforcer.py
rule-enforcer.py keeps the environment consistent by steering Python and pre-commit execution toward the approved toolchain.
From scripts/rule-enforcer.py:
def is_forbidden_python_command(command: str) -> bool:
cmd = command.strip().lower()
# Allow uv/uvx commands
if cmd.startswith(("uv ", "uvx ")):
return False
# Block direct python/pip
forbidden = ("python ", "python3 ", "pip ", "pip3 ")
return cmd.startswith(forbidden)
Tests around this script verify that it:
- Blocks direct
python,python3,pip,pip3, andpre-commitcommands. - Allows
uv,uvx, andprek. - Handles whitespace and mixed case correctly.
- Fails open on malformed hook input so a broken hook payload does not brick the entire session.
git-protection.py
git-protection.py protects branch hygiene. It blocks commits and pushes in cases that are easy to regret later.
From scripts/git-protection.py:
# Allow amend on unpushed commits
if is_amend_with_unpushed_commits(command):
return False, None
The tests around git-protection.py lock down behavior such as:
- Rejecting commits on
mainormaster. - Rejecting commits and pushes to branches that are already merged.
- Rejecting work on branches whose pull requests are already merged.
- Allowing
git commit --amendwhen the branch is ahead of its remote. - Handling detached HEAD, orphan branches, and GitHub API errors explicitly.
- Failing closed when it cannot safely decide, so an unsafe commit or push does not slip through.
Warning: These hook scripts are part of the quality story in this repo. Passing
pytestalone does not mean you have exercised the runtime guardrails unless the relevant hook tests are also green.
What the Test Suite Actually Guarantees
The tests in tests/ are not generic smoke tests. They encode concrete policies and edge cases for the project’s most important moving parts.
Review automation is expected to be safe and repeatable
The review pipeline tests cover fetching, parsing, replying, storing, and querying review data. That includes:
- GitHub GraphQL and REST pagination behavior.
- Parsing CodeRabbit outside-diff, nitpick, and duplicate comments.
- Posting replies idempotently, so already-posted entries are not posted again.
- Retrying a failed resolve without double-posting a reply.
- Grouping body-only comments into consolidated PR comments and chunking them to stay below GitHub size limits.
- Different resolution rules for human versus AI-generated review comments.
From myk_claude_tools/reviews/post.py:
# Determine if we should resolve this thread (MUST be before resolve_only_retry check)
should_resolve = True
if category == "human" and status != "addressed":
should_resolve = False
That one branch summarizes a real project policy: human review threads stay open unless they were actually addressed, while AI-originated review items can be auto-resolved after the reply policy is applied.
Review database access is deliberately conservative
The SQLite layer is tested for append-only storage, schema migration, and read-only analytics access. The database helper does not allow arbitrary write SQL through its query interface.
From myk_claude_tools/db/query.py:
if not sql_upper.startswith(("SELECT", "WITH")):
raise ValueError("Only SELECT/CTE queries are allowed for safety")
Tests also verify that the query layer rejects:
INSERT,UPDATE,DELETE,DROP,ALTER,ATTACH, andPRAGMA- Multi-statement SQL
- Dangerous keywords hidden inside otherwise “read-like” queries
The storage layer is also tested to create .claude/data/ with 0700 permissions, migrate older databases forward, and append new review snapshots instead of rewriting earlier ones.
Version tooling is section-aware and atomic
The release helpers are also well-covered. The tests check that version detection and bumping:
- Find versions in
pyproject.toml,package.json,setup.cfg,Cargo.toml, Gradle files, and Python__version__files - Skip excluded directories such as
.venv,node_modules,.tox, and build caches - Only update the correct section, not every
versionkey in a file - Reject invalid version strings
- Fail safely when a requested file filter does not match
- Use atomic writes so partial edits do not leak into the repository on error
These are especially useful guarantees if you use the repository’s release helpers as part of a scripted workflow.
Hook behavior is tested as behavior, not just as syntax
The hook test modules do more than import the scripts and call one happy path. They cover:
- Regex-based Git subcommand detection, including false-positive cases
- JSON hook I/O behavior
- Whitespace and case handling
- Timeout and subprocess failure paths
- The intentional difference between fail-open and fail-closed hooks
Note: Most of these tests are unit tests with mocked subprocess and GitHub responses. That is intentional. The goal is to make policy-heavy behavior easy to test without depending on live GitHub state or a specific local Git history.
Tip: If you want the exact contract for a subsystem, the clearest starting points are
tests/test_rule_enforcer.py,tests/test_git_protection.py,tests/test_post_review_replies.py,tests/test_store_reviews_to_db.py, andtests/test_review_db.py.
CI/CD Status
Warning: No
.github/workflows/directory is committed in this repository.
From the repository contents, the quality gates you can verify are:
pytestfor unit teststoxas a wrapper around the unit test suite- pre-commit hooks for linting, typing, formatting, Markdown, and secret scanning
- Claude Code hook scripts for runtime enforcement
.pre-commit-config.yaml does include a ci: section, but the source tree itself does not define a checked-in GitHub Actions pipeline. In practice, this means the repository is local-first: contributors are expected to run the checks themselves and keep them green before sharing changes.
Recommended Workflow
A practical workflow for contributors is:
- Run a targeted
pytestmodule while you work, such asuv run --group tests pytest tests/test_post_review_replies.py. - Run the full test suite with
uv run --group tests pytest tests. - Run
prek run --all-filesbefore committing. - If you use tox, make sure
toxstays green as a second path through the same unit-test suite.
That combination covers the repository’s two main quality layers: tested behavior and repository-wide hygiene.