Querying the Review Database
# Querying the Review Database
Analyze your accumulated code review history to spot recurring feedback, understand which review sources deliver the most value, and track how your team handles suggestions over time. These queries help you tune auto-skip rules and focus reviewer effort where it matters.
## Prerequisites
- `myk-pi-tools` installed (`uv tool install myk-pi-tools`)
- A review database at `<project-root>/.pi/data/reviews.db` (created automatically when you store completed reviews with `myk-pi-tools reviews store`)
## Quick Example
See how often comments from each review source get addressed:
```bash
myk-pi-tools db stats
source | total | addressed | not_addressed | skipped | addressed_rate
----------+-------+-----------+---------------+---------+---------------
coderabbit| 48 | 32 | 10 | 6 | 66.7%
human | 25 | 22 | 2 | 1 | 88.0%
qodo | 15 | 9 | 4 | 2 | 60.0%
This tells you at a glance which sources produce the most actionable feedback.
Viewing Statistics
Stats by source
The default view groups comments by their origin — human, qodo, or coderabbit:
myk-pi-tools db stats
This is equivalent to myk-pi-tools db stats --by-source.
Stats by reviewer
Switch to per-author breakdown to see which individual reviewers provide the most feedback:
myk-pi-tools db stats --by-reviewer
author | total | addressed | not_addressed | skipped
----------------+-------+-----------+---------------+--------
coderabbitai | 48 | 32 | 10 | 6
alice | 15 | 14 | 1 | 0
bob | 10 | 8 | 1 | 1
JSON output
Add --json to any command for machine-readable output:
myk-pi-tools db stats --by-source --json
Finding Recurring Patterns
Identify comments that keep appearing with similar wording — these are good candidates for auto-skip rules:
myk-pi-tools db patterns
path | occurrences | reason | body_sample
----------------+-------------+---------------------------+-----------------------------
src/api/auth.py | 4 | Style preference, ignored | Consider adding type hints...
src/utils.py | 3 | Not applicable to project | Add error handling for ed...
Raise the threshold to focus on the most persistent patterns:
myk-pi-tools db patterns --min 3
Tip: Patterns with high occurrence counts are strong signals that an auto-skip rule should be configured. The tool uses Jaccard word similarity (60% overlap) to cluster related comments, so minor wording variations still get grouped together.
Viewing Dismissed Comments
List all comments that were marked as not_addressed or skipped for a specific repository:
myk-pi-tools db dismissed --owner myk-org --repo my-project
path | line | status | reply | author
----------------+------+---------------+---------------------------+-----------
src/config.py | 42 | not_addressed | Style preference, ignored | coderabbitai
src/api/main.py | 15 | skipped | Auto-skipped: duplicate | qodo-code-review
This shows the dismissal reason alongside each comment, helping you understand why feedback was set aside.
Note: Comments with status
addressedonly appear here if they have a special type (outside_diff_comment,nitpick_comment, orduplicate_comment). Normal addressed comments are tracked by GitHub's thread resolution instead.
Finding Similar Comments
Check whether a new review comment matches something previously dismissed. This is the same similarity check that powers the auto-skip feature during review fetching.
Pipe a JSON object with path and body fields via stdin:
echo '{"path": "src/utils.py", "body": "Add error handling for edge cases"}' | \
myk-pi-tools db find-similar --owner myk-org --repo my-project
Found similar comment (similarity: 0.85):
Path: src/utils.py:23
Status: not_addressed
Reason: Not applicable — error cases handled upstream
Original body: Consider adding error handling for edge case...
Adjust the similarity threshold (default 0.6) to be more or less strict:
echo '{"path": "src/utils.py", "body": "Add error handling"}' | \
myk-pi-tools db find-similar --owner myk-org --repo my-project --threshold 0.8
Running Custom Queries
For ad-hoc exploration, run raw SQL against the database:
myk-pi-tools db query "SELECT path, COUNT(*) as cnt FROM comments WHERE status = 'skipped' GROUP BY path ORDER BY cnt DESC"
Useful queries
Top files by total comments:
myk-pi-tools db query "SELECT path, COUNT(*) as total FROM comments GROUP BY path ORDER BY total DESC LIMIT 10"
Comments by status breakdown:
myk-pi-tools db query "SELECT status, COUNT(*) as cnt FROM comments GROUP BY status"
Recent reviews with comment counts:
myk-pi-tools db query "SELECT r.owner, r.repo, r.pr_number, r.created_at, COUNT(c.id) as comments FROM reviews r JOIN comments c ON c.review_id = r.id GROUP BY r.id ORDER BY r.created_at DESC LIMIT 10"
High-priority unaddressed comments:
myk-pi-tools db query "SELECT c.path, c.line, c.body, c.author FROM comments c WHERE c.priority = 'HIGH' AND c.status = 'not_addressed'"
Common Table Expressions (CTEs) are supported:
myk-pi-tools db query "WITH skipped AS (SELECT path, body, skip_reason FROM comments WHERE status = 'skipped') SELECT path, COUNT(*) as cnt FROM skipped GROUP BY path"
Warning: Only
SELECTandWITH(CTE) statements are allowed. The database is opened in read-only mode, and modifying keywords (INSERT,UPDATE,DELETE,DROP,ALTER,CREATE,PRAGMA) are blocked.
Advanced Usage
Using a custom database path
All db subcommands accept --db-path to point at a specific database file instead of the auto-detected one:
myk-pi-tools db stats --db-path /path/to/other/reviews.db
By default, the tool auto-detects the database at <git-root>/.pi/data/reviews.db based on the current working directory.
Database schema reference
The database has two tables:
reviews — one row per review session:
| Column | Type | Description |
|---|---|---|
| id | INTEGER | Primary key |
| pr_number | INTEGER | Pull request number |
| owner | TEXT | GitHub org or user |
| repo | TEXT | Repository name |
| commit_sha | TEXT | Git commit at review time |
| created_at | TEXT | ISO 8601 timestamp |
comments — one row per review comment:
| Column | Type | Description |
|---|---|---|
| id | INTEGER | Primary key |
| review_id | INTEGER | Foreign key to reviews |
| source | TEXT | human, qodo, or coderabbit |
| author | TEXT | Reviewer username |
| path | TEXT | File path |
| line | INTEGER | Line number |
| body | TEXT | Comment text |
| priority | TEXT | HIGH, MEDIUM, or LOW |
| status | TEXT | pending, addressed, not_addressed, or skipped |
| reply | TEXT | Human response or dismissal reason |
| skip_reason | TEXT | Raw dismissal reason |
| type | TEXT | outside_diff_comment, nitpick_comment, duplicate_comment, or null |
| posted_at | TEXT | ISO 8601 timestamp |
| resolved_at | TEXT | ISO 8601 timestamp |
How auto-skip uses this data
When new reviews are fetched, the system automatically queries dismissed comments from the database and compares them against incoming feedback. If a new comment has 60% or greater word overlap with a previously dismissed comment on the same file path, it is auto-skipped with the original dismissal reason. This prevents the same low-value suggestions from resurfacing across PRs.
The find-similar command lets you test this matching logic manually before relying on it.
Querying from the orchestrator
Inside a pi session, use the /query-db slash command to run database queries conversationally:
/query-db stats --by-source
/query-db patterns --min 3
/query-db dismissed --owner myk-org --repo my-project
You can also ask natural language questions, and the orchestrator will compose the appropriate query.
Troubleshooting
- "Database not found" — No reviews have been stored yet. Run a full review cycle (
myk-pi-tools reviews fetch, thenmyk-pi-tools reviews store) to populate the database. - "Only SELECT/CTE queries are allowed" — The
querycommand blocks write operations. Use built-in subcommands (stats,patterns,dismissed) for standard analysis, or write aSELECTstatement for custom queries. - "Multiple SQL statements are not allowed" — The
querycommand only accepts a single statement. Remove any semicolons separating multiple queries and run them one at a time. - Stats show no data for a source — That review source hasn't produced any comments in stored reviews. Verify that reviews from that source exist by running
myk-pi-tools db query "SELECT DISTINCT source FROM comments".