PatchPatrol

Artifacts & Schema

Public contract details for PatchPatrol output artifacts and JSON payload structure.

Artifacts & Schema Reference

Use this page when you want to integrate PatchPatrol output into automation, checkers, or any non-interactive workflow.

Source of truth: schema.json plus the current artifact writing behavior in patchpatrol/artifacts.py and patchpatrol/cli.py.

What PatchPatrol Writes

PatchPatrol writes files into AI_REVIEW_OUTPUT_DIR (default .ai-review).

ArtifactPurposeStability posture
ai-review.mdHuman-readable summary for quick manual review and triage.Stable, documented public artifact.
ai-review.htmlOptional static printable presentation artifact for browser review or PDF export.Optional additive artifact; enabled with AI_REVIEW_HTML_REPORT=true or ai-review run --html-report.
ai-review.jsonMachine-readable FindingsReport payload for scripts and integrations.Stable, documented public contract.
gl-code-quality-report.jsonOptional Code Quality format mirror of findings.Optional output; enabled by default via AI_REVIEW_CODE_QUALITY_EXPORT, including early exits that have no findings and therefore render as [].

The HTML and markdown reports are human-facing renderings derived from the same validated payload. They can surface review context, summary, findings, coverage, and token usage sections, but ai-review.json remains the automation contract.

Use the First review output page for quick interpretation.

Core JSON contract (stable)

ai-review.json validates against schema.json with three required top-level fields:

  • meta
  • summary
  • findings

The top-level object and finding objects reject unknown properties, so key names above and their nested fields should match what integrators send to downstream systems.

meta

meta is required and includes at least these stable keys:

  • model
  • base_sha
  • head_sha
  • generated_at
  • diff_stats
  • limits

meta.diff_stats requires:

  • files
  • insertions
  • deletions

meta.limits is a dictionary. It always exists and carries runtime-applied limit values and execution diagnostics.

For structured-output provider behavior, meta.limits may also include compact diagnostics such as:

  • structured_output_review_call_count
  • structured_output_requested_modes
  • structured_output_effective_modes
  • structured_output_fallback_count
  • structured_output_reason_codes

These fields are optional and additive. They record output-mode decisions and fallback reason codes without persisting raw prompts or provider request payloads.

When repo-local .ai-review.yml review.rules are configured, meta.limits may also include compact counters such as:

  • prompt_review_rule_total_count
  • prompt_review_rule_matched_count
  • prompt_review_rule_included_count
  • prompt_review_rule_omitted_count

These counters make rule matching observable without persisting raw rule instructions in review artifacts.

When deterministic preflight hints are generated, meta.limits may include compact counters and reason codes such as:

  • prompt_preflight_hint_total_count
  • prompt_preflight_hint_matched_count
  • prompt_preflight_hint_included_count
  • prompt_preflight_hint_omitted_count
  • prompt_preflight_hint_reason_codes

These fields are optional and additive. They make hint matching observable without persisting raw hint evidence, raw prompt bodies, or provider chain-of-thought in artifacts. Preflight hints are prompt evidence only: they can point the provider at duplicate added code-like text, dependency additions, risky import additions, or missing test-counterpart signals, but they are not automatic findings or CI gates.

summary

summary is required and includes these stable fields:

  • overall_risk
  • top_issues

overall_risk is one of low, medium, or high.

findings

findings is required and is an array of finding objects.

A finding currently requires:

  • severity
  • category
  • file
  • line_start
  • line_end
  • title
  • description
  • recommendation
  • confidence

suggested_patch and provenance are optional.

line_start and line_end are normal new/right-side integer file line numbers. Provider prompts may render prompt-only L<number>: labels on added and context lines to help the model anchor findings, but those labels are not persisted in ai-review.json, ai-review.md, ai-review.html, or GitLab exports.

Published enums and controlled values

The following enum sets are the reliable values currently emitted and validated:

  • finding.severity: blocker, high, medium, low, info
  • finding.category: security, correctness, reliability, performance, maintainability, testing, style
  • summary.overall_risk: low, medium, high
  • finding.provenance.origin: llm, semantic, security_tool, combined
  • finding.provenance.sources: llm, semantic, security_tool
  • meta.security_precheck.status: pass, fail, unavailable, error, skipped

Runtime fields in meta you may see today

The schema allows additional public metadata in meta. In the shipped runtime paths, these meta sections are currently present in supported cases:

  • meta.semantic_precheck
  • meta.security_precheck
  • meta.trust_gate
  • meta.provider_runtime
  • meta.feedback
  • meta.repository_overview
  • meta.usage_ledger
  • meta.performance_diagnostics

These are additive and may grow as execution behavior evolves.

meta.usage_ledger, when present, is interpretive provider-call metadata used for token usage tables. It has three levels: calls for all provider calls, by_phase for phase totals, and records for individual calls. If the trust gate blocks execution before provider review, no provider call is made; fail-fast artifacts can therefore show meta.usage_ledger.calls.call_count as 0 and zero prompt/completion/total tokens. In that same blocked state, findings: [] means the provider review did not run, not that the merge request was clean.

Provider timing fields in meta.usage_ledger are additive. When the configured provider exposes native timings, records and totals may include these fields:

FieldMeaning
prompt_tokensPrompt/input tokens reported for the provider call.
completion_tokensGenerated/completion tokens reported for the provider call.
total_tokensPrompt plus completion tokens when available.
wall_elapsed_secondsPatchPatrol-measured wall time for the provider call or total.
provider_total_duration_secondsProvider/server total runtime when exposed by the provider.
provider_load_duration_secondsProvider-reported model load time, useful for cold-start diagnosis.
provider_prompt_eval_duration_secondsProvider-reported prompt ingestion/evaluation time.
provider_generation_duration_secondsProvider-reported generation/decode time.
total_tokens_per_second_wallTotal tokens divided by PatchPatrol wall time. Human reports label this Wall total tok/s.
prompt_tokens_per_second_providerPrompt tokens divided by provider prompt-eval time. Human reports label this Prompt tok/s.
generation_tokens_per_second_providerCompletion tokens divided by provider generation time. Human reports label this Generation tok/s; this is the Ollama model decode-speed metric.
total_tokens_per_second_providerTotal tokens divided by provider total runtime. Human reports label this Provider total tok/s.
provider_wall_overhead_secondsPatchPatrol wall time minus provider total runtime. Use this to spot Docker, network, wrapper, or orchestration overhead.

For Ollama, provider-native durations come from nanosecond fields in the /api/chat response and are converted to seconds:

prompt_tok_s = prompt_eval_count / (prompt_eval_duration / 1e9)
generation_tok_s = eval_count / (eval_duration / 1e9)
provider_total_tok_s = total_tokens / (total_duration / 1e9)
wall_total_tok_s = total_tokens / PatchPatrol_wall_elapsed_seconds

Blended wall/provider total rates are not generation speed. A large prompt and short completion can make Wall total tok/s or Provider total tok/s much higher than Generation tok/s. Providers that expose only token counts leave provider-native prompt/generation rates as unavailable, rendered as n/a in human reports, while wall throughput may still be available.

meta.performance_diagnostics, when enabled with AI_REVIEW_ENABLE_PERFORMANCE_DIAGNOSTICS=true, mirrors the same usage values under observed.token_usage and duration values under observed.elapsed_seconds. The markdown and optional HTML reports render the same concepts in ## Token Usage and ## Performance Diagnostics.

Treat missing optional sections as normal and preserve unknown keys when forwarding or storing reports.

Minimal valid payload example

{
  "meta": {
    "model": "deepseek-coder-v2:16b",
    "base_sha": "1111111",
    "head_sha": "2222222",
    "generated_at": "2026-02-26T12:00:00Z",
    "diff_stats": {
      "files": 0,
      "insertions": 0,
      "deletions": 0
    },
    "limits": {
      "max_diff_bytes": 32768,
      "max_files": 50
    }
  },
  "summary": {
    "overall_risk": "low",
    "top_issues": []
  },
  "findings": []
}

That structure is intentionally minimal, but valid for parser bootstrap checks and contract tests.

Stability posture

  • ai-review.json is the contract source of truth.
  • ai-review.md and optional ai-review.html are human-facing renderings derived from that validated payload.
  • schema.json is the contract source for those artifacts.
  • Integrators should validate required fields and enums from schema.json and treat extra fields as optional forward-compatible extensions.
  • Any optional output sections should be handled defensively (null/absent checks first), because these sections are surfaced opportunistically.

On this page