Public contract details for PatchPatrol output artifacts and JSON payload structure.

Artifacts & Schema Reference

Use this page when you want to integrate PatchPatrol output into automation, checkers, or any non-interactive workflow.

Source of truth: schema.json plus the current artifact writing behavior in patchpatrol/artifacts.py and patchpatrol/cli.py.

What PatchPatrol Writes

PatchPatrol writes files into AI_REVIEW_OUTPUT_DIR (default .ai-review).

Artifact	Purpose	Stability posture
`ai-review.md`	Human-readable summary for quick manual review and triage.	Stable, documented public artifact.
`ai-review.html`	Optional static printable presentation artifact for browser review or PDF export.	Optional additive artifact; enabled with `AI_REVIEW_HTML_REPORT=true` or `ai-review run --html-report`.
`ai-review.json`	Machine-readable `FindingsReport` payload for scripts and integrations.	Stable, documented public contract.
`gl-code-quality-report.json`	Optional Code Quality format mirror of findings.	Optional output; enabled by default via `AI_REVIEW_CODE_QUALITY_EXPORT`, including early exits that have no findings and therefore render as `[]`.

The HTML and markdown reports are human-facing renderings derived from the same validated payload. They can surface review context, summary, findings, coverage, and token usage sections, but ai-review.json remains the automation contract.

Use the First review output page for quick interpretation.

Core JSON contract (stable)

ai-review.json validates against schema.json with three required top-level fields:

meta
summary
findings

The top-level object and finding objects reject unknown properties, so key names above and their nested fields should match what integrators send to downstream systems.

`meta`

meta is required and includes at least these stable keys:

model
base_sha
head_sha
generated_at
diff_stats
limits

meta.diff_stats requires:

files
insertions
deletions

meta.limits is a dictionary. It always exists and carries runtime-applied limit values and execution diagnostics.

For structured-output provider behavior, meta.limits may also include compact diagnostics such as:

structured_output_review_call_count
structured_output_requested_modes
structured_output_effective_modes
structured_output_fallback_count
structured_output_reason_codes

These fields are optional and additive. They record output-mode decisions and fallback reason codes without persisting raw prompts or provider request payloads.

When repo-local .ai-review.yml review.rules are configured, meta.limits may also include compact counters such as:

prompt_review_rule_total_count
prompt_review_rule_matched_count
prompt_review_rule_included_count
prompt_review_rule_omitted_count

These counters make rule matching observable without persisting raw rule instructions in review artifacts.

When deterministic preflight hints are generated, meta.limits may include compact counters and reason codes such as:

prompt_preflight_hint_total_count
prompt_preflight_hint_matched_count
prompt_preflight_hint_included_count
prompt_preflight_hint_omitted_count
prompt_preflight_hint_reason_codes

These fields are optional and additive. They make hint matching observable without persisting raw hint evidence, raw prompt bodies, or provider chain-of-thought in artifacts. Preflight hints are prompt evidence only: they can point the provider at duplicate added code-like text, dependency additions, risky import additions, or missing test-counterpart signals, but they are not automatic findings or CI gates.

`summary`

summary is required and includes these stable fields:

overall_risk
top_issues

overall_risk is one of low, medium, or high.

`findings`

findings is required and is an array of finding objects.

A finding currently requires:

severity
category
file
line_start
line_end
title
description
recommendation
confidence

suggested_patch and provenance are optional.

line_start and line_end are normal new/right-side integer file line numbers. Provider prompts may render prompt-only L<number>: labels on added and context lines to help the model anchor findings, but those labels are not persisted in ai-review.json, ai-review.md, ai-review.html, or GitLab exports.

Published enums and controlled values

The following enum sets are the reliable values currently emitted and validated:

finding.severity: blocker, high, medium, low, info
finding.category: security, correctness, reliability, performance, maintainability, testing, style
summary.overall_risk: low, medium, high
finding.provenance.origin: llm, semantic, security_tool, combined
finding.provenance.sources: llm, semantic, security_tool
meta.security_precheck.status: pass, fail, unavailable, error, skipped

Runtime fields in `meta` you may see today

The schema allows additional public metadata in meta. In the shipped runtime paths, these meta sections are currently present in supported cases:

meta.semantic_precheck
meta.security_precheck
meta.trust_gate
meta.provider_runtime
meta.feedback
meta.repository_overview
meta.usage_ledger
meta.performance_diagnostics

These are additive and may grow as execution behavior evolves.

meta.usage_ledger, when present, is interpretive provider-call metadata used for token usage tables. It has three levels: calls for all provider calls, by_phase for phase totals, and records for individual calls. If the trust gate blocks execution before provider review, no provider call is made; fail-fast artifacts can therefore show meta.usage_ledger.calls.call_count as 0 and zero prompt/completion/total tokens. In that same blocked state, findings: [] means the provider review did not run, not that the merge request was clean.

Provider timing fields in meta.usage_ledger are additive. When the configured provider exposes native timings, records and totals may include these fields:

Field	Meaning
`prompt_tokens`	Prompt/input tokens reported for the provider call.
`completion_tokens`	Generated/completion tokens reported for the provider call.
`total_tokens`	Prompt plus completion tokens when available.
`wall_elapsed_seconds`	PatchPatrol-measured wall time for the provider call or total.
`provider_total_duration_seconds`	Provider/server total runtime when exposed by the provider.
`provider_load_duration_seconds`	Provider-reported model load time, useful for cold-start diagnosis.
`provider_prompt_eval_duration_seconds`	Provider-reported prompt ingestion/evaluation time.
`provider_generation_duration_seconds`	Provider-reported generation/decode time.
`total_tokens_per_second_wall`	Total tokens divided by PatchPatrol wall time. Human reports label this `Wall total tok/s`.
`prompt_tokens_per_second_provider`	Prompt tokens divided by provider prompt-eval time. Human reports label this `Prompt tok/s`.
`generation_tokens_per_second_provider`	Completion tokens divided by provider generation time. Human reports label this `Generation tok/s`; this is the Ollama model decode-speed metric.
`total_tokens_per_second_provider`	Total tokens divided by provider total runtime. Human reports label this `Provider total tok/s`.
`provider_wall_overhead_seconds`	PatchPatrol wall time minus provider total runtime. Use this to spot Docker, network, wrapper, or orchestration overhead.

For Ollama, provider-native durations come from nanosecond fields in the /api/chat response and are converted to seconds:

prompt_tok_s = prompt_eval_count / (prompt_eval_duration / 1e9)
generation_tok_s = eval_count / (eval_duration / 1e9)
provider_total_tok_s = total_tokens / (total_duration / 1e9)
wall_total_tok_s = total_tokens / PatchPatrol_wall_elapsed_seconds

Blended wall/provider total rates are not generation speed. A large prompt and short completion can make Wall total tok/s or Provider total tok/s much higher than Generation tok/s. Providers that expose only token counts leave provider-native prompt/generation rates as unavailable, rendered as n/a in human reports, while wall throughput may still be available.

meta.performance_diagnostics, when enabled with AI_REVIEW_ENABLE_PERFORMANCE_DIAGNOSTICS=true, mirrors the same usage values under observed.token_usage and duration values under observed.elapsed_seconds. The markdown and optional HTML reports render the same concepts in ## Token Usage and ## Performance Diagnostics.

Treat missing optional sections as normal and preserve unknown keys when forwarding or storing reports.

Minimal valid payload example

{
  "meta": {
    "model": "deepseek-coder-v2:16b",
    "base_sha": "1111111",
    "head_sha": "2222222",
    "generated_at": "2026-02-26T12:00:00Z",
    "diff_stats": {
      "files": 0,
      "insertions": 0,
      "deletions": 0
    },
    "limits": {
      "max_diff_bytes": 32768,
      "max_files": 50
    }
  },
  "summary": {
    "overall_risk": "low",
    "top_issues": []
  },
  "findings": []
}

That structure is intentionally minimal, but valid for parser bootstrap checks and contract tests.

Stability posture

ai-review.json is the contract source of truth.
ai-review.md and optional ai-review.html are human-facing renderings derived from that validated payload.
schema.json is the contract source for those artifacts.
Integrators should validate required fields and enums from schema.json and treat extra fields as optional forward-compatible extensions.
Any optional output sections should be handled defensively (null/absent checks first), because these sections are surfaced opportunistically.

Artifacts & Schema

On this page