Skip to content

Retrieval Quality Doc Priority Audit

  • Audit date: 2026-03-18 (Asia/Tbilisi)
  • Repository root: /home/standard/repo-rag-retrieval-home
  • Base origin/master: a86dc7c

Scope

This audit captures a retrieval-quality pass focused on full-corpus answer quality rather than DSPy plumbing. The change set does three things together:

  • demote meta and synthetic repo surfaces such as tests, training samples, audits, generated inventories, and summary overlays during live retrieval
  • add light lexical normalization and doc-seeking / code-seeking path heuristics so primary docs, source files, and headers win more often on "which file explains..." style questions
  • tighten the fairness-filtered benchmark corpus so repo-meta overlays such as README.AGENTS.md, FILES.md, env.md, TODO.MD, todo-backlog.yaml, AGENTS.md.d/, and generated exploratorium manifests do not contaminate the benchmark loop

The user-visible regression guard for this work is now the full-corpus retrieval test slice in tests/test_retrieval.py, which asserts that the tracked repository questions do not route through test files, training samples, audit notes, or generated meta surfaces in the top four hits.

Executed Commands

Executed successfully in this turn:

  • TMPDIR=/home/standard/.tmp uv run python -m compileall src tests
  • TMPDIR=/home/standard/.tmp uv run pytest tests/test_utilities.py tests/test_repository_rag_bdd.py
  • TMPDIR=/home/standard/.tmp uv run pytest tests/test_retrieval.py
  • TMPDIR=/home/standard/.tmp uv run pytest tests/test_retrieval.py tests/test_benchmarks_and_notebook_scaffolding.py
  • TMPDIR=/home/standard/.tmp uv run repo-rag smoke-test
  • TMPDIR=/home/standard/.tmp uv run repo-rag retrieval-eval --root . --top-k 4 --top-k-sweep 1,2,4,8 --minimum-pass-rate 1.0 --minimum-source-recall 1.0
  • TMPDIR=/home/standard/.tmp make hooks-install
  • TMPDIR=/home/standard/.tmp make verify-surfaces
  • TMPDIR=/home/standard/.tmp CARGO_TARGET_DIR=/home/standard/.cargo-target/repo-rag-retrieval-home cargo build --manifest-path rust-cli/Cargo.toml
  • TMPDIR=/home/standard/.tmp make quality

Results

  • uv run python -m compileall src tests: passed
  • focused utility + repository BDD slice: passed, 15 tests
  • retrieval regression slice: passed, 7 tests
  • retrieval + benchmark/notebook scaffold slice: passed, 19 tests
  • uv run repo-rag smoke-test: passed with
  • answer_contains_repository: true
  • mcp_candidate_count: 1
  • manifest_path: artifacts/azure/repo-rag-smoke.json
  • strict retrieval gate: passed with
  • pass_rate: 1.0
  • average_source_recall: 1.0
  • threshold_failures: []
  • make hooks-install: passed
  • make verify-surfaces: passed with issue_count: 0
  • Rust wrapper build: passed after moving both TMPDIR and CARGO_TARGET_DIR off /tmp
  • make quality: passed with 133 tests and 88.17% total coverage

Notes

  • The host had /tmp at 100% usage during this turn. An initial Rust build attempt failed for that reason, not because of a source regression. Re-running with TMPDIR=/home/standard/.tmp and CARGO_TARGET_DIR=/home/standard/.cargo-target/repo-rag-retrieval-home passed cleanly.
  • The retrieval benchmark gate remains strict on pass rate and average source recall. This turn did not change those thresholds; it improved the live full-corpus ranking behavior and documented the narrower benchmark corpus contract.
  • Not exercised in this turn: live Azure endpoint probes, notebook batch execution, publication PDF build, and post-push GitHub Actions evidence.