Skip to content

DSPy Training Path Audit

  • Audit date: 2026-03-18 (Asia/Tbilisi)
  • Repository root: /home/standard/dspy_rag_in_repo_docs_and_impl1_step1final
  • Git HEAD during verification: 7aca0a4ada051e4eca430d55a29c98e8fe4e1077

Scope

This audit captures step 1 of the repository roadmap: turning the DSPy path into a better-tested, documented, notebook-visible compile and reload workflow, then verifying the repository-native checks, the broad quality gate, and a live end-to-end DSPy compile plus saved-program reload.

Executed Commands

Executed successfully in this turn:

  • make hooks-install
  • uv run python -m compileall src tests
  • uv run pytest tests/test_utilities.py tests/test_repository_rag_bdd.py
  • uv run repo-rag verify-surfaces
  • uv run repo-rag smoke-test
  • cargo build --manifest-path rust-cli/Cargo.toml
  • uv run pytest tests/test_dspy_training.py tests/test_cli_and_dspy.py tests/test_verification.py tests/test_benchmarks_and_notebook_scaffolding.py tests/test_project_surfaces.py
  • make quality
  • set -a; . /home/standard/dspy_rag_in_repo_docs_and_impl1/.env; set +a; uv run repo-rag dspy-train --root . --run-name step1-smoke-final --optimizer bootstrapfewshot --max-bootstrapped-demos 1 --max-labeled-demos 1
  • set -a; . /home/standard/dspy_rag_in_repo_docs_and_impl1/.env; set +a; uv run repo-rag ask --root . --question "What does this repository research?" --use-dspy --dspy-program-path artifacts/dspy/step1-smoke-final/program.json

Notable Results

  • make hooks-install: passed and refreshed the managed pre-commit plus pre-push hooks
  • uv run python -m compileall src tests: passed
  • uv run pytest tests/test_utilities.py tests/test_repository_rag_bdd.py: passed, 8 tests
  • uv run repo-rag verify-surfaces: passed after trimming the new notebook code cell back to the repository 25-line limit
  • uv run repo-rag smoke-test: passed with answer_contains_repository: true, mcp_candidate_count: 1, and manifest_path: artifacts/azure/repo-rag-smoke.json
  • cargo build --manifest-path rust-cli/Cargo.toml: passed
  • focused DSPy and notebook-facing pytest slice: passed, 52 tests
  • make quality: passed with 87 tests and 87% total coverage
  • live repo-rag dspy-train smoke: passed and wrote artifacts/dspy/step1-smoke-final/program.json plus artifacts/dspy/step1-smoke-final/metadata.json
  • live repo-rag ask --use-dspy --dspy-program-path ...: passed and loaded the saved program

Current Verification Status

Configured and verified in this turn:

  • Compile checks: present and passed through uv run python -m compileall src tests
  • Utility and baseline pytest slice: present and passed through uv run pytest tests/test_utilities.py tests/test_repository_rag_bdd.py
  • Notebook and Makefile contract verification: present and passed through uv run repo-rag verify-surfaces
  • Repository smoke test: present and passed through uv run repo-rag smoke-test
  • Rust build: present and passed through cargo build --manifest-path rust-cli/Cargo.toml
  • Focused DSPy, notebook-scaffolding, verification, and surface tests: present and passed through the targeted uv run pytest ... slice above
  • Lint, notebook lint, mypy, basedpyright, repository-surface verification, complexity, pytest, and coverage: present and passed through make quality
  • Live DSPy compile and saved-program reload: present and passed through the two .env-backed uv run repo-rag ... commands above

Still absent or not exercised in this turn:

  • UI or browser tests: none found in the repository configuration
  • Full notebook execution batch: notebook lint and notebook-surface checks passed, but make notebook-report was not rerun end-to-end in this turn
  • Live Azure deployment or inference tests: not rerun in this turn
  • Post-push GitHub Actions evidence: not yet available before the push for this change set

Notes

  • The new DSPy tests cover LM config resolution from Azure/OpenAI env shapes, artifact path sanitization, paraphrase scoring in the repository answer metric, unsupported optimizer errors, and the runtime branches that skip or use compiled programs.
  • The training notebook now stays within the playbook contract while exposing the latest compiled DSPy artifact for inspection or reuse when LM configuration is available in-process.
  • The live step1-smoke-final compile succeeded, but its benchmark summary reported 0 passes out of 3 cases. That result confirms the next real bottleneck: retrieval quality under the DSPy layer, not the absence of a compile path.