Deterministic CC0 corpus

Million-Chunk Performance

Paired query latency, controlled indexing, full-system paths, and pinned public retrieval quality.

Warm p95

3.57x

53.77 ms -> 15.07 ms

Index throughput

21.96x

109006 chunks/s

Footprint

-57.0%

1.06 GiB -> 0.46 GiB

Acceptance

PASS

Latency, quality, footprint, recall, and documented indexing ceiling.

Full-system query paths

Path	Baseline p95 ms	Current p95 ms	Ratio
Process cold	307.22	478.02	1.556
Warm distinct	112.57	58.29	0.518
Cache replay	27.85	31.96	1.148
Filtered	257.31	45.22	0.176
Warm CLI	151.27	138.99	0.919
Concurrent	164.74	99.17	0.602

Public retrieval quality

Metric	Baseline	Current	Delta
ndcg_at_10	0.2620 +/- 0.0002	0.2666 +/- 0.0016	+0.0046
mrr_at_10	0.2178 +/- 0.0004	0.2220 +/- 0.0014	+0.0042
precision_at_5	0.0561 +/- 0.0002	0.0601 +/- 0.0015	+0.0039
recall_at_20	0.5080 +/- 0.0024	0.4890 +/- 0.0008	-0.0190
no_hit_rate	0.0000 +/- 0.0000	0.0000 +/- 0.0000	+0.0000

Per-dataset nDCG@10

Dataset	Baseline	Current	Delta
codetrans-dl	0.2365 +/- 0.0005	0.2402 +/- 0.0033	+0.0037
codetrans-contest	0.4269 +/- 0.0000	0.4278 +/- 0.0001	+0.0009
cosqa	0.1464 +/- 0.0003	0.1566 +/- 0.0041	+0.0102
codefeedback-st	0.5243 +/- 0.0000	0.5108 +/- 0.0000	-0.0135

Method

The query run kept both daemons live and alternated request order. The controlled indexing run used the same generated corpus for both binaries. The exact full-system run is retained separately because the shared host was heavily saturated.