How it works

The living model

The corpus is a living, deduplicated model — not an append log. Every new entry goes through a write-or-merge step before it is allowed to become a row. This is what keeps repetition from bloating the corpus and what lets the model change its mind.

The write-or-merge logic

  1. Embed the candidate once (topic + position + reasoning). This one vector is reused for both the similarity search and, if we insert, the stored embedding — so an entry is embedded at most once.

  2. Find the nearest active entries — the top 3 closest public, non-superseded entries, closest first.

  3. Walk them closest-first. If a neighbour's similarity is below 0.83, stop — nothing closer remains, so this is a genuinely new position and we insert.

  4. If similarity is at or above 0.83, we are in the "dedup band." Always ask the stance judge — a tiny LLM call comparing the two positions — because high similarity does NOT mean same stance. (A contradiction sits on the same topic, so it embeds close.) The judge returns one of: - same → MERGE (corroborate). - contradiction → insert the new entry AND mark the old one superseded (the principal changed their mind). - unrelated → false neighbour; try the next-nearest, and if none qualify, insert as new.

Fail-safe when the judge is down: if the stance call errors, we fall back on similarity alone, merging only if it is at or above 0.93 (very close means strong duplicate evidence, so fail toward merge), otherwise skip and try the next neighbour.

Why always run the judge in the band: an earlier version merged on embedding similarity alone and would silently swallow contradictions.

What a MERGE does (corroboration)

No new row is created. Instead the existing entry is strengthened:

  • corroboration_count += 1 — the principal said this again, so it is more real.
  • If the count reaches 3, confidence is promoted to high automatically.
  • Tags are unioned, source_date is refreshed to the newer date (keeps it fresh in ranking), and last_corroborated_at is stamped.
  • The change is logged (e.g. "KE-0007 corroborated (×3) from meeting").

Why this matters: repetition used to create duplicate rows, the corpus bloated, and the same idea outvoted itself in search. Now repetition makes one entry stronger — which is exactly what "the principal keeps saying this" should mean.

What SUPERSESSION does

When the stance judge says two positions on the same question are opposite (the principal changed their mind), we do not edit or delete the old entry. We:

  • insert the new position as its own entry, and
  • set the old one's superseded_by to the new id.

Why not delete: the old view drops out of retrieval instantly (search filters on superseded_by IS NULL), but the history is preserved — you can always see what was previously believed and when it changed (logged with a timestamp). It is a soft replace, not an erase.