Reasoning patterns
Entries capture what the principal thinks. Patterns capture how they think — the repeatable decision and judgment moves.
These are the whole point of the extrapolation goal. When the team asks something the principal was never directly asked, there is no matching entry — so the agent reasons from these patterns instead. They are discovered from real messages, never a fixed list, because reasoning moves can't be known in advance, and a hard-coded list would be wrong and frozen.
Why "snowball"
A pattern is treated like a snowball that gets rounder, not bigger, as more observations roll in. It should accumulate reasoning — not fragment into dozens of near-synonyms (which is what an earlier version did, half of them just rhetorical tics).
The lifecycle of one observation
Each new entry hands the patterns engine a reasoning_pattern slug (if the extractor tagged one) and/or a raw pattern observation (a described move). Then:
-
Tagged a known slug? Trust it and accumulate onto that pattern.
-
Otherwise, embed the observation and find the nearest existing pattern. If similarity is at or above 0.72, ask a stance judge: - aligned — same move (maybe adds nuance). Accumulate: record the observation, and refine the one canonical template (an LLM distills a tighter one-liner of ≤25 words; if nothing new was added, it's left unchanged). The snowball gets rounder. - opposing — the same kind of decision resolved the OPPOSITE way, usually because context differs. Record a context variant (below). Not a contradiction to resolve — a branch of the same umbrella move. - unrelated — false neighbour; fall through to mint.
-
Mint a new provisional pattern — but first a classifier rejects pure communication style (metaphor, tone, storytelling, phrasing). Style is not a reasoning pattern.
Why 0.72 (lower than the 0.83 entry-dedup bar): short reasoning moves embed with more surface variance, so genuinely-similar moves land at lower cosine. A looser bar just means the LLM stance judge — the real discriminator — gets consulted more often. The stance judge fails safe to "aligned": the embedding already says they are close, and the disease being cured is fragmentation, so on uncertainty we accumulate rather than split.
How variants are recorded
An opposing observation becomes a variant row with: the variant's own one-line template, plus scope, department, and context (any it cannot determine is null). A variant is keyed on the full (scope, department, context) triple, so "the same move under a different department" is a distinct variant, not an overwrite. If an observation matches an existing variant key, its template is snowball-refined (folded in), not clobbered.
Concrete example: the umbrella pattern is "push ownership down." In the general case: send people back to own it. But with other teams the move flips to "push it back, let it burn." That flip is stored as a variant with context "other teams" — not as a contradiction. At answer time, /retrieve returns the pattern with its variants, so the agent can pick the version that fits the asker's situation instead of blindly applying the canonical one.
Confirmation and pruning
- A pattern is confirmed after just 2 real observations. Why so low: with semantic accumulation, a move made twice is real and should reach the answer layer, not starve in "provisional" forever. (The old threshold of 4 was unreachable in one pass, so learned patterns never surfaced.)
- Only active patterns (confirmed OR seeded) are shown to the extractor and served by
/retrieve. Provisional ones (seen once) stay hidden — a single occurrence is a hypothesis, not a pattern. - A provisional pattern not confirmed within 7 days is pruned (with its vector, observations, and variants), and any entry tagged with it has that tag cleared. This keeps the catalogue bounded.
- A periodic merge step clusters near-duplicate patterns into ≤14 canonical ones, re-points their entries/observations/variants to the survivor, sums their observation counts (confirming any that cross the threshold from the sum), and deletes flagged rhetorical ones. Seed patterns are never deleted, so the spine always survives.
The 4 seeds
Four patterns are seeded from first principles so the extractor has slugs to tag from day one:
ship-imperfect-iterate-with-signalpush-ownership-downevery-system-is-future-ai-trainingcontrol-frame-let-data-talk
They are marked "seeded, pending confirmation" — hypotheses until real data backs them. Their authored wording is protected from refinement so accumulated observations can't drift the original phrasing.
A note on voice
Alongside the ontology (what the principal knows) and patterns (how they reason), a third artifact captures voice: a charitable profile updated once per conversation, with a strong novelty gate and periodic consolidation so it stays small (it is injected into prompts, so bloat hurts). It is intentionally a charitable voice profile, not an uncharitable psych-eval — an earlier version read like the latter and was corrected.