0005 — decision capture (the Phase-4 learning substrate)¶
Status: IN PROGRESS (2026-06-22) — substrate implemented (FrameVerdict
Proposed/DecidedAt, the analysis cache, Profile.Fingerprint); the learner
that consumes it is Phase 4. Companion to 0001-krites.md
(master design, esp. §6, §11 Phase 4), 0002-interface-contracts.md
(R-CULL-*, R-VRD-*, R-GLOBAL-6), and 0004-face-eye.md.
Scope: persist, per frame, the labelled decision record — the signals the machine measured, the verdict it proposed, and the verdict Hailey settled on — so that the Phase-4 learning loop (
0001§11) has accumulated history to learn from. This spec builds the data substrate, not the learner. It is the cheap, early move that turns Phase 4 from "start collecting data" into "train on a year of real labelled examples."
1. Why now, when the learner is Phase 4¶
0001 §1 ("a taste that grows") and §11 (Phase 4) promise a system that adapts
to Hailey's keep/reject history and auto-tunes the cull profile rather than
leaving her to hand-twiddle every knob. As Phases 1–2 add tunable knobs
(EyeOpenSoft, the EAR anchors, sharpness floors, dedup distance), the payoff for
that promise grows — but a learner can only learn from data we have been
capturing.
Today krites captures the override flag (R-VRD-1) but not the structured
signals behind a decision, nor the machine's original proposal once a human
overrides it, nor any analysis across runs. So if Phase 4 shipped tomorrow it
would start from zero history. Capturing the decision record now — long before
the learner exists — means the learner arrives to a populated dataset.
The capture is also independently useful before any learning: it backs the
analysis cache (R-GLOBAL-6, fast re-cull) and makes verdicts explainable
("the machine proposed reject because sharpness 38 < 50; you kept it").
2. What a decision is¶
The labelled example the learner needs, per frame:
| Part | Source | Purpose |
|---|---|---|
| Signals | cull.Signals at cull time |
the features (sharpness, exposure, faceCount, eye-open) |
| Proposed verdict | the machine's Resolve result |
the model's label |
| Final verdict | the effective verdict after any human override | the ground-truth label |
| Overridden? | Override flag |
whether the human corrected the machine — the highest-signal events |
| Profile id | the cull profile's name (+ a content hash) | the thresholds in force, since they change over time |
| Decided at | timestamp of the last change | ordering + recency weighting |
A frame where the machine proposed reject and Hailey set keep is one labelled
correction; a frame she left alone is a (weaker) agreement signal. Both matter.
3. Where it lives — per-shoot, krites stays stateless¶
0001 §3 keeps krites stateless: the shoot holds the state. §2 makes a
cross-shoot catalogue a non-goal (that's Lightroom's job). Decision capture
must honour both:
R-CAP-1(MUST) Decision records live in the shoot's.krites/sidecar, per-shoot, non-destructive (R-ND-1/2). krites owns no global decision store.R-CAP-2(MUST) The Phase-4 learner (future) aggregates history by being pointed at a set of shoots, reading each one's records — the union is the cross-shoot history, without krites owning a catalogue. Discovering/looping that set is the learner's concern, specced in Phase 4, not here.
This keeps the load-bearing statelessness while still enabling cross-shoot learning at train time.
4. The data model¶
Two cooperating records, both per-shoot:
4.1 Analysis cache — the signals (also serves R-GLOBAL-6)¶
R-CAP-3(MUST) Per frame, the structuredcull.Signalsare persisted under.krites/analysis/keyed by frame + an analysis-version tag (bumped when a signal's computation changes, so stale rows are detectable). This is the feature vector — numbers, not the reasons-as-text we keep today.R-CAP-4(SHOULD)cull --reanalyzerecomputes and overwrites; without it, cached signals are reused (R-GLOBAL-6) and only the verdict is re-resolved.
4.2 Enriched verdict — the labels¶
The verdict record gains the machine's proposal so a human override no longer erases what the machine thought:
type FrameVerdict struct {
Verdict cull.Verdict // the EFFECTIVE verdict (machine, or human if overridden)
Reasons []string
Rating int
Cluster string
Override bool
// Proposed is the machine's own verdict from the last cull, preserved even
// when a human overrides Verdict — so (Proposed, Verdict) is the labelled
// (machine-said, human-said) pair the learner trains on (0005 §2).
Proposed cull.Verdict `yaml:"proposed,omitempty"`
// DecidedAt is when Verdict last changed (RFC3339); recency for the learner.
DecidedAt string `yaml:"decided_at,omitempty"`
}
R-CAP-5(MUST) On cull,Proposedis set to the machine verdict and equalsVerdict(no override yet). On a human override (verdictCLI / studio),Verdictchanges andOverride=truebutProposedis preserved — yielding the(Proposed → Verdict)correction pair.R-CAP-6(MUST) The records stay human-readable YAML (0001§3) and deterministic given the same inputs (R-GLOBAL-7) — exceptDecidedAt, the one intentionally time-varying field.
5. Non-goals (this spec)¶
- The learner itself — model, profile auto-tuning, the
learncommand: Phase 4. - A cross-shoot catalogue / global store — explicitly out (
0001§2;R-CAP-1). - Full event sourcing — we keep the latest decision per frame (signals + proposed + final + decided-at), not every intermediate flip. An append-only event log is a possible future enrichment if the learner wants trajectories; flagged, not built (§7-Q2).
- PII / consent for training — the records are local, per-shoot, never leave the machine; revisit only if learning ever moves cloud-side.
6. Implementation plan (alongside the eye wiring)¶
The minimal substrate, TDD, deterministic-core unit-tested:
shoot.FrameVerdictgainsProposed+DecidedAt;Verdicts.OverridepreservesProposed, stampsDecidedAt(passed in, not read from the clock, to stay testable —R-GLOBAL-7). Round-trip + override unit tests.pipeline.CullsetsProposed= the resolved verdict and persists the per-frameSignalsto the analysis cache.shootgains analysis-cache read/write (.krites/analysis/), keyed by frame + analysis-version. Pure (de)serialisation, unit-tested.- Docs: a short "what krites records, and why (Phase-4 ready)" page.
Layered so it lands with the eye-signal wiring — the first real culls then bank labelled data including the new eye signal.
7. Open questions¶
- Q1 — analysis-cache granularity. One file per frame
(
.krites/analysis/<frame>.yaml) vs oneanalysis.yamlmap for the shoot. A per-frame file diffs cleanly and resumes well on 4,000 frames; a single map is simpler. Recommend: one map file now (matchesverdicts.yaml), revisit if it gets heavy. - Q2 — latest-state vs event log. Keep only the latest decision per frame, or append every change? Recommend: latest-state now (§5); event log deferred to Phase 4 if the learner wants trajectories.
- Q3 — profile identity. Store just the profile name, or name + a content hash of its thresholds? Recommend: name + hash — the thresholds move, and the learner needs to know which were in force.