0006 — develop: straighten, crop, looks & the develop-aware export¶
Status: DRAFT (2026-06-23) — the Phase-2 master spec (0001 §11). Companion
to 0001-krites.md (§4.3 develop, §4.6 export, §4.7 XMP, §6
catalogs), 0002-interface-contracts.md (R-DEV-*,
R-LOOK-*, R-EXP-*, R-XMP-1b, R-CAT-*), and the patterns proven in
0004-face-eye.md (native adapter behind an interface) and
0005-decision-capture.md (non-destructive records).
Decisions (2026-06-23, Matt). §8 resolved: Q1 — embedded-preview-first RAW (cull/preview/export off the embedded JPEG; full-latitude
Decode()deferred, keeping the build CGO-free). Q2 — looks are white-balance + tone curve + an optional 3D-LUT slot; retrofittable because looks are config the edit record references by id, applied through a pure seam, so changing the model later is additive and doesn't break stored edits. Q5 — aesthetic/expression signals are IN this Phase 2 (§8), not split out — develop and the richer cull signals land together.Scope: auto-correct the keepers — auto-straighten, composition crop, and a white-balance + tone + grade look — recorded as reversible edit instructions, baked to pixels only on
export, and carried (geometry + WB only) into Camera-Raw XMP so keepers open in Lightroom already leveled, cropped and WB-corrected (the ~90% finish-in-Lightroom deliverable,0001§4.7). Phase 2 also adds the aesthetic / expression cull signals (smile, looking-at-camera, a strength score) as new ONNX-backed signals feeding the cull (§8). Bounded retouch is in scope but minimal; object removal (Phase 3) is out.
1. The shape of the work, and the platform split¶
Phase 1 taught the lesson this spec leans on: separate the deterministic math (pure-Go, builds and tests on the Linux box now, stays WASM-safe) from the heavy native backend (its own adapter behind an interface). Develop divides cleanly:
| Pure-Go, buildable now | Native, behind a provider |
|---|---|
| edit-record model (straighten/crop/look/retouch instructions) | RAW decode (Decoder) — ARW/CR2/NEF → pixels |
| straighten-angle detection (horizon / verticals) | RAW embedded-preview extraction for culling |
| crop geometry (aspect + rule-of-thirds) | |
| look definition + WB/tone/LUT maths | |
| the export bake (rotate/crop/resize/colour) on a decoded image | |
| geometry + WB → Camera-Raw XMP |
So the entire develop engine and the JPEG/PNG export path build and validate on
Linux today — straighten, crop, develop --look, and a develop-aware
export/xmp write all work end-to-end on the JPEG/PNG keepers krites already
decodes (and on Hailey's Lightroom JPEG exports). The only native gap is
decoding RAW originals, which is exactly the 0001 §13-Q4 boundary: cull and
preview off the embedded JPEG; full RAW decode only for developed/exported
keepers. RAW is its own slice (§7), so it doesn't gate the develop engine.
Build order within Phase 2: edit records → straighten → crop → looks →
develop-aware export + XMP (all JPEG/PNG, pure-Go) → the RAW Decoder → bounded
retouch.
2. Edit records — the non-destructive develop state¶
Cull has verdicts.yaml; develop has edit records. Per 0001 §3/§7 the
shoot holds the state under .krites/edits/.
R-EDIT-1(MUST) Each frame's edits live as a reversible instruction record (.krites/edits/<frame>.yaml): straighten angle, crop rectangle, look id + per-frame overrides, retouch toggles — never baked pixels (0001§3, §4.6;0002R-ND-2). Onlyexportrenders pixels (R-EXP-2).R-EDIT-2(MUST) Every edit is reversible:reset <frame>clears edits (extending the Phase-1reset, R-VRD-2), and the studio's undo/redo is total (R-UI-9). Removing.krites/edits/returns every keeper to its culled state.R-EDIT-3(MUST) Edit records are human-readable YAML and deterministic given (frame, look, detector outputs) — the geometry/look maths are pure (R-GLOBAL-7).R-EDIT-4(SHOULD) An edit record carries the detector basis it was proposed from (the detected horizon angle, the saliency box) so a re-propose is explainable and the Phase-4 learner can later relate Hailey's crop/level overrides to what the machine proposed (0005).
A new package pkg/shoot gains edit-record load/save beside verdicts; the
geometry/look types live in pkg/develop.
3. Straighten (krites straighten)¶
R-DEV-1/2(MUST, restated) Detects the dominant horizon / verticals and computes a level angle, clamped to a sane maximum (e.g. ±8°, config), recorded on the edit record — proposed, never auto-baked; the studio previews it (R-UI-12). The geometry (angle → rotation, and the auto-crop inside the rotated frame so no empty corners show) is pure and unit-tested.R-DEV-3(SHOULD)--allbatch-proposes across keepers; each stays individually overridable.
Detection method is §9-Q3. Lives in pkg/develop/straighten; the angle→transform
+ corner-safe inner-crop geometry is pure Go (golden-fixture tested).
4. Crop (krites crop)¶
R-CROP-1(MUST) Proposes a crop rectangle improving composition — a target aspect (config; keep-original by default) and a placement heuristic (rule-of-thirds on the subject / saliency), recorded on the edit record, staying within the frame (R-DEV-2). Pure geometry, unit-tested.R-CROP-2(SHOULD) Respects the straighten edit (crop within the leveled frame) and never crops below a minimum resolution.
Placement heuristic is §9-Q4. Lives in pkg/develop/crop.
5. Looks (krites develop [--look <name>])¶
R-LOOK-1/2(MUST, restated) Applies a look — white balance + tone curve + grade/LUT — from the config-driven catalog (0001§6) as develop settings on the edit record (not baked until export).--lookselects, else the configured default. No hardcoded grades/curves in Go — wired through the catalog (R-CAT-1).R-LOOK-3(MUST) The look maths (WB scaling, tone-curve evaluation, 3D-LUT sampling) are pure and unit-tested on golden pixels — the deterministic core (R-GLOBAL-7); only where the pixels come from (decode) is native.R-LOOK-4(MUST)initseeds a neutral look (0001§6); Hailey's signature grade and per-scene variants (ceremony / golden-hour / reception) are catalog entries she tunes, like the cull profile.
Look representation is §9-Q2. Lives in pkg/develop/look (definition + maths);
the catalog is managed through the GTB config layer (R-CAT-*).
6. Develop-aware export & XMP¶
The export and XMP commands already exist (Phase 1); Phase 2 makes them edit-aware.
R-EXP-1(MUST, restated)exportrenders the chosen verdict set applying each frame's edits — straighten → crop → look → retouch — intoexport/. It stays the only pixel-producing command (R-EXP-2), never touching originals.R-EXP-4(MUST) The bake is a deterministic pure-Go image pipeline on the decoded image: rotate (straighten), crop, resize, and the colour ops (WB / tone / LUT).golang.org/x/imagecovers rotate/crop/resize; the colour ops are our pure maths (§5). No native image-ops library is required for the bake — only decode is native, and only for RAW (§7).R-XMP-1b(MUST)xmp writeadditionally carries crop, straighten angle and white balance into Camera-Raw XMP, so keepers open in Lightroom already leveled/cropped/WB-corrected (0001§4.7, §13-Q5). Looks / retouch / removals are never written to XMP — Adobe can't represent them, so they serve the krites-export path only; no fragile half-applied mapping (R-XMP-1b, unit-tested round-trip).
7. The RAW Decoder provider (the one native dependency)¶
Culling and develop work on JPEG/PNG today; Hailey's originals are Sony ARW.
Per 0001 §13-Q4: cull + preview off the embedded JPEG; full RAW decode only
for the developed/exported keepers (a few hundred frames, not all 4,000).
R-DEC-1(MUST) ADecoderprovider exposes a cheapPreview()(the embedded JPEG every RAW carries — enough for cull/preview) and a fullDecode()(full bit-depth RAW → pixels + WB-as-shot, for export).ingest/culluse the former;exportthe latter (0001§13-Q4). It is injected and faked in tests (the keryx rule), a native adapter behind the interface — the deterministic engine never imports it (0001§7, likepkg/face/onnx).R-DEC-2(MUST) Embedded-preview extraction needs no full RAW decoder — it parses the RAW container for the embedded JPEG. This unblocks culling RAW shoots without the heavy native path and can ship first.
The CGO question (§9-Q1). Full RAW decode means libraw/libvips, and the mature Go bindings (govips, go-libraw) are CGO — the same tension the eye model hit, and CGO breaks goreleaser's pure-Go cross-compile of the binary. Options, to decide before the full-
Decode()slice: (a) embedded-preview-only for now (no full RAW — develop/export RAW keepers off the preview, lower latitude); (b) shell out to a bundledlibraw/dcraw/exiftoolbinary (no CGO in our build; an external tool dependency); © accept CGO for a separate, optionally-built decode binary/sidecar; (d) apurego+libraw dylib binding (no mature one exists — DIY, like we considered for ONNX). The embedded preview (R-DEC-2) is CGO-free and ships regardless; only full latitude waits on this call.
8. Aesthetic & expression cull signals (the ONNX track)¶
0001 §4.2 lists these "Phase ≥2" signals; this spec folds them in. They are
new cull signals in the exact mould of the eye/blink signal (0004): a pure
projection + verdict/ranking integration, fed by an ONNX model behind an
interface. They reuse the proven 0004 machinery — the native adapter, the
config-gated provider (off by default), the calibrated-anchor approach, the
decision-capture (0005) plumbing.
R-AES-1(MUST) Expression (smile / neutral / awkward, looking-at-camera) is a per-face attribute — it extends the existingFaceAnalyzer(which already returns faces + eye state) with attribute outputs, so one detector pass feeds eyes and expression. The projection to a cull signal is pure and unit-tested (R-GLOBAL-7), likepkg/analyze/face.R-AES-2(MUST) Aesthetic score — a general "is this a strong frame" score — is its own provider (AestheticScorer,0001§5), a separate ONNX model, off by default. It feeds the cull as a soft ranking signal (best-of-burst and a maybe-floor), never a hard reject — a strong-but-imperfect frame is still the photographer's call (cf.R-EYE-3).R-AES-3(MUST) Both extendcull.Signals+ the cull profile with config-driven thresholds (no hardcoded constants,0001§6), are validated + calibrated against real frames (the0004method), and are captured in the analysis cache for Phase-4 learning (0005). Engine stays cgo-free; the models are native adapters (CPU on Linux now, CoreML later —0004Layer 3).R-AES-4(SHOULD) Aesthetic/expression are never written to XMP (Adobe can't represent them) — they inform the cull verdict/ranking only.
Build order: land after the geometric/colour develop, reusing the 0004 adapter
scaffolding; expression rides the existing detector pass, aesthetic is the one
new model.
9. Open questions¶
- Q1 — RAW full-decode strategy. ✅ Embedded-preview-first — ship the
CGO-free embedded-JPEG path (R-DEC-2) for cull/preview/export; defer
full-latitude
Decode()and revisit the libraw approach when it's needed. - Q2 — look representation. ✅ WB + parametric tone curve + an optional 3D-LUT slot. Retrofittable: looks are config the edit references by id, applied through a pure seam, so a richer model is additive (§5).
- Q5 — aesthetic / expression. ✅ In this Phase 2 (§8), not split out.
- Q3 — straighten detection. Horizon-line (Hough on long edges) vs vertical-edge alignment vs both. Recommend: edge-orientation histogram (dominant near-horizontal/near-vertical), clamped — pure, testable, no ML. Gates §3; a refinement, not a blocker.
- Q4 — crop placement. Rule-of-thirds on the largest face/subject (we already detect faces) vs a saliency model. Recommend: face/subject + rule-of-thirds first (reuses the Phase-1 detector); saliency later. Gates §4.
- Q6 — image-ops library. Pure-Go (
x/image+ our colour maths) for the export bake rather than libvips. Recommend: pure-Go — keeps the bake cgo-free and only RAW decode native. Gates §6.
10. Non-goals (this spec)¶
- Object removal / inpainting — Phase 3 (
0001§4.5; theInpainter). - Aesthetic/expression cull signals — own spec (Q5).
- A Lightroom-grade develop module — krites auto-corrects and hands off
(
0001§2 non-goals); it is not a full editor. - Heavy retouch — only the bounded set (
0001§4.4) is in scope, and minimally.