Skip to content

0006 — develop: straighten, crop, looks & the develop-aware export

Status: DRAFT (2026-06-23) — the Phase-2 master spec (0001 §11). Companion to 0001-krites.md (§4.3 develop, §4.6 export, §4.7 XMP, §6 catalogs), 0002-interface-contracts.md (R-DEV-*, R-LOOK-*, R-EXP-*, R-XMP-1b, R-CAT-*), and the patterns proven in 0004-face-eye.md (native adapter behind an interface) and 0005-decision-capture.md (non-destructive records).

Decisions (2026-06-23, Matt). §8 resolved: Q1 — embedded-preview-first RAW (cull/preview/export off the embedded JPEG; full-latitude Decode() deferred, keeping the build CGO-free). Q2 — looks are white-balance + tone curve + an optional 3D-LUT slot; retrofittable because looks are config the edit record references by id, applied through a pure seam, so changing the model later is additive and doesn't break stored edits. Q5 — aesthetic/expression signals are IN this Phase 2 (§8), not split out — develop and the richer cull signals land together.

Scope: auto-correct the keepers — auto-straighten, composition crop, and a white-balance + tone + grade look — recorded as reversible edit instructions, baked to pixels only on export, and carried (geometry + WB only) into Camera-Raw XMP so keepers open in Lightroom already leveled, cropped and WB-corrected (the ~90% finish-in-Lightroom deliverable, 0001 §4.7). Phase 2 also adds the aesthetic / expression cull signals (smile, looking-at-camera, a strength score) as new ONNX-backed signals feeding the cull (§8). Bounded retouch is in scope but minimal; object removal (Phase 3) is out.


1. The shape of the work, and the platform split

Phase 1 taught the lesson this spec leans on: separate the deterministic math (pure-Go, builds and tests on the Linux box now, stays WASM-safe) from the heavy native backend (its own adapter behind an interface). Develop divides cleanly:

Pure-Go, buildable now Native, behind a provider
edit-record model (straighten/crop/look/retouch instructions) RAW decode (Decoder) — ARW/CR2/NEF → pixels
straighten-angle detection (horizon / verticals) RAW embedded-preview extraction for culling
crop geometry (aspect + rule-of-thirds)
look definition + WB/tone/LUT maths
the export bake (rotate/crop/resize/colour) on a decoded image
geometry + WB → Camera-Raw XMP

So the entire develop engine and the JPEG/PNG export path build and validate on Linux todaystraighten, crop, develop --look, and a develop-aware export/xmp write all work end-to-end on the JPEG/PNG keepers krites already decodes (and on Hailey's Lightroom JPEG exports). The only native gap is decoding RAW originals, which is exactly the 0001 §13-Q4 boundary: cull and preview off the embedded JPEG; full RAW decode only for developed/exported keepers. RAW is its own slice (§7), so it doesn't gate the develop engine.

Build order within Phase 2: edit records → straighten → crop → looks → develop-aware export + XMP (all JPEG/PNG, pure-Go) → the RAW Decoder → bounded retouch.

2. Edit records — the non-destructive develop state

Cull has verdicts.yaml; develop has edit records. Per 0001 §3/§7 the shoot holds the state under .krites/edits/.

  • R-EDIT-1 (MUST) Each frame's edits live as a reversible instruction record (.krites/edits/<frame>.yaml): straighten angle, crop rectangle, look id + per-frame overrides, retouch toggles — never baked pixels (0001 §3, §4.6; 0002 R-ND-2). Only export renders pixels (R-EXP-2).
  • R-EDIT-2 (MUST) Every edit is reversible: reset <frame> clears edits (extending the Phase-1 reset, R-VRD-2), and the studio's undo/redo is total (R-UI-9). Removing .krites/edits/ returns every keeper to its culled state.
  • R-EDIT-3 (MUST) Edit records are human-readable YAML and deterministic given (frame, look, detector outputs) — the geometry/look maths are pure (R-GLOBAL-7).
  • R-EDIT-4 (SHOULD) An edit record carries the detector basis it was proposed from (the detected horizon angle, the saliency box) so a re-propose is explainable and the Phase-4 learner can later relate Hailey's crop/level overrides to what the machine proposed (0005).

A new package pkg/shoot gains edit-record load/save beside verdicts; the geometry/look types live in pkg/develop.

3. Straighten (krites straighten)

  • R-DEV-1/2 (MUST, restated) Detects the dominant horizon / verticals and computes a level angle, clamped to a sane maximum (e.g. ±8°, config), recorded on the edit record — proposed, never auto-baked; the studio previews it (R-UI-12). The geometry (angle → rotation, and the auto-crop inside the rotated frame so no empty corners show) is pure and unit-tested.
  • R-DEV-3 (SHOULD) --all batch-proposes across keepers; each stays individually overridable.

Detection method is §9-Q3. Lives in pkg/develop/straighten; the angle→transform + corner-safe inner-crop geometry is pure Go (golden-fixture tested).

4. Crop (krites crop)

  • R-CROP-1 (MUST) Proposes a crop rectangle improving composition — a target aspect (config; keep-original by default) and a placement heuristic (rule-of-thirds on the subject / saliency), recorded on the edit record, staying within the frame (R-DEV-2). Pure geometry, unit-tested.
  • R-CROP-2 (SHOULD) Respects the straighten edit (crop within the leveled frame) and never crops below a minimum resolution.

Placement heuristic is §9-Q4. Lives in pkg/develop/crop.

5. Looks (krites develop [--look <name>])

  • R-LOOK-1/2 (MUST, restated) Applies a look — white balance + tone curve + grade/LUT — from the config-driven catalog (0001 §6) as develop settings on the edit record (not baked until export). --look selects, else the configured default. No hardcoded grades/curves in Go — wired through the catalog (R-CAT-1).
  • R-LOOK-3 (MUST) The look maths (WB scaling, tone-curve evaluation, 3D-LUT sampling) are pure and unit-tested on golden pixels — the deterministic core (R-GLOBAL-7); only where the pixels come from (decode) is native.
  • R-LOOK-4 (MUST) init seeds a neutral look (0001 §6); Hailey's signature grade and per-scene variants (ceremony / golden-hour / reception) are catalog entries she tunes, like the cull profile.

Look representation is §9-Q2. Lives in pkg/develop/look (definition + maths); the catalog is managed through the GTB config layer (R-CAT-*).

6. Develop-aware export & XMP

The export and XMP commands already exist (Phase 1); Phase 2 makes them edit-aware.

  • R-EXP-1 (MUST, restated) export renders the chosen verdict set applying each frame's edits — straighten → crop → look → retouch — into export/. It stays the only pixel-producing command (R-EXP-2), never touching originals.
  • R-EXP-4 (MUST) The bake is a deterministic pure-Go image pipeline on the decoded image: rotate (straighten), crop, resize, and the colour ops (WB / tone / LUT). golang.org/x/image covers rotate/crop/resize; the colour ops are our pure maths (§5). No native image-ops library is required for the bake — only decode is native, and only for RAW (§7).
  • R-XMP-1b (MUST) xmp write additionally carries crop, straighten angle and white balance into Camera-Raw XMP, so keepers open in Lightroom already leveled/cropped/WB-corrected (0001 §4.7, §13-Q5). Looks / retouch / removals are never written to XMP — Adobe can't represent them, so they serve the krites-export path only; no fragile half-applied mapping (R-XMP-1b, unit-tested round-trip).

7. The RAW Decoder provider (the one native dependency)

Culling and develop work on JPEG/PNG today; Hailey's originals are Sony ARW. Per 0001 §13-Q4: cull + preview off the embedded JPEG; full RAW decode only for the developed/exported keepers (a few hundred frames, not all 4,000).

  • R-DEC-1 (MUST) A Decoder provider exposes a cheap Preview() (the embedded JPEG every RAW carries — enough for cull/preview) and a full Decode() (full bit-depth RAW → pixels + WB-as-shot, for export). ingest/ cull use the former; export the latter (0001 §13-Q4). It is injected and faked in tests (the keryx rule), a native adapter behind the interface — the deterministic engine never imports it (0001 §7, like pkg/face/onnx).
  • R-DEC-2 (MUST) Embedded-preview extraction needs no full RAW decoder — it parses the RAW container for the embedded JPEG. This unblocks culling RAW shoots without the heavy native path and can ship first.

The CGO question (§9-Q1). Full RAW decode means libraw/libvips, and the mature Go bindings (govips, go-libraw) are CGO — the same tension the eye model hit, and CGO breaks goreleaser's pure-Go cross-compile of the binary. Options, to decide before the full-Decode() slice: (a) embedded-preview-only for now (no full RAW — develop/export RAW keepers off the preview, lower latitude); (b) shell out to a bundled libraw/dcraw/exiftool binary (no CGO in our build; an external tool dependency); © accept CGO for a separate, optionally-built decode binary/sidecar; (d) a purego+libraw dylib binding (no mature one exists — DIY, like we considered for ONNX). The embedded preview (R-DEC-2) is CGO-free and ships regardless; only full latitude waits on this call.

8. Aesthetic & expression cull signals (the ONNX track)

0001 §4.2 lists these "Phase ≥2" signals; this spec folds them in. They are new cull signals in the exact mould of the eye/blink signal (0004): a pure projection + verdict/ranking integration, fed by an ONNX model behind an interface. They reuse the proven 0004 machinery — the native adapter, the config-gated provider (off by default), the calibrated-anchor approach, the decision-capture (0005) plumbing.

  • R-AES-1 (MUST) Expression (smile / neutral / awkward, looking-at-camera) is a per-face attribute — it extends the existing FaceAnalyzer (which already returns faces + eye state) with attribute outputs, so one detector pass feeds eyes and expression. The projection to a cull signal is pure and unit-tested (R-GLOBAL-7), like pkg/analyze/face.
  • R-AES-2 (MUST) Aesthetic score — a general "is this a strong frame" score — is its own provider (AestheticScorer, 0001 §5), a separate ONNX model, off by default. It feeds the cull as a soft ranking signal (best-of-burst and a maybe-floor), never a hard reject — a strong-but-imperfect frame is still the photographer's call (cf. R-EYE-3).
  • R-AES-3 (MUST) Both extend cull.Signals + the cull profile with config-driven thresholds (no hardcoded constants, 0001 §6), are validated + calibrated against real frames (the 0004 method), and are captured in the analysis cache for Phase-4 learning (0005). Engine stays cgo-free; the models are native adapters (CPU on Linux now, CoreML later — 0004 Layer 3).
  • R-AES-4 (SHOULD) Aesthetic/expression are never written to XMP (Adobe can't represent them) — they inform the cull verdict/ranking only.

Build order: land after the geometric/colour develop, reusing the 0004 adapter scaffolding; expression rides the existing detector pass, aesthetic is the one new model.

9. Open questions

  • Q1 — RAW full-decode strategy.Embedded-preview-first — ship the CGO-free embedded-JPEG path (R-DEC-2) for cull/preview/export; defer full-latitude Decode() and revisit the libraw approach when it's needed.
  • Q2 — look representation.WB + parametric tone curve + an optional 3D-LUT slot. Retrofittable: looks are config the edit references by id, applied through a pure seam, so a richer model is additive (§5).
  • Q5 — aesthetic / expression.In this Phase 2 (§8), not split out.
  • Q3 — straighten detection. Horizon-line (Hough on long edges) vs vertical-edge alignment vs both. Recommend: edge-orientation histogram (dominant near-horizontal/near-vertical), clamped — pure, testable, no ML. Gates §3; a refinement, not a blocker.
  • Q4 — crop placement. Rule-of-thirds on the largest face/subject (we already detect faces) vs a saliency model. Recommend: face/subject + rule-of-thirds first (reuses the Phase-1 detector); saliency later. Gates §4.
  • Q6 — image-ops library. Pure-Go (x/image + our colour maths) for the export bake rather than libvips. Recommend: pure-Go — keeps the bake cgo-free and only RAW decode native. Gates §6.

10. Non-goals (this spec)

  • Object removal / inpainting — Phase 3 (0001 §4.5; the Inpainter).
  • Aesthetic/expression cull signals — own spec (Q5).
  • A Lightroom-grade develop module — krites auto-corrects and hands off (0001 §2 non-goals); it is not a full editor.
  • Heavy retouch — only the bounded set (0001 §4.4) is in scope, and minimally.