# Karapace Store Format Specification (v2) ## Overview The Karapace store is a content-addressable filesystem structure that holds all environment data: objects, layers, metadata, environment directories, and crash recovery state. ## Directory Layout ``` / store/ version # JSON: { "format_version": 2 } .lock # flock(2) file for exclusive access objects/ # Content-addressable blobs (blake3) layers/ # Layer manifests (JSON) metadata/ # Environment metadata (JSON) staging/ # Temporary workspace for atomic operations wal/ # Write-ahead log entries (JSON) env/ / upper/ # Writable overlay layer (fuse-overlayfs upperdir) lower -> ... # Symlink to base image rootfs work/ # Overlay workdir (ephemeral) merged/ # Overlay mount point images/ / rootfs/ # Extracted base image rootfs ``` ## Format Version - Current version: **2** - Stored in `store/version` as JSON. - Checked on every store access; mismatches are rejected. - Version 1 stores are not auto-migrated; a clean rebuild is required. ## Objects - Keyed by blake3 hex digest of their content. - Written atomically: write to tempfile, then rename. - Integrity verified on every read: content re-hashed and compared to filename. - Idempotent: writing the same content twice is a no-op. ## Layers Each layer is a JSON manifest: ```json { "hash": "", "kind": "Base" | "Dependency" | "Policy" | "Snapshot", "parent": "" | null, "object_refs": ["", ...], "read_only": true, "tar_hash": "" } ``` - `tar_hash` (v2): blake3 hash of the deterministic tar archive stored in the object store. - Base layers have no parent. Their `hash` equals their `tar_hash`. - Dependency layers reference a base parent. - Snapshot layers are created by `commit`. Their `hash` is a composite identity: `blake3("snapshot:{env_id}:{base_layer}:{tar_hash}")` to prevent collision with base layers. ## Metadata Each environment has a JSON metadata file: ```json { "env_id": "...", "short_id": "...", "name": "my-env", "state": "Defined" | "Built" | "Running" | "Frozen" | "Archived", "manifest_hash": "", "base_layer": "", "dependency_layers": ["", ...], "policy_layer": null | "", "created_at": "RFC3339", "updated_at": "RFC3339", "ref_count": 1 } ``` - `name` is optional (`#[serde(default)]`). Old metadata without this field deserializes correctly. ## Atomic Write Contract All writes follow the pattern: 1. Create `NamedTempFile` in the target directory. 2. Write full content. 3. `flush()`. 4. `persist()` (atomic rename). This ensures no partial files are visible. ## Garbage Collection - Environments with `ref_count == 0` and state not in {`Running`, `Archived`} are eligible for collection. - Layers not referenced by any live environment are orphaned. - Objects not referenced by any live layer or live metadata (`manifest_hash`) are orphaned. - GC never deletes running or archived environments. - GC supports graceful cancellation via signal handler (`SIGINT`/`SIGTERM`). - `--dry-run` reports what would be removed without acting. - The caller must hold the store lock before running GC. ## Write-Ahead Log (WAL) The `store/wal/` directory contains JSON entries for in-flight mutating operations. Each entry tracks: ```json { "op_id": "20260215120000123-a1b2c3d4", "kind": "Build" | "Rebuild" | "Commit" | "Restore" | "Destroy", "env_id": "...", "timestamp": "RFC3339", "rollback_steps": [ { "RemoveDir": "/path/to/orphaned/dir" }, { "RemoveFile": "/path/to/orphaned/file" } ] } ``` ### Recovery Protocol 1. On `Engine::new()`, the WAL directory is scanned for incomplete entries. 2. Each entry's rollback steps are executed in **reverse order**. 3. The WAL entry is then removed. 4. Corrupt or unreadable WAL entries are silently deleted. ### Invariants - **INV-W1**: Kill during rebuild → next startup rolls back orphaned env_dir. - **INV-W2**: Kill during build → orphaned env_dir cleaned. - **INV-W3**: Successful operations leave zero WAL entries. ## Staging Directory The `store/staging/` directory is a temporary workspace used for atomic operations: - **Restore**: snapshot tar is unpacked into `staging/restore-{env_id}`, then renamed to replace the overlay upper directory. - **Layer packing**: temporary files during tar creation. The staging directory is cleaned up after each operation. Leftover staging data is safe to delete. ## Backward Compatibility - Layout changes require a format version bump. - Karapace 1.0 requires format version 2. - Version 1 stores are not auto-migrated; environments must be rebuilt. - The `name` and `tar_hash` fields use `#[serde(default)]` for forward-compatible deserialization.