karapace/docs/store-spec.md
Marco Allegretti 5306963cce docs: comprehensive public documentation
- docs/getting-started.md — install per distro, first use, common workflows
- docs/architecture.md — 9-crate dependency graph, design decisions, data flow
- docs/manifest-spec.md — manifest v1 specification
- docs/lock-spec.md — lock file v2 specification
- docs/store-spec.md — store format v2 specification
- docs/hash-contract.md — two-phase identity hashing algorithm
- docs/security-model.md — threat model, mount/device/env policy, privilege model
- docs/cli-stability.md — 23 stable commands, exit codes, stability guarantees
- docs/protocol-v1.md — remote protocol v1 draft
- docs/layer-limitations-v1.md — phase 1 layer limitations
- docs/api-reference.md — public API reference (Engine, D-Bus)
- docs/versioning-policy.md — semantic versioning, deprecation policy
- docs/verification.md — release artifact verification (SHA256, cosign, SBOM)
- docs/e2e-testing.md — E2E test guide with distro-specific prerequisites
- README.md — project overview, features, quick start, installation
- CONTRIBUTING.md — development setup, architecture principles, code standards
- CHANGELOG.md — full changelog for 0.1.0 and 2.0 hardening
2026-02-22 18:38:41 +01:00

4.8 KiB

Karapace Store Format Specification (v2)

Overview

The Karapace store is a content-addressable filesystem structure that holds all environment data: objects, layers, metadata, environment directories, and crash recovery state.

Directory Layout

<store_root>/
  store/
    version          # JSON: { "format_version": 2 }
    .lock            # flock(2) file for exclusive access
    objects/<hash>   # Content-addressable blobs (blake3)
    layers/<hash>    # Layer manifests (JSON)
    metadata/<env_id> # Environment metadata (JSON)
    staging/         # Temporary workspace for atomic operations
    wal/             # Write-ahead log entries (JSON)
  env/
    <env_id>/
      upper/         # Writable overlay layer (fuse-overlayfs upperdir)
      lower -> ...   # Symlink to base image rootfs
      work/          # Overlay workdir (ephemeral)
      merged/        # Overlay mount point
  images/
    <cache_key>/
      rootfs/        # Extracted base image rootfs

Format Version

  • Current version: 2
  • Stored in store/version as JSON.
  • Checked on every store access; mismatches are rejected.
  • Version 1 stores are not auto-migrated; a clean rebuild is required.

Objects

  • Keyed by blake3 hex digest of their content.
  • Written atomically: write to tempfile, then rename.
  • Integrity verified on every read: content re-hashed and compared to filename.
  • Idempotent: writing the same content twice is a no-op.

Layers

Each layer is a JSON manifest:

{
  "hash": "<layer_hash>",
  "kind": "Base" | "Dependency" | "Policy" | "Snapshot",
  "parent": "<parent_hash>" | null,
  "object_refs": ["<hash>", ...],
  "read_only": true,
  "tar_hash": "<blake3_hash>"
}
  • tar_hash (v2): blake3 hash of the deterministic tar archive stored in the object store.
  • Base layers have no parent. Their hash equals their tar_hash.
  • Dependency layers reference a base parent.
  • Snapshot layers are created by commit. Their hash is a composite identity: blake3("snapshot:{env_id}:{base_layer}:{tar_hash}") to prevent collision with base layers.

Metadata

Each environment has a JSON metadata file:

{
  "env_id": "...",
  "short_id": "...",
  "name": "my-env",
  "state": "Defined" | "Built" | "Running" | "Frozen" | "Archived",
  "manifest_hash": "<object_hash>",
  "base_layer": "<layer_hash>",
  "dependency_layers": ["<hash>", ...],
  "policy_layer": null | "<hash>",
  "created_at": "RFC3339",
  "updated_at": "RFC3339",
  "ref_count": 1
}
  • name is optional (#[serde(default)]). Old metadata without this field deserializes correctly.

Atomic Write Contract

All writes follow the pattern:

  1. Create NamedTempFile in the target directory.
  2. Write full content.
  3. flush().
  4. persist() (atomic rename).

This ensures no partial files are visible.

Garbage Collection

  • Environments with ref_count == 0 and state not in {Running, Archived} are eligible for collection.
  • Layers not referenced by any live environment are orphaned.
  • Objects not referenced by any live layer or live metadata (manifest_hash) are orphaned.
  • GC never deletes running or archived environments.
  • GC supports graceful cancellation via signal handler (SIGINT/SIGTERM).
  • --dry-run reports what would be removed without acting.
  • The caller must hold the store lock before running GC.

Write-Ahead Log (WAL)

The store/wal/ directory contains JSON entries for in-flight mutating operations. Each entry tracks:

{
  "op_id": "20260215120000123-a1b2c3d4",
  "kind": "Build" | "Rebuild" | "Commit" | "Restore" | "Destroy",
  "env_id": "...",
  "timestamp": "RFC3339",
  "rollback_steps": [
    { "RemoveDir": "/path/to/orphaned/dir" },
    { "RemoveFile": "/path/to/orphaned/file" }
  ]
}

Recovery Protocol

  1. On Engine::new(), the WAL directory is scanned for incomplete entries.
  2. Each entry's rollback steps are executed in reverse order.
  3. The WAL entry is then removed.
  4. Corrupt or unreadable WAL entries are silently deleted.

Invariants

  • INV-W1: Kill during rebuild → next startup rolls back orphaned env_dir.
  • INV-W2: Kill during build → orphaned env_dir cleaned.
  • INV-W3: Successful operations leave zero WAL entries.

Staging Directory

The store/staging/ directory is a temporary workspace used for atomic operations:

  • Restore: snapshot tar is unpacked into staging/restore-{env_id}, then renamed to replace the overlay upper directory.
  • Layer packing: temporary files during tar creation.

The staging directory is cleaned up after each operation. Leftover staging data is safe to delete.

Backward Compatibility

  • Layout changes require a format version bump.
  • Karapace 1.0 requires format version 2.
  • Version 1 stores are not auto-migrated; environments must be rebuilt.
  • The name and tar_hash fields use #[serde(default)] for forward-compatible deserialization.