Delete 14 old docs files (AI-generated, riddled with Phase/M1/1.0 jargon, references to non-existent commands, stale CI snippets). New documentation (6 files), written from repository source analysis: - docs/architecture.md — crate graph, engine lifecycle, identity computation, runtime backends, store design, WAL, GC, unsafe blocks - docs/cli-reference.md — all 23 commands with syntax, args, flags, exit codes, env vars, verified against crates/karapace-cli/src/main.rs - docs/storage-format.md — directory layout, objects, layers, metadata, manifest format, lock file, WAL, atomic write contract - docs/security-model.md — mount/device/env var policies with exact defaults from security.rs, trust assumptions, what is NOT protected - docs/build-and-reproducibility.md — CI env vars, RUSTFLAGS, cargo profile, reproducibility verification, toolchain pinning - docs/contributing.md — setup, verification, project layout, code standards, testing, CI workflows README.md rewritten: concise, no marketing language, prerequisites first, usage example, command table, limitations section. CONTRIBUTING.md now points to docs/contributing.md. CHANGELOG.md cleaned: removed M1-M8 labels, Phase refs, stale counts.
5.6 KiB
Storage Format
Store format version: 2. Defined in karapace-store/src/layout.rs::STORE_FORMAT_VERSION.
Directory layout
Default root: ~/.local/share/karapace.
<root>/
store/
version # { "format_version": 2 }
.lock # flock(2) exclusive lock
objects/<blake3_hex> # content-addressable blobs
layers/<blake3_hex> # layer manifests (JSON)
metadata/<env_id> # environment metadata (JSON)
staging/ # temp workspace for atomic operations
wal/<op_id>.json # write-ahead log entries
env/
<env_id>/
upper/ # overlay writable layer
overlay/ # overlay mount point
images/
<cache_key>/
rootfs/ # extracted base image filesystem
Paths defined in karapace-store/src/layout.rs::StoreLayout.
Version file
{ "format_version": 2 }
Checked on every store access. Mismatched versions are rejected with StoreError::VersionMismatch.
Objects
Content-addressable blobs keyed by blake3 hex digest of their content.
- Write:
NamedTempFilein objects dir → write content →sync_all()→persist()(atomic rename) - Read: read file → recompute blake3 → compare to filename → reject on mismatch
- Idempotent: writing identical content is a no-op
Defined in karapace-store/src/objects.rs::ObjectStore.
Layers
JSON files in store/layers/. Each describes a tar archive stored in the object store.
{
"hash": "<layer_hash>",
"kind": "Base | Dependency | Policy | Snapshot",
"parent": "<parent_hash> | null",
"object_refs": ["<hash>", ...],
"read_only": true,
"tar_hash": "<blake3_of_tar>"
}
Defined in karapace-store/src/layers.rs::LayerManifest.
Layer kinds:
| Kind | Hash computation | Parent |
|---|---|---|
Base |
tar_hash |
None |
Dependency |
tar_hash |
Base layer |
Policy |
tar_hash |
— |
Snapshot |
blake3("snapshot:{env_id}:{base_layer}:{tar_hash}") |
Base layer |
Layer integrity is verified on read: the file content is re-hashed and compared to the filename.
Deterministic tar packing
karapace-store/src/layers.rs::pack_layer(source_dir):
- Entries sorted by path
- Timestamps set to 0
- Owner set to
0:0 - Permissions preserved
- Symlink targets preserved
Dropped during packing: extended attributes, device nodes, hardlinks (stored as regular files), SELinux labels, ACLs, sparse file holes.
unpack_layer(tar_data, target_dir) reverses the process.
Metadata
JSON files in store/metadata/, one per environment. Filename is the env_id.
{
"env_id": "...",
"short_id": "...",
"name": null,
"state": "Built",
"manifest_hash": "<object_hash>",
"base_layer": "<layer_hash>",
"dependency_layers": [],
"policy_layer": null,
"created_at": "RFC3339",
"updated_at": "RFC3339",
"ref_count": 1,
"checksum": "<blake3_of_json>"
}
Defined in karapace-store/src/metadata.rs::EnvMetadata.
States: Defined, Built, Running, Frozen, Archived.
Checksum: blake3 of the JSON content (excluding the checksum field itself). Computed on every put(), verified on every get(). Absent in legacy metadata (#[serde(default)]).
Names: optional, validated by validate_env_name: pattern [a-zA-Z0-9_-], 1–64 characters. Unique across all environments.
Manifest format
File: karapace.toml. Parsed by karapace-schema/src/manifest.rs.
manifest_version = 1
[base]
image = "rolling"
[system]
packages = ["git", "curl"]
[gui]
apps = []
[hardware]
gpu = false
audio = false
[mounts]
workspace = "./:/workspace"
[runtime]
backend = "namespace"
network_isolation = false
[runtime.resource_limits]
cpu_shares = 1024
memory_limit_mb = 4096
Required: manifest_version (must be 1), base.image (non-empty).
Optional: all other sections. Unknown fields cause a parse error (deny_unknown_fields).
Normalization (ManifestV1::normalize): trim strings, sort and deduplicate packages/apps, sort mounts by label, lowercase backend name. Produces NormalizedManifest with a canonical_json() method.
Lock file
File: karapace.lock. Written next to the manifest. TOML format.
lock_version = 2
env_id = "46e1d96f..."
short_id = "46e1d96fdd6f"
base_image = "rolling"
base_image_digest = "a1b2c3d4..."
runtime_backend = "namespace"
hardware_gpu = false
hardware_audio = false
network_isolation = false
[[resolved_packages]]
name = "git"
version = "2.44.0-1"
Defined in karapace-schema/src/lock.rs::LockFile.
Verification:
verify_integrity(): recomputesenv_idfrom locked fields, compares to stored valueverify_manifest_intent(): checks manifest hasn't drifted from what was locked
Hashing
All hashing uses blake3, 256-bit output, hex-encoded (64 characters).
Used for: object keys, layer hashes, env_id computation, metadata checksums, image content digests.
Write-ahead log
store/wal/<op_id>.json. Defined in karapace-store/src/wal.rs.
{
"op_id": "20260215120000123-a1b2c3d4",
"kind": "Build",
"env_id": "...",
"timestamp": "RFC3339",
"rollback_steps": [
{ "RemoveDir": "/path" },
{ "RemoveFile": "/path" }
]
}
Operations: Build, Rebuild, Commit, Restore, Destroy, Gc.
Recovery: on Engine::new(), all WAL entries are scanned. Each entry's rollback steps execute in reverse order. The entry is then deleted. Corrupt entries are silently removed.
Atomic write contract
All store writes follow: NamedTempFile::new_in(dir) → write → flush() → persist() (atomic rename). No partial files are visible. Defined throughout karapace-store.