diff --git a/.windsurf/plans/likwid-stabilization-6061c4.md b/.windsurf/plans/likwid-stabilization-6061c4.md new file mode 100644 index 0000000..2682386 --- /dev/null +++ b/.windsurf/plans/likwid-stabilization-6061c4.md @@ -0,0 +1,210 @@ +# Likwid Stabilization Megaplan (6061c4) + +Stabilize Likwid into a production-usable system by shipping a coherent admin modular-rule management UX (built-in + WASM), making WASM packages production-grade (including background jobs + SSRF-hardening), and converging authorization on roles/permissions. + +## Decisions (confirmed) +- Git history: **linear `main` via squash merges**. +- Demo VPS: **tracks `main`**. +- Modular rules approach: **Hybrid** (built-in DB-backed settings + built-in modules + WASM plugin packages). +- Authz direction: **roles/permissions is authoritative**. +- Phase 1 default instance guidance: `instance_type=multi_community`, `platform_mode=admin_only`. +- `plugin_allow_background_jobs`: **full implementation** (end-to-end semantics, not just a stored flag). +- Registry SSRF hardening: **yes** (DNS-aware; do not rely only on host string / IP-literal checks). +- Community admin UX scope: **full** (plugin policy + WASM packages + built-in plugins in one coherent flow). +- Plan handling: **commit plan to `main`** once approved (plan-first discipline). + +## Non-negotiable outcomes (Phase 1) +- Operational reliability + install/upgrade story. +- Admin modular rule management UX. +- WASM third-party plugin packages are production-grade. +- End-user UX consistency (avoid confusing partial configuration states). + +## Current state (source-backed highlights) +- Backend exposes community endpoints: + - `GET/PUT /api/communities/{id}/plugin-policy` + - `GET/POST/PUT /api/communities/{id}/plugin-packages` (+ `install-registry`) +- WASM runtime exists (wasmtime; fuel + timeout + memory limits). +- WASM outbound HTTP is capability-gated and allowlisted. +- Registry allowlist currently: + - blocks `localhost` + - blocks IP-literal loopback/private/link-local/unspecified + - matches exact host or `*.suffix` + - **does not** do DNS resolution + post-resolution IP classification. +- Frontend currently does not call `plugin-policy` / `plugin-packages` (community UI covers built-in plugins only). + +## Scope boundaries +- No architectural rewrite. +- Minimal, targeted changes per milestone. +- No dependency additions unless clearly required for an explicit acceptance criterion. + +## Milestone discipline / verification +- Each milestone lands as a single squash-merge PR to `main`. +- Required verification per milestone: + - Backend: `cargo check` (and `cargo test` if stable) + - Frontend: `npm run build` + - Demo VPS: deploy `main` and run `./scripts/smoke-test.sh demo` + +--- + +## Phase 0 — Baseline + operator invariants (gate) + +### Deliverables +- Single authoritative operator workflow for: + - local dev start/stop + - demo deploy/update + - rollback + - smoke test +- Confirm demo systemd + compose wiring is consistent with docs and scripts. + +### Acceptance criteria +- Demo VPS updated to latest `main` and `./scripts/smoke-test.sh demo` passes. + +### Verification +- Run smoke test on VPS. + +--- + +## Phase 1 — Admin modular rule management UX (hybrid) + +### Goal +One coherent admin flow for: +- community built-in plugins (`plugins` + `community_plugins`) +- community WASM plugin packages (`plugin_packages` + `community_plugin_packages`) +- community plugin policy (`communities.settings` keys) + +### Deliverables (frontend) +- Community admin UI adds a “Plugins / Rules” surface that includes: + - Built-in community plugins management (existing functionality retained). + - Plugin policy editor (read + update): + - trust policy + - install sources + - registry allowlist + - trusted publishers + - outbound HTTP toggle + allowlist + - background jobs toggle + - WASM package manager: + - list installed packages + - upload package + - install from registry URL + - activate/deactivate + - edit package settings (schema-driven when available; raw JSON fallback) +- Clear error messages for: + - policy forbids uploads / registry installs + - signature requirements fail + - registry allowlist blocks URL + +### Deliverables (backend contract hardening) +- Ensure API error responses are stable and actionable for UI (status code + message consistency). +- Ensure event emission for key actions is consistent (`public_events`). + +### Acceptance criteria +- As community admin/moderator: + - can view/update plugin policy + - can install a WASM package (upload + registry) when policy allows + - can activate/deactivate packages + - can edit package settings and receive server-side schema validation errors when invalid + +### Verification +- Manual UI walkthrough covering: + - signed-only policy + - registry allowlist allow/deny + - outbound HTTP allowlist allow/deny + - background jobs on/off behavior (see Phase 2 definition) + +--- + +## Phase 2 — WASM packages production-grade hardening + +### 2.1 Background jobs: **full implementation** + +#### Proposed semantics (must be implemented consistently) +- `plugin_allow_background_jobs=false` means: + - WASM plugins must **not** be invoked for cron hooks for that community. + - Any future “scheduled” behavior for WASM must be gated by the same setting. +- `plugin_allow_background_jobs=true` means: + - WASM plugins may receive cron hooks they declare in their manifest (e.g. `cron.minute`, `cron.hourly`, `cron.daily`, ...). + +#### Implementation outline (expected code touchpoints) +- Resolve where WASM cron hooks are dispatched (currently cron loop exists in `backend/src/main.rs` and invokes `PluginManager::do_wasm_action_for_community`). +- Add a community-settings check (`communities.settings.plugin_allow_background_jobs`) in the WASM cron dispatch path. +- Ensure policy API default behavior is explicit and safe: + - confirm default is false (current parse default is false). + +#### Acceptance criteria +- When `plugin_allow_background_jobs=false`, WASM cron hooks are not executed for that community. +- When `plugin_allow_background_jobs=true`, WASM cron hooks execute normally. + +### 2.2 Registry install SSRF hardening (DNS-aware) + +#### Goal +Registry install should not be able to reach internal/private addresses via DNS rebinding or private resolution. + +#### Deliverables +- Extend registry allowlist enforcement to: + - resolve DNS for hostname-based registry URLs + - reject if any resolved IP is loopback/private/link-local/unspecified/unique-local (IPv6) +- Keep existing protections: + - reject `localhost` + - reject IP-literal private/loopback + - enforce allowlist patterns + +#### Acceptance criteria +- Registry install is blocked when a hostname resolves to a private/loopback/link-local address. + +### 2.3 Registry fetch hardening (timeouts/size caps) + +#### Deliverables +- Add explicit timeout and size bounds to registry bundle fetch. + - Current code path uses `reqwest::get(...)` without explicit timeout/size cap. + +#### Acceptance criteria +- Registry fetch cannot hang indefinitely. +- Registry fetch cannot load an unbounded payload. + +### 2.4 Operator-visible metadata + +#### Deliverables +- UI shows package metadata: + - publisher + - sha256 + - signature present + - source (upload/registry) + - registry URL + - manifest-declared hooks + capabilities + - effective outbound HTTP permission status + +--- + +## Phase 3 — Authz convergence (roles/permissions authoritative) + +### Goal +Stop using `community_members.role` as the primary enforcement mechanism for privileged actions. + +### Deliverables +- Inventory endpoints that currently use `ensure_admin_or_moderator` style gates. +- Introduce/confirm permissions for: + - managing plugin policy + - managing plugin packages + - managing community plugin settings +- Migrate gates to permission checks consistently. + +### Acceptance criteria +- Plugin policy + package management endpoints authorize via roles/permissions. + +--- + +## Phase 4 — Technical debt hotspot inventory + targeted fixes + +### Deliverables +- Evidence-backed inventory (file/function-level) of: + - cross-layer coupling hotspots + - duplicated policy parsing/enforcement + - plugin plane confusion (instance defaults vs community plugins vs wasm packages) + - any unstable areas discovered during Phases 1–3 +- Only fix hotspots that block Phase 1–3 acceptance criteria. + +--- + +## Commit plan (after this plan is approved) +- Add this plan into the repo under `.windsurf/plans/` and commit to `main`. +- Implementation starts only after the plan commit lands.