# Likwid Stabilization Megaplan (6061c4) Stabilize Likwid into a production-usable system by shipping a coherent admin modular-rule management UX (built-in + WASM), making WASM packages production-grade (including background jobs + SSRF-hardening), and converging authorization on roles/permissions. ## Decisions (confirmed) - Git history: **linear `main` via squash merges**. - Demo VPS: **tracks `main`**. - Modular rules approach: **Hybrid** (built-in DB-backed settings + built-in modules + WASM plugin packages). - Authz direction: **roles/permissions is authoritative**. - Phase 1 default instance guidance: `instance_type=multi_community`, `platform_mode=admin_only`. - `plugin_allow_background_jobs`: **full implementation** (end-to-end semantics, not just a stored flag). - Registry SSRF hardening: **yes** (DNS-aware; do not rely only on host string / IP-literal checks). - Community admin UX scope: **full** (plugin policy + WASM packages + built-in plugins in one coherent flow). - Plan handling: **commit plan to `main`** once approved (plan-first discipline). ## Non-negotiable outcomes (Phase 1) - Operational reliability + install/upgrade story. - Admin modular rule management UX. - WASM third-party plugin packages are production-grade. - End-user UX consistency (avoid confusing partial configuration states). ## Current state (source-backed highlights) - Backend exposes community endpoints: - `GET/PUT /api/communities/{id}/plugin-policy` - `GET/POST/PUT /api/communities/{id}/plugin-packages` (+ `install-registry`) - WASM runtime exists (wasmtime; fuel + timeout + memory limits). - WASM outbound HTTP is capability-gated and allowlisted. - Registry allowlist currently: - blocks `localhost` - blocks IP-literal loopback/private/link-local/unspecified - matches exact host or `*.suffix` - **does not** do DNS resolution + post-resolution IP classification. - Frontend currently does not call `plugin-policy` / `plugin-packages` (community UI covers built-in plugins only). ## Scope boundaries - No architectural rewrite. - Minimal, targeted changes per milestone. - No dependency additions unless clearly required for an explicit acceptance criterion. ## Milestone discipline / verification - Each milestone lands as a single squash-merge PR to `main`. - Required verification per milestone: - Backend: `cargo check` (and `cargo test` if stable) - Frontend: `npm run build` - Demo VPS: deploy `main` and run `./scripts/smoke-test.sh demo` --- ## Phase 0 — Baseline + operator invariants (gate) ### Deliverables - Single authoritative operator workflow for: - local dev start/stop - demo deploy/update - rollback - smoke test - Confirm demo systemd + compose wiring is consistent with docs and scripts. ### Acceptance criteria - Demo VPS updated to latest `main` and `./scripts/smoke-test.sh demo` passes. ### Verification - Run smoke test on VPS. --- ## Phase 1 — Admin modular rule management UX (hybrid) ### Goal One coherent admin flow for: - community built-in plugins (`plugins` + `community_plugins`) - community WASM plugin packages (`plugin_packages` + `community_plugin_packages`) - community plugin policy (`communities.settings` keys) ### Deliverables (frontend) - Community admin UI adds a “Plugins / Rules” surface that includes: - Built-in community plugins management (existing functionality retained). - Plugin policy editor (read + update): - trust policy - install sources - registry allowlist - trusted publishers - outbound HTTP toggle + allowlist - background jobs toggle - WASM package manager: - list installed packages - upload package - install from registry URL - activate/deactivate - edit package settings (schema-driven when available; raw JSON fallback) - Clear error messages for: - policy forbids uploads / registry installs - signature requirements fail - registry allowlist blocks URL ### Deliverables (backend contract hardening) - Ensure API error responses are stable and actionable for UI (status code + message consistency). - Ensure event emission for key actions is consistent (`public_events`). ### Acceptance criteria - As community admin/moderator: - can view/update plugin policy - can install a WASM package (upload + registry) when policy allows - can activate/deactivate packages - can edit package settings and receive server-side schema validation errors when invalid ### Verification - Manual UI walkthrough covering: - signed-only policy - registry allowlist allow/deny - outbound HTTP allowlist allow/deny - background jobs on/off behavior (see Phase 2 definition) --- ## Phase 2 — WASM packages production-grade hardening ### 2.1 Background jobs: **full implementation** #### Proposed semantics (must be implemented consistently) - `plugin_allow_background_jobs=false` means: - WASM plugins must **not** be invoked for cron hooks for that community. - Any future “scheduled” behavior for WASM must be gated by the same setting. - `plugin_allow_background_jobs=true` means: - WASM plugins may receive cron hooks they declare in their manifest (e.g. `cron.minute`, `cron.hourly`, `cron.daily`, ...). #### Implementation outline (expected code touchpoints) - Resolve where WASM cron hooks are dispatched (currently cron loop exists in `backend/src/main.rs` and invokes `PluginManager::do_wasm_action_for_community`). - Add a community-settings check (`communities.settings.plugin_allow_background_jobs`) in the WASM cron dispatch path. - Ensure policy API default behavior is explicit and safe: - confirm default is false (current parse default is false). #### Acceptance criteria - When `plugin_allow_background_jobs=false`, WASM cron hooks are not executed for that community. - When `plugin_allow_background_jobs=true`, WASM cron hooks execute normally. ### 2.2 Registry install SSRF hardening (DNS-aware) #### Goal Registry install should not be able to reach internal/private addresses via DNS rebinding or private resolution. #### Deliverables - Extend registry allowlist enforcement to: - resolve DNS for hostname-based registry URLs - reject if any resolved IP is loopback/private/link-local/unspecified/unique-local (IPv6) - Keep existing protections: - reject `localhost` - reject IP-literal private/loopback - enforce allowlist patterns #### Acceptance criteria - Registry install is blocked when a hostname resolves to a private/loopback/link-local address. ### 2.3 Registry fetch hardening (timeouts/size caps) #### Deliverables - Add explicit timeout and size bounds to registry bundle fetch. - Current code path uses `reqwest::get(...)` without explicit timeout/size cap. #### Acceptance criteria - Registry fetch cannot hang indefinitely. - Registry fetch cannot load an unbounded payload. ### 2.4 Operator-visible metadata #### Deliverables - UI shows package metadata: - publisher - sha256 - signature present - source (upload/registry) - registry URL - manifest-declared hooks + capabilities - effective outbound HTTP permission status --- ## Phase 3 — Authz convergence (roles/permissions authoritative) ### Goal Stop using `community_members.role` as the primary enforcement mechanism for privileged actions. ### Deliverables - Inventory endpoints that currently use `ensure_admin_or_moderator` style gates. - Introduce/confirm permissions for: - managing plugin policy - managing plugin packages - managing community plugin settings - Migrate gates to permission checks consistently. ### Acceptance criteria - Plugin policy + package management endpoints authorize via roles/permissions. --- ## Phase 4 — Technical debt hotspot inventory + targeted fixes ### Deliverables - Evidence-backed inventory (file/function-level) of: - cross-layer coupling hotspots - duplicated policy parsing/enforcement - plugin plane confusion (instance defaults vs community plugins vs wasm packages) - any unstable areas discovered during Phases 1–3 - Only fix hotspots that block Phase 1–3 acceptance criteria. --- ## Commit plan (after this plan is approved) - Add this plan into the repo under `.windsurf/plans/` and commit to `main`. - Implementation starts only after the plan commit lands.