likwid/.windsurf/plans/likwid-stabilization-6061c4.md

8.1 KiB
Raw Permalink Blame History

Likwid Stabilization Megaplan (6061c4)

Stabilize Likwid into a production-usable system by shipping a coherent admin modular-rule management UX (built-in + WASM), making WASM packages production-grade (including background jobs + SSRF-hardening), and converging authorization on roles/permissions.

Decisions (confirmed)

  • Git history: linear main via squash merges.
  • Demo VPS: tracks main.
  • Modular rules approach: Hybrid (built-in DB-backed settings + built-in modules + WASM plugin packages).
  • Authz direction: roles/permissions is authoritative.
  • Phase 1 default instance guidance: instance_type=multi_community, platform_mode=admin_only.
  • plugin_allow_background_jobs: full implementation (end-to-end semantics, not just a stored flag).
  • Registry SSRF hardening: yes (DNS-aware; do not rely only on host string / IP-literal checks).
  • Community admin UX scope: full (plugin policy + WASM packages + built-in plugins in one coherent flow).
  • Plan handling: commit plan to main once approved (plan-first discipline).

Non-negotiable outcomes (Phase 1)

  • Operational reliability + install/upgrade story.
  • Admin modular rule management UX.
  • WASM third-party plugin packages are production-grade.
  • End-user UX consistency (avoid confusing partial configuration states).

Current state (source-backed highlights)

  • Backend exposes community endpoints:
    • GET/PUT /api/communities/{id}/plugin-policy
    • GET/POST/PUT /api/communities/{id}/plugin-packages (+ install-registry)
  • WASM runtime exists (wasmtime; fuel + timeout + memory limits).
  • WASM outbound HTTP is capability-gated and allowlisted.
  • Registry allowlist currently:
    • blocks localhost
    • blocks IP-literal loopback/private/link-local/unspecified
    • matches exact host or *.suffix
    • does not do DNS resolution + post-resolution IP classification.
  • Frontend currently does not call plugin-policy / plugin-packages (community UI covers built-in plugins only).

Scope boundaries

  • No architectural rewrite.
  • Minimal, targeted changes per milestone.
  • No dependency additions unless clearly required for an explicit acceptance criterion.

Milestone discipline / verification

  • Each milestone lands as a single squash-merge PR to main.
  • Required verification per milestone:
    • Backend: cargo check (and cargo test if stable)
    • Frontend: npm run build
    • Demo VPS: deploy main and run ./scripts/smoke-test.sh demo

Phase 0 — Baseline + operator invariants (gate)

Deliverables

  • Single authoritative operator workflow for:
    • local dev start/stop
    • demo deploy/update
    • rollback
    • smoke test
  • Confirm demo systemd + compose wiring is consistent with docs and scripts.

Acceptance criteria

  • Demo VPS updated to latest main and ./scripts/smoke-test.sh demo passes.

Verification

  • Run smoke test on VPS.

Phase 1 — Admin modular rule management UX (hybrid)

Goal

One coherent admin flow for:

  • community built-in plugins (plugins + community_plugins)
  • community WASM plugin packages (plugin_packages + community_plugin_packages)
  • community plugin policy (communities.settings keys)

Deliverables (frontend)

  • Community admin UI adds a “Plugins / Rules” surface that includes:
    • Built-in community plugins management (existing functionality retained).
    • Plugin policy editor (read + update):
      • trust policy
      • install sources
      • registry allowlist
      • trusted publishers
      • outbound HTTP toggle + allowlist
      • background jobs toggle
    • WASM package manager:
      • list installed packages
      • upload package
      • install from registry URL
      • activate/deactivate
      • edit package settings (schema-driven when available; raw JSON fallback)
  • Clear error messages for:
    • policy forbids uploads / registry installs
    • signature requirements fail
    • registry allowlist blocks URL

Deliverables (backend contract hardening)

  • Ensure API error responses are stable and actionable for UI (status code + message consistency).
  • Ensure event emission for key actions is consistent (public_events).

Acceptance criteria

  • As community admin/moderator:
    • can view/update plugin policy
    • can install a WASM package (upload + registry) when policy allows
    • can activate/deactivate packages
    • can edit package settings and receive server-side schema validation errors when invalid

Verification

  • Manual UI walkthrough covering:
    • signed-only policy
    • registry allowlist allow/deny
    • outbound HTTP allowlist allow/deny
    • background jobs on/off behavior (see Phase 2 definition)

Phase 2 — WASM packages production-grade hardening

2.1 Background jobs: full implementation

Proposed semantics (must be implemented consistently)

  • plugin_allow_background_jobs=false means:
    • WASM plugins must not be invoked for cron hooks for that community.
    • Any future “scheduled” behavior for WASM must be gated by the same setting.
  • plugin_allow_background_jobs=true means:
    • WASM plugins may receive cron hooks they declare in their manifest (e.g. cron.minute, cron.hourly, cron.daily, ...).

Implementation outline (expected code touchpoints)

  • Resolve where WASM cron hooks are dispatched (currently cron loop exists in backend/src/main.rs and invokes PluginManager::do_wasm_action_for_community).
  • Add a community-settings check (communities.settings.plugin_allow_background_jobs) in the WASM cron dispatch path.
  • Ensure policy API default behavior is explicit and safe:
    • confirm default is false (current parse default is false).

Acceptance criteria

  • When plugin_allow_background_jobs=false, WASM cron hooks are not executed for that community.
  • When plugin_allow_background_jobs=true, WASM cron hooks execute normally.

2.2 Registry install SSRF hardening (DNS-aware)

Goal

Registry install should not be able to reach internal/private addresses via DNS rebinding or private resolution.

Deliverables

  • Extend registry allowlist enforcement to:
    • resolve DNS for hostname-based registry URLs
    • reject if any resolved IP is loopback/private/link-local/unspecified/unique-local (IPv6)
  • Keep existing protections:
    • reject localhost
    • reject IP-literal private/loopback
    • enforce allowlist patterns

Acceptance criteria

  • Registry install is blocked when a hostname resolves to a private/loopback/link-local address.

2.3 Registry fetch hardening (timeouts/size caps)

Deliverables

  • Add explicit timeout and size bounds to registry bundle fetch.
    • Current code path uses reqwest::get(...) without explicit timeout/size cap.

Acceptance criteria

  • Registry fetch cannot hang indefinitely.
  • Registry fetch cannot load an unbounded payload.

2.4 Operator-visible metadata

Deliverables

  • UI shows package metadata:
    • publisher
    • sha256
    • signature present
    • source (upload/registry)
    • registry URL
    • manifest-declared hooks + capabilities
    • effective outbound HTTP permission status

Phase 3 — Authz convergence (roles/permissions authoritative)

Goal

Stop using community_members.role as the primary enforcement mechanism for privileged actions.

Deliverables

  • Inventory endpoints that currently use ensure_admin_or_moderator style gates.
  • Introduce/confirm permissions for:
    • managing plugin policy
    • managing plugin packages
    • managing community plugin settings
  • Migrate gates to permission checks consistently.

Acceptance criteria

  • Plugin policy + package management endpoints authorize via roles/permissions.

Phase 4 — Technical debt hotspot inventory + targeted fixes

Deliverables

  • Evidence-backed inventory (file/function-level) of:
    • cross-layer coupling hotspots
    • duplicated policy parsing/enforcement
    • plugin plane confusion (instance defaults vs community plugins vs wasm packages)
    • any unstable areas discovered during Phases 13
  • Only fix hotspots that block Phase 13 acceptance criteria.

Commit plan (after this plan is approved)

  • Add this plan into the repo under .windsurf/plans/ and commit to main.
  • Implementation starts only after the plan commit lands.