# Phase 5: Event Tracking, Business Dashboards, Per-User Stats ## Summary Add event tracking infrastructure, revenue/conversion/retention dashboards, and per-user usage stats in the account portal. Three layers that build on each other. ## Layer 1: Events Table Append-only `events` table in accounts-db (new migration in web-api): ```sql CREATE TABLE events ( id BIGSERIAL PRIMARY KEY, user_sub TEXT NOT NULL REFERENCES users(keycloak_sub), event_type TEXT NOT NULL, metadata JSONB, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); CREATE INDEX idx_events_type_created ON events (event_type, created_at); CREATE INDEX idx_events_user ON events (user_sub, created_at); ``` ### Event types | event_type | Where recorded | metadata | |-------------------|---------------------------------|---------------------------------------| | `signup` | auth callback (first login) | `{}` | | `login` | auth callback (every login) | `{}` | | `bouncer_create` | bouncer creation route | `{"bouncer": ""}` | | `bouncer_delete` | bouncer deletion route | `{"bouncer": ""}` | | `plan_upgrade` | billing webhook | `{"from": "free", "to": "pro"}` | | `plan_downgrade` | billing webhook | `{"from": "pro", "to": "free"}` | | `paste_create` | txt service (internal API call) | `{"paste_id": ""}` | | `pic_upload` | pics service (internal API call)| `{"pic_id": ""}` | Recording is inline at each call site. No async queue. A helper function `record_event(pool, user_sub, event_type, metadata)` keeps it simple. For paste_create and pic_upload: txt/pics call their own DB, web-api does not record these. They are tracked via Prometheus counters on the respective services and per-user counts come from the internal usage API (Layer 3). ## Layer 2: Business Dashboards (Grafana) ### New Prometheus metrics **Stripe MRR (300s background task in web-api):** ``` irc_now_mrr_cents gauge -- total MRR in cents irc_now_subscriptions_active gauge -- active subscription count irc_now_subscriptions_canceled_30d gauge -- cancellations in last 30 days ``` Implementation: `stripe::Subscription::list()` with `status=active` to sum `plan.amount` for MRR. Canceled count via `status=canceled` with `created > now - 30d`. **Event-derived metrics (60s background task in web-api):** ``` irc_now_signups_7d gauge -- signups in last 7 days irc_now_logins_total counter -- incremented on each login (already referenced in dashboard) irc_now_conversion_rate_30d gauge -- plan_upgrade events / signup events over 30 days irc_now_churn_rate_30d gauge -- plan_downgrade events / active pro users over 30 days ``` Retention cohorts: monthly signup cohorts with percentage still active (has `downstream_interacted_at` in soju within last 30 days). Exposed as gauge with `{cohort="2026-01"}` label. ``` irc_now_retention_cohort{cohort} gauge -- % of cohort still active ``` **Fix active users:** Replace `last_login_at`-based `irc_now_active_users_24h` with the soju `downstream_interacted_at > now() - 24h` query. This is already computed in `record_soju_metrics` as `irc_now_bouncer_active_users`. Either rename it or keep both and update dashboards to use the soju-based one. ### New Grafana panels Admin dashboard additions: - MRR (stat panel, formatted as currency) - Active subscriptions (stat) - Signups this week (stat) - Conversion rate 30d (stat, percentage) - Churn rate 30d (stat, percentage) - Retention by cohort (bar chart) - Messages sent total (timeseries, already have the data) ## Layer 3: Per-User Portal Stats ### Internal usage APIs **txt service** -- new endpoint `GET /api/usage/{sub}`: ```json { "paste_count": 12, "storage_bytes": 45678 } ``` Requires: authenticated internal call (service-to-service). Use a shared secret or just restrict by network policy (cluster-internal only, no Route). **pics service** -- new endpoint `GET /api/usage/{sub}`: ```json { "image_count": 8, "storage_bytes": 234567 } ``` Same pattern as txt. ### Dashboard page expansion The dashboard template gets new data: ```rust struct DashboardTemplate { page: Page, has_bouncer: bool, is_pro: bool, plan: String, // new fields: networks: Vec, // name, address, enabled channels: Vec, // channel names messages_relayed: i64, // total from soju "Message" table bouncer_uptime: Option, // pod age from K8s paste_count: i64, paste_storage: i64, // bytes pic_count: i64, pic_storage: i64, // bytes } ``` Data sources on page load: - Networks/channels/messages: query user's bouncer soju DB (existing pattern) - Uptime: K8s pod creation timestamp for the bouncer deployment - Paste/pic stats: HTTP calls to txt/pics internal usage APIs ### Dashboard UI Stats displayed in a grid of cards: - Bouncer: networks count, channels count, messages relayed, uptime - Content: pastes count, images count, total storage used ## Configuration changes web-api needs new env vars: - `TXT_INTERNAL_URL` -- e.g. `http://txt:8080` - `PIC_INTERNAL_URL` -- e.g. `http://pics:8080` These are cluster-internal service URLs, no auth needed if restricted by NetworkPolicy or just trusted within the namespace. ## Migration sequence 1. Apply events table migration to accounts-db 2. Deploy txt/pics with new `/api/usage/{sub}` endpoints 3. Deploy web-api with event recording, Stripe metrics, dashboard expansion 4. Update Grafana dashboards with new panels ## Retention / cleanup Events table: no automatic cleanup initially. At current scale (handful of users), growth is negligible. Add a retention policy (delete events older than 1 year) when the table exceeds ~1M rows.