pelagia-portal/automation
Hardik 938ff6df89 test+ci: green the test baseline and make type-check + unit tests hard gates
Green-lights the test suite so the PR checks can enforce it:
- Fix the NextAuth v5 auth() mock typing across all integration tests (cast to a
  simple async fn so mockResolvedValue accepts the session) — clears ~86 errors.
- Fix stale test values: intent 'resubmit'->'submit' / 'save'->'draft'; ParsedImportLine
  .description -> .name; approvepo -> approvePo; add missing beforeEach/beforeAll imports.
- permissions: MANAGER *can* process_payment (intentional since e1340b9) — update the
  stale assertion.
- po-import-parser: skip the Sample_PO.xlsx fixture tests when the file is absent (it
  lives outside the repo); synthetic-workbook tests still cover the parser.

type-check is now 0 errors and unit tests pass (167 passed, 13 skipped). pr-checks.yml
flips type-check (whole project) and unit tests to HARD gates.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 13:03:54 +05:30
..
claude-issue-watcher.ps1 feat(automation): add triage phase to issue watcher 2026-06-19 04:20:21 +05:30
claude-issue-watcher.sh ci: enforce PR policy (tests-present + app type-check) and PR template 2026-06-19 12:49:32 +05:30
README.md test+ci: green the test baseline and make type-check + unit tests hard gates 2026-06-19 13:03:54 +05:30
refresh-test-db.sh fix(automation): apply master migrations to the test DB 2026-06-19 11:51:59 +05:30
register-watcher-task.ps1 feat(automation): issue-to-deploy pipeline — Report Issue button, Claude watcher, tag-triggered deploy 2026-06-11 16:39:43 +05:30
run-watcher.cmd feat(automation): add manual run-watcher launcher for desktop shortcut 2026-06-19 03:28:12 +05:30
staging-tunnel.cmd feat(automation): lock staging to SSH tunnel + dev banner + desktop shortcut 2026-06-19 11:59:25 +05:30
staging-up.sh feat(automation): lock staging to SSH tunnel + dev banner + desktop shortcut 2026-06-19 11:59:25 +05:30
watcher.config.example.json feat(automation): add triage phase to issue watcher 2026-06-19 04:20:21 +05:30

Automated issue-to-deploy pipeline

End-to-end flow from a user clicking Report Issue in the portal to a fix running in production:

Portal header (bug icon)                          [App/components/layout/report-issue-button.tsx]
        │  server action → Forgejo API
        ▼
Forgejo issue  (label: portal)                    [git.pelagiamarine.com/shad0w/pelagia-portal]
        │  polled every 10 min by Windows Scheduled Task "PelagiaClaudeIssueWatcher"
        ▼
TRIAGE  (watcher phase 1)                          [dev PC, headless Claude Code, analysis only]
        │  Claude reads the issue + repo, posts a requirements-breakdown comment,
        │  and routes it: adds `claude-queue` (auto-fixable) or `interactive` (human)
        ▼
FIX  (watcher phase 2, only for claude-queue)      [headless Claude Code in C:\...\src\pelagia-autofix]
        │  Claude implements + verifies fix; watcher pushes branch claude/issue-N
        │  and opens a PR (label: claude-pr)
        ▼
Human review: merge the PR, then create a release tag vX.Y.Z
        │  tag push triggers .forgejo/workflows/deploy.yml
        ▼
forgejo-runner on pms1 (pm2: forgejo-runner, label "host")
        │  checks out the tag in ~/pms, pnpm install + build + prisma migrate deploy
        ▼
pm2 restart ppms  →  live at pms.pelagiamarine.com

interactive-routed issues stop after triage for a human to pick up (run with Claude in a steered session). The triage breakdown comment is plain (no bot marker) so, for claude-queue issues, the fix stage reads it back as refined requirements.

Contribution policy (all changes via PR)

Every change lands through a pull request — no direct pushes to master. This applies to humans and to the automated pipeline alike (the watcher already opens PRs).

Each PR must include:

  • Tests for any code change. Model: the integration test on claude/issue-12 — it targets the prod-mirror test DB, anchors on existing rows, inserts fixtures via raw SQL (schema-tolerant), isolates them with a unique prefix, and cleans up in afterEach. Docs/config/automation-only PRs are exempt.
  • Docs updates where relevant (App/README.md, App/CLAUDE.md, Docs/, this file, CHANGELOG.md).

Enforcement.forgejo/workflows/pr-checks.yml runs on every PR into master:

  1. Test-presence gate: a PR touching App/app|lib|components|hooks with no test change fails. Justify genuine exceptions in the PR body for a reviewer to override.
  2. Type-check: pnpm type-check must be clean across the whole project (tests included). The test suite's old type baseline was repaired when this gate landed.
  3. Unit tests: pnpm test must pass.

All three are hard gates. pnpm lint is intentionally not run — it currently requires an interactive ESLint migration (a follow-up). Integration tests are type-checked here but executed against the pelagia_test DB by the autofix / locally (not in this shared CI, to avoid prod-mirror schema drift).

A PULL_REQUEST_TEMPLATE.md carries the checklist.

Components

Piece Where Notes
Report Issue button App/components/layout/report-issue-button.tsx + report-issue-actions.ts Any signed-in user; files issue with only the portal label (triage routes it)
Forgejo helper App/lib/forgejo.ts Needs FORGEJO_URL, FORGEJO_REPO, FORGEJO_TOKEN env (token scope: write:issue)
Issue watcher (active) automation/claude-issue-watcher.sh on pms1 Bash port; runs 24/7 via cron. Config + logs under ~/issue-watcher/
Issue watcher (Windows, disabled) automation/claude-issue-watcher.ps1 PowerShell original. PelagiaClaudeIssueWatcher task is disabled (pms1 is the sole worker; two pollers would race)
Forgejo helper App/lib/forgejo.ts Needs FORGEJO_URL, FORGEJO_REPO, FORGEJO_TOKEN env (token scope: write:issue)
Deploy workflow .forgejo/workflows/deploy.yml Triggers on v* tags; runs on the host runner
Runner pms1 ~/forgejo-runner, pm2 process forgejo-runner Registered as pms1-host with labels host, docker

Where the watcher runs (pms1)

The watcher runs on pms1 under cron (every 10 min), polling Forgejo over the local loopback (http://127.0.0.1:3001).

  • Script: ~/issue-watcher/claude-issue-watcher.sh (source: automation/claude-issue-watcher.sh)
  • Config: ~/issue-watcher/watcher.config.json (gitignored; holds the token + claudeExe = the nvm claude path)
  • Work clone: ~/pelagia-autofix (separate from the deployed ~/pms)
  • Logs: ~/issue-watcher/logs/ (watcher-<date>.log, per-issue claude-*.log, cron.log)
  • Crontab: */10 * * * * PATH=<nvm bin>:... ~/issue-watcher/claude-issue-watcher.sh >> ~/issue-watcher/logs/cron.log 2>&1

Auth: Claude Code must be signed in on pms1 (ssh in, run claude, complete the login → writes ~/.claude/.credentials.json). The watcher has a preflight that no-ops until those credentials exist, so cron can be enabled before sign-in and activates automatically once signed in. (An ANTHROPIC_API_KEY env var also satisfies it.)

The Windows variant (.ps1 + register-watcher-task.ps1) is the portable fallback; re-enable its task only if pms1 is unavailable, and disable one before enabling the other.

Test database (for autofix verification)

So the fix stage can verify against realistic data without touching production:

  • pelagia_test — a PostgreSQL database on pms1, owned by pelagia_user, that is a daily mirror of production (pelagia). Created once as superuser; refreshed by automation/refresh-test-db.sh via cron at 03:30 (pg_dump pelagia | psql pelagia_test).
  • The autofix clone's ~/pelagia-autofix/App/.env points DATABASE_URL at pelagia_test and runs in safe dev mode — no Resend/SSO secrets, so email is console-logged and storage is local. NEXTAUTH_URL/PORT are set to 3100 (production app is on 3000).
  • The fix prompt tells Claude it may run integration tests against this DB (set -a; . ./.env; set +a; pnpm test:integration) and may start a dev server on port 3100 only, stopping it by port (fuser -k 3100/tcp) — never a broad pkill next, which would take down production (it also runs a next-server).

Because the test DB is refreshed daily, anything the autofix writes to it (test data, schema experiments) is disposable. Schema-migration issues are routed to interactive by triage, so the unattended fixer should not be altering the schema anyway.

Staging (smoke test before deploy)

automation/staging-up.sh (deployed to ~/issue-watcher/ on pms1) brings up a staging instance of the latest master so changes can be clicked through before a release tag deploys them to prod.

  • Checkout: ~/pelagia-staging (separate from ~/pms and ~/pelagia-autofix)
  • Process: pm2 ppms-staging on port 3200, against the prod-mirror test DB (pelagia_test), safe dev mode (console email, local storage, SSO disabled).
  • Refresh to newer master + restart: re-run ~/issue-watcher/staging-up.sh.
  • Stop: pm2 delete ppms-staging.
  • Access is SSH-tunnel only — the dev server binds to 127.0.0.1:3200, so it is not reachable from the public internet. Open a tunnel and browse http://localhost:3200: ssh -L 3200:localhost:3200 shad0w@<pms1>. On Windows, the desktop shortcut "Pelagia Staging (tunnel)" (automation/staging-tunnel.cmd) opens the tunnel and the browser in one click.
  • A fixed banner "INTERNAL DEV / STAGING - NOT PRODUCTION" is shown (driven by NEXT_PUBLIC_ENV_LABEL in the staging .env; the EnvBanner component renders nothing when the var is unset, so production is unaffected).
  • Log in with a password user (SSO is off here), e.g. admin@pelagiamarine.com.

Issue label lifecycle

portal ──(triage)──▶ claude-queue ─▶ claude-working ─▶ claude-pr | claude-failed
              └────▶ interactive  (stops here — handle with Claude interactively)
  • A portal issue with no decision label yet is triaged once (maxTriagePerRun per run). Triage adds claude-queue or interactive and posts a breakdown.
  • claude-queueclaude-workingclaude-pr (PR opened) or claude-failed.
  • To retry a failed issue, re-add claude-queue.
  • To queue any manually-created issue for Claude (skipping triage), add claude-queue directly. To force human handling, add interactive.
  • Triage is skipped for issues that already carry any decision label.

Releasing

After merging a Claude PR (or any change) on master:

git pull
git tag v0.2.0          # semver: bump patch for fixes, minor for features
git push pms1 master --tags

The runner deploys the tag and restarts the app. Watch progress under Actions on the Forgejo repo, or pm2 logs forgejo-runner on pms1.

Operational notes

  • The watcher runs Claude Code with --dangerously-skip-permissions inside the dedicated pelagia-autofix clone — never point workDir at your main checkout.

  • Watcher only works issues while this PC is on; queued issues are picked up on the next run after boot (-StartWhenAvailable).

  • Tokens: portal-report-issue (write:issue, used by the app) and claude-watcher (write:issue + write:repository, used by the watcher). Both belong to the shad0w Forgejo account. Rotate via docker exec -u 1000 forgejo forgejo admin user generate-access-token ....

  • Server-side env for the button lives in ~/pms/App/.env on pms1 (FORGEJO_URL=http://127.0.0.1:3001 so it does not depend on the tunnel).

  • Known Forgejo 10 bug: clicking Update branch on a PR (or pushing to its head branch) can make the page show "This pull request is broken due to missing fork information" even though the PR is fine (API still reports mergeable: true). Fix: close and reopen the PR — via the UI, or:

    $h = @{ Authorization = "token <claude-watcher token>" }
    Invoke-RestMethod -Method Patch -Headers $h -ContentType application/json `
      -Uri https://git.pelagiamarine.com/api/v1/repos/shad0w/pelagia-portal/pulls/<N> -Body '{"state":"closed"}'
    Invoke-RestMethod -Method Patch -Headers $h -ContentType application/json `
      -Uri https://git.pelagiamarine.com/api/v1/repos/shad0w/pelagia-portal/pulls/<N> -Body '{"state":"open"}'
    

    Fixed upstream in newer Gitea/Forgejo — resolves itself if Forgejo is upgraded past v10.