Production event platform for college communities; owned the deploy pipeline, security hardening, and disaster recovery.
✦Built a 3-mode GitHub Actions deploy workflow with a post-deploy SHA probe and automated rollback from a .deploy-history ledger — cutting MTTR for a broken deploy to ~90 seconds.
✦Restricted the CI deploy SSH key with an `authorized_keys` forced-command wrapper — strict IMAGE_TAG regex, op allowlist, audited invocations. A leaked key can no longer obtain a shell.
✦Built a schema-versioned deploy state machine in Bash holding one remote flock for the full deploy lifecycle plus a peek_claim gate on the drift CI job — structurally impossible for two prod-mutating ops to interleave.
✦Shipped automated daily PostgreSQL → S3 backups with pg_dump | gzip integrity checks, 14-day lifecycle, and EC2 instance-profile auth (no static AWS keys); executed a live restore drill from prod backup.
✦Migrated PostgreSQL from a superuser connection to a least-privilege role via idempotent public-schema ownership transfer — shrinking SQL-injection blast radius from full DBA access to only the data the app needs.
✦Tightened CI: SSH-based prod-drift check, shellcheck + bats coverage on every infra script, and decoupled ops-class workflows from hosted runners onto a self-hosted homelab.
✦Led the live production domain cutover from heywhatsup.app to joinflyer.com: DNS, dual Let's Encrypt certs, host-based 301s, Resend DKIM swap, NextAuth + Google OAuth callback updates, and a PWA service-worker cache-name bump.
Two-node k3s cluster on repurposed Intel hardware (Dell desktop + Mac mini reflashed from macOS to Ubuntu, with additional MacBooks reflashed and staged to join); hosts an ephemeral GitHub Actions runner pool for Flyer.
✦Built a 2-node k3s cluster on repurposed Intel hardware (Dell desktop + Mac mini, Ubuntu 24.04 + containerd) running Actions Runner Controller (ARC) with a runner scale set labeled homelab — each CI job spawns an ephemeral pod (clean filesystem per job, zero idle compute when the queue is empty). Hosts ops-class CI for Flyer.
Production event platform for college communities; owned the deploy pipeline, security hardening, and disaster recovery.
Built a 3-mode GitHub Actions deploy workflow with a post-deploy SHA probe and automated rollback from a .deploy-history ledger — cutting MTTR for a broken deploy to ~90 seconds.
Restricted the CI deploy SSH key with an `authorized_keys` forced-command wrapper — strict IMAGE_TAG regex, op allowlist, audited invocations. A leaked key can no longer obtain a shell.
Built a schema-versioned deploy state machine in Bash holding one remote flock for the full deploy lifecycle plus a peek_claim gate on the drift CI job — structurally impossible for two prod-mutating ops to interleave.
Shipped automated daily PostgreSQL → S3 backups with pg_dump | gzip integrity checks, 14-day lifecycle, and EC2 instance-profile auth (no static AWS keys); executed a live restore drill from prod backup.
Migrated PostgreSQL from a superuser connection to a least-privilege role via idempotent public-schema ownership transfer — shrinking SQL-injection blast radius from full DBA access to only the data the app needs.
Tightened CI: SSH-based prod-drift check, shellcheck + bats coverage on every infra script, and decoupled ops-class workflows from hosted runners onto a self-hosted homelab.
Led the live production domain cutover from heywhatsup.app to joinflyer.com: DNS, dual Let's Encrypt certs, host-based 301s, Resend DKIM swap, NextAuth + Google OAuth callback updates, and a PWA service-worker cache-name bump.
Homelab — Self-hosted CI on a k3s cluster2026 – Present
Two-node k3s cluster on repurposed Intel hardware (Dell desktop + Mac mini reflashed from macOS to Ubuntu, with additional MacBooks reflashed and staged to join); hosts an ephemeral GitHub Actions runner pool for Flyer.
Built a 2-node k3s cluster on repurposed Intel hardware (Dell desktop + Mac mini, Ubuntu 24.04 + containerd) running Actions Runner Controller (ARC) with a runner scale set labeled homelab — each CI job spawns an ephemeral pod (clean filesystem per job, zero idle compute when the queue is empty). Hosts ops-class CI for Flyer.
Empirical study of whether code style affects AI coding-agent bug-fix performance.
Designed and ran a 1,920-trial benchmark measuring whether code style affects AI coding-agent bug-fix rates — 4 Python projects (~20k LOC, 3,039 tests), 6 style variants, 14 mutation types, 2 evaluation modes.
Built the supporting infrastructure: a tree-sitter AST transformation framework (whitelist renames, pytest.mark.parametrize syncing) and a multi-agent harness with pluggable Claude Code / Codex CLI / Gemini CLI adapters — rate-limit detection with checkpoint resumption, manifest mode for byte-identical input across agents, process-group termination on timeout.
Headline finding: code style had no statistically significant effect on fix rate (p = 0.998; 1pp spread). Repository difficulty (28pp spread) and mutation type (29pp spread) dominated.
pytest.mark.parametrize
syncing) and a multi-agent harness with pluggable Claude Code / Codex CLI / Gemini CLI adapters — rate-limit detection with checkpoint resumption, manifest mode for byte-identical input across agents, process-group termination on timeout.
✦Headline finding: code style had no statistically significant effect on fix rate (p = 0.998; 1pp spread). Repository difficulty (28pp spread) and mutation type (29pp spread) dominated.