Flyer · 5-person agent-driven team
2025 – PresentTypeScript · Next.js 16 · PostgreSQL 16 · NextAuth v5 · Docker · AWS EC2 Graviton (ARM64) · Claude Code / Copilot CLI
Production event platform built ground-up by a 5-person team driving AI coding agents (~77k LOC). Engineered the guardrails — deploy pipeline, security, disaster recovery — that make agent-driven shipping safe.
- ✦Designed a multi-mode GitHub Actions deploy workflow with post-deploy health probes and auto-rollback from a deploy-history ledger — a bad agent-authored deploy reverts itself in under two minutes.
- ✦Built a deploy state machine in Bash with remote locking, drift detection, and a CI-gating layer that blocks promotion until tests for the exact deploy commit pass — two prod-mutating ops can't interleave even when multiple agent sessions race the same branch.
- ✦Locked down the CI deploy key with a forced-command wrapper, strict argument allowlist, and audited invocations — a leaked key (or a misbehaving agent) can't open an interactive shell.
- ✦Hardened the stack end-to-end — daily encrypted Postgres → S3 backups (restore-tested via a scratch-container drill), migrated off a DB superuser to a least-privilege role, closed a PII leak in proxy logs, and SHA-pinned third-party actions with shell linting + bats coverage on every infra script.
- ✦Migrated all CI + deploy jobs onto a self-hosted k3s runner cluster (Actions Runner Controller, ephemeral pod per job) — CI rode through a hosted-runner billing outage that would've blocked any cloud-runner job.
Homelab · Self-hosted CI for the agent-built pipeline
2026 – Presentk3s · Actions Runner Controller · GitHub Actions · Ubuntu 24.04 · Lima (Apple Silicon arm64 VM)
Mixed-arch k3s cluster on repurposed Intel + Apple Silicon hardware; hosts ephemeral GitHub Actions runner pools that keep Flyer's agent-built CI cheap, fast, and outage-resilient.
- ✦Built a 3-node mixed-arch k3s cluster on repurposed Intel + Apple Silicon hardware running Actions Runner Controller — each CI job spawns an ephemeral pod (clean filesystem; runner pools scale to zero when idle).
- ✦Cut Flyer CI wall-clock ~42% end-to-end (13m35s → 7m50s) with a custom multi-arch runner image, node-local npm/Playwright caches that replaced WAN-bound hosted caches, and an arm64-native pool that skips QEMU emulation for build + heavy test jobs.
StyleBench · Master's Capstone — Empirical Study of AI Coding Agents
Python · tree-sitter · pytest · matplotlib · Claude Code · Codex CLI · Gemini CLI · uv · ruff
Empirical study of whether source-code style affects AI coding-agent bug-fix performance — a 1,920-trial benchmark on Claude Code under six controlled stylistic variants, run on a pluggable harness built for Claude Code, Codex CLI, and Gemini CLI.
- ✦Ran a 1,920-trial bug-fix benchmark on Claude Code (Haiku 4.5) across 4 real-world Python projects (~20k LOC, 3,000+ tests), 6 stylistic variants, and 14 mutation types — plus an 80-trial Codex CLI pilot.
- ✦Built a tree-sitter AST transformation framework + a multi-agent harness with pluggable CLI adapters (Claude Code, Codex CLI, Gemini CLI) — checkpoint-resume on rate limits, deterministic manifest mode for byte-identical inputs, and process-group cleanup on timeout.
- ✦Code style had no statistically significant effect on agent fix rate (p ≈ 1.0, ~1pp spread); repository difficulty and mutation type dominated by ~30pp each.