April: Agentic engineering in production, AWS wired in Terraform, benchmarking SCA tools

April was about three things: putting the agentic workflow rig from March on real rails, sorting out how AWS dev environments hold together end-to-end in Terraform, and a tour through the SCA tool market that ended up being more about vendor go-to-market than about scanners.

Agentic engineering, productionized

The agentic workflow stack I was sketching in March is mature enough now to leave running unattended. The pattern that mattered most is the two-file split between source and lock: a markdown workflow you write, compiled into a SHA-pinned .lock.yml that CI actually runs. The mental model is the same as package.json plus package-lock.json, applied to an agent run instead of a Node build. The reference implementation worth knowing is GitHub’s gh-aw.

.github/workflows/
├── workflow.md          # source, ~10 KB, edited by humans
└── workflow.lock.yml    # compiled, ~1300 lines, run by CI

Pinning lives at three layers: an aw/actions-lock.json that fixes which action versions a workflow can compile against, the compiled .lock.yml carrying SHA-pinned references for every step, and a small firewall container that constrains outbound HTTP from inside the agent runtime to an allowlist. None alone is enough. Together they make the difference between “an agent on a schedule” and “an agent on a schedule that I’m willing to leave unattended on a Sunday.”

The MCP layer is where the operational pain shows up. Cross-version compatibility issues between MCP servers and the agent runtime are common enough to plan for; a policy change in one MCP server interacting badly with the runtime can burn weeks of debugging before it’s tracked down. The lesson generalizes: anything an agent depends on (MCP servers, tools, model versions) needs to live on the same release-discipline ladder as the rest of the lock, with a changelog you can read before bumping. Debugging behavior across an agent boundary is dramatically harder than debugging an HTTP API.

AWS environments, wired in Terraform

Dev environments built to last on AWS belong in Terraform end-to-end: networking, security groups, load balancers, parameter store, the lot. The interesting work is in the wiring. Any cloud provider gives you VPC, ALB, NLB, security groups, parameter stores; where you spend your time is in how those compose, where state lives, and what happens to costs when service count crosses some threshold.

service
├── ALB (layer 7)  or  NLB (layer 4)
├── security groups  ── ingress from ALB SG, egress to dependencies
├── SSM /env/service/{...}  ── IAM scoped to the prefix
└── ECS task:  1 × Fargate on-demand   (guaranteed baseline)
              + N × Fargate Spot       (autoscale, ~70% cheaper)

The wiring shape that holds up: each service gets its own SG that accepts ingress from the upstream ALB SG by reference (not by CIDR); SSM parameters live under a per-environment prefix with IAM scoped to that prefix; ALB at layer 7 is the default for service-to-service HTTP, NLB at layer 4 only when something downstream needs a stable IP. Inline cidr_block in a SG rule is a smell that says somebody didn’t have a SG to reference, or didn’t bother.

Cost optimization at scale on ECS is mostly about Fargate Spot. The pattern: one Fargate on-demand task per service as a guaranteed baseline, then scale out on Spot above that. Spot is roughly 70% off on-demand, but tasks can be reclaimed by AWS with two minutes’ notice, so a service running purely on Spot is one capacity event away from being down. The on-demand baseline is the safety net; one-plus-N is meaningfully cheaper than N+1 over a month, without giving up the availability floor.

Benchmarking SCA tools, and the friction that makes it harder than it should be

A lot of the security work this month was less about audits and more about evaluating SCA tools (Software Composition Analysis: scanners for dependency vulnerabilities and license issues). It’s a saturated market: half a dozen serious vendors, all doing roughly the same job, differing in language coverage, false-positive rates, CI integration, and pricing model. The interesting part is not which tool wins; it’s the experience of trying to evaluate them.

Engagement model	Time to first signal	Engineer fit	Sales fit
Self-serve PoC	hours	high	low
Sales-led intake	weeks	low	high
Pre-sales engineer led	days	high	medium
Forward-deployed engineer	hours	high	high

Some vendors ship self-serve onboarding: work email, point the tool at a repo, usable signal inside an hour. Others gate every conversation behind a sales-led process where the first call is a non-technical AE asking how many developers and repos you have (the inputs for a quote), before any technical access. The mismatch is the problem: at evaluation, the engineer needs a tech benchmark and the salesperson needs a license count, and neither has what the other one needs. The tool might genuinely be the best on the market, and you’ll never know, because you bounced off the front door.

The middle ground exists but few vendors do it well: a technical PoC led by a pre-sales engineer, bracketed off from commercial discussion until the tool earns its keep. The newer wave of forward-deployed engineering (technical staff embedded with customer teams instead of AE-plus-SE handoffs) is the next step in the same direction. More on FDE as a model in a future entry; for SCA specifically, the tools whose vendors invest in technical-first engagement tend to be the ones that survive contact with a real codebase.

Long-form writeups on the agentic stack and the AWS wiring patterns are in progress. They will land in the /writing section when ready.