Architecture
agent-runner is an AWS-native multi-tenant SaaS platform. A single
control plane in eu-central-1 fronts the Claude Model Proxy and the
skills / MCP data planes.
Browser / Claude Code CLI
|
| HTTPS
v
CloudFront (console: app.<domain>, proxy: *.proxy.<domain>)
|
+---> S3 (Next.js static export) [console]
|
+---> Lambda Response Streaming [proxy]
|
+---> DynamoDB (tenant/grant/key lookup, billing counters)
+---> Cognito JWKS (JWT verification, 1h cache)
+---> STS AssumeRole -> Bedrock (cross-account)
+---> CloudWatch EMF (token metrics -> Firehose -> S3 -> Athena -> Stripe)
API Gateway REST (/v1/*)
|
+---> Control-plane Lambdas
tenants / users / grants / keys / models / billing
cognito_triggers / billing_aggregator / audit_processorKey properties
- Single-region control plane. Config, billing, and audit live in
eu-central-1. DynamoDB uses a single-table design, KMS-encrypted, with PITR and Streams enabled. - Wildcard CloudFront for the proxy. One distribution, one wildcard
ACM cert (
*.proxy.<domain>), zero per-tenant provisioning. Adding a tenant means adding a DynamoDB row. - Proxy is stateless pass-through. No response buffering. Token counts are extracted from the final Bedrock SSE chunk after streaming completes.
- Cross-account Bedrock via STS AssumeRole. The proxy Lambda assumes
a role in a dedicated Bedrock account. STS session tags (
tenant_id,identity_id,caller_type,bedrock_region) flow into CloudTrail for per-tenant billing attribution. Bedrock regions:eu-central-1,eu-central-2(Zurich),us-east-1. - Daily billing aggregator. An EventBridge cron runs Athena rollup queries, cross-checks against Bedrock CloudTrail within a 1% drift gate, then pushes metered usage records to Stripe.
Tenant isolation
The platform uses the pool model — shared Lambda, shared DynamoDB table. Isolation is enforced at the application layer.
Tenant binding is always verified. The bearer JWT’s tenant_id claim
(or the API key’s stored tenant_id) must equal the tenant resolved
from the Host header. A mismatch is always a 403, never a 401.
Cross-tenant replay is rejected.
Stacks
The infrastructure is split into ordered OpenTofu stacks; cross-stack
references resolve via AWS data sources using deterministic resource
names (no terraform_remote_state).
account_prep -> data -> backend -> proxy
-> frontendSee Deployment for what each stack provisions.
Last updated on