Skip to Content
ArchitectureOverview

Architecture

agent-runner is an AWS-native multi-tenant SaaS platform. A single control plane in eu-central-1 fronts the Claude Model Proxy and the skills / MCP data planes.

Browser / Claude Code CLI | | HTTPS v CloudFront (console: app.<domain>, proxy: *.proxy.<domain>) | +---> S3 (Next.js static export) [console] | +---> Lambda Response Streaming [proxy] | +---> DynamoDB (tenant/grant/key lookup, billing counters) +---> Cognito JWKS (JWT verification, 1h cache) +---> STS AssumeRole -> Bedrock (cross-account) +---> CloudWatch EMF (token metrics -> Firehose -> S3 -> Athena -> Stripe) API Gateway REST (/v1/*) | +---> Control-plane Lambdas tenants / users / grants / keys / models / billing cognito_triggers / billing_aggregator / audit_processor

Key properties

  • Single-region control plane. Config, billing, and audit live in eu-central-1. DynamoDB uses a single-table design, KMS-encrypted, with PITR and Streams enabled.
  • Wildcard CloudFront for the proxy. One distribution, one wildcard ACM cert (*.proxy.<domain>), zero per-tenant provisioning. Adding a tenant means adding a DynamoDB row.
  • Proxy is stateless pass-through. No response buffering. Token counts are extracted from the final Bedrock SSE chunk after streaming completes.
  • Cross-account Bedrock via STS AssumeRole. The proxy Lambda assumes a role in a dedicated Bedrock account. STS session tags (tenant_id, identity_id, caller_type, bedrock_region) flow into CloudTrail for per-tenant billing attribution. Bedrock regions: eu-central-1, eu-central-2 (Zurich), us-east-1.
  • Daily billing aggregator. An EventBridge cron runs Athena rollup queries, cross-checks against Bedrock CloudTrail within a 1% drift gate, then pushes metered usage records to Stripe.

Tenant isolation

The platform uses the pool model — shared Lambda, shared DynamoDB table. Isolation is enforced at the application layer.

Tenant binding is always verified. The bearer JWT’s tenant_id claim (or the API key’s stored tenant_id) must equal the tenant resolved from the Host header. A mismatch is always a 403, never a 401. Cross-tenant replay is rejected.

Stacks

The infrastructure is split into ordered OpenTofu stacks; cross-stack references resolve via AWS data sources using deterministic resource names (no terraform_remote_state).

account_prep -> data -> backend -> proxy -> frontend

See Deployment for what each stack provisions.

Last updated on