Observability
Every Lambda is instrumented with AWS Lambda Powertools — structured JSON logging, X-Ray tracing, and EMF metrics. No custom logging, tracing, or metrics code.
Logs
- CloudWatch Logs, structured JSON via Powertools
Logger. tenant_idis always in the structured log context for tenant requests.- Log groups:
/aws/lambda/agent-runner-<handler>-<env>, 30-day retention, KMS-encrypted.
Metrics
- CloudWatch EMF from the proxy Lambda on every request.
- Namespace
agent-runner/<tenant_id>— intentionally per-tenant for cost attribution. - Platform-wide service metrics use
AgentRunner/<service>(e.g.AgentRunner/Proxy).
EMF metrics are always emitted, even on error or client disconnect. Missing metrics means missing billing — a Sev-2 condition.
Traces
X-Ray active tracing on every Lambda. Outbound Bedrock calls are wrapped
in subsegments annotated with tenant_id, user_sub, model_id, and
bedrock_region.
Billing & audit data
- Billing data: CloudWatch metric stream → Kinesis Firehose → S3
(
agent-runner-usage-<env>) → Athena. Long retention. - Audit log: DynamoDB Streams → audit processor Lambda → S3 Object Lock (WORM) bucket, 7-year retention. Every API mutation and proxy authentication event creates an audit record.
Dashboards & saved queries
One CloudWatch dashboard per environment (<name_prefix>-platform) with
proxy, control-plane, and data-pipeline sections. Saved Logs Insights
queries (backend/saved_queries.tf): errors-recent, recent-logs,
slow-requests, tenant-activity, billing-events.
Last updated on