Skip to Content
ArchitectureObservability

Observability

Every Lambda is instrumented with AWS Lambda Powertools — structured JSON logging, X-Ray tracing, and EMF metrics. No custom logging, tracing, or metrics code.

Logs

  • CloudWatch Logs, structured JSON via Powertools Logger.
  • tenant_id is always in the structured log context for tenant requests.
  • Log groups: /aws/lambda/agent-runner-<handler>-<env>, 30-day retention, KMS-encrypted.

Metrics

  • CloudWatch EMF from the proxy Lambda on every request.
  • Namespace agent-runner/<tenant_id> — intentionally per-tenant for cost attribution.
  • Platform-wide service metrics use AgentRunner/<service> (e.g. AgentRunner/Proxy).

EMF metrics are always emitted, even on error or client disconnect. Missing metrics means missing billing — a Sev-2 condition.

Traces

X-Ray active tracing on every Lambda. Outbound Bedrock calls are wrapped in subsegments annotated with tenant_id, user_sub, model_id, and bedrock_region.

Billing & audit data

  • Billing data: CloudWatch metric stream → Kinesis Firehose → S3 (agent-runner-usage-<env>) → Athena. Long retention.
  • Audit log: DynamoDB Streams → audit processor Lambda → S3 Object Lock (WORM) bucket, 7-year retention. Every API mutation and proxy authentication event creates an audit record.

Dashboards & saved queries

One CloudWatch dashboard per environment (<name_prefix>-platform) with proxy, control-plane, and data-pipeline sections. Saved Logs Insights queries (backend/saved_queries.tf): errors-recent, recent-logs, slow-requests, tenant-activity, billing-events.

Last updated on