Skills Platform

The skills platform lets tenants deploy and schedule Python functions (skills) built with the Claude Agent SDK. Skills run on Amazon Bedrock AgentCore Runtime — an isolated microVM that charges only for active compute and supports long sessions.

Concepts

A skill has metadata (name, description, input/output schema) and one or more stages (e.g. dev, staging, prod, or custom names).
Each stage has independent code, configuration, secrets, schedule, and execution history.
On-demand invocations return a run_id; clients poll GET /runs/{run_id} or subscribe via SSE for live updates.

Package format


skill.zip/
├── skill.yaml         # metadata descriptor (auto-generated on upload if absent)
├── main.py            # entry point: def handler(input: dict) -> dict
└── requirements.txt

The handler signature maps directly to Claude’s tool_use format (name, description, input_schema), so any deployed skill can be used as a Claude tool.

Upload → validate → deploy

The owner requests a presigned S3 URL and PUTs the ZIP directly to S3.
A skill_validator Lambda (S3 trigger) extracts metadata, runs an AST security scan, bundles dependencies with uv, and marks the stage ready_to_deploy.
The owner confirms the version bump and change level → the stage becomes active.

At deploy time the stage snapshots the current input_schema and handler. Invocation-time validation uses this frozen copy, so editing the skill metadata later never breaks running stages.

Invocation authorization

Owners can invoke any active stage in their tenant.
A non-owner must hold a grant whose allowed_skills contains the skill’s ID (or the wildcard ["*"]), else the invoke returns 403.
The request body is validated against the stage’s frozen input_schema; a mismatch returns 400 and never creates a run record.
If the SQS enqueue fails, the run is marked failed/runtime_error and the API returns 503 — no row is left stuck in queued.

Scheduling & quotas

Stages with schedule_enabled use EventBridge Scheduler → SQS with a 5-field cron and an IANA-validated timezone. L/XL stages route to a long-run queue so AgentCore sessions don’t time out.
A skill_quota Lambda runs synchronously before every execution. Increment uses an atomic DynamoDB UpdateItem with ADD plus a ConditionExpression, so concurrent invocations can’t bypass the cap. Throttled runs record status=throttled and consume no billable minutes.

Webhooks & isolation

Stages can configure a webhook_url plus a webhook_secret_arn (HMAC-SHA256). The dispatcher validates the URL against an SSRF blocklist and confirms the secret belongs to the caller’s tenant. If the signing secret can’t be resolved, the webhook is rejected — never sent unsigned.
The AgentCore container receives a curated environment. Reserved-prefix keys (AWS_*, AGENT_RUNNER_*) and dynamic-loader keys (PYTHONPATH, LD_PRELOAD, LD_LIBRARY_PATH, DYLD_INSERT_LIBRARIES) supplied by the tenant are stripped, so tenants cannot inject AWS credentials or hijack module resolution.

Secrets

Owners create named secrets stored in AWS Secrets Manager under agent-runner/<env>/tenant/<tenant_id>/secrets/<secret_id>. Only metadata is stored in DynamoDB — values are never returned by the API. Secrets are assigned to a stage as secret_refs and injected as environment variables at execution time. Developers can use secrets assigned to a stage they can invoke, but cannot list, create, or modify them.