AAM v0.1
A working draft on the agent web. Composed in Praha. MIT.
What this is
AAM (Agent Action Manifest) is a small open standard. It lets any website tell AI agents - the ones acting on a person's behalf - what they can do on the site, and how.
Think of it as robots.txt, but for AI agents instead of search crawlers. robots.txt tells Google what to index. AAM tells Claude or any agent what to do.
It's one JSON file at a fixed URL. No platform sign-up. No proprietary library. No vendor between you and the agent.
The problem in plain language
Soon, ordinary people will ask AI to do things on websites for them. Book a flight. Find a babysitter. Buy birthday gifts. Cancel a subscription. The AI will visit websites the person has never opened.
Today, when an AI agent visits your website, it has to guess. It clicks blindly through HTML, fills forms by trial and error, gets blocked, fails, retries. Often it can't get past your login page. When it does succeed, you have no record of what it did, who authorized it, or how to revoke that authorization.
This is bad for everyone:
- For the user - their agent is unreliable.
- For the site - anonymous traffic that may or may not be legitimate, no way to monetize.
- For the agent - 30% success rate on real flows.
AAM fixes this with a small contract between the site and the agent.
The protocol in three primitives
1. The manifest
Every AAM-aware site publishes a JSON file at:
https://yoursite.com/.well-known/agent-actions.jsonThis file declares everything an agent needs to know:
- What the site is called
- What actions are available (book a table, purchase a book, list issues, etc.)
- What parameters each action takes (date, time, party_size, etc.)
- Whether the action is free or has a price
- Where to send the user for authorization
The agent fetches this file on demand, only when it needs to. No pre-installation, no pollution of the agent's memory, no platform between the agent and the site.
A real example:
{
"aam_version": "0.1",
"site": {
"name": "Cafe Rosso",
"domain": "caferosso.com"
},
"auth": {
"type": "delegated_oauth",
"authorize_url": "/agent/authorize",
"required": true
},
"actions": [
{
"id": "check_availability",
"pricing": "free",
"params": {
"date": { "type": "string", "format": "date" },
"time": { "type": "string", "format": "HH:MM" },
"party_size": { "type": "integer", "min": 1, "max": 12 }
}
},
{
"id": "make_reservation",
"pricing": {
"type": "x402",
"amount": "0.05",
"currency": "USDC",
"network": "base"
},
"params": {
"date": { "type": "string", "format": "date" },
"time": { "type": "string", "format": "HH:MM" },
"party_size": { "type": "integer" },
"name": { "type": "string" }
}
}
]
}That's the entire on-site surface for a small business.
2. Authorization handoff
Agents should not pretend to be users. They should act on behalf of identified users, with explicit, scoped permission.
When an agent wants to use an action that requires auth, it stops. It tells its user: "This site needs you to sign in. Open this URL once." The user opens the URL on their device, signs in to the site like any human visitor would, and explicitly approves the agent with limits they choose: how much it can spend per transaction, how often per day.
The site issues an agent_token scoped to a triple of (this user, this agent vendor, this set of actions). The user can revoke it at any time. Every action under it is logged.
This is exactly how OAuth works between websites. AAM borrows the pattern for the human-to-agent relationship. Sites already have user registration, so AAM does not invent a new identity layer; it hands off to whatever the site already does.
3. Action invocation
With an agent_token in hand, the agent invokes an action with a regular HTTP POST:
POST /api/aam/actions/make_reservation
Authorization: Bearer agt_2354dd550089d210bd2b...
X-Agent-Vendor: anthropic.com
X-Agent-Run-Id: run_2wd2k7j3
Content-Type: application/json
{
"date": "2026-05-02",
"time": "19:00",
"party_size": 4,
"name": "Tadeas"
}The site validates the token, checks the agent is allowed to perform this action under the user's limits, runs the action, and returns the result. Everything is logged with a runtime ID so the user can audit later.
If the action is paid (like make_reservation above), the first call returns HTTP 402 with a payment challenge. The agent signs an authorization with the user's wallet, retries with X-Payment header, and the action runs.
A concrete example, end to end
Maria asks Claude: "Book me a table at Cafe Rosso for Friday 7pm, party of four."
- Claude searches the web, finds
caferosso.com. - Claude fetches
caferosso.com/.well-known/agent-actions.json. It seesmake_reservationexists, costs 0.05 USDC, requires auth. - Claude tells Maria: "Cafe Rosso requires you to sign in once. Open this URL:
caferosso.com/agent/authorize?vendor=anthropic.com." - Maria opens the URL. She signs in to Cafe Rosso (or registers if new), sees a consent screen: "Anthropic's Claude wants to make reservations on your behalf. Limit: $50/transaction, 5/day." She approves with Face ID via her wallet.
- Cafe Rosso issues an
agent_tokenscoped to (Maria, anthropic.com, [check_availability, make_reservation]). Limits attached. Audit log started. - Claude exchanges the auth code for the token. Calls
make_reservation. First response: HTTP 402 with payment challenge. - Claude signs the EIP-3009 authorization on Maria's wallet. Retries.
- Cafe Rosso verifies the signature on-chain (or off-chain, sites choose). Action runs. Reservation made. Response returns to Claude.
- Claude tells Maria: "Reserved. Confirmation RES-AB12X9. $0.05 USDC paid on Base. You can see this in your dashboard at caferosso.com/account/agents."
Total time: 60-90 seconds, mostly the human steps. Server-to-server steps complete in a few hundred milliseconds.
What you implement, as a site owner
Three things, in this order.
1. Drop a JSON file at /.well-known/agent-actions.json
Static if you want, or generated. The reference SDK does the latter.
2. Add three thin route handlers
Using @aam/server:
// app/api/aam/manifest/route.ts
import { createHandlers } from "@aam/server/next";
import { aam } from "@/lib/aam";
export const { GET } = createHandlers(aam).manifest;
// app/api/aam/token/route.ts
export const { POST } = createHandlers(aam).token;
// app/api/aam/actions/[action]/route.ts
export const { POST } = createHandlers(aam).action;And one rewrite so /.well-known/agent-actions.json serves your manifest:
// next.config.ts
async rewrites() {
return [
{ source: "/.well-known/agent-actions.json", destination: "/api/aam/manifest" }
];
}3. Run your existing business logic in the action's execute function
// lib/aam.ts
import { AAM } from "@aam/server";
export const aam = new AAM({
site: { name: "Cafe Rosso", domain: "caferosso.com" },
actions: [
{
id: "make_reservation",
pricing: "free",
params: {
date: { type: "string", format: "date" },
time: { type: "string", format: "HH:MM" },
party_size: { type: "integer", min: 1, max: 12 },
name: { type: "string" },
},
execute: async (input, ctx) => {
const reservation = await yourReservationService.create(input, ctx.userId);
return { ok: true, data: reservation };
},
},
],
});The SDK handles the protocol: token issuance, signature validation, scope enforcement, audit logging. You provide the business logic that already exists in your app for human users.
That's it. ~20 lines of config plus three thin routes. Nothing more.
What's optional
Payments
Most actions are free. Search a catalog, fill a form, get availability, list articles. The protocol works without any payment layer.
When an action does cost money, AAM specifies the price in the manifest:
{
"id": "make_reservation",
"pricing": {
"type": "x402",
"amount": "0.05",
"currency": "USDC",
"network": "base"
}
}The agent presents a signed payment authorization, either:
- x402 / USDC - default. The agent's wallet (Coinbase Smart Wallet, MetaMask, any ERC-1271-compliant smart account) signs an EIP-3009
TransferWithAuthorization. The site verifies the signature on-chain. - Stripe Checkout - an adapter returns a Stripe-hosted URL. Useful for fiat-only merchants and high-value transactions where human approval is desirable.
- Future rails - the protocol is method-agnostic. Stripe Link, ACH, SEPA, anything pluggable.
Payments are an optional layer. The protocol is useful for free actions alone.
Identity (federated, optional)
Per-site OAuth-style consent (the baseline above) works, but it makes the user reauthenticate at every new site they encounter. After 5 sites most users give up. AAM ID is an optional federated identity layer that fixes this.
A site declares which identity proofs it accepts in its manifest:
{
"auth": {
"type": "delegated_oauth",
"required_for": ["book_appointment", "cancel_appointment"],
"accepted_identity_proofs": [
{
"type": "aam_id",
"issuer": "https://aam-platform-gamma.vercel.app",
"jwks_url": "https://aam-platform-gamma.vercel.app/.well-known/aam-jwks.json",
"connect_url_pattern": "https://aam-platform-gamma.vercel.app/id/connect?agent={agent}&scopes={scopes}&site={site}"
},
{
"type": "per_site_consent",
"authorize_url": "https://example.com/api/aam/authorize",
"token_url": "https://example.com/api/aam/token"
}
]
}
}The agent picks any acceptable proof. AAM ID credentials are signed JWTs (RS256) that sites validate locally against the issuer's JWKS — no callback to the issuer per request.
The flow (out-of-band, works for chat agents that have no callback URL):
- Agent attempts an authenticated action without a credential → site responds 401 with
accepted_identity_proofs. - Agent constructs
, shows the URL to the user./id/connect?agent={vendor}&scopes={a,b}&site={site} - User opens it, sees the requested permissions, signs in once (Google OIDC, Apple, BankID, whatever the issuer supports), approves. The page displays a JWT credential.
- User pastes the credential back to the agent. Agent retries the action with
Authorization: Bearer. - Site validates signature against the issuer's JWKS (cached 24h locally). On future calls the agent reuses the same credential — no re-auth per site.
JWT shape:
{
"iss": "https://aam-platform-gamma.vercel.app",
"sub": "alice@example.com",
"aud": "your-site.com",
"iat": 1709913600,
"exp": 1712505600,
"agent_vendor": "claude-code",
"scopes": ["book:appointment", "cancel:appointment"],
"email_verified": true,
"verification_method": "google_oidc"
}Federation, not gatekeeping. AAM ID is open spec. We run the reference provider at aam-platform-gamma.vercel.app for convenience and demos. Anyone (Anthropic, Google, BankID, your own self-hosted instance) can run a conformant provider — it just exposes JWKS at /.well-known/aam-jwks.json and issues JWTs in the shape above. Sites pick which providers they trust by listing them in accepted_identity_proofs. If we go down, sites switch issuer in their manifest in minutes.
Cross-site reputation (optional network effect). Conformant providers MAY expose:
POST <issuer>/api/aam-id/track
GET <issuer>/api/aam-id/reputation/<sha256(email)>Sites report successful actions back to the issuer; reputation accumulates per identity across all sites that adopt the same issuer. Sites can use the reputation snapshot to apply trust bands (new identity = stricter rate limits, established identity = smooth flow). Identities are hashed by sites before lookup so the issuer doesn't see who's asking about whom.
Full spec at /spec/identity.
On-chain settlement
When a paid action is invoked, the site can either:
- Trust the signature off-chain and settle later in a batch (saves gas, requires patience)
- Submit on-chain immediately via
transferWithAuthorization(real-time guarantee, costs gas)
Both are valid. The SDK supports a relayer mode that auto-settles on-chain after every paid action. Configure with AAM_RELAYER_PRIVATE_KEY env var.
Why we built it like this
A few decisions worth making explicit.
How AAM relates to MCP
MCP (Anthropic's Model Context Protocol) and AAM optimize for different use cases. The MCP spec, read carefully, describes a protocol designed for trusted, stateful, often pre-configured connections to tool servers - your IDE talking to your filesystem, your agent talking to your own GitHub server. It supports a rich feature set for that case: bidirectional sampling, resource subscriptions, OAuth flows, structured outputs, list-changed notifications, paginated tool discovery.
AAM is designed for a narrower case MCP does not optimize for: an agent meeting a public website it has never seen, owned by someone the agent has never trusted, where the agent needs to perform a small set of well-defined actions on a person's behalf. There is no prior trust, no pre-installed server, no expected long-running connection.
Three concrete differences:
- Static file vs running server. An AAM manifest is a JSON file at a well-known URL. An MCP server is a long-running process. For most websites, publishing a static file is trivial; running an MCP server is not.
- No prior trust vs prior trust. AAM assumes the agent and the site are meeting for the first time, in the wild. MCP assumes the connection has been configured ahead of time by someone the agent runtime already trusts.
- Stateless POST vs stateful JSON-RPC. AAM is one HTTP POST per action with a bearer token. MCP is a session-oriented protocol with bidirectional features.
The two are complementary, not competing. A future agent could use MCP for installed long-running services and AAM for ad-hoc encounters on the open web.
Where AAM and MCP overlap (both can describe agent-callable actions), AAM is intentionally smaller. We avoid features that do not fit the runtime-discovery use case (no sampling, no resource subscriptions, no stateful connections). For untrusted public sites the cost of installation and the maintenance of trust matter more than feature richness.
Why not a centralized agent platform?
Because it would put one company in the middle of every agent commerce transaction. That's a chokepoint, not a protocol. We don't want to be the chokepoint. We don't think site owners want one to exist.
AAM runs in your stack, with your credentials. Tokens, audit logs, payments live with you. We do not need to exist for AAM to work.
Why on-chain payment as a default?
For agents acting autonomously, traditional cards require human approval each transaction (the Stripe Link Agent Wallet pattern). x402 + USDC is the only payment primitive today that lets agents pay autonomously, server-to-server, with no human in the loop after the initial consent.
Cards remain available via adapters when human approval is desirable - large purchases, regulated transactions, fraud-sensitive flows.
Why so small?
Because the success of a protocol depends on adoption, and adoption depends on the cost of implementation. AAM is ~20 lines of config to add to an existing site. That's deliberate.
The hardest part of any protocol is the second half of v1: rejecting features. We built a small thing on purpose.
Bootstrap, and the end state
A protocol only works if something on the agent side can read it. AAM is just HTTP + JSON, so anything that has fetch() and a few minutes of code can do it. But "anything" is a vague subject. There are three concrete paths to the agent reading AAM today, and they sit on a maturity curve.
1. Reference MCP adapter (today)
A small MCP server, @aam/mcp-bridge, exposes four generic tools to any MCP-aware agent (Claude Desktop, Cody, Continue, etc.):
aam_discover(site_url)— fetches the manifest at/.well-known/agent-actions.jsonaam_authorize_url(site_url)— returns the URL the user should open to consentaam_exchange_token(site_url, auth_code)— turns an auth code into an agent tokenaam_invoke(site_url, action_id, params, agent_token?, payment_proof?)— calls the action
Install it once, and from that point on every AAM-compliant site on the open web is reachable. Source in examples/mcp-bridge/.
This is explicitly not what AAM is fighting against. AAM's argument is against installing an MCP server per site — the context-window pollution problem that breaks at scale. A single, generic, protocol-aware bridge is an N=1 install whose context cost is constant regardless of how many AAM sites the agent encounters.
2. Lab-native browser tool detection (the real win)
Claude.ai's web browsing tool, ChatGPT browse, Operator, Gemini's web access — all of these already fetch HTML pages on the agent's behalf. The smallest possible extension is:
When fetching a new domain, opportunistically fetch
/.well-known/agent-actions.json. If a valid manifest exists, switch from pixel-clicking the page to invoking the typed actions instead.
This is a 50-line change in an existing feature, not a new protocol roadmap item. The win is measurable: success rate on instrumented sites goes from ~30% (pixel-click) to ~99% (typed). When this lands, the bridge becomes obsolete — that is the goal.
3. Native runtime support in agent frameworks
LangChain, LlamaIndex, the OpenAI Agents SDK, the Anthropic Agent SDK — any framework that orchestrates LLMs and tool calls can ship an aam package whose client knows how to fetch, parse, and invoke. No MCP layer at all; the framework just understands the protocol natively. We expect this to follow #1 and precede #2 by months.
Why we are honest about this
The marketing pitch — no per-site installation, no context pollution, no platform lock-in — describes the destination. The reference adapter is scaffolding while we get there. We say so plainly because every successful open protocol has gone through the same trajectory: RSS readers preceded native browser RSS support; OAuth client libraries preceded "Sign in with Apple"; Service Workers preceded native cache APIs. The bridge's eventual obsolescence is what success looks like, not a defeat.
What's in v0.1
Stable and shipped:
- Manifest format (this document)
- Auth handoff via OAuth-style consent
- Token issuance and verification with scope triples (user, vendor, actions)
- Action invocation with bearer tokens
- Optional x402 / EIP-3009 signature verification, full ERC-1271 support
- Optional on-chain settlement via relayer
- Reference implementation in
@aam/server(Node SDK) - Coinbase Smart Wallet onboarding component (
@aam/onboarding-coinbase) - Stripe Checkout adapter (
@aam/adapter-stripe)
Not yet, sketched only:
- Multi-action consent (one signature for N actions in a quest)
- Stripe Link adapter (when their API stabilizes for third parties)
- Manifest registry (optional public index of AAM-aware sites)
- Fine-grained scope language beyond per-action
Deliberately not in scope:
- Buyer wallets (use Coinbase Smart Wallet, Stripe Link, MetaMask, anything)
- Identity providers (use the site's existing user system)
- Agent runtime (Claude, OpenAI, custom: protocol is agnostic)
How to read this
Curious passer-by - stop here. The above is the whole story.
Site owner / dev - jump to the SDK at github.com/. Clone the starter, run it locally, add your first action. Five minutes from clone to working endpoint on Base Sepolia testnet.
Agent vendor - look at the auth handoff section. The token format and scope triple are designed to plug into your runtime cleanly. Send a PR if anything is awkward.
Payment infrastructure - look at the executor adapter interface. New rails are one TypeScript file away. Reference: packages/adapter-stripe/src/index.ts.
v0.1 working draft. APIs may change between minor versions until v1.0. Comments and pull requests welcome.