AWS Releases Serverless A2A Gateway Replacing 190 Point-to-Point Connections with a Central Registry
Amazon Web Services has published a reference serverless architecture for an A2A gateway that centralizes discovery, routing, and access control between AI agents. Twenty agents without coordination can create up to 190 mutual connections — the gateway reduces that to a single entry point.
This article was generated using artificial intelligence from primary sources.
As the number of AI agents inside an enterprise grows, so does the operational chaos that arises without central coordination. Amazon Web Services has published an open reference architecture for a serverless A2A gateway that solves that problem at the infrastructure level.
Why Does a Growing Number of Agents Create a Connectivity Problem?
The problem is simple to describe but serious in practice. Every direct connection between two agents requires its own authentication, its own routing logic, and its own access control. With N agents, the maximum number of such point-to-point connections is N×(N−1)/2.
For 10 agents that is 45 connections. For 20 agents — a situation that is not unusual in an enterprise deployment — the system can require up to 190 direct connections. Each carries its own credentials that must be rotated, its own error-handling logic, and its own authorization code. The fragmentation becomes so complex that adding a new agent becomes a project in itself.
The A2A gateway reduces all of that to a single entry point.
Three-Layer Architecture
Management Layer
The management layer holds a central agent registry and enables agent discovery. Each agent registers once, and the gateway automatically generates and caches an agent-card.json accessible at the standardized path GET /agents/{agentId}/.well-known/agent-card.json. URLs in the card are rewritten to the gateway domain so clients never communicate directly with backends.
Agent discovery works in two ways: exact name matching or semantic natural-language search. Agent descriptions are indexed using Amazon Bedrock Titan Text Embeddings and stored in Amazon S3 Vectors. A client can submit a query like “agent that analyzes contracts in English” and receive a list of semantically relevant agents without knowing their exact names.
Control Layer
The control layer manages authorization and rate limiting. Authentication is based on the OAuth 2.0 client credentials flow through Amazon Cognito — the client receives a JWT token containing scopes tied to the specific agents it is allowed to access.
A Lambda authorizer evaluates JWT scopes on every request. Rate limiting is applied per user-agent combination and is implemented with atomic counters in Amazon DynamoDB — meaning even parallel requests cannot bypass the limit thanks to DynamoDB’s atomic operations.
Execution Layer
The execution layer is a proxy that receives the authorized request, fetches backend credentials from AWS Secrets Manager, authenticates against the target agent, and forwards the response to the client. Server-Sent Events streaming is also supported via the Lambda Web Adapter, meaning clients can receive partial responses in real time without blocking.
Zero-Secret Model on the Client Side
The security design starts from the principle that a client must never have access to any password or OAuth secret for backends. All backend credentials are held exclusively by Secrets Manager; clients communicate only with the gateway using a scoped JWT token valid for a limited time.
Combined with an optional private VPC deployment (the gateway and all AWS services accessible only inside a private network, with no public internet) and AWS Direct Connect links for on-premises agents, the architecture meets enterprise requirements with high security demands.
Implementation Supports the Open A2A Protocol
The gateway explicitly implements the A2A (Agent-to-Agent) protocol — an open standard for interoperability between AI agents from different vendors. Both JSON-RPC and HTTP+REST bindings are supported, meaning agents built on different platforms can communicate through the same gateway without changes on the client side.
The entire infrastructure code is available as a Terraform configuration (version 1.5.0 or newer). Terraform automatically provisions DynamoDB tables for the agent registry, permissions, and rate-limit counters, the Cognito user pool, Lambda functions for all operations, API Gateway, and IAM roles. The Docker build proxy for the Lambda container runs automatically as part of the terraform apply process.
Practical Context
The reference architecture is not a production-ready service but a documented pattern that teams can adapt. AWS emphasized that the gateway is intended as a foundation to which organizations add their own security layers, monitoring, and domain-specific logging. Backends must implement their own defenses against prompt injection attacks — the gateway controls who may access an agent, but does not analyze message content.
Frequently Asked Questions
- How many point-to-point connections arise without an A2A gateway in a system with 20 agents?
- Without central coordination, 20 agents can require up to 190 direct connections (formula N×(N−1)/2), which becomes operationally unsustainable due to scattered credentials and custom routing logic.
- How does the gateway find the right agent without knowing its exact name?
- Agent descriptions are indexed with Bedrock Titan Text Embeddings vectors in Amazon S3 Vectors, so clients can send a natural-language query and find the appropriate agent via semantic search.
- What does the zero-secret model mean for clients using the gateway?
- Clients receive only scoped JWT tokens; all backend OAuth credentials are held by AWS Secrets Manager, and clients never see or manage passwords to individual agents.
Related news
Claude Code v2.1.198: Background Agents Now Open PRs Automatically, /dataviz Skill Arrives
AWS AgentCore Memory Gains Metadata Filtering — Accuracy Jumps from 40% to 64%
LangChain Introduces RLM Agents: Recursive Models Achieve 79% Better Results on Long Contexts