🟡 🏥 In Practice Published: · 2 min read ·

AWS: Bedrock AgentCore pool-model multi-tenancy — shared infrastructure, isolated tenants

Editorial illustration: cloud infrastructure diagram showing layered tenant isolation tiers with data flow arrows

AWS Bedrock AgentCore introduces a pool-model multi-tenancy architecture with three-tier isolation (Tier → Tenant → User), Cedar policies for tool boundaries, and a Token Vending Machine for memory isolation — a reference SaaS design for production AI agents.

🤖

This article was generated using artificial intelligence from primary sources.

AWS has published a reference architectural pattern for production SaaS AI agents — pool-model multi-tenancy within the Amazon Bedrock AgentCore platform.

What is multi-tenancy and why is it critical for SaaS AI?

Multi-tenancy is an architecture in which multiple independent users — tenants — share the same infrastructure, while their data, permissions, and resources remain strictly isolated from one another. For AI agents in a SaaS context this is especially demanding: the agent must know who is calling it, which tools it is authorized to access, and which data it may return — and it must do this for each tenant separately, in real time.

Three-tier isolation hierarchy

The AgentCore solution introduces three clear isolation levels: Tier → Tenant → User. At the tier level, two service classes are distinguished. The Basic tier uses the Mistral 3 8B Instruct model with a limit of 2 requests per second and a maximum of 50 requests per day. The Premium tier offers the OpenAI GPT OSS 120B model at 10 requests per second and 500 per day — five times the capacity with a significantly more powerful model.

Mechanisms ensuring strong isolation

Tool boundaries per tier are defined through Cedar authorization policies — a declarative language describing what each tier is allowed to do, without hard-coded logic in application code.

For memory isolation the system uses a Token Vending Machine (TVM) combined with an ABAC model (Attribute-Based Access Control). The TVM issues short-lived tokens with embedded tenant attributes, so the memory layer automatically knows which data each tenant may access.

The third key element is the OpenTelemetry baggage mechanism, which propagates tenant metadata — tenant identifier, tier level, permission scope — throughout the entire request lifecycle, from the incoming API call to the agent response. In this way every microservice in the chain knows the context without additional database calls.

Reference example: healthcare platform

AWS describes a healthcare SaaS as the primary example: hospitals (tenants) share the same AI agents for processing medical data, but Cedar policies guarantee that patient records of one institution are never accessible to another — even within the same agent call.

This pattern sets a production standard for multi-tenant AI agents — a replacement for the ad-hoc approach in which each tenant receives its own isolated instance, multiplying infrastructure costs.

Frequently Asked Questions

What is multi-tenancy and why does it matter for AI agents?
Multi-tenancy is an architecture in which multiple independent users (tenants) share the same infrastructure while their data and permissions remain strictly isolated — essential for SaaS platforms that want to reduce costs without compromising privacy.
How does AgentCore ensure isolation between tenants?
Through a combination of Cedar authorization policies for per-tier tool boundaries, a Token Vending Machine with an ABAC model for memory isolation, and an OpenTelemetry baggage mechanism that propagates tenant metadata throughout the entire request lifecycle.