AI & Innovation

Agents Are the Architecture — Three Things Google Doesn't Tell You That SMEs Must Fix First

Google Cloud Next 2026 plastered "Agents are the architecture now" across every slide. Sounds great — until you point an agent at production data. Three foundations SMEs need to fix first: inbound auth with short-lived JWTs, end-to-end tracing with cost attribution, fine-grained policies. From our live pipeline.

Vittorio EmmermannCEO of cierra — building AI systems that actually work.

June 5, 20269 min read

Agents Are the Architecture — Three Things Google Doesn't Tell You That SMEs Must Fix First

📑 Table of Contents

Google Cloud Next 2026 dropped a keynote slide last week that's been on every other LinkedIn profile since: "Agents are the architecture now." Sounds visionary. We're hearing the line from clients too, ever since AWS and Microsoft followed up with identical slides. The problem: an SME that buys into this and points an agent at production data without nailing three foundations first is flying blind. We're watching it play out live, because we've built exactly these foundations with several clients. Here are the three things no keynote mentions.

What Google says — and what it leaves out

The big cloud providers are aligned in 2026: agentic AI is the next architectural step. Instead of monolithic apps, you build pipelines of specialised agents that autonomously use tools, dispatch sub-tasks, and aggregate results. Vertex, Bedrock and Foundry deliver the runtime, the identity backbone, and the marketplace.

That's not wrong. It's just incomplete.

What the keynotes leave out: the moment your agent first touches real data — writes an invoice, comments on a ticket, reads a contract — you switch from prototype mode to compliance mode. And different rules apply. You need answers to three questions that no demo ever asks:

Who just invoked this agent, and with what permission?
What has this agent actually done in the last 24 hours, and what did it cost?
Which tools is this agent allowed to use in which context — and which not?

We call these the three foundations. Without them, "agents are the architecture" is a slide. With them, it's a production environment.

Foundation 1: Inbound auth with short-lived JWTs and scopes

The most common architecture we see in the wild in 2026 looks like this: an agent runs as a cloud service. Whoever calls it sends an API key in the header. The API key has been unchanged for twelve months, lives in three different repos and one Slack channel. Whoever has the key is "the agent" — whether it's the marketing team, an external consultant, or the office intern.

That works for a prototype. For production, it's a time bomb. What you need instead:

Short-lived tokens. Every call against your agent comes with a JWT that's valid for at most an hour. Ideally issued by the same identity provider that authenticates your employees.
Explicit scopes. The token declares what the caller may do: read-only, only project X, only certain actions. The agent checks scope before doing anything.
Audit per call. Who called which agent with which scope and when — that belongs in a central log, not in a CloudWatch corner nobody reads.

Concretely: we use a Cognito pool for the identity layer that issues short-lived JWTs via client-credentials flow. Every one of our agents requires a valid token, checks the scope (`agents/invoke`), and only then routes the call into business logic. Sounds like vanilla OAuth — and that's exactly what it is. But we see clients still using static bearer tokens after three months of agent operation. If one of those leaks, it's not one account that's compromised — it's the entire agent pipeline.

Foundation 2: End-to-end tracing with cost and token attribution

Observability for agents is not a logging problem. It's a tracing problem. The difference: logs tell you that something happened. Traces tell you why, in what order, triggered by whom, at what cost.

What an agent does in practice is never a single LLM call. A typical workflow:

Inbound call from a SaaS webhook (say, a new ticket in your project management tool).
Triage agent classifies the ticket — one LLM call, maybe 800 input tokens, 200 output tokens.
Triage agent calls the CRM tool to load customer context — two tool calls, maybe 1,500 additional tokens.
Triage agent escalates to a specialist agent — which makes two more LLM calls of its own.
The final answer gets posted as a ticket comment — another tool call.

That's one ticket. Multiply by a hundred tickets a day and you're at five hundred LLM calls, three hundred tool calls, and a four-digit token spend. If you don't trace this chain end-to-end, you can't answer three critical questions:

Which agent causes 80 percent of my Bedrock bill?
Which step in which pipeline hallucinates most often?
When a customer complains: what did the agent actually tell them, and on what basis?

Concretely, we work with a single tracing platform that captures every LLM call, every tool call, every agent hop. We tag each trace with project, customer, agent version, and model identifier. That lets us see not just that something got expensive — we see which agent version, in which client project, on which weekday. It's the foundation for eval pipelines, cost control, and the honest pricing conversation with the client.

Foundation 3: Fine-grained policies — "tool X in project Y"

Most agent frameworks have a binary permission model: an agent either has access to a tool or it doesn't. That's too coarse for 2026. What you actually need in practice are statements like:

"The support agent can read and comment on tickets in project A, read-only in project B, not at all in project C."
"The DevOps agent can push to repository X, but only on branches starting with `agent/`."
"The verifier agent can read the CRM but must not return personally identifiable fields."

This is not a theoretical exercise. The moment you run multiple agents across multiple client projects, you need this granularity. Otherwise you have exactly two options: too restrictive (agent does nothing, is useless) or too permissive (agent does anything, and the first slip-up costs you a customer).

We use a policy engine that expresses rules as declarative statements — separate from agent code. That means I can change a policy without redeploying the agent. Audit, code review, and rollback are cleanly separated.

Practical side effect: when a client asks what our agent is allowed to do in their environment and what not, we don't show them a marketing slide. We show them an auditable ruleset. Compliance officers relax a lot faster looking at that than at any demo.

What this means for SMEs

There are two ways to start with agents in 2026:

The hype path. You book Vertex or Bedrock, build a demo agent, show it to leadership, get applause. You integrate it into your CRM, your ticketing system, your accounting. Six weeks later you have no overview of who's calling what, what it costs, or what the agent is allowed to do in which context. The first incident is months away, not years.
The architecture path. You start with the same tools, but build the three foundations first: inbound auth, tracing, policies. Only then do you let the first agent near production data. You'll be two weeks slower — and six months later considerably faster, because you don't have to clean up a single incident.

We've seen both with clients. The difference isn't the model, the framework, or the cloud. It's these three foundations.

Three steps for the next 30 days

If your company is putting first agents into production in 2026, do these three things before any agent sees real data:

Settle the identity question first. Which identity provider issues the tokens? How long do they live? Which scopes do you need? That's an architecture decision, not an implementation question.
Pick your tracing platform before building the second agent. Retrofitting tracing is three times as expensive as building it in. Doesn't matter which platform — just pick one, and make sure it does cost attribution.
Write your first policy before you deploy your first agent. Even if it starts trivial ("this agent may only use tool X in project Y"). The first sentence in the policy document matters more than the hundredth prompt tweak.

Bottom line

"Agents are the architecture now" works as a slogan. As a roadmap it's dangerous, because it suggests the platforms handle the rest. They don't.

Identity, observability, and policy are not detail topics you bolt on after the first win. They're the foundations without which any agent architecture collapses in its first production weeks — or, worse, quietly erodes trust.

We're building this live for our clients, and we're talking about it honestly, because the keynote slides aren't.

AIAI AgentsEnterprise AIMittelstandBusinessBehind the Scenes

Back to Blog

Agents Are the Architecture — Three Things Google Doesn't Tell You That SMEs Must Fix First

What Google says — and what it leaves out

Foundation 1: Inbound auth with short-lived JWTs and scopes

Foundation 2: End-to-end tracing with cost and token attribution

Foundation 3: Fine-grained policies — "tool X in project Y"

What this means for SMEs

Three steps for the next 30 days

Bottom line

More Articles

MCP Sprawl: Why Your Business Will Wire Up 20 Agent APIs in 2026 — And No One's Talking About It

France Is Ditching Windows for Linux — And Every EU Business Should Pay Attention

Project Glasswing: When AI Finds Security Flaws That Went Undetected for 27 Years