Most cloud security architectures do not fail because the controls don't exist. They fail because decisions were never made deliberately — they accumulated, organically, over time, until a quiet structural problem became permanent.

In two decades of reviewing enterprise cloud security architectures across AWS, Azure, and Microsoft 365, the same patterns recur. Not because the engineers were careless — they almost never were — but because the architecture grew faster than the architectural authority around it. Decisions got made under pressure, by people without the context to see how they would compound, and once a structural choice is six months old in a fast-moving estate, the cost of reversing it has usually become political rather than technical.

What follows is a working list. Not a CIS benchmark, not a Well-Architected pillar, not a SOC 2 control reference. Patterns we have seen, in regulated and unregulated estates alike, repeatedly enough to be confident they generalise. Each comes with the symptom, the underlying cause, and the architectural posture that prevents it.

01
Landing Zone

The shared-everything landing zone.

A single AWS account, Azure subscription, or M365 tenant holds production workloads, non-production workloads, shared tooling, sandbox experiments, and the security team's own logging — all sharing the same identity boundary, the same network space, and the same blast radius.

The organisation moved fast in early cloud adoption. One account was simpler than ten. Then production launched, then the security team needed somewhere to put detection tooling, then a sandbox got promoted to "kind of production," and nobody ever stopped to split it apart.

A deliberate multi-account, multi-subscription, or multi-tenant topology with clear separation between production, non-production, shared services, and security operations — managed through AWS Organizations, Azure Management Groups, or M365 administrative units. The boundary is policy-defined, not vibes-defined. New workloads inherit security posture from the structure, not from whoever last touched the resource.

02
Identity

Privileged identity sprawl.

Hundreds of IAM users, service accounts, application registrations, and break-glass roles accumulated over years. Nobody can credibly say which ones are still needed. Half of them were created for projects that ended in 2022. Some have access keys that were issued before the current CISO was hired.

Identity creation is cheap. Identity revocation is political. Every account was originally legitimate; nobody owns the question of when it should stop being legitimate. There is no joiner-mover-leaver process for non-human identities.

A documented identity model that distinguishes human, workload, and service principals, with explicit ownership and lifecycle for each. Just-in-time elevation for human privileged access. Managed identities and federated workload credentials where the platform supports them. A standing review cadence — quarterly, not "when an auditor asks" — that retires what is no longer needed.

03
Identity

The conditional access maze.

Entra ID has forty conditional access policies, each added in response to a specific incident or audit finding, layered on top of each other for years. Nobody understands the precedence. MFA prompts fire inconsistently across the user population. Exceptions exist that nobody remembers granting. A senior executive has a permanent break-glass exemption that has been there for eighteen months.

Conditional access policies are easy to add and emotionally hard to remove — each one was justified at the moment of creation. There is no architectural model holding the whole thing together, just an archaeology of past decisions.

A small, principled set of policies organised around persona-based access patterns — privileged user, standard user, external collaborator, service identity — with explicit precedence and a documented design intent for each. Policies are reviewed as a coherent system, not individually. Exceptions are time-bound by default. The whole thing fits on one architectural diagram.

Every anti-pattern below is, fundamentally, the same problem: the architecture grew faster than the architectural authority around it.
04
Network

The permissive default network.

Every workload in the estate can reach every other workload. East-west traffic is unrestricted because "we're behind the perimeter." Egress goes to anywhere on the internet because outbound filtering was deferred. "We'll segment later" turned into "we never segmented."

Segmentation is annoying to design and even more annoying to retrofit. Permissive defaults let workloads ship on time. The cost of the open network is invisible until an incident happens — and by then it is too expensive to fix in a hurry.

Workload segmentation designed before the workloads exist. Default-deny security groups, NSGs, or equivalent, with explicit allow lists. Egress controls that govern which inference endpoints, package registries, and external APIs the estate can reach. Private connectivity for everything that does not strictly need to be public. Microsegmentation where the regulatory or threat model justifies it.

05
Access

The bastion that forgot why it existed.

A jump host, RDP gateway, or "temporary" VPN endpoint that someone stood up in year one of cloud adoption. Three years later it is the de facto admin path into production. Nobody remembers who owns it. It probably runs an unpatched OS. It almost certainly has standing credentials.

The bastion was the lowest-friction answer to "we need to administer this thing." Once it worked, it became invisible. Replacing it requires admitting it exists, which requires admitting it is risky, which requires somebody to own the replacement.

Modern privileged access patterns — AWS Systems Manager Session Manager, Azure Bastion, just-in-time access via PIM, identity-aware proxies — that eliminate persistent admin endpoints altogether. Every administrative session is identity-driven, logged, time-bound, and replayable. The phrase "jump host" should be rare and slightly embarrassing.

06
Governance

Tagging theatre.

Tagging policies exist on paper. Resources have tags. The tags are inconsistent, half-empty, mis-spelled, or applied by different people meaning different things. Cost attribution still falls back to "ask the FinOps team." Security ownership still falls back to "ask Steve." Incident response still wastes thirty minutes figuring out who runs the affected workload.

Tagging looks like an administrative concern, not a security one, so it never gets architectural attention. It is enforced at the wrong layer — usually a wiki page — rather than at the policy layer where it could actually compel compliance.

A small, mandatory tag schema — owner, cost-centre, environment, data-classification, criticality — enforced through Azure Policy, AWS Service Control Policies, or equivalent. Untagged resources are blocked at creation, not corrected after the fact. Tags drive automation: incident routing, cost reporting, access policy, retention. They are infrastructure, not metadata.

07
Assurance

Compliance-by-screenshot.

An auditor asks for evidence that a control is operating. Someone opens the Defender for Cloud or Security Hub dashboard, screenshots the green checkmark, pastes it into a Word document, and uploads it to a SharePoint folder named "Audit 2026." Six months later, nobody can prove the control was operating during the period in question.

Audit cycles are infrequent and painful, so they get treated as projects rather than as outputs of a continuous process. The team optimises for surviving the next audit, not for being continuously auditable. Evidence collection is manual because building it into the pipeline felt like over-engineering.

Continuous evidence collection — exporting control state from Defender, Security Hub, or equivalent into an immutable evidence store, automatically, on a defined cadence. Compliance reporting becomes a query against the evidence store, not a screenshot exercise. The auditor's questions become trivial because the answers were already being collected.

A closing note

Why these matter more now.

Each of these anti-patterns is survivable when the workloads they affect are conventional cloud applications. The shared-everything landing zone holds a few CRUD apps; the permissive network has some east-west traffic between microservices; the bastion provides admin access to a database.

When AI workloads enter the same estate, the cost of each anti-pattern compounds. Models call tools. Agents traverse identity boundaries. RAG pipelines pull from data stores that the existing segmentation never anticipated. Prompt injection finds reachability that nobody knew was there. The shared-everything account that was tolerable for a CRM workload becomes architecturally indefensible the moment an autonomous agent runs inside it.

The cloud foundation does not have to be perfect before AI adoption. It does have to be deliberate. Each of these patterns can be addressed in weeks, not years — but only if someone owns the architectural decision to address them, and someone owns the design of what replaces them.