Home
News
Hosting
Cloudflare's Outage Shows Why One Script Can Still Break the Internet

Hosting 24 February 2026 9 min read

Cloudflare's Outage Shows Why One Script Can Still Break the Internet

A cleanup script with an empty API filter deleted 1,100 BYOIP prefixes from Cloudflare's global routing table on 20 February. It's the third time in four weeks that a major cloud provider has had automation destroy production infrastructure.

Mark McNeece Founder & Managing Director, 365i

Close-up of data centre network cabling with dramatic lighting, overlaid with bold text reading Cloudflare Outage

On 20 February 2026, a routine cleanup script at Cloudflare went wrong in the worst possible way. An empty string in an API filter caused the system to select every BYOIP (Bring Your Own IP) prefix in the account, not just the pending ones marked for removal. Within seconds, 1,100 of the company's 4,306 customer IP prefixes were withdrawn from the global routing table. BGP routes vanished. Traffic stopped. Services including Uber Eats, Bet365, Wikipedia, Workday, Minecraft, and Microsoft Outlook went dark.

The outage lasted 6 hours and 7 minutes. For a company that processes requests for roughly 21% of all websites on the internet, that's not a blip. It's a lesson.

And it's the third time in a single month that a major cloud provider has had automation delete production infrastructure.

What Actually Happened at Cloudflare

Cloudflare's BYOIP service lets customers announce their own IP address ranges through Cloudflare's network. Businesses that own IP space use this to get Cloudflare's DDoS protection and CDN performance without changing their addresses. (Cloudflare's own 2026 threat report shows DDoS attacks doubled to 47.1 million last year, which explains why that protection matters.)

The cleanup script was supposed to remove only prefixes in a "pending deletion" state. But the API query used to filter those prefixes contained an empty string, which matched everything. According to Cloudflare's post-mortem, the script ran with production credentials that had full write access to the BYOIP system. No dry-run. No confirmation step. No scope limit on how many prefixes could be deleted in a single operation.

The result: 25% of all BYOIP prefixes were withdrawn from BGP in one go. Customer traffic that relied on those IP ranges had nowhere to go.

Cloudflare's engineering team acknowledged the failure directly. "We let you down today," the post-mortem reads, before detailing the root cause and the remediations now in place.

Vector illustration of a deployment pipeline with stages from code to deploy, showing a red failure point at the automated cleanup stage with blast radius expanding outward — When an automated cleanup script has no scope limit, the blast radius is the entire system.

Three Providers, Three Automation Failures, One Month

Cloudflare's outage didn't happen in isolation. February 2026 has been a brutal month for cloud reliability, and the pattern is impossible to ignore.

Azure, 2-3 February. Microsoft's health monitoring system detected an issue in its backend and triggered an automated remediation. That remediation made the problem worse. The health checker detected the worsened state and triggered more remediation. This feedback loop cascaded across Azure regions for over 24 hours. Microsoft 365 services, including Teams and Outlook, were disrupted for businesses worldwide.

AWS, 16 February. Amazon's AI coding tool Kiro, running with overly broad permissions, autonomously deleted and rebuilt a live AWS environment. The incident, first reported by the Financial Times, took down AWS Cost Explorer in China for 13 hours. Amazon disputes the "AI autonomy" framing, but the outcome is the same: automated tooling destroyed production infrastructure because permissions weren't scoped correctly.

Cloudflare, 20 February. An empty string in an API query caused a cleanup script to delete 1,100 BYOIP prefixes, knocking out traffic for 25% of BYOIP customers for over six hours.

Three separate providers. Three separate root causes. But the underlying failure is identical in each case: an automated process with too much access, no safety check, and no limit on how much damage it could do in a single operation.

Illustration of three damaged cloud server towers with February 2026 calendars, each showing a different automation failure icon for Azure, AWS, and Cloudflare — Three major cloud providers hit by automation failures in a single month. The pattern is clear.

The Root Cause Behind the Root Cause

Every post-mortem focuses on the technical trigger. The empty string. The feedback loop. The misconfigured permissions. But the real problem sits one level deeper: automation runbooks are being deployed without blast-radius controls.

Research from CyberSecurity News cites industry data suggesting roughly a third of cloud outages now involve some form of automated remediation or maintenance script going wrong. That number has been climbing as providers push more operational tasks from human-led to fully automated.

The logic makes sense: automation reduces human error and speeds up routine tasks. But when the automation itself fails, it fails at machine speed. A human technician withdrawing BGP routes might do ten, notice something odd, and stop. A script with production credentials does 1,100 before anyone can react.

"When a digital service depends entirely on a single cloud platform, that single point of failure extends to everything built on top of it. The February incidents are a strong signal that businesses need to think harder about redundancy."
Mayur Upadhyaya, CEO of APIContext

That's the quiet part said out loud. The cloud was supposed to eliminate single points of failure. Instead, concentration has created new ones. Cloudflare handles traffic for over 41 million websites. When its automation misfires, the blast radius isn't one company or one region. It's a chunk of the internet.

What This Means for Your Business

If you run a website, you almost certainly depend on at least one of these providers, even if you don't know it. Your WordPress hosting company uses a CDN. Your DNS provider routes through a major network. Your email transits through infrastructure that sits on top of the same platforms that failed in February.

The question isn't whether your stack has exposure to a hyperscaler outage. It does. The question is what happens when that exposure gets triggered.

Illustration of a business owner seeing a 503 error on a laptop screen, with a window view showing the concept of building redundancy into web infrastructure — When your hosting provider's provider goes down, what keeps your business online?

Here's what's worth checking:

DNS resilience. If your DNS provider uses a single upstream network and that network drops routes, your domain stops resolving. Use the 365i DNS Lookup tool to check your current nameserver configuration and email authentication records. Consider whether your DNS setup has a fallback path.

CDN dependency. A CDN that sits in front of your site is also a single point of failure. If the CDN goes down, your origin server should still be reachable. Ask your hosting provider whether your site falls back to origin or goes offline entirely. Our CDN is designed to fail open, so sites remain accessible even if edge nodes are disrupted.

Monitoring and alerts. The Cloudflare outage lasted six hours. How quickly would you know if your site went down? Services like UptimeRobot or Pingdom can alert you in minutes. The earlier you know, the faster you can switch DNS or take other recovery steps. Use our HTTP Header Inspector to verify your site's response headers and redirect chains are healthy right now.

Multi-provider awareness. You don't need to run your own data centre. But you should know where your single points of failure are. Which providers does your hosting company depend on? What's their failover strategy? At 365i, we use managed cloud infrastructure across multiple providers (365i, AWS, Google Cloud) specifically because concentration risk is real.

What Cloudflare Is Doing About It

Credit where it's due: Cloudflare's post-mortem is thorough. They've committed to several remediations including rate-limiting deletion operations, adding mandatory dry-run steps for bulk changes, reducing the scope of service account permissions, and implementing a "two-person rule" for changes affecting large numbers of prefixes.

These are the right fixes. They're also the fixes that should have been in place before the incident. Every one of them is a standard practice in production server security: principle of least privilege, blast-radius controls, confirmation steps for destructive operations.

Azure and AWS have issued their own remediations after their February incidents. The pattern across all three is the same: add the guardrails that were missing. The uncomfortable truth is that these guardrails were well understood long before February 2026. They just weren't applied to the automation layer.

Five Questions to Ask Your Hosting Provider

You don't need to understand BGP routing tables or BYOIP prefix management. But you should be able to get straight answers to these questions:

What's your CDN and DNS failover? If your CDN provider has an outage, does my site go offline or fall back to origin?
Where are my single points of failure? Which upstream providers does your infrastructure depend on?
How do you handle provider outages? Is there a documented incident response process?
What monitoring do you run? How quickly would you detect a BGP withdrawal or DNS resolution failure?
Do you use multiple data centre locations? Geographic redundancy protects against regional failures. 365i's platform runs across UK, US, and Asia data centres for this reason.

If your provider can't answer these clearly, that itself is an answer.

A Pattern That's Getting Worse

The web is more concentrated than it's ever been. A handful of companies, Cloudflare, AWS, Azure, Google Cloud, carry the majority of internet traffic. That concentration delivers real benefits: better performance, easier management, lower costs. It also creates systemic risk that didn't exist when the web was more distributed.

February 2026 is a preview of what happens when automation at that scale goes wrong. Not once, but three times in four weeks. The businesses that came through it best were the ones that already had redundancy built in: fallback DNS, multi-CDN setups, hosting platforms designed for resilience rather than minimum cost.

If you're reviewing your hosting setup after reading this, that's the right response. Check your SSL configuration, verify your DNS records, and make sure you know what happens when one link in your chain breaks. The next outage won't wait for you to get ready.

Frequently Asked Questions

What caused the Cloudflare outage on 20 February 2026?

A cleanup script designed to remove unused BYOIP prefixes contained an empty string in its API filter. Instead of selecting only prefixes marked for deletion, it matched all 4,306 prefixes and deleted 1,100 of them. This withdrew the associated BGP routes, causing traffic to those IP ranges to stop globally.

How long did the Cloudflare outage last?

6 hours and 7 minutes from initial impact to full recovery. Cloudflare's engineering team had to re-announce the deleted BGP prefixes and wait for global routing tables to propagate the corrected information.

Which websites and services were affected?

Directly affected services included Uber Eats, Bet365, Wikipedia, Workday, Minecraft, and Microsoft Outlook, among others. Any business using Cloudflare's BYOIP service with a prefix in the deleted batch lost connectivity for the duration of the outage. Sites using Cloudflare's standard CDN (without BYOIP) were not affected.

What is BYOIP and why does it matter?

BYOIP (Bring Your Own IP) lets businesses announce their own IP address blocks through Cloudflare's network. Large organisations use it to keep their existing IP addresses while gaining Cloudflare's DDoS protection and CDN performance. When those prefixes were withdrawn from BGP, traffic for those IPs had no route to follow.

Were there other major cloud outages in February 2026?

Yes. Azure had a 24-hour outage on 2-3 February caused by an automated health checker entering a remediation feedback loop. AWS had a 13-hour outage on 16 February when Amazon's AI coding tool Kiro deleted and rebuilt a live environment with overly broad permissions. All three incidents involved automated processes failing without adequate safety controls.

Was my website affected by the Cloudflare outage?

If your site uses Cloudflare's standard CDN (the free or paid plans most small businesses use), you were not directly affected. The outage specifically hit BYOIP customers, which are typically larger organisations that own their own IP address space. However, if any service your site depends on (email, payment processing, third-party APIs) used an affected IP range, you may have experienced indirect disruption.

How can I protect my business from cloud provider outages?

Focus on removing single points of failure. Use DNS providers that support failover. Check whether your CDN fails open (falling back to your origin server) or fails closed (taking your site down entirely). Set up uptime monitoring so you're alerted within minutes. And ask your hosting provider which upstream providers they depend on and what their failover strategy looks like.

What is Cloudflare doing to prevent this from happening again?

Cloudflare's post-mortem commits to rate-limiting deletion operations, mandatory dry-run steps for bulk changes, reduced permissions for service accounts, and a two-person approval rule for changes affecting large numbers of prefixes. These are standard production safety practices that weren't applied to the automation that caused the incident.

Check your site's infrastructure health

Use our free DNS Lookup, HTTP Header Inspector, and SSL Scanner to verify your site's configuration. No sign-up needed.

View Free Tools

Sources

Published: 24 February 2026 · Last reviewed: 22 April 2026 · Written by: Mark McNeece, Founder & Managing Director, 365i

Editorially reviewed by: Mark McNeece on 22 April 2026 · Our editorial standards

About the Author

Mark McNeece is an industry leader in AI Visibility. He developed and published the AI Discovery File Specifications, the emerging open standard for making websites discoverable by large language models such as ChatGPT, Claude, and Gemini. Mark founded 365i in 2002, runs 365i Web Design for sites and AI visibility, and founded Press Forge for specialist WordPress services.

Every article on this site is drawn from real client work across more than 20 years of UK hosting and WordPress experience, not from release-note reruns. His WordPress plugins are published on wordpress.org. Get in touch if you'd like Mark to look at your site.

Mark also reviews every post on this site against our editorial standards before it publishes and again whenever it is substantively updated.

Tags:

Cloud Hosting Security Industry News Web Hosting Server Management UK Business

WordPress 17 Jul 2026

WordPress 7.1 Could Reduce the Need for Page Builders. Here's What's Changing

WordPress 7.1 lands on 19 August with responsive styling and hover states built into the editor, two jobs that have pushed people towards Elementor for years. We check what actually shipped in Beta 1, what quietly didn't, and run the UK maths on whether your page builder site should care.

13 min read Read

WordPress 16 Jul 2026

WordPress AI Agents Are Coming. Is Your WordPress Hosting Ready?

WordPress is turning from software humans drive into software AI agents can inspect and operate. Giving an agent access without backups, permissions, logs and easy rollback is like giving a very enthusiastic apprentice the server password and going to lunch. Here is what changed in WordPress 7.0, what goes wrong, and what AI-ready WordPress hosting actually needs to include.

12 min read Read

WordPress 12 Jul 2026

Your WordPress Backup Isn't a Backup Until You've Restored It

Nearly every host promises "daily backups included", but that phrase tells you almost nothing about whether you could actually recover your site on a bad day. Here is what really gets backed up, why retention matters more than frequency, whether the backup can be restored, and how 365i's Timeline Backups handle files, databases and mailboxes without a plugin.

11 min read Read

SEO 16 Jun 2026

We Deleted Every Blog Post on This Removals Site. Then ChatGPT Scored It 99/100

The old Brewood Removals site was buried under AI-slop blog posts and around 160 doorway landing pages, and ranked for almost nothing. We deleted the lot, rebuilt it as an advice hub written from real jobs, and commissioned an independent ChatGPT assessment that scored it 99/100, ahead of Pickfords and every national chain. An early case study, with the search data still to come.

13 min read Read