Cloudflare Outage: 6-Hour Failure Disrupts ChatGPT, Claude and Millions of Websites Worldwide
Image Credit: Jacky Lee
Cloudflare experienced a significant global outage on 18 November 2025, disrupting internet traffic across its network and affecting many of the world’s most widely used AI platforms, social media sites, and consumer services. Millions of websites that rely on Cloudflare for routing, security, DNS, and content delivery showed elevated HTTP 5xx errors for several hours.
What Happened and Timeline
According to Cloudflare’s incident report, the disruption began at 11:28 UTC, when its global edge network started returning abnormal levels of 500-series errors. Shortly before this, at 11:05 UTC, Cloudflare engineers applied a routine permissions update to a ClickHouse database, unintentionally setting the stage for the incident.
The outage unfolded as follows:
11:28 UTC — Error rates spike globally; customer traffic becomes unstable
14:30 UTC — Core traffic flow largely stabilised
17:06 UTC — All systems marked fully recovered
While the most severe disruption lasted roughly three hours, complete restoration took just under six hours.
Outage tracker Downdetector recorded over 11,000 problem reports at the peak, including reports from services attempting to report Cloudflare’s own downtime.
Engineers initially explored whether the symptoms resembled a large-scale DDoS attack but quickly ruled out any malicious activity.
Root Cause: A Bot-Management Configuration Failure
Cloudflare attributed the outage to a flaw introduced during a permissions change in a ClickHouse database schema. The update altered the behaviour of a query used to generate a Bot Management feature file, a critical machine-learning input used to classify automated versus human traffic.
As a result:
The feature file began including duplicate metadata rows, expanding from roughly 60 entries to over 200, far beyond normal levels.
This oversized file was automatically propagated across Cloudflare’s edge network.
FL2, Cloudflare’s newer Rust-based proxy engine, attempted to load the file but exceeded internal limits, triggering 5xx responses.
FL, the legacy proxy engine, did not error but instead assigned a bot score of zero to all requests, causing widespread misclassification of traffic.
Because Cloudflare refreshes this feature file frequently, the faulty configuration was rapidly distributed across a majority of global regions before deployment was halted. Cloudflare emphasised that the outage was caused entirely by an internal configuration error, not a cyberattack.
Impact on AI Services
The outage had a heavy effect on AI applications that depend on Cloudflare for DNS resolution, DDoS protection, and edge routing. The most affected services included:
OpenAI’s ChatGPT
Anthropic’s Claude
Perplexity AI
Other generative-AI tools with web-served inference endpoints
Cloudflare’s own Workers KV — a storage system used for distributing configuration — was also listed among the impacted components. Although Workers AI was not explicitly affected on this date, earlier outages (such as the June 12, 2025 Workers KV incident) showed how storage-layer disruptions can cascade into AI workloads.
Real-time inference services saw degraded responsiveness or queuing due to routing failures across Cloudflare’s edge.
Wider Internet Disruption
Beyond AI tools, many mainstream platforms experienced downtime or reduced functionality, including:
X (formerly Twitter)
Spotify
Canva
Shopify
League of Legends and other game services
Downdetector, which itself briefly went offline due to Cloudflare dependency
With Cloudflare estimated to underpin about 20% of global web traffic, the ripple effects were felt across industries, regions, and device types.
How It Compares to Recent Cloud and AI Infrastructure Outages
The Cloudflare failure arrives amid a sequence of infrastructure incidents that disrupted AI and cloud services in late 2025, including:
An AWS outage in October 2025 linked to failures in DynamoDB’s DNS subsystem
A Microsoft Azure Front Door configuration failure later the same month
Multiple Google Cloud Vertex AI degradations in 2024–2025, particularly the June 12, 2025 IAM/API misconfiguration incident that affected AI prediction services across regions
Unlike cloud-provider disruptions driven by infrastructure or network routing issues, Cloudflare’s outage was triggered by a software-level configuration error inside its bot-management pipeline.
Looking Ahead
Cloudflare issued an apology, calling the outage “unacceptable”, and announced steps to strengthen:
validation of configuration updates,
hardening of ingestion systems, and
kill-switch mechanisms to rapidly halt propagation of faulty files.
As AI continues to drive surging global traffic, analysts highlight the need for:
multi-CDN or multi-provider redundancy,
automatic failover routing, and
stronger isolation between machine-learning safety components and core proxy systems.
The incident underscores how even a subtle metadata change in shared infrastructure can cascade into a global service disruption, especially as AI-powered applications become central to everyday digital activity.
We are a leading AI-focused digital news platform, combining AI-generated reporting with human editorial oversight. By aggregating and synthesizing the latest developments in AI — spanning innovation, technology, ethics, policy and business — we deliver timely, accurate and thought-provoking content.
