To dispel doubts and stop rumors about a possible cyberattack or BGP hijack, Cloudflare provided a clear explanation: the recent outage of the 1.1.1.1 DNS service was not caused by external actors, but by an internal configuration error.
The outage occurred on July 14 and affected users around the world. In many cases, this left millions without access to websites and essential Internet services.
“The root cause was an internal configuration error, not an attack or BGP hijack,” Cloudflare confirmed in its post-incident report.
Cloudflare's clarification comes after many people began speculating on social media that the 1.1.1.1 outage had been caused by a BGP hijack, one of the most feared scenarios in the world of Internet routing. But the reality was much less dramatic (though just as impactful): it all came down to an internal configuration error.
What really happened?
To understand the origin of the problem, we have to go back to June 6, when Cloudflare made a change to its configuration system as part of an upcoming integration with its new data localization suite, known as DLS (Data Localization Suite). The issue: the IP prefixes of the public DNS 1.1.1.1 were mistakenly assigned to an inactive (and non-production-ready) version of DLS.
Everything seemed to be working fine… until July 14 at 21:48 UTC, when a new update was deployed that included a test location within that DLS service. The change applied the incorrect configuration globally, causing the IP prefixes of 1.1.1.1 to be disconnected from production data centers and redirected to a single offline location. The result: Cloudflare’s DNS service became inaccessible worldwide.
This Is How the Outage Unfolded
-
21:48 UTC: The update that activates the incorrect configuration is deployed globally.
-
Less than 4 minutes later: Traffic to 1.1.1.1 begins to drop sharply.
-
22:01 UTC: Cloudflare detects the issue and makes a public announcement.
-
22:20 UTC: The configuration error is reversed and the affected BGP prefixes are re-announced.
-
22:54 UTC: Service is fully restored across all regions.
Which Services Were Affected?
The impact was significant. The outage affected multiple key IP ranges of Cloudflare’s DNS system:
-
1.1.1.1: Its primary DNS resolver.
-
1.0.0.1: Its secondary DNS resolver.
-
2606:4700:4700::1111 and 2606:4700:4700::1001: Their IPv6 equivalents.
-
Along with other IP blocks used within Cloudflare’s global infrastructure.
In other words, millions of devices, home networks, businesses, and applications that relied on these resolvers experienced outages, slowness, or connection failures.
Outages Affecting Key IP Ranges (Source: Cloudflare)
Regarding the technical impact of the incident, most DNS query protocols were severely affected. Requests made via UDP, TCP, and DNS over TLS (DoT) to the affected addresses experienced a significant drop in volume. However, DNS over HTTPS (DoH) traffic remained relatively stable.
Why? DoH uses a different path to reach its destination, routing through cloudflare-dns.com, which largely shielded it from the widespread disruption.
Impact of the Incident by Protocol (Source: Cloudflare)
Read more: Cloudflare Blocks Record 7.3 Tbps DDoS Attack on Web Provider
What Will Cloudflare Do to Prevent This from Happening Again?
After analyzing the incident, Cloudflare acknowledged that the configuration error could have been avoided if they had used a system with progressive rollouts. In other words, the failure slipped through because they still rely on legacy systems that don’t allow gradual testing of changes before applying them globally.
That’s why one of the main steps they will take is to accelerate the migration to a more modern infrastructure, based on abstract service topologies instead of static IP routes. This new architecture will allow them to:
-
Deploy changes in stages
-
Monitor system health at each step
-
Quickly roll back any modifications that cause issues
Additionally, Cloudflare admitted that the misconfiguration passed peer review without being detected, largely due to a lack of clear documentation on how their service topologies are structured and how internal routing behaves. This is also on their to-do list: to improve documentation and review processes to minimize risks in the future.