Amazon Web Services (AWS) has resolved a massive outage that disrupted more than 1,000 apps and websites, including Snapchat, Lloyds Bank, and Halifax, on Monday. The outage affected platforms across the globe and left millions of users unable to access services for much of the day.
Downdetector, a site that monitors platform outages, reported over 11 million user complaints during the incident. Experts said the outage highlighted the risks of relying heavily on a single cloud provider for critical infrastructure.
Professor Alan Woodward from the University of Surrey said the incident exposed how interdependent online services have become. Small errors, often human-made, can ripple across multiple industries and have major economic impacts.
The problems began around 07:00 BST, with users reporting issues accessing a range of services from popular games like Fortnite to apps like Duolingo. Within hours, Downdetector received more than four million reports from 500 sites, more than double the normal daily count. The peak of complaints later surpassed 11 million as more services tried to recover, including Reddit and Lloyds Bank.
Amazon confirmed around 23:00 BST that all AWS services had returned to normal. During the outage, parts of Amazon’s system were throttled to address the root problem. Experts suggested that a series of cascading failures may have followed the initial issue, similar to power outages where systems flicker as crews work to restore service.
Amazon stated the problem “appears to be related to DNS resolution of the DynamoDB API endpoint in US-EAST-1.” DNS, or Domain Name System, acts like the internet’s phone book by translating website names into numbers that computers can read. Disruptions to DNS can prevent web browsers from locating content.
Matthew Prince, CEO of Cloudflare, said the outage showed the immense power cloud providers have over the internet. “The cloud allows incredible scalability, but outages can bring down many services we depend on,” he said.
Cori Crider, head of the Future of Technology Institute, described the event as “like a bridge collapsing,” noting that when major cloud providers fail, a large portion of the economy is affected. She warned that relying on a few dominant providers, such as Amazon, Microsoft, and Google, is unsustainable. Crider recommended investing in more local services to reduce risks to security, sovereignty, and economic stability.
Some experts argue that companies using AWS share responsibility. Ken Birman, a computer science professor at Cornell University, said businesses must build backup systems and protections for cloud-hosted applications. Outages like Monday’s are not uncommon, though not always at this scale.
The question of accountability could have legal implications. After a major CrowdStrike outage last year, Delta Airlines is still seeking over $500 million in damages. Even after CrowdStrike fixed the issue, Delta had to manually reset 40,000 servers, causing significant flight delays over several days.
The AWS outage on Monday demonstrates both the power and the vulnerability of centralised cloud infrastructure. While cloud services offer efficiency and scalability, experts warn that companies must invest in resilience to avoid widespread disruptions.
