September 22, 2022

Amazon Web Services Says Overwhelmed Network Devices Triggered Outage

Amazon Web Services (AWS) provided an explanation of the cause of the outage that resulted in parts of its own services, as well as third-party websites and online platforms that use AWS, to go down. In an article on AWS Website, the company explains that an automated process caused the outage, which began around 10:30 a.m. ET in the Northern Virginia area (US-EAST-1).

“An automated activity to scale the capacity of one of the AWS services hosted in the core AWS network triggered unexpected behavior from a large number of customers inside the internal network,” the report states. ‘Amazon. “This resulted in a sharp increase in connection activity that overwhelmed the networking devices between the internal network and the main AWS network, causing communication delays between these networks. “

According to the report, this issue even impacted Amazon’s ability to see exactly what was wrong with the system. This prevented the company’s operations team from using the real-time monitoring system and internal controls they typically rely on, which is why the outage took so long to be corrected. Amazon notes that the service’s startup didn’t start to improve until 4:34 p.m. ET, and the issue was fully resolved by 5:22 p.m. ET.

Because Amazon’s support contact center is also operating on the AWS network, customers were unable to create support requests for seven hours during the outage. Amazon service health The dashboard, which the platform uses to provide status updates, was also affected, causing Amazon to delay recognition of the issue. The company says it is working on a way to improve its outage response and plans to release a revamped version of the service health dashboard that should help customers receive timely updates in the event of an outage. .

In addition to cutting popular services, like Venmo, Tinder, Disney Plus, and even Roomba, the December 7 blackout also suspended some Amazon deliveries. Amazon experienced its last major outage around this time last year, causing a number of sites and apps to crash for hours on end.

Source link