Share This Post

Amazon

Amazon Explains S3 Outage, Blames Employee

If you’re interested in just what happened this week when Amazon’s servers took out 100’s of popular web sites and services, Amazon has finally released an overview of the situation:

Summary of the Amazon S3 Service Disruption in the Northern Virginia (US-EAST-1) Region

The crux of the message is user error.

At 9:37AM PST, an authorized S3 team member using an established playbook executed a command which was intended to remove a small number of servers for one of the S3 subsystems that is used by the S3 billing process. Unfortunately, one of the inputs to the command was entered incorrectly and a larger set of servers was removed than intended.

Amazon goes on to apologize and offers next steps to avoid a similar situation in the future.

According to the Business Insider, AWS’ S3 supports more than 150,000 websites (54 of the top 100 Internet retailers). The disruption caused an estimated loss of $150 to $160 million as a result.


Looking for an awesome, no-nonsense technical conference for IT Pros, Developers, and DevOps? IT/Dev Connections kicks off in San Francisco in 2017!

IT/Dev Connections

Share This Post

Leave a Reply