Update March 8th, 2017
We’ve been contacted by representatives from Express.com who stated that their S3 instance was effected by the outage but did not impact our website or our ability to transact commerce because of our failover and disaster recovery plans. After further investigation, their checks were failing because of third-party services on their website, mostly advertising and analytics partners.
Update March 3rd, 2017
Unfortunately in the haste of analyzing our IR 100 checks during the outage, our analyst picked up a false positive for Victoriassecret.com. A video on the victoriasecret.com home page frequently times out for one of our checks in Las Vegas. The issue is a result of the way our agent processes HTTP Options requests. Please see the screenshots below that detail the error with this file.
Thank you to the L Brands (Victoria’s Secret parent company) infrastructure employee that contacted us about this error. We sincerely apologize to the L Brands infrastructure and engineering teams. Their website showed no sign of degradation during the AWS outage while most of the world’s top retailers did. Kudos to them!
On Tuesday, February 28th 2017, at 11:53AM PST Amazon’s S3 web-based storage service went down across the US-EAST-1 region. This 4-hour outage caused some big issues for millions of companies across the US, especially companies who host a large portion of their site on Amazon. Apica’s synthetic monitoring tool was used to determine the effects of this outage on the top 100 e-commerce sites.
Damage Caused from Amazon S3 Outage
Back in November, Apica created the top 100 Web Performance Cyber Monday Index. From this same list, we’ve evaluated how these companies were hit during the Amazon S3 outage.
Top 3 Findings from the Amazon S3 Outage
• 54 out of the top IR 100 were effected (20% performance decrease or more)
• For effected website, average slow down time was 29.7 seconds — 42.7 seconds to load
Disney Store – 94 seconds slower (1165%)
Target – 41.6 seconds slower (991%)
Nike – 12.3 seconds slower (642%)
Nordstrom – 29.8 seconds slower (592%) [Due to 3rd-party resource]
Apple, Walmart, Newegg, Bestbuy, Costco, and surprisingly Amazon/Zappos were not affected.
Why did some companies survive the Amazon S3 outage?
Depending on how companies rely on Amazon S3, the effects ranged. Newer websites are pulling data from various databases in the cloud and stored all over the world. This caused only partial outages in image render time or various data being stored on Amazon.
Some companies store their data locally and were able to pull images from their own servers when the outage happened.
What caused the Amazon S3 Outage?
Amazon announced at 4:45PM ET that there was a problem at one of the main storage systems. S3 is Amazon’s largest service and used by more than half of its million+ customers with more the 3-4 trillion pieces of data in it, so when a small problem arises, it turns into a big issue. Amazon’s update stated the following:
“Update at 12:52 PM PST: We are seeing recovery for S3 object retrievals, listing and deletions. We continue to work on recovery for adding new objects to S3 and expect to start seeing improved error rates within the hour.
Update at 1:12 PM PST: S3 object retrieval, listing and deletion are fully recovered now. We are still working to recover normal operations for adding new objects to S3.
Update at 2:08 PM PST: As of 1:49 PM PST, we are fully recovered for operations for adding new objects in S3, which was our last operation showing a high error rate. The Amazon S3 service is operating normally.”
Summary
Catching outages is difficult and almost uncontrollable, but implementing performance can pin-point whether images, video, 3rd-party or servers go down and why. Some of the companies above were able to spot-check and evaluate the performance of their site while down. This helped them generate a workaround or quickly use images from the local server. Learn how your company can withstand an outage, contact Apica today.