Back to overview
Downtime

Web application, Public Content, and 31 other services are down

Oct 19 at 11:50pm PDT
Affected services
Web application
Public Content
Original Assets Distribution
Get Bulk Assets Download Job
Create Bulk Assets Download Job
Bulk Asset Retrieval
Get Asset by ID
Generate an Inference
Inference txt2img-ip-adapter
Inference txt2img-texture
Inference img2img
Inference img2img-ip-adapter
Inference img2img-texture
Inference controlnet
Inference controlnet-img2img
Inference controlnet-ip-adapter
Inference controlnet-inpaint
Inference inpaint
Inference inpaint-ip-adapter
Restyle
Texture conversion
Get All Models
Get Recommended Models
Get Model by ID
Get Model Categories
Get Model Classes
Get Model Presets
Get Model Examples
Get Model Description
Update Model Description
Update Model Examples
Search Images
Search Models

Resolved
Oct 21 at 01:32am PDT

AWS Outage Impact Report

Overview

Between 6:49 AM GMT on October 19 and 9:24 AM GMT on October 20, AWS experienced increased error rates and latencies across multiple services in the US-EAST-1 Region.

Services relying on US-EAST-1 endpoints—including IAM (Identity and Access Management) and DynamoDB Global Tables—were also affected.

Initial Impact on Scenario

For Scenario, this resulted in a sharp rise in HTTP 500 (Internal Server Error) and 404 (Not Found) responses from the API, increasing from near zero to tens of thousands starting at 6:55 AM GMT.

This was caused by the platform’s inability to access its database and process incoming requests.

Root Cause and Initial Recovery

At 7:26 AM GMT on October 20, AWS identified the root cause as DNS resolution issues affecting regional DynamoDB endpoints.

Once the DNS issue was resolved at 9:24 AM GMT, most services began to recover.

The restoration of DynamoDB immediately restored Scenario’s API operations—requests were successfully processed again, and error rates returned to zero.

Secondary Outage and Extended Impact

However, as AWS continued mitigation efforts, Network Load Balancer (NLB) health checks became impaired, degrading network connectivity across Lambda, DynamoDB, and CloudWatch.

AWS restored NLB health checks by 4:38 PM GMT, gradually reduced throttling on EC2 and Lambda operations, and fully recovered all services by 10:01 PM GMT.

This final chain of events triggered a secondary outage for Scenario between 2:00 PM and 10:10 PM GMT, during which all processing queues were blocked—impacting:
- Prompt-based edits

- Model training

- Image, video, audio, and 3D generation

- Indexing and search

- Compute Unit management

- Notifications

- Socket updates (preventing web app refreshes)

- Email invitations

- SSO and OTP authentication

Additional Contributing Factors

In addition, several of Scenario’s GPU sub-processors, including Replicate and Fal, were also affected throughout the day, contributing to GPU scalability constraints during the incident.

Updated
Oct 20 at 03:13am PDT

Inference controlnet-ip-adapter recovered.

Updated
Oct 20 at 03:06am PDT

Bulk Asset Retrieval, Inference img2img-texture, Inference controlnet, and 1 other resource recovered.

Updated
Oct 20 at 03:05am PDT

Inference txt2img-ip-adapter, Inference txt2img-texture, Texture conversion, and 1 other resource recovered.

Updated
Oct 20 at 03:01am PDT

Update Model Examples recovered.

Updated
Oct 20 at 03:01am PDT

Update Model Description and Search Models recovered.

Updated
Oct 20 at 03:01am PDT

Original Assets Distribution, Create Bulk Assets Download Job, Generate an Inference, and 14 other resources recovered.

Updated
Oct 20 at 02:43am PDT

Get Bulk Assets Download Job and Get Asset by ID recovered.

Updated
Oct 20 at 02:38am PDT

Web application and Public Content recovered.

Updated
Oct 20 at 01:40am PDT

Web application and Public Content went down.

Updated
Oct 20 at 12:25am PDT

Create Bulk Assets Download Job, Bulk Asset Retrieval, Inference txt2img-ip-adapter, and 6 other resources went down.

Updated
Oct 20 at 12:12am PDT

Inference inpaint-ip-adapter went down.

Updated
Oct 19 at 11:59pm PDT

Get Model by ID went down.

Updated
Oct 19 at 11:56pm PDT

Inference inpaint went down.

Updated
Oct 19 at 11:54pm PDT

Get Model Presets went down.

Updated
Oct 19 at 11:54pm PDT

Inference img2img-ip-adapter went down.

Updated
Oct 19 at 11:54pm PDT

Inference img2img and Get Model Classes went down.

Updated
Oct 19 at 11:54pm PDT

Generate an Inference, Inference controlnet-ip-adapter, Get Recommended Models, and 1 other resource went down.

Updated
Oct 19 at 11:53pm PDT

Get All Models went down.

Updated
Oct 19 at 11:52pm PDT

Inference controlnet-inpaint went down.

Updated
Oct 19 at 11:52pm PDT

Original Assets Distribution, Inference controlnet, and Inference controlnet-img2img went down.

Updated
Oct 19 at 11:51pm PDT

Get Asset by ID went down.

Updated
Oct 19 at 11:51pm PDT

Get Bulk Assets Download Job went down.

Updated
Oct 19 at 11:50pm PDT

Inference img2img-texture and Get Model Description went down.

Created
Oct 19 at 11:50pm PDT

Restyle and Get Model Examples went down.