On Thursday January 21st 2021, from 7h00 UTC to 10h05 UTC, a small number of Shotgun sites were partially unavailable.
The incident occurred as a result of a faulty application component. Instead of the expected response, this component returned an error to requests.
A small number of clients (under 5%) whose sites were associated with the failed app component experienced HTTP 502 responses for one in every few requests made to Shotgun during the incident.
We are investigating why the affected app component got into a bad state and how it can be prevented from happening again.
To further improve our resiliency, we plan to implement enhanced health-checks to identify and remove faulty components automatically.