On November 11th, 2021 at 14:10UTC, an update to the ShotGrid service introduced intermittent performance degradation for some clients which was later identified and subsequently fixed on November 18th, 2021 at 13:45UTC.
On November 16th an investigation was launched into a cluster of support requests where clients reported receiving HTTP 502 and 503 responses. Alerts were not triggered by our monitoring tools due to the sporadic nature of the errors. We identified that a code change introduced into ShotGrid on November 11th was responsible for an increased volume of requests to our feature flag service, resulting in degraded performance. A fix was implemented on November 18th which addressed this performance degradation.
During the course of the investigation we identified additional conditions which could result in performance degradation due to inconsistencies in memory utilization. A fix containing memory allocation optimizations was implemented on November 18th, 2021 at 18:55UTC. This further reduced the number of errors experienced by clients.
Clients sporadically received HTTP 502 or 503 responses when making requests to ShotGrid.
Increased compute capacity has been allocated to the feature flag service to improve its performance and redundancy.
Improvements to our monitoring tools are being implemented to track and alert us to unusual patterns of 5XX errors.