Service Degradation - ShotGrid
Incident Report for Flow Production Tracking
Postmortem

AUTODESK EVENT ANALYSIS

Incident Number: # INC107309

Incident Date: March 4, 2024

Summary

On the evening of March 4, 2024, from 08:40 PM PST to 11:30 PM PST, Autodesk encountered a

service disruption that affected customers' ability to access the ShotGrid website.

Impacted Services

  • Autodesk ShotGrid

Root Cause

The disruption occurred during a system enhancement initiative by Autodesk involving the

consolidation of multiple proxies. In this process, ShotGrid was unintentionally excluded from

an Autodesk maintained list of authorized clients.

This led to authentication failures, presenting most ShotGrid users with a blank page during

sign-in attempts. We have now resolved this issue and are diligently implementing measures to

avoid similar incidents in the future.

Autodesk Actions

Autodesk has completed a post-incident analysis of the event and identified actions to be

taken. These include the following:

  • Upgrading the monitoring systems and alert mechanisms to detect proxy-related issues

more effectively.

  • Refining the incident escalation process for rapid issue ownership and response.
  • Optimizing deployment strategies across all proxy components, incorporating methods

like Canary-based approaches and feature flags.

We are committed to continually improving our systems and processes to serve you better.

Thank you for your patience and understanding.

Posted Mar 13, 2024 - 19:30 UTC

Resolved
This incident has been resolved.
Posted Mar 05, 2024 - 07:38 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Mar 05, 2024 - 07:37 UTC
Identified
The issue has been identified and a fix is being implemented.
Posted Mar 05, 2024 - 07:35 UTC
Investigating
We are observing a high number of failed requests to the ShotGrid service which may impact site availability for some clients. Email notifications may also be delayed. This issue is under investigation.
Posted Mar 05, 2024 - 06:14 UTC
This incident affected: Flow Production Tracking and Notification Service.