Service Degradation - ShotGrid
Incident Report for Flow Production Tracking
Postmortem

Summary

On January 18, 2024, Autodesk experienced a service outage that affected customers using multiple Autodesk products and services between 09:02 PST and 10:15 AM PST.

Root Cause

While performing maintenance targeting one service on Autodesk's user management infrastructure, an invalid configuration was applied which caused the internal service to be down. This led to ShotGrid users being served blank page when their sessions expired.

Autodesk Actions

Autodesk has completed a post-incident analysis of the event and identified actions to be taken. These include:

  • Ensuring ShotGrid handles user management service downtime gracefully
  • Improving infrastructure change validation on Autodesk's user management service
  • Reviewing the incident process for upstream service failures

Thank you for your patience and understanding.

Posted Feb 02, 2024 - 22:01 UTC

Resolved
This incident has been resolved.
Posted Jan 18, 2024 - 18:58 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Jan 18, 2024 - 18:37 UTC
Identified
The issue has been identified and a fix is being implemented.
Posted Jan 18, 2024 - 18:29 UTC
Investigating
We are observing a high number of failed requests to the ShotGrid service which may impact site availability for some clients. Email notifications may also be delayed. This issue is under investigation.
Posted Jan 18, 2024 - 17:21 UTC
This incident affected: Flow Production Tracking and Notification Service.