- Delta Air Lines CEO Ed Bastian said the massive IT outage earlier this month that stranded thousands of customers will cost it $500 million.
- The airline canceled more than 4,000 flights in the wake of the outage, which was caused by a botched CrowdStrike software update and took thousands of Microsoft systems around the world offline.
- Bastian, speaking from Paris, told CNBC’s “Squawk Box” on Wednesday that the carrier would seek damages from the disruptions, adding, “We have no choice.”
Crowdstrike offers layered rollouts, but some executive declined this because they want the most up to date software at all times.
Not for the rapid update that broke everything.
See post incident report:
How Do We Prevent This From Happening Again?
Software Resiliency and Testing
Improve Rapid Response Content testing by using testing types such as:
Local developer testing
Content update and rollback testing
Stress testing, fuzzing and fault injection
Stability testing
Content interface testing
Add additional validation checks to the Content Validator for Rapid Response Content.
A new check is in process to guard against this type of problematic content from being deployed in the future.
Enhance existing error handling in the Content Interpreter.
Rapid Response Content Deployment
Implement a staggered deployment strategy for Rapid Response Content in which updates are gradually deployed to larger portions of the sensor base, starting with a canary deployment.
Improve monitoring for both sensor and system performance, collecting feedback during Rapid Response Content deployment to guide a phased rollout.
Provide customers with greater control over the delivery of Rapid Response Content updates by allowing granular selection of when and where these updates are deployed.
Provide content update details via release notes, which customers can subscribe to.
Source: https://www.crowdstrike.com/falcon-content-update-remediation-and-guidance-hub/