Public admission of failure takes courage. In trying to limit reputational damage to his cyber security company, CrowdStrike president Michael Sentonas certainly demonstrated chutzpah by accepting an award for the Most Epic Fail at the recent Pwnie Awards. The tactic seems to have worked: Sentonas was cheered by those attending the DEF CON event for publicly owning his company’s mistakes.
“Definitely not the award to be proud of receiving,” Sentonas told delegates in his acceptance speech. “I think the team was surprised when I said straight away that I’d come and get it because we got this horribly wrong. We’ve said that a number of different times and it’s super important to own it when you do things well. It’s super important to own it when you do things horribly wrong, which we did in this case.”
But beyond this astute PR move, the legacy of the CrowdStrike incident is deadly serious. On 19 July, the world experienced one of the biggest ever IT outages when a faulty software update to Crowdstrike’s vulnerability scanner, Falcon Sensor, led 8.5 million systems running Microsoft Windows to crash. Globally, IT infrastructure malfunctioned, creating havoc and financial loss for individuals and organisations.
The most serious such event since the NotPetya cyberattack in 2022, its impact was enormous: the faulty update caused global computer outages that disrupted air travel, banking, broadcasting, hotels, hospitals and other vital services. Insured losses are estimated to be more than $10bn; actual losses may be far greater with the absence of cover affecting thousands of SME businesses.
Central to determining where liability rests will be the question of foreseeability. Numerous individuals would have known that this software was critical for interconnected and dependent organisations worldwide, and that they would be seriously affected by a faulty update. It is therefore self-evident that vendors should have adequate procedures in place for updating software, which include how each update is developed and tested before it is distributed to users.
Was CrowdStrike a ‘Black Swan’ event?
So, was this a Black Swan event – unpredictable beyond what could reasonably be expected? Such events are usually characterised by their rarity, the severity of their impact, and the general perception that they were obvious in hindsight.
Opinion is divided on whether events like CrowdStrike are, in fact, becoming more common, and therefore more predictable. Certainly, innovators who experiment in a haphazard fashion are more likely to increase the incidence of such events, making them less unpredictable. Restrictions may stifle creativity, but innovators who fail to take adequate precautionary steps to prevent predictable events may also face serious legal consequences.
Debate will rage about what testing processes should be mandatory for those launching cyber security updates, especially when issuing these updates at speed is necessary to protect against new cyber threats. When explaining the potential vulnerability of different systems, IT industry commentators invariably point out that such updates may need to be launched multiple times a day.
Similarly, other interdependent systems may also be updated multiple times a day with devices receiving updates in a different order or timescale. Commentators argue that the real world cannot deliver a perfect test environment, and if updates go wrong, third and fourth-party exposure can be expected alongside potential supply chain fallout. From a lawyer’s perspective, this “guinea pig” approach to technology creates a nightmare scenario of potential class actions.
Risk of single points of failure
Risks are further amplified by any technology that has a prominent or dominant market share. Here, potential single points of failure can result in systemic events that ultimately produce simultaneous claims from a very large number of claimants: one deficient small cog can bring global IT infrastructure to a halt.
Such a single point of failure can have an extraordinarily wide impact with potentially catastrophic cumulative losses. From a legal perspective, questions arise about mitigating the risks of a single point of failure in a complex, global IT supply chain, and whether these risks are adequately assessed.
Issues of agency and delegation also arise. The represented security of a system when interfacing may not only block freeze the system, but also open it up to attack. In scope and scale, the net effect of the CrowdStrike outage was equivalent to an attack on a global supply chain by a malicious actor.
Perhaps the issues faced as a result of NotPetya and other malicious cyber attacks simply foreshadow the impact that future cyber events might deliver.
Could Microsoft have rejected the update?
It is also important to consider the link between CrowdStrike and Microsoft. In particular, there is the question of whether Microsoft’s operating system was capable of rejecting the update, and reverting to a previous version. If it could, why did that not happen?
Although it is unclear as to precisely how the Microsoft system could revert to the previous version in order to achieve this outcome, AI experts constantly remind us that the system can be compared to a super brain that calibrates itself to resolve problems. If that is true, is the super brain still engaged or are we listening to the wrong AI experts?
In a recent blog, industry commentators refer to Microsoft’s comments on the challenge of third-party vendors pushing out updates which operate in the low-level operating system. They suggest that changes could be made so that third-party applications operate higher in the operating system, easing the management challenge of such issues: for example, the ability to reject updates which cause blue screens and the need to roll back to the previous version.
Predictable failures
Across the IT sector, some argue that disaster could have been averted by more rigorous testing of security updates and the staggering of update releases to smaller groups or upgrade “rings”. From a legal perspective, it is impossible to ignore the fact that everything seems too predictable, especially given the endless discussions over many years about the dreaded blue screens.
Given the complexity, novelty (and predictability) of industry practices, together with the scale of attendant risks (including legal rights and obligations), the IT sector must give full consideration to its responsibilities in preventing further catastrophic losses resulting from systemic failure and cyber security risks.
Hermès Marangos is a Partner at Signature Litigation