On the early morning of 19 July, Cybersecurity firm Crowdstrike ran an update that affected Windows operating systems, affecting IT systems in healthcare, airlines, small retailers, payroll, and more. The root cause for the issue has been identified as a driver update relating to Crowdstrike’s Falcon Sensor security software.
In what Tesla and X CEO Elon Musk has described as the ‘biggest IT fail ever’, from Brisbane to Luton, over 3300 flights around the world have been cancelled, GP surgeries have been postponed, UK broadcasters like Sky News and some BBC programmes were unable to produce programmes for television, and people were unable to pay for things like coffee and taxis by electronic card, having to resort to cash.
Crowdstrike boss George Kurtz has confirmed it was not a cyberattack, but has acknowledged that it could be some time before things are resolved. Microsoft has recommendations on its website for users encountering issues, with suggestions of rebooting up to 15 times in some cases.
Experts from City and Bayes have shared their thoughts on the story as it developed.
Muttukrishnan Rajarajan, Professor of Security Engineering and Director of the Institute for Cyber Security at City, University of London, explained:
"The issue is due to a software upgrade from Crowdstrike. Not a well-known name in the security industry. However, has grown quite aggressively and have more than 24,000 customers now.
"This is the challenge of digital transformation and far too much dependency on 3rd party vendors for business-critical applications. As the cyber threats are evolving at a rapid phase these companies are also under lot of pressure to upgrade their systems. However, they have limited resources to scale at the level they need to manage such upgrades carefully as there are lot of interdependencies in the supply chain and this is a classic example of the cascading impact a simple upgrade can cause to multiple business sectors and in this case some critical infrastructure providers.
"Hopefully the new Cyber Security and Resilience bill proposed this week during the King's speech will enforce more controls in place to improve the infrastructure resilience and avoid such future issues at a larger scale to the critical IT infrastructures of major industries.
Airlines will need more efficient solutions
One of the more eye-opening aspects of the situation as it affected airlines was the quick shift from IT systems to check in passengers for flights to taking down their names and information manually, via pen and paper.
Dr Amit Rawal, Lecturer in Management (Education) added:
“The IT outages reflect the issues with a cybersecurity update on complex networks. The aviation and transport industries in particular are impacted due to their reliance on outdated software. Subsequently, their systems have not been able to display flight and train information as well as check people in on flights as per the usual processes. Further implications of this IT outage are expected given the various networks that rely on an update from Crowdstrike.”
“Over the course of the next few days, this will cause a number of delays and further cancelations on flights as they will not all be able to fly at their scheduled times. Airlines are likely to have several customers seeking compensation so will have to find more efficient solutions than manual approaches”.
Robust operating systems needed
Professor Feng Li, Associate Dean of Research and Innovation explained the wider issues surrounding technology and why it’s so surprising this happened.
“The implications of this IT update are being described as the “Windows blue screen of death” for companies that use CrowdStrike.
"CrowdStrike is a big name in cybersecurity, worth around $80 billion, and they lead the market in “endpoint protection”, which basically means running security software or antivirus on Windows machines. Businesses rely on CrowdStrike to keep their Windows clients secure.
"This reflects poorly on both Windows and CrowdStrike, and it is shocking this could happen with Microsoft OS (Operating System) in 2024. What’s especially surprising is that CrowdStrike didn’t carry out staged rollouts of this update – usually, you’d roll out to a small percent first, then a bigger group, and so on until everyone got it. That way, any problems can be spotted, and things can be paused or rolled back before it causes massive damages.
"It is surprising that lessons from the past haven’t been learned and that this could happen today, at such a massive scale around the world. It’s not just CrowdStrike’s fault. Although it is sensible to give an antivirus company privileges to update their systems, a robust OS shouldn’t let things like this happen.
Lessons to learn – don’t put your eggs all in one basket
ManMohan Sodhi, Professor of Operations and Supply Chain Management discussed the risks involved and lessons to take away. He said:
“The most basic principle of risk management is to not concentrate your risk—like not putting all your eggs in one basket. And do not connect your risks so they build on each other. Yet, in the world of IT, both occur with nearly all eggs, with just one or two companies running centralized operations despite their software being used globally. Now disruptions are affecting a wide range of businesses and operations worldwide—not the first time—all boiling down to a single update of software from a security firm (Crowdstrike) for a single update of Windows. Billions of people across the world will bear the eventual cost.
“Microsoft and Crowdstrike did not coordinate their changes, but who should we blame for all the impact? After all, policymakers are allowing all these eggs in a single basket. Companies are not fully considering the fragility of their IT systems, even with their data backed up, but with all their systems connected for efficiency but not resilience. The big tech firms are better at lobbying Washington than writing software, and Washington lobbies other countries on their behalf.
“What lessons can we learn? Use software from different companies to put your eggs in different baskets. Use public-domain software that can be vetted. Use separate networks and isolated systems for critical operations. The executives in the big tech companies are not compensated for bringing peace and prosperity to this planet, so we need not treat them as new Roman gods. But we will forget these lessons by next Friday."
All comments attributed to the academic experts.