On July 19, 2024, CrowdStrike, a US-based cybersecurity organization, released a faulty update to its security software. This update affected millions of Windows computers globally, causing them to crash and display the blue screen of death (BSOD).
Several organizations across industries were unable to reboot and were out of work for hours. The CrowdStrike outage affected important businesses, including banks, hospitals, airlines, and airports. While the error was identified within hours and a fix was released soon after, several computers had to be fixed manually and continued to cause outages. Estimatedly, the CrowdStrike outage update led to a global financial loss of USD 10 billion.
This blog explores when did CrowdStrike outage start, its root causes, impacts, and more.
The CrowdStrike cybersecurity platform is used by several organizations of various sizes and across various industries. Due to its wide use and integration with significant operations, the CrowdStrike outage had an amplified effect. The outage was not caused by Microsoft directly. A flaw in CrowdStrike triggered the issue.
CrowdStrike Falcon integrates with Microsoft’s operating system (OS), and Windows, and runs closely with its kernel. This enables CrowdStrike to monitor operations as they occur in the OS. However, a logic error in Falcon led to the outage. Since it is so closely working with Windows, it resulted in the global Windows crash.
The logic error in CrowdStrike occurred within a sensor configuration update. The sensor updates regularly — often several times daily — to ensure threat minimization and protection against various risks.
The erroneous update was contained in a channel file. These files contain configuration updates to protect against behavioral threats. Channel file 291 was the update intended to be deployed. It aimed to enhance how CrowdStrike Falcon assesses “named pipe execution” on the Windows OS. Named pipes are a widely used type of communication mechanism. They facilitate interprocess communication for Windows.
CrowdStrike unknowingly introduced the logic error through channel file 291, resulting in the crash of the Falcon sensor. Subsequently, the Windows OS with which it was integrated crashed as well. However, the logic error was not present in all versions of channel file 291. As soon as CrowdStrike identified the error, a different version with the fix was introduced. Regardless, the reversal of the initial update came too late for certain users as they had already updated. This led to the system crash and BSOD screens.
A Post Incident Review (PIR) was released by CrowdStrike on 24th July 2024. Another detailed 12-page root cause analysis report was released on 06th August 2024.
According to Microsoft, 8.5 million devices operating on the Windows OS were affected by the CrowdStrike outage. Surprisingly, it is less than 1% of Microsoft’s global user base with Windows. Despite the small percentage of systems that were affected, the operations that they were carrying out were critical. Some services and businesses that CrowdStrike affected are mentioned below:
Financial institutions and banking systems functioning online were affected by the outage. Several payment platforms were directly affected as well. There were also individuals who did not receive their paychecks due to banking services coming to a halt.
More than 10,000 flights were canceled and significant delays were reported around the world as the outage grounded several flights. In the US alone, the airlines that were affected include United, American Airlines, and Delta. Globally, multiple airlines and airports were forced to cancel flights until systems were back online.
Public transit in various cities such as Minneapolis, New York City, Chicago, Washington DC, and Cincinnati were affected by the outage.
Several media and broadcast outlets went off air due to the outage. These include British broadcaster, Sky News, as well.
Hospitals and clinics faced significant challenges in their appointment systems and networks due to the outage. This led to delays and cancellations in appointments. A few states also reported that their 911 emergency services were also affected. These states include New Hampshire, Alaska, and Indiana.
Other negative consequences of the CrowdStrike outage are the legal repercussions that followed. These include the following:
The outage cost Delta Air Lines approximately USD 500 million. The airlines were reported saying that they would seek damages from CrowdStrike since the outage caused immense disruption in the airlines’ operations. CrowdStrike, however, refuted all claims of gross negligence. It struck back at the airlines and further stated that they had neither modernized nor updated their security infrastructure.
This lawsuit alleges that the claims made by CrowdStrike about the adequacy of its security software were misleading and false. It further claims that the share price of CrowdStrike declined after the outage. The class action suit seeks damages on behalf of CrowdStrike’s investors, who held shares from 29th November 2023 to 29th July 2024.
CrowdStrike aims to prevent further outages like these by:
CrowdStrike was able to identify the error within 79 minutes and deploy another update to fix it. Regardless, the recovery process for business is long and complicated. Another issue that systems faced was that once the update was complete, the fundamental Windows OS would trigger BSOD. This rendered the system inoperative for a longer time. Some affected systems were brought back into operation manually by IT operators, while other businesses took a few days to deploy the fix.
Outages on a scale such as this can be alarming and have several negative consequences. This makes it crucial to prevent them in the long run. Below are a few tips to reduce and mitigate the risk of such outages that impact businesses across the globe:
The CrowdStrike outage date was a day that shed light on the pervasive vulnerabilities that exist in our modern-day technologies and the reliance we place on them. Manual procedures significantly improve business continuity during such outages, in addition to automated processes and system backups. Tech organizations, especially the ones that have a global and critical presence must test all updates before they are deployed to production, develop and document manual procedures, and perform disaster recovery and business continuity planning at regular intervals.
Imagine IT, a leader in technology and cyber security solutions plays an important role in helping businesses navigate the complexities of digital disruptions, such as the recent CrowdStrike outage. With a focus on proactive IT management and innovative strategies, Imagine IT enables organizations to build resilient systems that can withstand unexpected challenges.
Our approach includes robust cybersecurity measures, efficient data management, and reliable support services. They are designed to minimize the impact of outages and ensure business continuity. Contact Imagine IT today to better prepare for and respond to incidents like the CrowdStrike outage, safeguard your operations, and maintain customer trust.
MAIN OFFICE
© 2024 Imagine IT Website by eMod, LLC