Microsoft CrowdStrike outage – A sneak peek

Shruti Govil
Shruti Govil July 22, 2024
Updated 2024/07/22 at 2:05 PM

In the early hours (IST) of July 19, reports emerged that Microsoft’s Azure cloud service was facing an outage, affecting users in the Central U.S. area. Within the next few hours, the service outage spread like wildfire to several other countries, including India, disrupting flight operations and air traffic, forcing airports to shift to manual operations. Brokerages and stock exchanges were also hit, throwing the digital lives of many out of gear.

The Indian Computer Emergency Response Team, CERT-In, has issued a severity rating of “Critical” for the incident.

What caused the Microsoft’s outage?

Microsoft, while acknowledging the outage in a blog post noted that “Virtual Machines running Windows Client and Windows Server, running the CrowdStrike Falcon agent, may encounter a bug check”.

The tech giant estimated the approximate time of impact to be as early as 4.09 a.m. UTC (9.39 a.m. IST) on July 19, when this update started rolling out.

While Microsoft has not fully revealed what caused the outage on its cloud service, one incident seems to have triggered this cyber event — a glitch in the software update of CrowdStrike’s end point protection programme, Falcon Sensor.

Crowdstrike is a cybersecurity firm that deploys unified security programme to stop breaches in real time. The Falcon Sensor platform runs with high privileges and is built to protect endpoints (basically, any devices connected to a computer network). A mishap in this security platform can cause the operating system to crash, like what several users faced globally on July 19 with the Blue Screen of Death (BSOD). Once BSOD flashes on a user’s screen, they will be caught in a boot loop cycle, which simply means that they won’t be able to access their devices linked to CrowdStrike’s Falcon platform. This is because the security solution is deployed at the BIOS level.

What does BSOD mean?

BSOD is a warning that users see on their computer when the system interrupts operations and displays it on a blue screen. This is what many users saw today when they tried to access their affected devices during the outage.

Officially referred to as the “Stop Error”, the warning is issued when a critical problem forces Windows to reboot. Before rebooting the system, the Windows operating system saves a file on the computer, carrying some data about the error. This file is called a ‘minidump’ and is crucial in determining the cause of the error.

BSOD warnings have many causes and include problems caused by hardware drivers, incompatible software like apps, or programs which may cause conflicts that result in BSOD.

Additionally, faults in the hardware systems like RAM, hard disk drive (HDD), Solid-State Drive (SSD), motherboard, or other physical components in a system can also lead to a BSOD screen.

A malware injected by threat actors could also corrupt system files in a computer, causing it to show the Blue Screen of Death.

In order to fix the error screen, it is essential to find out the cause and troubleshoot the issue, based on the alphanumeric code shown in the message.

What could have caused this malfunction?

According to CyberArk’s CIO, Omer Grossman, there are range of possibilities, starting from “human error – for instance, a developer who downloaded an update without sufficient quality control – to the complex and intriguing scenario of a deep cyberattack, prepared ahead of time and involving an attacker activating a “doomsday command” or “kill switch”.

And it is anyone’s guess until CrowdStrike’s own analysis and updates are out in the coming days. Most cybersecurity experts will be keen on understanding what could have gone wrong here.

Alternatively, the software update made by CrowdStrike could have conflicted with the changes introduced in the latest Windows update, CYFIRMA’s CEO Kumar Ritesh, pointed out.

The latter could a good area to probe as other cloud service providers, like Google Cloud or Amazon Web Services (AWS), did not suffer any outage. It is also important to note that both Google and Amazon have built their cloud platform on Linux.

How did the outage impact people?

Thousands of users opened their devices to see BSOD as many people faced delays and disruptions at the airport.

In the past when businesses faced such outages and cybersecurity breaches, often carried out by foreign attackers, Microsoft addressed them with confident optimism. However, on July 19, as users across the world struggled with simple tasks such as making digital payments or found themselves stranded in airports, there was growing fear and frustration. Computer emergency response teams worldwide quickly tried to ascertain if the IT outage was the work of cyber-criminals or even state-backed hackers. The outage seriously impacted Microsoft’s users ranging from airports, airlines, financial institutions, and hospitals down to office workers and casual Internet users trying to log into their Microsoft apps or devices.

Traders and investors in India complained that their transactions were not being processed, while airports and airlines moved back to manual processes, such as hand-written boarding passes, even as flights were delayed. Some hospitals also reported disruptions, with concerns that patient data could be lost, while their crucial treatments might be delayed.

Downdetector recorded spiking outage reports from different parts of the world, with complaints surrounding Microsoft’s login, outlook, server, and app experiences. In India, many major cities such as Chennai, Bengaluru, Delhi, and Mumbai were impacted, per the platform.

What is the current status of the outage?

The situation is gradually returning to normal, with CrowdStrike sharing that a fix has been deployed. Both U.S. and Indian airlines confirmed that they were working to get passengers to their destinations.

“CrowdStrike is actively working with customers impacted by a defect found in a single content update for Windows hosts. Mac and Linux hosts are not impacted. This is not a security incident or cyberattack. The issue has been identified, isolated and a fix has been deployed,” said CrowdStrike CEO George Kurtz as part of a longer post on X.

However, with even the U.S. White House tracking the situation, both Microsoft and CrowdStrike will have many difficult questions to answer in the coming days.

Share this Article