By March 2025, firms in scope of PS21/3: Building operational resilience must make sure they can deliver important business services in “severe but plausible scenarios” – like the CrowdStrike outage – to help minimize the impact on consumers and markets.
The FCA has outlined key lessons following this incident, including examples of how firms’ compliance with PS21/3 allowed them to respond effectively, and areas firms should strengthen.
On July 19, 2024, CrowdStrike released a Falcon content update for Microsoft Windows hosts, with a defect that caused systems to crash. Many firms use CrowdStrike for device protection, threat intelligence and response services. CrowdStrike’s core technology, the Falcon Platform, detects and responds to malicious threats.
The FCA saw varying degrees of operational impact on regulated firms, with no one sector more affected than others, and minimal consumer harm.
General observations
By acting in accordance with the FCA’s operational resilience rules (PDF), firms were able to identify consumer and market impacts and prioritize important business services.
The FCA found:
- Firms who had mapped their important business services and resources were able to prioritize getting vital services back online and so reduce the overall impact of the incident on their operations.
- Firms who tested scenarios that were “severe but plausible” had beneficial outcomes.
- Firms with clearly defined and tested communications strategies were able to quickly and efficiently communicate with customers and stakeholders.
How firms responded
Infrastructure resilience
The FCA reported that firms recognized that it was essential “to identify single points of failure within their infrastructure and technology stack, and identify the changes, investment and actions needed to ensure resilience of these.”
Furthermore, firms considered “various ways to ensure resilience in their infrastructure, such as procuring systems on different builds and devices with different Operating Systems (OS), and some have considered updating change management processes for third parties with deep-level system access.” Also, some firms identified the need to review change management processes for software and content updates.
Third party management
The FCA found that some regulated firms affected by the outage also provided services that supported other regulated firms’ important business services. This increased the impact of the disruption. However, firms that had conducted detailed mapping of third and nth party relationships were able to quickly understand exposure and take mitigating actions to manage the impact.
It was also noted that firms who had existing relationships and pathways to share information with third party providers were able to respond quicker during the outage.
Incident communications
The FCA said that firms, when reflecting on the incident, acknowledged “the need to ensure staff and management are aware of, and familiar with, incident response and crisis management processes. Aligned to this, firms reflected on the importance of ensuring stakeholder contact details are updated and readily available (online and offline).”
The FCA found the timing and completeness of incident notifications to the regulator varied extensively across affected firms. However, effective engagement “was timely and clearly defined the impact of the incident on the firm’s important business services.”
It was noted that firms who had pre-defined communication plans were able to respond more quickly.
FCA advice
The FCA made several recommendations, including that firms should:
- ensure current testing scenarios are adequate to minimize operational disruptions;
- ensure adequate testing of updates before deployment and consider phasing releases across user groups to contain potential failures;
- review third-party management frameworks regularly, especially after significant events or incidents;
- note that communications are more efficient through use of pre-approved communication templates;
- ensure third-party contracts clearly set out responsibilities for service monitoring, incident notification and timely updates, both during and after incidents.
The FCA also suggested that firms should consider conducting a post-incident review following a significant disruption or any event that affects the market, “this would include a review of the overall effects to determine if any changes are needed to your important business services or impact tolerances, for example, the need to classify a service as an important business service, or revise impact tolerances.”
The FCA encourages all firms to consider these lessons and advice to improve their ability to respond to, and recover from, future disruptions.