Operational Resilience That Delivers When It Matters Most
- Julien Haye
- Jan 23, 2023
- 20 min read
Updated: Sep 7

Introduction: When Resilience Fails, Boards Are the First to Answer
In the past five years, operational disruptions have wiped out billions in market value, forced public apologies from CEOs, and triggered direct regulatory intervention to protect the financial system. The 2018 TSB IT migration failure cost more than £400 million, caused weeks of customer disruption, and became the subject of a parliamentary inquiry. The 2021 Colonial Pipeline cyberattack shut down critical fuel supplies across the US East Coast, escalating to a national security emergency within hours.
For boards, the stakes are personal as well as organisational. Under the UK’s Senior Managers and Certification Regime (SMCR), the EU’s Digital Operational Resilience Act (DORA), and similar frameworks worldwide, directors are expected to oversee and assure the resilience of critical services. Failure to do so can result in personal accountability, regulatory sanction, and permanent reputational damage.
“The operational resilience policy is outcomes-based. There are many roads to resilience.”— Duncan Mackinnon, Executive Director, Prudential Regulation Authority
Organisations need the ability to continue delivering their most important services when systems fail, information is incomplete, and multiple problems demand attention at once.
Operational resilience is the capacity to anticipate disruption, protect what matters most, maintain continuity under pressure, and adapt as conditions change. It is as much about safeguarding customers and markets as it is about protecting the organisation’s own viability.
This article examines operational resilience from a strategic, regulatory, and organisational perspective. It brings together the priorities of boards, CROs, and operational leaders, and outlines practical tools, governance approaches, and a phased roadmap to move from compliance to competitive advantage.
TABLE OF CONTENTS
Operational Resilience: A definition
I like the definition used by Megan Butler, FCA Executive Director of Supervision – Investment, Wholesale and Specialists
“We define operational resilience as the ability of firms and FMIs and the financial sector as a whole to prevent, adapt, respond to, recover and learn from operational disruptions.”
So, building resilient firms is about considering both prevention and recovery. This is about sustainable business model aligning scalability with resiliency and efficiency.
This is also about looking at both financial and operational resilience. The former has long been a focus of the regulators through capital and liquidity management, and I will not cover it in any detail in this series of articles. Operational resilience on the other hand, is a fairly new area of policy, though firms should already have implemented a lot of the requirements by now. It came about as a result of multiple major operational incidents in the banking sector (see next section) and was the result of the regulators realising a simple truth:
It is always possible to bail-out a financial firm if it runs out of cash, even though this is not a desired outcome. But it is not possible to “bail-out” a firm which is unable to run its operations. There isn’t any substitute for a failed database somewhere in a datacentre of a major bank; it has to be fixed by the same firm.
Regulatory Landscape and Industry Standards
The UK has been a leading jurisdiction in shaping operational resilience policy.
This outcomes-based approach recognises that each firm’s business model, service portfolio, and operating environment are unique and that resilience should be judged by whether critical services can be maintained, not by rigidly prescribed processes.
The UK Treasury Select Committee began pressing for stronger resilience oversight after a series of high-profile disruptions in the financial sector. The April 2018 TSB IT migration failure became a catalyst, accelerating regulatory development and reinforcing the need for board-level accountability over operational continuity.

Today, multiple jurisdictions have formalised operational resilience expectations:
UK FCA/PRA Rules – Require identification of important business services (IBS), mapping dependencies, setting and testing impact tolerances, and ensuring recovery within tolerance levels in severe but plausible scenarios.
EU Digital Operational Resilience Act (DORA) – Extends resilience requirements to all financial entities and their ICT service providers, including critical third parties. Focus areas include ICT risk management, incident reporting, and testing frameworks.
Basel Committee Principles – Global guidance on operational resilience, emphasising governance, risk management, incident response, and interconnections with financial resilience.
US OCC/FDIC Guidance – Focuses on business continuity, third-party risk, and cyber resilience, with increasing emphasis on scenario testing and impact assessment.
Across these frameworks, several core requirements consistently emerge:
Critical Business Services – Identify and prioritise the services that, if disrupted, would cause intolerable harm to customers or threaten market stability.
Impact Tolerance – Define measurable thresholds for maximum tolerable disruption and design operations to stay within them.
Scenario Testing – Plan and execute severe but plausible scenarios to validate recovery capabilities.
Dependency Mapping – Understand the technology, people, processes, and third-party relationships each critical service relies on.
For many organisations, the challenge is not just meeting one regulator’s rules but operating across multiple regimes with different terminology, timelines, and reporting expectations. This cross-jurisdiction harmonisation requires a clear governance model, consistent definitions, and integrated reporting so that resilience is embedded globally while still satisfying local requirements.
Ready to build resilience that holds up when it matters most. Discover our full range of operational resilience solutions.
Core Pillars of Operational Resilience
Operational resilience is fundamentally an outcome: the ability to keep delivering your most important services even in the face of severe operational incident to prevent harm to consumers. This comes hand-in-hand with orderly wind-down planning. Achieving this requires a set of interconnected organisational capabilities that work together in prevention, recovery, culture, and governance, all underpinned by a clear strategic direction.
Identification of Critical Services and Dependencies
The starting point is mapping the Important Business Services (IBS) that, if disrupted, could cause intolerable harm to customers or threaten market stability. This requires an end-to-end view of how services are delivered, including technology platforms, operational processes, physical sites, and people.Every dependency must be understood:
Technology: infrastructure, applications, and data
People: critical roles and skills, succession planning
Processes: key operational workflows and control points
Third parties: suppliers, outsourcing partners, and critical service providers
Setting and Testing Impact Tolerances
Impact tolerance is the measurable threshold of disruption an organisation is prepared to accept for a critical service. Setting impact tolerances involves:
Defining maximum tolerable outage durations
Agreeing recovery objectives with leadership and the board
Considering regulatory expectations, customer needs, and market stability
These tolerances are not theoretical. They become the benchmark against which resilience capabilities are tested and measured.
Scenario Testing
Resilience is proven under pressure, not on paper. Severe but plausible scenarios are designed to stress the organisation’s recovery capabilities and identify hidden vulnerabilities.Examples include:
Technology outages or data centre failures
Cyber attacks or ransomware incidents
Simultaneous compound events combining multiple stressors
Testing is only effective if it covers both technical recovery and decision-making under pressure, including escalation, communication, and governance pathways.
Prevention and Recovery Capabilities
From your original framework:
Prevention: robust cyber security, disciplined change management, proactive vendor oversight
Recovery: incident response planning, crisis management playbooks, and tested internal and external communication strategies
Communication and Escalation Protocols
Clear, actionable escalation routes ensure the right people are mobilised quickly. This includes:
Defined roles and responsibilities across business, risk, technology, and operations
Board and executive notification thresholds
Pre-approved stakeholder and regulatory communication templates
Culture, Strategy, and Oversight as Enablers
Culture: resilience is embedded when employees feel safe to escalate issues early, leadership demonstrates accountability, and resilience is valued in decision-making
Strategy: the organisational blueprint must align with resilience objectives, for example through location strategy, simplification initiatives, or digitalisation programmes
Oversight: board and executive committees must assure themselves that prevention and recovery mechanisms are effective, supported by policies, standards, and risk management tools
Integration with Broader Risk Management
Operational resilience cannot exist in isolation from the wider risk management framework. It must be connected to the way the organisation identifies, assesses, and manages risk across all levels. This connection ensures that resilience is not just a compliance requirement but a strategic capability that informs investment, transformation, and day-to-day decision-making.
Link to Enterprise Risk Management (ERM)
Resilience insights should feed directly into the organisation’s ERM processes. This means:
Aligning resilience activities with the overall risk taxonomy and reporting structures
Using resilience testing results to update risk registers and inform the risk profile
Including operational resilience scenarios in the risk assessment cycle
Alignment with Risk Appetite, Capacity, and Tolerance
Resilience objectives must be consistent with the organisation’s risk appetite statement, capacity to absorb disruption, and specific tolerance thresholds. This alignment helps ensure:
Impact tolerances are realistic given available resources and operational capabilities
Board and executive oversight focuses on the right recovery targets
Escalation triggers are clear and linked to pre-defined governance pathways
Informing Strategic Investment and Transformation Decisions
Operational resilience analysis should be a routine input to strategic planning. This includes:
Assessing resilience implications of major change programmes, acquisitions, or market entries
Prioritising investments that address the most critical vulnerabilities
Evaluating whether proposed efficiency measures might erode resilience capacity
Embedding in Governance and Reporting
For resilience to be sustainable it must be part of business-as-usual governance and reporting:
Integrating resilience metrics into board and executive dashboards
Including resilience readiness in internal audit and assurance programmes
Ensuring committees with risk oversight explicitly review resilience capabilities and gaps
The Role of Third Parties and Supply Chains
No organisation operates in complete isolation. Critical services often depend on third parties and complex supply chains, which means resilience is only as strong as the weakest link in those external relationships.
Understanding Third-Party Dependencies
The first step is to identify all third parties that support the delivery of important business services. This includes:
Direct outsourcing arrangements for core operational processes
Technology providers such as cloud hosting, data centres, and software platforms
Key suppliers that deliver physical goods or specialised services
Subcontractors used by your main suppliers, which may create hidden dependencies
Assessing Resilience of Third Parties
Once dependencies are mapped, the resilience of each provider must be assessed. This involves:
Reviewing their business continuity and disaster recovery plans
Understanding their own dependency chains and concentration risks
Evaluating their ability to operate within your organisation’s impact tolerances
Conducting due diligence at onboarding and regular assurance thereafter
Exit Strategies for Critical Third Parties
Regulators increasingly require firms to have credible exit strategies for critical third-party providers. This means:
Identifying alternative providers in advance
Establishing contractual rights to transition services if performance or resilience fails
Testing the feasibility of switching providers within the required timeframes
Ensuring data portability and interoperability between systems
Regulatory Requirements and Standards
Frameworks such as DORA in the EU and the UK’s Critical Third Party regime set explicit expectations for managing third-party resilience. Common requirements include:
Reporting on critical third-party arrangements
Stress-testing operational scenarios that involve third-party disruption
Demonstrating contractual provisions for resilience, oversight, and termination
Coordinating with regulators on systemic provider risks
Integrating Third-Party Oversight into Operational Resilience
Third-party and supply chain resilience must be part of the broader operational resilience framework:
Including third-party dependencies in impact tolerance setting and scenario testing
Aligning vendor management with the organisation’s resilience governance structures
Ensuring board and executive oversight includes external dependency risks
Technology and Cyber Resilience
Technology is the backbone of most critical business services, and its resilience is central to an organisation’s ability to operate through disruption. This applies not only to internal systems but also to the external technology services and platforms that organisations rely on.
IT Infrastructure Continuity
Ensuring that core infrastructure can continue operating during a disruption is essential. This involves:
Building redundancy into data centres, network connections, and hardware
Maintaining capacity to shift operations to alternative sites or systems
Regularly testing failover and recovery processes to validate they work in practice
Cloud Service Provider Risks
Cloud platforms offer scalability and flexibility but introduce new resilience challenges. These include:
Concentration risk when multiple critical services rely on the same provider
Data sovereignty issues if services span multiple jurisdictions
Dependency on the provider’s own resilience and incident response capabilities
Effective management requires a clear understanding of these risks, strong contractual provisions, and contingency plans for switching providers if necessary.
Cybersecurity Preparedness
Cyber threats are one of the most common and severe triggers of operational disruption. Building cyber resilience requires:
Continuous monitoring of networks and systems for potential breaches
Incident detection and response procedures that can contain and recover from attacks quickly
Employee awareness and training to reduce human error vulnerabilities
Alignment with recognised frameworks such as NIST Cybersecurity Framework or ISO 27001
Data Integrity and Recovery
The ability to protect and recover data is essential to both operational and regulatory requirements. This includes:
Regular, secure backups stored in separate locations
Verification processes to ensure backups are complete and usable
Recovery time and recovery point objectives aligned with impact tolerances
It is also important to check data recovery is part of hte package when engaging with a cloud provider. We are aware of instances where the main
Regulatory Expectations for Technology Resilience
Many regulators now set explicit expectations for technology resilience. Common requirements include:
Testing technology failure scenarios as part of operational resilience programmes
Demonstrating the ability to recover critical services within defined timeframes
Documenting and evidencing technology risk assessments and mitigation measures
Technology and cyber resilience are not stand-alone initiatives. They must be embedded within the wider operational resilience framework, linked to critical service mapping, impact tolerance setting, and scenario testing. This ensures technology capabilities are tested in realistic conditions and are aligned with the organisation’s overall resilience objectives.
Cultural and Leadership Enablers
Technology, processes, and frameworks provide the structure for operational resilience, but it is people who determine whether those capabilities work in practice. A resilient organisation is built on a culture that values preparedness, encourages transparency, and empowers decision-making at every level.
Leadership Accountability
Resilience starts at the top. Boards and executives must take ownership of resilience outcomes by:
Setting a clear vision for resilience and linking it to strategic objectives
Allocating resources to maintain and strengthen resilience capabilities
Holding senior managers accountable through defined responsibilities and performance measures
Ensuring resilience expectations are built into leadership scorecards and appraisal processes
Psychological Safety and Early Escalation
A resilient culture requires that employees feel safe to raise concerns without fear of blame or negative consequences. This is achieved by:
Encouraging open reporting of vulnerabilities and near misses
Rewarding proactive problem identification
Training leaders to respond constructively to bad newsEarly escalation allows issues to be addressed before they escalate into major disruptions.
The Risk Within provides a roadmap for embedding psychological safety into risk management. It identifies critical touch points across the risk lifecycle and offers clear actions to align leadership, culture, and governance. It is designed to help risk functions integrate more deeply into the business and strengthen decision-making at every level.
Cross-Functional Collaboration
Operational resilience cannot be delivered by a single team. Business, risk, operations, technology, compliance, and supplier management functions must work together to:
Share information on potential risks and emerging threats
Align recovery priorities and resource allocation during incidents
Coordinate resilience activities across the entire organisation and its extended enterprise
Capability Building and Training
Resilience skills and knowledge need to be developed across all levels of the organisation. This includes:
Tailored training for boards, executives, and frontline teams
Simulation exercises to test crisis management and decision-making under pressure
Foundational multi-day operational resilience courses to build deep organisational understanding
Regular refreshers to keep skills current and aligned to regulatory expectations
Embedding Resilience in Everyday Decisions
Culture becomes an enabler when resilience considerations are part of routine decision-making. This means:
Assessing resilience impact for new projects and strategic changes
Weighing operational risk alongside financial and commercial factors
Making resilience a standard item in governance and investment discussions
Culture and leadership set the tone for resilience. When senior leaders prioritise it, employees are empowered to act, and collaboration is the norm, operational resilience moves from being a compliance requirement to a defining characteristic of the organisation’s identity.
If you're curious how your leadership environment supports or constrains risk visibility, try the Leadership Behaviour Insight Assessment. It’s designed to help you reflect on the behaviours that shape risk culture and psychological safety in practice.
Measurement, Monitoring, and Continuous Improvement
Operational resilience is not static. As services evolve, technologies change, and threats emerge, resilience capabilities must be regularly assessed and refined. This is where structured measurement, active monitoring, and a commitment to continuous improvement are essential.
A resilience maturity assessment provides a clear baseline and highlights where investment and corrective action are most needed. The most valuable assessments benchmark against FCA/PRA rules, DORA, and recognised global standards, and are refreshed periodically to track progress. These reviews are not only about compliance. They also inform strategic decisions by showing where the organisation is strong, where it is exposed, and where it stands compared with industry peers.
Clear decision triggers ensure that issues are escalated to executives, boards, and regulators at the right time. These triggers can be quantitative, such as a breach of impact tolerance, or qualitative, such as the discovery of a systemic weakness in a critical supplier relationship. The goal is to prevent delays in decision-making when a rapid response is needed.
Every incident or near miss is an opportunity to strengthen resilience. Root cause analysis should focus on identifying the underlying issues that allowed the disruption to occur. Once these are understood, structured remediation plans with clear ownership and timelines can address the gaps. Tracking progress through governance forums ensures that lessons are acted on, not just documented.
An effective resilience framework operates on a test–learn–adapt cycle. Scenario testing and real-world events provide the test. Lessons are captured, shared, and embedded into processes. Adjustments are made, and the improved capabilities are tested again. This cycle keeps resilience aligned to the organisation’s risk profile and operating environment.
Oversight functions play a critical role in this process. Risk, compliance, and internal audit teams provide independent challenge, validate reporting accuracy, and give the board confidence that resilience arrangements are both effective and compliant.
When measurement, monitoring, and continuous improvement are embedded into governance, operational resilience becomes a dynamic capability that evolves alongside the organisation and the world in which it operates.
Immediate Intervention and Post-Incident Recovery
Even the most prepared organisations can experience a major disruption. In these situations, the priority shifts from prevention to immediate stabilisation and recovery. A structured intervention framework helps leadership act quickly and decisively while maintaining control of the situation.
Four-Step Recovery Framework
Step 1: Stabilise the Situation
Contain the disruption to prevent further damage. This may involve isolating affected systems, pausing specific operations, or activating business continuity arrangements. Clear communication to staff and key stakeholders is critical to reduce uncertainty and maintain trust.
Step 2: Diagnose the Root Causes
Undertake a rapid but thorough diagnostic to determine what happened and why. This includes technical, operational, and governance reviews, as well as mapping the impact on critical services, customers, and regulatory obligations. Early identification of root causes enables targeted corrective actions.
Step 3: Restore and Remediate
Develop and execute a recovery plan that brings critical services back within impact tolerances. This includes prioritising recovery actions based on customer needs, regulatory requirements, and operational dependencies. Remediation addresses not just the immediate technical issues but also the procedural and organisational gaps identified in the diagnosis.
Step 4: Embed Lessons Learned
Once operations are stabilised, capture lessons learned and integrate them into resilience frameworks, policies, and training. This step closes the loop between incident response and long-term improvement, ensuring similar disruptions are less likely in the future and can be managed more effectively if they occur.
Application Example
A financial services firm experienced a prolonged outage of a critical payment processing platform due to a failed software update. The firm activated its crisis management team immediately, rerouted some transactions to a backup system, and notified the regulator and major clients within the first hour (Step 1). A technical and governance review identified that the change management process lacked adequate pre-deployment testing for critical services (Step 2). The recovery plan prioritised restoring high-value transaction capability within 24 hours, while a parallel workstream rolled back the failed update and implemented a new testing protocol (Step 3). Post-incident, the firm integrated the new testing standard into its change management policy and updated its impact tolerance documentation to reflect lessons learned (Step 4).
This structured approach ensures that even in high-pressure situations, recovery is deliberate, transparent, and focused on both immediate operational continuity and future resilience.
Case Studies and Lessons Learned
Case studies provide a powerful way to see operational resilience in action. They show what works under pressure, what fails, and how organisations adapt when disruption strikes. Reviewing both failures and recoveries helps leaders apply lessons without having to experience the same setbacks themselves.
Case Study 1: TSB IT Migration Failure (2018)
When TSB migrated its customer data to a new platform, unexpected technical issues left customers locked out of accounts for weeks. The disruption affected millions, triggered regulatory investigations, and led to significant reputational damage.
Lesson: Complex change programmes require rigorous scenario testing, robust fallback plans, and early involvement of resilience specialists. Resilience is not just about responding to incidents but about designing change to prevent them.
Case Study 2: Boeing 737 Max Design Issues
The 737 Max’s flawed flight control system contributed to two fatal crashes, with investigations revealing gaps in safety governance and regulatory oversight.
Lesson: Operational resilience is not limited to financial services or IT. In high-risk industries, product design and organisational governance must work together to prevent catastrophic failures. Oversight must be independent, informed, and empowered to act.
Case Study 3: COVID-19 Adaptation in Financial Services
At the onset of the pandemic, many financial institutions quickly moved to remote operations. Those with strong digital infrastructure and tested remote access capabilities maintained service continuity with minimal disruption. Others struggled with system capacity, remote onboarding processes, and security risks.
Lesson: Resilience investment in technology, workforce flexibility, and remote governance frameworks can enable rapid adaptation when operating conditions change unexpectedly.
Case Study 4: Payment Platform Recovery
A major payment provider suffered a disruption when a core processing system failed during a high-volume trading day. The organisation applied a structured recovery framework, prioritised customer communications, and restored partial service within hours. A root cause analysis led to investment in infrastructure redundancy and enhanced change controls.
Lesson: A disciplined recovery process, backed by pre-defined roles and escalation routes, can limit operational impact and preserve trust even during severe disruptions.
Case Study 5: Supply Chain Disruption in Manufacturing
A manufacturer dependent on a single overseas supplier faced production halts when a natural disaster struck the supplier’s region. Those without alternative sourcing agreements faced prolonged downtime.
Lesson: Critical third-party and supply chain resilience must form part of a wider third-party risk management and include exit strategies, diversified sourcing, and scenario planning for supplier failure.
The common thread across these cases is that resilience failures are rarely the result of a single event. They are often the outcome of untested assumptions, weak oversight, and a lack of integration between operational, strategic, and cultural factors. Organisations that learn from both their own and others’ experiences are better positioned to anticipate disruption, respond effectively, and recover stronger.
Future Outlook
Operational resilience is entering a new phase. The drivers of disruption are becoming more complex, less predictable, and more interconnected. Regulatory expectations are maturing, but the most resilient organisations will be those that prepare for the less visible, harder-to-model risks as well as the familiar ones.
Mainstream and Emerging Risks
Some risks are widely recognised and already embedded in board discussions – you are likely to have them on your radar:
Climate change and extreme weather events
Artificial intelligence errors, bias, and system dependencies
Geopolitical instability and economic fragmentation
Global supply chain fragility and resource scarcity
These are important, but they sit alongside less publicly discussed vulnerabilities that could have severe, cascading impacts:
Cascading infrastructure failures where one critical system outage ripples across sectors
Concentration of critical minerals and materials in a few geographies
Risks from bio-manufacturing and synthetic biology, whether accidental or deliberate
Physical vulnerabilities in undersea cables and space-based infrastructure
Manipulation of the information environment through AI-driven disinformation
Convergence attacks that combine cyber, physical, and human vectors
Action Matrix for Leadership
Risk Category | Priority Actions | Sector Relevance |
Climate and environmental | Test site, supply chain, and workforce continuity against severe climate scenarios. Build in physical redundancy and flexible working arrangements. | Energy, manufacturing, transport, agriculture, logistics |
Technology and AI dependency | Map AI use cases across the business. Test failure scenarios and embed human override protocols. Maintain alternatives for critical processes. | Financial services, healthcare, manufacturing, tech, public services |
Geopolitical and supply chain | Diversify critical suppliers. Secure long-lead materials. Build exit strategies for high-risk geographies. | Manufacturing, retail, defence, pharmaceuticals, construction |
Cascading infrastructure | Identify shared dependencies with other sectors. Develop “safe mode” operations that can run independently if a shared service fails. | Financial services, utilities, telecommunications, healthcare |
Critical minerals | Secure alternative sourcing or strategic stockpiles. Monitor geopolitical developments in resource-producing regions. | Energy transition sectors, defence, electronics, automotive |
Bio-manufacturing | Review dependencies on biological inputs or processes. Partner with sector bodies for early-warning intelligence. | Pharmaceuticals, agriculture, food production, biotech |
Undersea and space-based infrastructure | Build redundancy through multiple providers and routes. Establish contingency communication plans. | Telecommunications, financial services, logistics, defence |
Information environment | Enhance monitoring for misinformation risks that could damage trust or trigger market reaction. Train leadership on rapid-response communications. | All sectors with strong customer or public interface, especially financial services, government, healthcare |
Convergence threats | Conduct joint exercises that test cyber, physical, and human threat responses together rather than in isolation. | Critical national infrastructure, financial services, defence, healthcare |
Why This Matters
The move from process compliance to outcome-based resilience means regulators are increasingly asking, “Can you keep delivering your critical services when disruption is messy, information is incomplete, and multiple problems are happening at once?” rather than simply, “Do you have a policy?” Boards and executives who prepare for these wider, interconnected risks will be in a stronger position to answer yes.
The future belongs to organisations that treat resilience not as a defensive exercise but as a strategic advantage — one that protects trust, sustains operations under stress, and enables adaptation in a volatile environment.
The CRO Perspective – From Risk Function to Strategic Influence
For a Chief Risk Officer, operational resilience is more than a compliance requirement. It is an opportunity to shape strategic decisions, influence investment priorities, and ensure the organisation can deliver its critical services in the most challenging conditions.
CRO Operational Resilience Checklist
1/.Link resilience to strategy
Ensure operational resilience objectives are embedded into transformation programmes, market entry decisions, and major technology investments. Flag where initiatives could weaken the ability to remain within impact tolerances.
2/.Define decision-making triggers
Establish clear thresholds for escalation to executives and the board, based on both quantitative measures (e.g., breach of impact tolerances) and qualitative indicators (e.g., systemic weaknesses).
3/.Maintain interdependency mapping
Keep an up-to-date map of people, technology, process, and third-party dependencies for all critical services. Use this to inform both crisis planning and day-to-day operational risk decisions.
4/.Test beyond the minimum
Go beyond regulatory scenarios to test complex, compound, and concurrent events, including human factors that can affect crisis response.
5/.Demonstrate ROI of resilience investment
Quantify the benefits of resilience initiatives through avoided losses, faster recovery, customer retention, and competitive advantage. Use these insights to build the case for continued investment.
6/.Lead during disruption
Provide strategic advice on recovery priorities, maintain regulatory engagement, and deliver concise, accurate updates to the board throughout the incident.
The Board/NED Perspective – Oversight, Assurance, and Long-Term Positioning
For boards and non-executive directors, operational resilience oversight is about protecting customers, safeguarding the organisation’s reputation, and ensuring the business can sustain itself during and after disruption. This checklist distils the essential governance responsibilities for effective resilience oversight.
Board/NED Operational Resilience Checklist
Align resilience with strategy: Confirm that resilience priorities are directly linked to the organisation’s strategic objectives, brand commitments, and customer promises.
Seek assurance that resilience is embedded: Require evidence that resilience capabilities are built into operations, not just documented for compliance. Review independent assurance reports and challenge management’s conclusions.
Monitor decision triggers: Agree on clear escalation criteria for board intervention, ensuring they are aligned with the organisation’s impact tolerances and risk appetite.
Oversee third-party and systemic vulnerabilities: Understand the organisation’s critical third-party dependencies, exit strategies, and exposure to sector-wide risks such as infrastructure failures or regulatory changes.
Review scenario preparedness: Ensure management tests severe, plausible, and compound scenarios that include operational, cyber, and supply chain disruption. Participate in at least one strategic-level simulation annually.
Track long-term investment in resilience: Monitor whether budgets and resource allocation support sustained capability improvement. Challenge cuts that could weaken resilience over time.
Evaluate leadership accountability: Confirm that senior managers, including those with SMCR responsibilities, have clear resilience accountabilities in their statements of responsibility and performance objectives.
Practical Action Roadmap
Improving operational resilience is most effective when approached in a deliberate sequence. This roadmap provides a practical order of action that CROs, boards, and operational leaders can follow together.

Phase 1 – Establish the Baseline
Conduct a resilience maturity assessment benchmarked to FCA/PRA, DORA, Basel, and other relevant standards.
Map important business services (IBS) and identify dependencies across people, technology, processes, and third parties.
Capture current resilience capabilities and known vulnerabilities in a central, accessible record.
Phase 2 – Set the Direction
Define or update impact tolerances for all IBS and agree clear escalation triggers for executive and board intervention.
Align resilience priorities with the organisation’s strategic objectives, customer commitments, and risk appetite.
Assign executive ownership for each IBS, including SMCR or equivalent regulatory accountabilities.
Phase 3 – Strengthen Capabilities
Close identified gaps through targeted investments in prevention, recovery, and cultural enablers.
Improve third-party resilience by assessing critical suppliers, formalising contractual resilience requirements, and developing exit strategies.
Build organisational capability with scenario exercises, leadership simulations, and cross-functional training.
Phase 4 – Embed and Improve
Integrate resilience metrics, KRIs, and decision triggers into governance dashboards and committee reporting.
Conduct root cause analysis after incidents and monitor remediation until complete.
Refresh scenario testing annually to include emerging threats, compound events, and lessons learned from real disruptions.
If you want to understand where your organisation stands on resilience, try the Operational Resilience Assessment. It helps you identify strengths, uncover hidden gaps, and reflect on how prepared your critical services are to withstand disruption.
Conclusion – Resilience as a Competitive Differentiator
Operational resilience is no longer a technical or compliance-led concept. It is a board-level capability that determines whether an organisation can protect its customers, maintain trust, and adapt to new realities when disruption strikes.
The organisations that excel are those that treat resilience as an ongoing investment, not a one-off exercise. They embed it into strategy, governance, and culture so it becomes part of how the business makes decisions, allocates resources, and leads in uncertain environments.
When resilience is approached in this way, it creates more than just regulatory compliance. It builds confidence among customers, regulators, investors, and employees. It also strengthens the organisation’s ability to seize opportunities while others are still recovering from the same event.
The competitive advantage is clear. In a world where disruption is inevitable, the organisations that can adapt quickly, operate under pressure, and emerge stronger will define the standards others try to follow.
Now is the time for boards, executives, and operational leaders to act together. Use the roadmap, apply the checklists, and make resilience a core part of your business model. The sooner it becomes business-as-usual, the sooner it becomes your edge.
About the Author: Julien Haye
Managing Director of Aevitium LTD and former Chief Risk Officer with over 26 years of experience in global financial services and non-profit organisations. Known for his pragmatic, people-first approach, Julien specialises in transforming risk and compliance into strategic enablers. He is the author of The Risk Within: Cultivating Psychological Safety for Strategic Decision-Making and hosts the RiskMasters podcast, where he shares insights from risk leaders and change makers.
Some Interesting Links
Speech: The resilience of nuclear power : Perspectives - World Nuclear News (world-nuclear-news.org)
.png)

