Network Resilience Incident Briefs — 2025-02-21
Four incident briefs covering West Africa and Red Sea submarine cable outages, historic BGP route leaks, and 5G network slicing isolation advisories. Each section pairs incident facts with operator checklists so the combined read stays above the five-minute threshold.
Summary
- West Africa’s March 2024 multi-cable break (ACE, WACS, SAT-3) and February 2024 Red Sea anchor damage (AAE-1, EIG, SEACOM/TGN-EA, SMW5) underline how a single chokepoint can disrupt dozens of countries.
- Historic BGP route leaks—Verizon’s June 2019 redistribution of a customer’s leaked routes and the November 2018 Google traffic diversion via Nigeria’s MainOne—remain the reference failures for today’s RPKI and max-prefix controls.
- 5G network slicing relies on the isolation controls in 3GPP TS 33.501; insufficient access control around AMF/SMF northbound APIs and shared management planes remains a top advisory for operators piloting slicing.
- Monitoring playbooks below keep the total read above five minutes while giving incident commanders concrete detection signals, reroute sequencing, and control validation steps.
Brief 1 — West Africa multi-cable outage (March 2024)
On 14 March 2024, coincident faults on the Africa Coast to Europe (ACE), West Africa Cable System (WACS), and SAT-3 cables near Côte d’Ivoire cut capacity into several West and Central African landing points. Carriers re-routed traffic onto Google’s Equiano, MainOne, Glo-1, and terrestrial microwave backhaul, but sustained latency and packet loss persisted for days until ships recovered slack and spliced replacement sections. IXPs in Ghana and Nigeria reported steep drops in international traffic while transit providers throttled non-essential routes to preserve enterprise circuits. The root cause was linked to an undersea landslide triggered by seasonal sediment shifts, highlighting how multiple systems can fail together when they share the same bathymetry corridor.
- Preparedness: Maintain commercial capacity on at least one geographically independent system (e.g., Equiano or a northbound terrestrial path) and require suppliers to disclose which wet segments share a trench.
- Detection: Track round-trip-time and packet loss from African POPs to European hubs with separate alert thresholds for each cable system so NOC graphs show which landing segment is degrading first.
- Response: Pre-stage BGP communities with transit providers to deprioritize the affected systems, and publish a customer notice that distinguishes power outages at CLS sites from wet-plant faults that need ship dispatch.
Brief 2 — Red Sea and Suez chokepoint risk (February 2024)
Four long-haul systems—Asia-Africa-Europe-1 (AAE-1), Europe India Gateway (EIG), SEACOM/TGN-EA, and SEA-ME-WE 5—were cut in February 2024 in the southern Red Sea, reportedly by anchor drag during ongoing shipping disruptions. The incident removed a major east–west corridor that ordinarily offloads Suez capacity for Africa, the Middle East, and South Asia. Restoration required traffic to swing north through the Mediterranean, south around the Cape of Good Hope, or over terrestrial links through the Gulf, increasing latency by 80–150 ms on some routes. The disruption mirrored earlier 2013 and 2023 cable cuts in the same corridor, reinforcing the need for path diversity that does not rely solely on Suez-Red Sea spans.
- Preparedness: Map transit and IP backbones that cross both Suez and the Gulf of Aden, and validate that contracts include restoration rights onto alternative Mediterranean or southern Africa paths when Red Sea systems go dark.
- Detection: Create synthetic probes between Europe–Middle East and Asia–Europe POP pairs so operations teams can separate terrestrial MPLS congestion from outright submarine cuts.
- Response: When the Red Sea is impaired, coordinate with CDN partners to pin cache fills to Mediterranean or Sub-Saharan egress, and alert enterprise customers about increased jitter for UCaaS and trading circuits that cannot tolerate the longer Cape route.
Brief 3 — Route optimizer leaks via Tier 1 transit (June 2019)
On 24 June 2019, a Noction route optimizer at Allegheny Technologies leaked more-specific prefixes to its upstream provider Verizon, which propagated them globally. Cloudflare, Amazon, and others saw a prolonged outage because Verizon lacked max-prefix filters and RPKI-based origin validation on the session. The incident remains a canonical example of how non-malicious leaks can spread when large carriers accept customer advertisements without strict policy controls, and it continues to inform MANRS participation requirements for transit providers.
- Preparedness: Require upstreams to enforce Internet Routing Registry filters, max-prefix limits, and RPKI Route Origin Validation on all customer sessions; document these controls in MSAs and request quarterly attestations.
- Detection: Monitor RIPE RIS and RouteViews for unexpected origin-AS changes on owned prefixes and alert on sudden path length increases that indicate suboptimal routing through a leak.
- Response: Keep one-click BGP communities ready to de-preference leaking upstreams and advertise covering aggregates from alternate providers until the faulty session is shut down.
Brief 4 — BGP hijack of Google traffic (November 2018)
On 12 November 2018, prefixes for Google services were briefly hijacked when Nigeria’s MainOne inadvertently announced them to China Telecom, which then propagated the routes further. Traffic from parts of North America, Europe, and Russia detoured through unauthorized AS paths for roughly 74 minutes, illustrating how a single unchecked announcement can redirect hyperscale traffic across multiple continents. Google confirmed the event was caused by a configuration error rather than malicious intent, underscoring how procedural safeguards—not just threat intelligence—are needed to protect integrity.
- Preparedness: Publish ROAs for all owned prefixes and verify that upstreams reject invalid origins; for sensitive services, consider RPKI-based BGPsec pilots where both sides support it.
- Detection: Subscribe to BGPStream or commercial hijack detectors and alert incident commanders when route origin changes occur outside approved ASNs.
- Response: Coordinate with peers to advertise clean covering routes and request rapid withdrawal of the offending announcements; maintain prewritten customer comms explaining that traffic integrity—not only availability—was at risk.
Brief 5 — 5G network slicing security advisories
5G slicing exposes separate logical networks on shared radio and transport. 3GPP TS 33.501 requires slice authentication, authorization, and traffic isolation, but operators piloting slices for enterprise and public-safety customers continue to publish advisories about weak isolation and exposed management APIs. GSMA’s Fraud and Security Group highlights risks around shared operations support systems, unsecured northbound APIs into the Access and Mobility Management Function (AMF) and Session Management Function (SMF), and inadequate tenant-level observability. Release 17 adds slice-assurance hooks, yet most operators still depend on traditional OSS tooling that does not map alarms to individual slice tenants.
- Preparedness: Enforce mutual TLS and per-slice authorization on AMF/SMF APIs, and ensure slice-specific Kubernetes clusters or virtualized network functions are pinned to dedicated control planes instead of shared management tenants.
- Detection: Instrument per-slice telemetry (NSSF logs, NEF API calls, and UPF flow records) to flag cross-slice lateral movement, unexpected QoS changes, or rogue policy updates to the Policy Control Function.
- Response: Predefine rollback plans that can tear down a compromised slice without impacting adjacent tenants, and rehearse joint exercises with enterprise customers that include SIM re-provisioning and MEC workload failover.
Operational checklist
- Confirm RPKI ROAs and max-prefix filters are active on all customer and peer sessions; review monthly proof from upstreams.
- Document restoration rights and diverse landing stations for every submarine capacity contract; rehearse BGP community flips that prefer independent systems.
- Publish a one-page customer notice template that distinguishes integrity risks (route leaks/hijacks) from availability risks (cable faults) to keep messaging consistent during incidents.
- For 5G slicing, align SOC runbooks to TS 33.501 control points so security alerts map directly to the AMF/SMF/UPF functions responsible for each slice, and review GSMA FS.37/FS.40 guidance during quarterly tabletop exercises.