Between 11:41:00 UTC+2 and 12:03:00 UTC+2 on 2025-05-13, all traffic that traverses our on-prem firewall experienced severe to complete packet drops. The disruption stemmed from a previously unknown software defect in the firewall OS. The defect also caused fail-over to secondary firewall to break. Service was fully restored after a automated reboot of primary firewall.
Time | Event |
---|---|
11:41 | Person on-call receiving multiple DOWN alerts |
11:50 | Ticket raised with NOC, no upstream issues found |
12:03 | Network restores automatically as firewall reboots |
12:05 | Root cause analysis begins |
12:45 | An acknowledged software defect in our firewall identified as root cause |
14:20 | Scheduled maintenance planned for fireware upgrade, subscribers notified |
A verified firmware defect in our firewall OS triggers an out-of-memory crash during normal traffic, simultaneously breaking HA fail-over. Vender documents issue is fixed in a later release.
Service returned when the primary firewall auto-rebooted (12:03). Incident completed by 12:44 after correlating log entries with vendor advisory.
We’re upgrading both our firewalls next maintenance window due 29 May 2025.