GetWiza - Service Outage - [Frogfoot] โ€“ Incident details

All systems operational

Service Outage - [Frogfoot]

Resolved
Major outage
Started about 1 year agoLasted 3 days
Updates
  • Resolved
    Resolved

    The incident has been resolved

  • Update
    Update

    Frog foot confirmed that this outage was resolved on 03/01/2025 @ 14:54. Please restart your devices and contact our Support Team at 021 204 4878 or support@getwiza.com should you still experience connection issues.

  • Update
    Update

    Letter from Frogfoot: We would like to share the latest update on the network disruption affecting the Western and Eastern Cape regions. 2 January 2025 During preparations for a planned upgrade to our core hardware devices, it was determined that the upgrade would not address the underlying issue. Following additional testing, our team identified a solution to stabilize the network load caused by the software bug. This involved installing two additional core routers on the Cape Town leg of our network, which were successfully implemented overnight. 3 January 2025 The new core routers stabilized the network load, enabling our teams to pinpoint the exact trigger of the software bug. This was traced to a node in our network ring. A maintenance technician has been dispatched to the affected node, and with support from our core team, the triggering switch will be rebooted. We will continue to monitor the network closely and have all support teams on standby to address any further issues. We appreciate your patience and understanding as we work to resolve this issue fully and restore seamless connectivity. Regards, The Frogfoot Team

  • Update
    Update

    Latest update from Frogfoot: -- The on-site tech and our CORE team have rebooted the switch, they are currently completing the required checks. Stability will be monitored over the next hour and we will provide further updates.

  • Update
    Update

    Latest update from Frogfoot: -- Tech has been on site since 09:30 and the core team has been investigating remotely, we are awaiting further updates on the investigation.

  • Update
    Update

    Latest update from Frogfoot: -- We discovered that there is an the underlying fault that triggered the juniper issues which had not yet been identified. Our change last night resolved the juniper issues, and with the junipers stabilized we have managed to identify the underlying issue. Core team is now working on resolving the underlying issue. -- We have identified a loop on the network which seems to be originating from the table view switch. The core team have disabled that backhaul as a temp fix. We are monitoring for stability. A tech is en route to site and the ETA is 20minutes. Core team will work remotely with the tech to implement a permanent fix.

  • Update
    Update

    Latest update from Frogfoot: 2025-01-03 -- New hardware devices have been installed and CORE teams are busy setting up and installing configs. Once completed we will perform a reboot under a more balanced load and then perform checks. -- New device configurations are still underway, some services have been moved over to the new routers and we are expecting to perform a reboot once we have completed enough to balance the load. Further updates will be provided as we progress. -- The Core team has completed moving over some services to the new routers and are now performing a reboot of the routers, once the reboot has been completed they will conduct some checks -- The routers have been rebooted and the Core team is busy performing some checks. Further updates will be shared.

  • Update
    Update

    Latest update from Frogfoot: -- T3 is pulling logs to continue to investigate. More updates to follow.

  • Update
    Update

    Lastest updates provided by Frogfoot: -- Teams is still finalizing the checks for the upgrade developed with the Vendor. Some risks have been identified which could cause further impact and the Vendor has added a more senior engineering team to assist with this troubleshooting. More updates will be provided in the next 30 - 45 minutes. -- Additional logs collected by the Vendor indicate that there is an underlying issue on the FPC's that needs to be resolved before any firmware upgrade is implemented. It is highly likely that resolving this underlying issue will also resolve the intermittency issues and negate the need for any additional firmware upgrades. Vendor and Frogfoot Engineering Teams are actively troubleshooting and we will provide more updates as they progress.

  • Update
    Update

    Latest update from Frogfoot: The upgrade developed jointly with the Vendor has been prepared and is currently undergoing final checks. Once ready we will begin the upload to the affected devices which will take around an hour per device (2 devices in total). We will provide more accurate timeframes as we begin the processes.

  • Update
    Update

    Frogfoot have replaced 2 x line cards, however the issue persists. Some services have been re-routed which will result in degraded performance. Frogfoot are working with their vendor to have the fault resolved this morning.

  • Investigating
    Investigating

    Frogfoot are currently experiencing an outage affecting customers in WC and EC. Frogfoot engineers are attending.