Lightning strikes four times in Google’s cloud

1404213168796_wps_18_M_rten_Eskil_Winge_Thor_sIn what is a cloud supplier’s nightmare, a lightning storm in Belgium knocked out Google’s St Ghislain data centre causing power loss and damage to disk storage, leaving some customers without access to data.

The facility was hit directly by four successive lightning strikes which immediately took down the centre’s operations from Thursday until Monday, according to Google.

The damage caused to Google Compute Engine’s primary storage persistent disks housed in the data centre resulted in a four-day data outage for some European customers.

Writing in its bog, Google said: “At 09:19 PDT on Thursday 13 August 2015, four successive lightning strikes on the electrical systems of a European data centre caused a brief loss of power to storage systems which host disk capacity for GCE instances in the europe-west1-b zone.”

It continued: “Although automatic auxiliary systems restored power quickly, and the storage systems are designed with battery backup, some recently written data was located on storage systems which were more susceptible to power failure from extended or repeated battery drain.

“In almost all cases the data was successfully committed to stable storage, although manual intervention was required in order to restore the systems to their normal serving state. However, in a very few cases [less than 0.000001% of PD space in europe-west1-b], recent writes were unrecoverable, leading to permanent data loss on the Persistent Disk.”

Google has accepted full responsibility for the blackout and is not blaming Zeus, Jupiter or Thor. However, it stressed to customers that “GCE instances and Persistent Disks within a zone exist in a single Google data centre and are therefore unavoidably vulnerable to data centre-scale disasters”.

Google has promised to upgrade its data centre storage hardware, increasing its resilience against power outages. According to the search giant, research is already underway to improve cache data retention and response procedures for system engineers.