Procedures are there to make sure that things consistently get done the same – and the right – way. When good procedures for using technology are followed correctly, productivity and profitability can be increased. But technology used the wrong way can cause business discontinuity, where operations and productivity grind to a halt. A recent outage in a major cloud provider’s IT service was caused not by any technical problem, but by a failure to follow the correct operating procedure. So why did things go wrong and what lesson can other organisations learn from this case?
The service concerned was the Azure storage service provided by Microsoft. This service is updated from time to time like most IT installations. The procedure defined by Microsoft is to move to a new update little by little to allow the time to run checks and make sure that everything is still working properly. However, a misunderstanding about the status of a recent update led an engineer to apply a change over the entire service all at once. This change unfortunately resulted in the service becoming unavailable to users and the need to manually restart a certain number of systems. The key point is that although the correct procedure was defined, there was no safeguard to prevent employees from deviating from it (something that Microsoft has now fixed).
Organisations need to build in failsafe mechanisms to guard against human error. One of the simplest ones is the ‘four eyes principle’, in which two people must check and approve an action before it can be executed. Other failsafe devices may be mechanical or physical, such as automatic speed or rotation limits on machines. Information technology allows complex failsafe procedures to be programmed, but then those automated procedures also need to be properly checked by humans before they are set in motion. In summary, take a good look at your technology and your business continuity to also make a complete list of things that must not happen. Then make sure that you then have the appropriate preventive measures in place.