Data Centre Management: Avert Disaster Tomorrow With A UPS Health Check

A regular UPS health check can make all the difference between your critical power protection system protecting you or not.

Uninterruptible power supplies are essential equipment in a data centre’s fight against dreaded downtime. However, a UPS is only as good as its maintenance regime. Unqualified engineers working on your unit could spell disaster. While ambiguous emergency response times could leave you waiting for assistance if the worst was to happen.

data centre management dcm magazine logoRiello UPS General Manager Leo Craig talks to Data Centre Management (DCM) magazine about the importance of UPS maintenance.

He outlines how a regular UPS health check can keep a system operating at peak performance. Leo also highlights some of the pitfalls to look out for. Basic human error is the most common cause of failure during UPS maintenance. While he also advises due diligence before signing up for a UPS maintenance agreement.

Below we showcase the full article as featured in the summer edition of DCM.

Is It Time To Tighten Up Your UPS Maintenance Approach?

When it comes to ensuring your data centre is resilient, regular maintenance of your UPS is essential. The maintenance process is designed to minimise risk and keep your UPS operating in a fail-safe, efficient manner. So far, so good but, what happens if the very act of carrying out maintenance poses a risk itself? What checks and balances can you put in place to ensure peace of mind and a watertight approach?

Minimise Human Error During UPS Health Checks

As British Airways discovered to their cost in the summer of 2017, human error is the main cause of problems occurring during UPS maintenance procedures; engineers may throw a wrong switch, or carry out a procedure in the wrong order.

But, whilst it’s easy to lay the blame solely at the feet of the engineer in these instances, errors of this kind are often the result of poor operational procedures, poor labelling or even poor training. By ironing out these issues at the start of a UPS installation, risks can be avoided.

For example, if the system being installed is a critical system comprising large UPSs in parallel and a complex switchgear panel, castell interlocks should be incorporated into the design. Castell interlocks force the user to switch in a controlled and safe fashion but are often left out of the design to save costs at the start of the project.

Simple things can make a difference. By ensuring that basic labelling and switching schematics are up-to-date, disaster can be averted. Having clearly documented switching procedures available is recommended. If the site is extremely critical, the procedure of Pilot – Co-Pilot (two engineers both check the procedure before carrying out each action) will prevent most human errors.

Embrace Thermal Imaging Technology

Any maintenance is typically intrusive into the UPS or switchgear, so reducing this is always a good thing. Most problems arising, including the failure of electrical components, are proceeded with an increase in heat. If a connect point isn’t tightened properly, for example, it will start to heat up and eventually fail in some way. Short of checking every connection physically, the most effective solution is thermal imaging.

Thermal image technology can identify potential issues that wouldn’t necessarily be picked up using conventional techniques, without the need for physical intervention.

Monitor Equipment And Competency

Round-the-clock equipment monitoring also offers robust protection and should be part of your maintenance package.  Rigorous training is also vital, as is ensuring that the attending engineer can carry out the work competently.

Never be afraid to ask questions of your maintenance provider – it is your responsibility to request proof of competency levels – pertaining both to the company itself and the engineers it uses.

And always check ‘on the day’ that the engineer on site is competent and isn’t a last-minute sub-contractor sent in because the original engineer is off sick.

Read The Small Print – UPS Maintenance Agreements

A strong maintenance package should ensure that when the UPS does fail, the response is timely and effective. Service level agreements need to be appropriate to the criticality of the application.

There is no point having a maintenance contract for a UPS 24/7 response if access to the UPS can only be gained during normal business hours. Transversely, if operations are 24/7 and very critical to the business, then 24/7 response is a must.

Be clear on exactly what the ‘response’ constitutes – will it just be a phone call or will it be someone coming to the site, and, if so, will that someone be a competent engineer?

Review Today, Protect Tomorrow

Undertaking a review of your current UPS maintenance procedure will help to identify and reduce risk to critical operations, that you may not have previously anticipated. By applying an extra level of due diligence today, you can help to avert disaster tomorrow.