Every outage makes IT look bad !

Running an IT system is difficult: whenever an outage is popping up, IT department is looking bad in front of its customer, very often looking for the guilty one. This natural inclination to finger pointing has to be fought as much as possible: it is not improving the relationship with the customer and not improving the system either. No need to say that, from a management point of view, this is creating even more damages: by definition, people are making mistakes and finger pointing will encourage them to cover their position or even to hide facts.
Everyone has been confronted with this very annoying situation in which the whole system is down but every sub component is working fine! This is a typical situation where looking good as an individual is more important than solving the issue the company is facing. Therefore everyone is defending himself instead of collaborating for solving the issue.

Finger pointing does not help solving an issue !

This finger pointing behaviour is generating fear and angst which are blocking any improvement process. Who would be keen on changing something which may generate a problem? Even worse, very often people are hiding failures making it impossible to identify the root cause. I am sure this will ring a bell for many of us: how much time have we spent trying to understand implausible arguments that were trying to hide evidences?
Another negative impact of this kind of lack of trust is that IT tends to fix immediately the outage, overreacting to prove how good it is: “of course, we had an outage but it has been fixed immediately… !”
All these trends are tending to hide or at least not to analyse the root causes of the faced outage, therefore, no improvement is then possible and those outages will occur repeatedly.

Are we sure this will not happen again ?

There is only one question which really matters: "Are we sure this will not happen again?". If this question is enforced by management and users, the focus move dramatically to understanding the root causes and improving the system
IT chains are complex, made of hundreds of components chained one after the other, therefore an outage may come from a root cause which is quite far from the faulty component. First level analysis is definitely not sufficient. Encouraging a deep analysis is not only mandatory to improve the overall reliability but is also breaking the vicious circle of suspicious relationships leading to better efficiency.