“A one-off human error”?

This news story in the Guardian grabbed my attention. The Nationwide Building Society mistakenly processed 704,426 payments a second time.

The Independent carried a less detailed report on the story, but it contained a quote that irked me (my emphases).

The society instead blamed an “inputting” error by an operative at its Swindon HQ. The phantom transactions were removed from customers accounts overnight, the bank said.

Jenny Groves, divisional director for customer experience said: “Nationwide wishes to apologise to those customers affected by an issue which has affected some of our debit card customers.”

She said those put into the red would have all charges “refunded in full and any costs associated with this error will be reimbursed in full. None of our customers will suffer financial loss as a result of this one-off error“.

Wow! This is 2012, and a big bank is making excuses that didn’t wash 30 years ago. I’ve worked extensively with big, batch financial systems. Here are some basic, utterly fundamental precepts that were well known by developers before I even knew what a computer was.

  • People screw up. Sometimes they do it in ways you expect, often in ways that surprise you. The only certainty is that they screw up.

  • You process every payment accurately, no exceptions.

  • You never, ever, process payments twice. It is a big deal. It’s not just about keeping your job, or staying out of jail. It’s about self respect. It’s about going to sleep at night knowing you’re a competent professional, not an irresponsible cowboy who gets it right only some of the time.

  • The user requirements will not state every requirement that is absolute and non-negotiable. Some requirements are so fundamental and essential that the users will assume that “they go without saying”. If such requirements do not appear in specifications then you will look stupid if you subsequently pretend that they didn’t matter, or that you believe the users did not really require them. Processing payments accurately, once, and once only falls into this category.

  • It is the system designers’ responsibility to build these unstated, fundamental requirements into the application, even if the business analysts missed them.

  • It is the testers’ responsibility to test the application against these unstated, fundamental requirements.

All this means that financial applications need carefully designed controls to ensure that the right things always happen and the wrong things never do. It means that the application needs built in checks to detect these “one off human errors”. The techniques are ancient, at least in computing, and maybe that’s part of the problem. They’re boring, pedantic old-school stuff.

The main techniques are control files to keep track of files as they are being processed, hash counts and record counts to show that all records have been processed, and file version numbers so that the application can check that the right files are being processed and being processed only once.

These techniques are boring and fiddly, but they work. Unfortunately they frequently trip up test runs. Control files and version numbers have to be reset after a run is halted. It’s easy to lose track and have to explain that the failure was a embarrassing test setup problem, rather than a genuine defect.

It’s much simpler to forget about these controls, or to switch them off for testing, or even switch them off in live running (it happens) when they complicate restarts after problems.

I said earlier that testers have a responsibility to test unstated, fundamental requirements. Actually, that was a slightly tricky one. Of course it is perfectly true, but sadly some project managers, and even whole organisations, prefer to put pressure on testers to script tests only against written requirements.

If you are testing a financial payment application and you’re not testing to see if every payment is processed accurately, once, and only once then you’re not really testing. Such ”testing” is an embarrassment to the testing profession.

Organisations that skimp on effective testing, that don’t understand the value of thoughtful, risk-based controls, that blame “human error” when there is a management or systemic failure are placing their customers and reputation at risk. They are inviting humiliating press coverage and they deserve it.

I had to get that off my chest. People screw up. Human error is inevitable. Testers have to show how it can happen. It’s so much less embarrassing to read it in a test report than a national newspaper. That’s all.

Advertisements

Control and confusion – misunderstandings about governance

My recent blog about ISO29119 and the discussion I started at the Software Testing Club sparked some very thought provoking contributions.

I’ve decided to return to this topic because I think there is a unhealthy gulf between those who equate professionalism with formal standards, and those who know that in software testing professionalism and these formal standards aren’t comfortable allies; in fact they can be in opposition.

I fall into the latter camp, but I do believe that we should be making more effort to demonstrate that it is perfectly possible to move away from formal standards and a heavyweight approach to documentation while maintaining a highly responsible and professional approach. The intellectual case has been won convincingly, but I don’t think we have yet won the political, organisational battle. That will require clearer links to the needs of auditors and regulators than most testers currently have.

My experience tells me that vague guesses and unchallenged assumptions are often the sole basis for massively costly and time-consuming process models and governance regimes.

In this piece I want to start my journey into the field of testing governance by looking at some damaging confusion surrounding the terminology and the concepts.

Documenting Quality Governance Control for Management Accountability

Companies often mix up accountability, governance, control, quality and documentation. They all get jumbled up together without clear distinctions and without understanding how they are linked.

The words are often lumped together in ludicrous combinations that look impressive but mean nothing. The meaningless title of this section is a poor attempt at satire. It probably looks uncomfortably familiar to anyone who has worked in big corporate IT.

Companies act on the vague assumption that if they smother projects with governance and documentation they will produce accountability and control, thus generating quality. That assumption is illogical nonsense, but organisations produce vast reams of standards and processes (sorry, Process Models) based on the delusion. Sadly, the cost of generating these millstones is only a fraction of the cost in development time, lost opportunities and shoddy products that result.

Quality is a goal. Actually, it isn’t really a goal per se. It’s something you need to provide in order to satisfy whoever is paying the bills. The other stuff is just detail. Accountability, governance, control and documentation are various means you might employ to help achieve your goals, but the connections between them and the effect they have are widely misunderstood.

Confusing all these concepts makes it far harder to produce high quality products. So I want to get some some things off my chest and tackle some of the myths of corporate IT that frequently go unchallenged.

Time to attack some myths

Governance is not the same as control and neither does governance produce control

Project governance is a framework for managing projects and integrating them into the wider business. Governance helps us control the use we make of time, money and assets, and to a certain extent it helps manage people. That’s all important, but it’s also relatively straightforward. Controlling events is infinitely more difficult, and that’s what we really yearn to do.

Governance has only a limited impact on future events, yet we pretend that if we get the governance right then we’ll be able to shape the future too. Our ability to control events is severely constrained because these events are the product of an uncertain future, with many influences and pressures that we cannot predict with a realistic level of confidence.

The danger is that we confuse governance and control then make bold, confident predictions based on the naïve belief that our mastery of governance will allow us to control the future.

We are using our regime of governance to control those things that we can control, and deceive ourselves that we are thus controlling the future. We end up controlling plans and documents. We shape our perception of reality to try and fit it to our plans. The underlying reality drifts further and further out of our control and we never acknowledge the problem, even when frantic re-planning is required. We kid ourselves that all that happened was our plans were wrong, people made mistakes and we were unlucky.

Achieving even the limited control over events that is possible requires us to be realistic and honest. The future is uncertain. Acknowledging the huge level of uncertainty in software development is realistic, and makes it easier to adapt. Pretending that governance provides true control has precisely the opposite effect. It introduces a layer of fog and documentation that can hide the reality. It dulls our ability to anticipate, detect and respond to changes in that reality.

Accountability does not entail tight control

Well, control and accountability should be independent, though they are often tied up together. Accountability does not require tight control of people. Weak managers, who lack confidence in themselves and their team, try to compensate by micro-managing the project.

The usual weapon of choice for micro-managers is a rigid application of governance. They obsess about the aspects of the project that they can control. The result is that the team is distracted by “managing” the managers, to keep them sweet, rather than focussing on the real work.

Accountability means being responsible for outcomes, and for the use made of people and resources. It should not mean beating people up. If one is managing responsibly, and getting the most out of the people, then that often means relaxing control. It means putting good people, with appropriate skills in the right position to do a good job, then protecting them from the external pressures and distractions that distract them. If managers are adding to those pressures and distractions they should question their approach.

I was recently asked whether I was confident that I knew exactly what my team was going to do. I replied that I had no idea, not in any detail anyway. There wasn’t time to familiarise myself with that level of detail. I trusted them and was entirely confident that they knew what they had to do, when they had to do it, and what they needed from each other. I was accountable for the outcome. Accountability does not mean being able to give a detailed account of everything that is going on.

Documentation and information are not synonyms

Managers need the right information at the right time to take the decisions they need. That doesn’t mean that the project should be constantly churning out documentation. Information does not have to be contained in documents, still less in formal “work products”. Likewise, a mass of verbiage, dubious metrics and irrelevant garbage do not add up to information.

Formal documentation is often produced mindlessly because it is mandated by the standards. I’ve been astonished at how much time is spent producing documents that run to 50 pages or more and contain virtually no information that will aid the project. Boilerplate documents are often the enemy, not the source, of information.

Documentation produces neither control nor accountability

If we accept that documentation is not the same as information then we can ease up on the documentation and take a more sceptical look at the justification for producing it.

Documentation is not an absolute good. In a sense it is a contingent requirement, like any other. If you are developing a safety critical application, or software associated with the production of drugs that will be covered by the US Food & Drugs Administration then you will need far more detailed documentation than will be necessary for a straightforward e-commerce application.

Companies often produce documentation in the belief that “the auditors will need to see it”. If that is the only justification for the documentation then it is not needed. I am serious.

Auditors report on whether the management, or application, has controls in place that are appropriate to the risk, and whether these controls are actually being applied. If the documentation is produced only for the auditors then it has no value at all.

Managers look daft in front of smart, professional auditors when they admit that they have been wasting time and money producing shelfware “for the auditors”. It is certainly possible that the management should have been using potentially valuable documentation but they ignored it, but that is a separate failing.

I’ve written other blog posts that go into this in more detail. See “when documentation is a waste of time” and also “testers are like auditors” for a more general explanation of my perspective on auditing.

Neither governance nor control produce quality

The almost wilful confusion of quality with control really bugs me. Few managers ever say; “forget quality, we can live with an application that is a crock of crap provided we hit our dates and budget”.

However, they can get away with setting up a tough regime of project governance they hope will ensure they will get the dates and costs right, whilst assuming incorrectly that the rigid structure will produce a quality product.

I’ve seen these dreadful structures given Orwellian names like “Quality Management System” that mean the exact opposite of what they promise. They drive quality down because good people are producing shelfware rather than software, test plans rather than effective test preparation. The iteration that good software requires is suppressed in order to permit a neat, structured project plan.

Eventually the wheels come off such projects when the changes to requirements come in, or the users realise that the design doesn’t meet their real needs. Such disruption to the project is unacceptable, so it’s time to steamroller the users, and silently accept that this time quality wasn’t such a high priority; the costs and schedule were more important.

At the post mortem the dishonesty inherent in the process is ignored. There is a pretence that someone screwed up, or events conspired to defeat us, or that we were unlucky. There’s no admission that the approach was flawed, and that the consequences were an inevitable result of our decisions. Next time it will be ok because there’s no reason to assume they’ll be as unlucky again.

So what now?

That’s a good question. I’ve had a bit of a rant, but now I’m feeling the urge to be constructive. Please don’t misunderstand me. I do believe that project governance, control (and controls), documentation and accountability are all important. It matters when people don’t really think about the concepts and mix them all up. These concepts need to be applied intelligently and with a sensitive regard for the context. They need to be applied so that they help an organisation to use IT more effectively and efficiently, rather than act as a drag and distraction.

Traditional projects often seem to value governance, documentation and illusory control above all else. Such a mindset makes it natural to respond to the perceived needs of stakeholders, auditors and regulators by mandating an approach of extreme documentation and rigid adherence to inflexible standards

I’ve started to think that many testers need clearer advice and guidance about what constitutes a responsible, professional link between development projects, testing, governance and control. I suspect that managers of developers and testers often feel uncomfortably exposed when they move away from traditional approaches, uncertain whether they are doing it right. The temptation is to take unhelpful elements of the heavyweight approach with them as a purely defensive measure.

Can they do real, effective testing without a mountain of documentation and rigid standardised processes, and also be confident they can demonstrate they are doing it right? Of course, but we’ve got to show them how, and clarify the links between testing, governance and accountability. It’s time for me to give this some serious thought.