The Post Office Horizon IT scandal, part 3 – audit, risk & perverse incentives

In the first post of this three part series about the scandal of the Post Office’s Horizon IT system I explained the concerns I had about the approach to errors and accuracy. In the second post I talked about my experience working as an IT auditor investigating frauds, and my strong disapproval for the way the Post Office investigated and prosecuted the Horizon cases. In this, the final part, I will look at the role of internal audit and question the apparent lack of action by the Post Office’s internal auditors.

Independence and access to information

There’s a further aspect to the Horizon scandal that troubles me as an ex-auditor. In 2012, after some pressure from a Parliamentary committee, the Post Office commissioned the forensic IT consultancy Second Sight to review Horizon. Second Sight did produce a report that was critical of the system but they could not complete their investigation and issue a final report. They were stymied by the Post Office’s refusal to hand over crucial documents, and they were eventually sacked in 2015. The Post Office ordered Second Sight to hand over or destroy all the evidence it had collected.

An experienced, competent IT audit team should have the technical expertise to conduct its own detailed system review. It was a core part of our job. I can see why in this case it made sense to bring in an outside firm, “for the optics”. However, we would have been keeping a very close eye on the investigation, assisting and co-operating with the investigators as we did with our external auditors. We would have expected the investigators to have the same access rights as we had, and these were very wide ranging.

We always had the right to scope our audits and investigations and we had the right to see any documents or data that we considered relevant. If anyone ever tried to block us we would insist, as a matter of principle, that they should be overruled. This was non-negotiable. If it was possible to stymie audits or investigations by a refusal to co-operate then we could not do our job. This is all covered in the professional standards of the Institute of Internal Auditors. The terms of reference for the Post Office’s Audit, Risk and Compliance Committee makes its responsibilities clear.

“The purpose of the charter will be to grant Internal Audit unfettered access to staff, data and systems required in the course of discharging its responsibilities to the Committee…

Ensure internal audit has unrestricted scope, the necessary resources and access to information to fulfil its mandate.”

I am sure that a good internal audit department, under the strong management that I knew, would have stepped in to demand access to the relevant records in the Horizon case on behalf of the external investigators, and would have pursued the investigation themselves if necessary. It’s inconceivable that we would have let the matter drop under management pressure.

Internal auditors must be independent of management, with a direct reporting line to the board to protect them from attempted intimidation. “Abdication of management responsibilities” was the nuclear phrase in our audit department. It was only to be used by the Group Chief Auditor. He put it in the management summary of one of my reports, referring to the UK General Manager. The explosion was impressive. It was the best example of audit independence I’ve seen. The General Manager stormed into the audit department and started aggressively haranguing the Chief Auditor, who listened calmly then asked. “Have you finished? Ok. The report will not be changed. Goodbye”. I was in awe. You can’t intimade good auditors. They tend to be strong willed. The weak ones don’t last long, unless they’re part of a low grade and weak audit department that has been captured by the management.

Risk and bonuses

The role of internal audit in the private sector recognises the divergent interests of the executives and the owners. The priority of the auditors is the long term security and health of the company, which means they will often look at problems from a different angle than executives whose priority might be shaped by annual targets, bonuses and the current share price. The auditors keep an eye on the executives, who will often face a conflict of interest.

Humans struggle to think clearly about risk. Mechanical risk matrices like this one (from the Health and Safety Executive, the UK Government inspectorate responsible for regulating workplace safety) serve only to fog thinking. A near certain chance of trivial harm isn’t remotely the same as a trivial chance of catastrophic damage.

UK HSE risk matrix

UK HSE risk matrix

Senior executives may pretend they are acting in the interests of the company in preventing news of a scandal emerging but their motivation could be the protection of their jobs and bonuses. The company’s true, long term interests might well require early honesty and transparency to avoid the risk of massive reputational damage further down the line when the original scandal is compounded by dishonesty, deflection and covering up. By that time the executives responsible may have moved on, or profited from bonuses they might not otherwise have received.

A recurring theme in the court case was that the Post Office’s senior management, especially Paula Vennells, the chief executive from 2012 to 2019, simply wanted the problem to go away. Their perception seems to have been that the real problem was the litigation, rather than the underlying system problems and the lives that were ruined.

In an email, written in 2015 before she appeared in front of a Parliamentary committee, Vennells wrote.

“Is it possible to access the system remotely?

What is the true answer? I hope it is that we know it is not possible and that we are able to explain why that is. I need to say no it is not possible and that we are sure of this because of xxx [sic] and we know this because we had the system assured.”

Again, in 2015, Vennells instructed an urgent review in response to some embarrassingly well informed blog posts, mainly about the Dalmellington Bug, by a campaigning former sub-postmaster,
Tim McCormack. Vennells made it clear what she expected from the review.

“I’m most concerned that we/our suppliers appear to be very lax at handling £24k. And want to know we’ve rectified all the issues raised, if they happened as Tim explains.”

These two examples show the chief executive putting pressure on reviewers to hunt for evidence that would justify the answer she wants. It would be the job of internal auditors to tell the unvarnished truth. No audit manager would frame an audit in such an unprofessional way. Reviews like these would have been automatically assigned to IT auditors at the insurance company where I worked. I wonder who performed them at the Post Office.

When the Horizon court case was settled Vennells issued a statement, an apology of sorts.

“I am pleased that the long-standing issues related to the Horizon system have finally been resolved. It was and remains a source of great regret to me that these colleagues and their families were affected over so many years. I am truly sorry we were unable to find both a solution and a resolution outside of litigation and for the distress this caused.”

That is inadequate. Expressing regret is very different from apologising. I also regret that these lives were ruined, but I hardly have any responsibility. Vennells was “truly sorry” only for the litigation and its consequences, although that litigation was what offered the victims hope and rescue.

Vennells resigned from her post in the spring of 2019, eight months before the conclusion of the Horizon court case. In her last year as chief executive Vennells earned £717,500, only £800 less than the previous year. She lost part of her bonus because the Post Office was still mired in litigation, but it hardly seems to have been a punitive cut. Over the course of her seven years as chief executive, according to the annual reports, she earned £4.5 million, half of which came in the form of bonuses. In that last year when she was penalised for the ongoing litigation she still earned £389,000 in bonuses.

These bonuses are subject to clawback clauses (according to the annual reports, available at the last link);

“which provide for the return of any over-payments in the event of misstatement of the accounts, error or gross misconduct on the part of an Executive Director.”

Bonuses for normal workers reflect excellent performance. In the case of chief executives the criterion seems to be “not actually criminal”.

I have dismissed the risk matrix above for being too mechanical and simplistic. There’s a further criticism; it ignores the time it takes for risks to materialise into damage. A risk that is highly unlikely in any particular year might be almost certain over a longer period. It depends how you choose to frame the problem. To apply a crude probability calculation, if the chance of a risk blowing up in a single year is 3%, then there is a 53% chance it will happen at some point over 25 years. If a chief executive is in post for seven years, as Paula Vennells was, there is only a 19% chance of that risk occurring.

These are crude calculations, but there is an important and valid underlying point; a risk that might be intolerable to the organisation might be perfectly acceptable to a chief executive who is incentivised to maximise earnings through bonuses, and push troubling risks down the line for someone else to worry about.

No organisation should choose to remain in the intolerable risk cell, yet Vennells took the Post Office there and it probably made financial sense for her. The Post Office was very likely to lose the Horizon litigation, with massive damage. It wouldn’t happen while she was in post, and it would be extremely unlikely that fighting the case aggressively would be regarded as gross misconduct.

Perverse incentives often tempt managers, and also politicians, to ignore the possibility of dreadful outcomes that are unlikely while they are in post and would force them to incur expense or unpopularity to prepare for. The odds are good that irresponsible management will be rewarded for being wrong and will have left with their hefty bonuses before disaster strikes. On the other hand you can get sacked for doing the right thing long before justification is obvious.

This is, or at least it should be, a big issue for internal auditors who have to keep a sharp eye on risk and misaligned incentives. All too often the only people with a clear eyed, dispassionate understanding of risk are those who are gaming the corporate system. The Post Office’s internal auditors fell down on the job here. Even setting aside the human tragedies, the risks to the Post Office posed by the Horizon system and the surrounding litigation should have been seen as intolerable.

Role of internal audit when organisations move from the public to private sector

This all raises questions about corporate governance and the role of internal audit in bodies like the Post Office that sit between the public and private sectors. The Post Office is owned by the UK government, but with a remit of turning itself into a self-sustaining company without government subsidy. The senior executives were acting like private sector management, but with internal auditors who had a public sector culture, focusing on value for money and petty fraud. There are endless examples of private sector internal auditors losing sight of the big picture. However, a good risk-based audit department will always be thinking of those big risks that could take the company down.

Public bodies are backed by the government and can’t fail in the same way as a private company. When they move into the private sector, the management culture and remuneration system exposes the organisation to a new world of risks. So how do their internal auditors respond? In the case of the Post Office the answer is; badly. The problems were so serious that the internal auditors would have had a professional responsibility to bypass the senior executives and escalate them to board level, and to the external auditors. There is no sign that it happened. The only conclusion is that the Post Office’s internal auditors were either complicit in the Horizon scandal, or negligent. At best, they were taking their salaries under false pretences.

Conclusion

At almost every step, over many years, the Post Office handled the Horizon scandal badly, inexcusably so. They could hardly have done worse. There will be endless lessons that can, and will be drawn, from detailed investigation in what must be the inevitable inquiry. However, for software testers and for IT auditors the big lesson they should take to heart is that bad software, and dysfunctional corporate practices, hurt people and damage lives. The Post Office’s subpostmasters were hard working, decent, business people trying to make a living and provide for their family. They were ruined by a cynical, incompetent corporation. They will receive substantial compensation, but it’s hardly enough. They deserve better.

The Post Office Horizon IT scandal, part 2 – evidence & the “off piste” issue

In the first post of this three part series about the scandal of the Post Office’s Horizon IT system I explained the concerns I had about the approach to errors and accuracy. In this post I’ll talk about my experience working as an IT auditor investigating frauds, and my strong disapproval for the way the Post Office investigated and prosecuted the Horizon cases.

Evidence, certainty and prosecuting fraud

Although I worked on many fraud cases that resulted in people going to prison I was never required to give evidence in person. This was because we built our case so meticulously, with an overwhelmingly compelling set of evidence, that the fraudsters always pleaded guilty rather than risk antagonising the court with a wholly unconvincing plea of innocence.

We always had to be aware of the need to find out what had happened, rather than simply to sift for evidence that supported our working hypothesis. We had to follow the trail of evidence, but remain constantly alert to the possibility we might miss vital, alternative routes that could lead to a different conclusion. It’s very easy to fall quickly into the mindset that the suspect is definitely guilty and ignore anything that might shake that belief. Working on these investigations gave me great sympathy for the police carrying out detective work. If you want to make any progress you can’t follow up everything, but you have to be aware of the significance of the choices you don’t make.

In these cases there was a clear and obvious distinction between the investigators and the prosecutors. We, the IT auditors, would do enough investigation for us to be confident we had the evidence to support a conviction. We would then present that package of evidence to the police, who were invariably happy to run with a case where someone else had done the leg work. The police would do some confirmatory investigation of their own, but it was our work that would put people in jail. The prosecution of the cases was the responsibility of the Crown Prosecution Service in England & Wales, and the Procurator Fiscal Service in Scotland. That separation of responsibilities helps to guard against some of the dangers that concerned me about bias during investigation.

This separation didn’t apply in the case of the Post Office, which for anachronistic, historical reasons, employs its own prosecutors. It also has its own investigation service. There’s nothing unusual about internal investigators, but when they are working with an in house prosecution service that creates the danger of unethical behaviour. It the case of the Post Office the conduct of prosecutions was disgraceful.

The usual practice was to charge a sub-postmaster with theft and false accounting, even if the suspect had flagged up a problem with the accounts and there was no evidence that he or she had benefitted from a theft, or even committed one. Under pressure sub-postmasters would usually accept a deal. The more serious charge of theft would be dropped if they pleaded quilty to false accounting, which would allow the Post Office to pursue them for the losses.

What made this practice shameful was that the Post Office knew it had no evidence for theft that would secure a conviction. This doesn’t seem to have troubled them. They knew the suspects were guilty. They were protecting the interests of the Post Office and the end justified the means.

The argument that the prosecution tactics were deplorable is being taken very seriously. The Criminal Cases Review Commission has referred 39 Horizon cases for appeal, on the grounds of “abuse of process” by the prosecution.

The approach taken by Post Office investigators and prosecutors was essentially to try and ignore the weakest points of their case, while concentrating on the strongest points. This strikes me as fundamentally wrong. It is unprofessional and unethical. It runs counter to my experience/

Although I was never called to appear as a witness in court, when I was assembling the evidence to be used in a fraud trial I always prepared on the assumption I would have to face a barrister, or advocate, who had been sufficiently well briefed to home in on any possible areas of doubt, or uncertainty. I had to be prepared to face an aggressive questioner who could understand where weak points might lie in the prosecution case. The main areas of concern were where it was theoretically possible that data might have been tampered with, or where it was possible that someone else had taken the actions that we were pinning on the accused. Our case was only as strong as the weakest link in the chain of evidence. I had to be ready to explain why the jury should be confident “beyond reasonable doubt” that the accused was guilty.

Yes, it was theoretically possible that a systems programmer could have bypassed access controls and tampered with the logs, but it was vanishingly unlikely that they could have set up a web of consistent evidence covering many applications over many months, even years, and that they could have done so without leaving any trace.

In any case, these sysprogs lacked the deep application knowledge required. Some applications developers, and the IT auditors, did have the application knowledge, but they lacked the necessary privileges to subvert access controls before tampering with evidence.

The source code and JCL decks for all the fraud detection programs would have been available to the defence so that an expert witness could dissect them. We not only had to do the job properly, we had to be confident we could justify our code in court.

Another theoretical possibility was that another employee had logged into the accused’s account to make fraudulent transactions, but we could match these transactions against network logs showing that the actions had always been taken from the terminal sitting on the accused’s desk during normal office hours. I could sit at my desk in head office and use a network monitoring tool to watch what a suspect was doing hundreds of mile away. In one case I heard a colleague mention that the police were trailing a suspect around Liverpool that afternoon. I told my colleague to get back to the cops and tell them they were following the wrong guy. Our man was sitting at his desk in Preston and I could see him working. Half an hour later the police phoned back to say we were right.

In any case, fanciful speculation that our evidence had been manufactured hit the problem of motive; the accused was invariably enjoying a lifestyle well beyond his or her salary, whereas those who might have tampered with evidence had nothing to gain and a secure job, pension and mortgage to lose.

I’ve tried to explain our mindset and thought processes so that you can understand why I was shocked to read about what happened at the Post Office. We investigated and prepared meticulously in case we had to appear in court. That level of professional preparation goes a long way to explaining why we were never called to give evidence. The fraudsters always put their hands up when they realised how strong the evidence was.

Superusers going “off piste”

One of the most contentious aspects of the Horizon case was the prevalence of Transaction Corrections, i.e. corrections applied centrally by IT support staff to correct errors. The Post Office seems to have regarded these as being a routine part of the system, in the wider sense of the word “system”. But it regarded them as being outside the scope of the technical Horizon system. They were just a routine, administrative matter.

I came across an astonishing phrase in the judgment [PDF, opens in new tab, see page 117], lifted from an internal Post Office document. “When we go off piste we use APPSUP”. That is a powerful user privilege which allows users to do virtually anything. It was intended for “for unenvisaged ad-hoc live amendment” of data. It had been used on average about once a day, and was assigned on a permanent basis to the ID’s of all the IT support staff looking after Horizon.

I’m not sure readers will realise how shocking the phrase “off piste” is in that context to someone with solid IT audit experience in a respectable financial services company. Picture the reaction of of a schools inspector coming across an email saying “our teachers are all tooled up with Kalashnikovs in case things get wild in the playground”. It’s not just a question of users holding a superuser privilege all the time, bad though that is. It reveals a lot about the organisation and its systems if staff have to jump in and change live data routinely. An IT shop that can’t control superusers effectively probably doesn’t control much. It’s basic.

Where I worked as an IT auditor nobody was allowed to have an account with which they could create, amend or delete production data. There were elaborate controls applied whenever an ad hoc or emergency change had to be made. We had to be confident in the integrity of our data. If we’d discovered staff having permanent update access to live data, for when they went “off piste”, we’d have raised the roof and wouldn’t have eased off till the matter was fully resolved. And if the company had been facing a court action that was centred on how confident we could be in our systems and data we’d have argued strongly that we should cut our losses and settle once we were aware of the “off piste” problem.

Were the Post Office’s internal auditors aware of this? Yes, but they clearly did nothing. If I hadn’t discovered that powerful user privileges were out of control on the first day of a two day, high level, IT installation audit I’d have been embarrassed. It’s that basic. However, the Post Office’s internal auditors don’t have the excuse of incompetence. The problem was flagged up by the external auditors Ernst & Young in 2011. If internal audit was unaware of a problem raised by the external auditors they were stealing their salaries.

The only times when work has ever affected my sleep have been when I knew that the police were going to launch dawn raids on suspects’ houses. I would lie in bed thinking about the quality of the evidence I’d gathered. Had I got it all? Had I missed anything? Could I rely on the data and the systems? I worried because I knew that people were going to have the police hammering on their their front doors at 6 o’clock in the morning.

I am appalled that Post Office investigators and prosecutors could approach fraud investigations with the attitude “what can we do to get a conviction?”. They pursued the sub-postmasters aggressively, knowing the weaknesses in Horizon and the Post Office; that was disgaceful.

In the final post in this series I’ll look further at the role of internal audit, how it should be independent and its role in keeping an eye on risk. In all those respects the Post Office’s internal auditors have fallen short.

The Post Office Horizon IT scandal, part 1 – errors and accuracy

For the last few years I’ve been following the controversy surrounding the Post Office’s accounting system, Horizon. This controls the accounts of some 11,500 Post Office branches around the UK. There was a series of alleged frauds by sub-postmasters, all of whom protested their innocence. Nevertheless, the Post Office prosecuted these cases aggressively, pushing the supposed perpetrators into financial ruin, and even suicide. The sub-postmasters affected banded together to take a civil action against the Post Office, claiming that no frauds had taken place but that the discrepancies arose from system errors.

I wasn’t surprised to see that the sub-postmasters won their case in December 2019, with the judge providing some scathing criticism of the Post Office, and Fujitsu, the IT supplier, who had to pay £57.75 million to settle the case. Further, in March 2020 the Criminal Cases Review Commission decided to refer for appeal the convictions of 39 subpostmasters, based on the argument that their prosecution involved an “abuse of process”. I will return to the prosecution tactics in my next post.

Having worked as an IT auditor, including fraud investigations, and as a software tester the case intrigued me. It had many features that would have caused me great concern if I had been working at the Post Office and I’d like to discuss a few of them. The case covered a vast amount of detail. If you want to see the full 313 page judgment you can find it here [PDF, opens in new tab].

What caught my eye when I first heard about this case were the arguments about whether the problems were caused by fraud, system error, or user error. As an auditor who worked on the technical side of many fraud cases the idea that there could be any confusion between fraud and system error makes me very uncomfortable. The system design should incorporate whatever controls are necessary to ensure such confusion can’t arise.

When we audited live systems we established what must happen and what must not happen, what the system must do and what it must never do. We would ask how managers could know that the system would do the right things, and never do the wrong things. We then tested the system looking for evidence that these controls were present and effective. We would try to break the system, evading the controls we knew should be there, and trying to exploit missing or ineffective controls. If we succeeded we’d expect, at the least, the system to hold unambiguous evidence about what we had done.

As for user error, it’s inevitable that users will make mistakes and systems should be designed to allow for that. “User error” is an inadequate explanation for things going wrong. If the system can’t do that then it is a system failure. Mr Justice Fraser, the judge, took the same line. He expected the system “to prevent, detect, identify, report or reduce the risk” of user error. He concluded that controls had been put in place, but they had failed and that Fujitsu had “inexplicably” chosen to treat one particularly bad example of system error as being the fault of a user.

The explanation for the apparently inexplicable might lie in the legal arguments surrounding the claim by the Post Office and Fujitsu that Horizon was “robust”. The rival parties could not agree even on the definition of “robust” in this context, never mind whether the system was actually robust.

Nobody believed that “robust” meant error free. That would be absurd. No system is perfect and it was revealed that Horizon had a large, and persistent number of bugs, some serious. The sub-postmasters’ counsel and IT expert argued that “robust” must mean that it was extremely unlikely the system could produce the sort of errors that had ruined so many lives. The Post Office confused matters by adopting different definitions at different times, which was made clear when they were asked to clarify the point and they provided an IT industry definition of robustness that sat uneasily with their earlier arguments.

The Post Office approach was essentially top down. Horizon was robust because it could handle any risks that threatened its ability to perform its overall business role. They then took a huge logical leap to claim that because Horizon was robust by their definition it couldn’t be responsible for serious errors at the level of individual branch accounts.

Revealingly, the Post Office and Fujitsu named bugs using the branch where they had first occurred. Two of the most significant were the Dalmellington Bug, discovered at a branch in Ayrshire, and the Callendar Square Bug, also from a Scottish branch, in Falkirk. This naming habit linked bugs to users, not the system.

The Dalmellington Bug entailed a user repeatedly hitting enter when the system froze as she was trying to acknowledge receipt of a consignment of £8,000 in cash. Unknown to her each time she struck the enter key she accepted responsibility for a further £8,000. The bug created a discrepancy of £24,000 for which she was held responsible.

Similarly, the Callendar Square Bug generated spurious, duplicate financial transactions for which the user was considered to be responsible, even though this was clearly a database problem.

The Horizon system processed millions of transactions a day and did so with near 100% accuracy. The Post Office’s IT expert therefore tried to persuade the judge that the odds were 2 in a million that any particular error could be attributable to the system.

Unsurprisingly the judge rejected this argument. If only 0.0002% of transactions were to go wrong then a typical day’s processing of eight million transactions would lead to 16 errors. It would be innumerate to look at one of those outcomes and argue that there was a 2 in a million chance of it being a system error. That probability would make sense only if one of the eight million were chosen at random. The supposed probability is irrelevant if you have chosen a case for investigation because you know it has a problem.

It seemed strange that the Post Office persisted with its flawed perspective. I knew all too well from my own experience of IT audit and testing that different systems, in different contexts, demanded different approaches to accuracy. For financial analysis and modelling it was counter-productive to chase 100% accuracy. It would be too difficult and time consuming. The pursuit might introduce such complexity and fragility to the system that it would fail to produce anything worthwhile, certainly in the timescales required. 98% accuracy might be good enough to give valuable answers to management, quickly enough for them to exploit them. Even 95% could be good enough in some cases.

In other contexts, when dealing with financial transactions and customers’ insurance policies you really do need a far higher level of accuracy. If you don’t reach 100% you need some way of spotting and handling the exceptions. These are not theoretical edge cases. They are people’s insurance policies or claims payments. Arguing that losing a tiny fraction of 1% is acceptable, would have been appallingly irresponsible, and I can’t put enough stress on the point that as IT auditors we would have come down hard, very hard, on anyone who tried to take that line. There are some things the system should always do, and some it should never do. Systems should never lose people’s data. They should never inadvertently produce apparently fraudulent transactions that could destroy small businesses and leave the owners destitute. The amounts at stake in each individual Horizon case were trivial as far as the Post Office was concerned, immaterial in accountancy jargon. But for individual sub-postmasters they were big enough to change, and to ruin, lives.

The willingness of the Post Office and Fujitsu to absolve the system of blame and accuse users instead was such a constant theme that it produced a three letter acronym I’d never seen before; UEB, or user error bias. Naturally this arose on the claimants’ side. The Post Office never accepted its validity, but it permeated their whole approach; Horizon was robust, therefore any discrepancies must be the fault of users, whether dishonestly or accidentally, and they could proceed safely on that basis. I knew from my experience that this was a dreadful mindset with which to approach fraud investigations. I will turn to this in my next post in this series.

“You’re a tester! You can’t tell developers what to do!”

Michael Bolton posted this tweet to which I responded. When Michael asked for more information I started to respond in a tweet, which turned into a thread, which grew to the point where it became ridiculous, so I turned it into a blog post.

It was a rushed migration of insurance systems following a merger. I was test manager. The requirements were incomplete. That’s being polite. In a migration there are endless edge cases where conversion algorithms struggle to handle old, or odd data. The requirements failed to state what should happen if policies couldn’t be assigned to an appropriate pigeon hole in the new system. They didn’t say we should not lose policyholders’ data, and that became highly controversial.

The lack of detail in the requirements inevitably meant, given the complexities of a big migration, which are seldom appreciated by outsiders, that data was lost on each pass. Only a small percentage was dropped, but in the live migration these would be real policies protecting real people. My team’s testing picked this problem up. It was basic, pretty much the first thing we’d look for. The situation didn’t improve significantly in successive test cycles and it was clear that there were would be dropped records in the live runs.

I approached the development team lead and suggested that migration programs should report on the unassigned records and output them to a holding file awaiting manual intervention. Letting them fall through a gap and disappear was amateurish. The testers could flag up the problem, but the fix should come from the users and developers. It wouldn’t be the testers’ job to detect and investigate these problems when it came to the live migration. My test team would not be involved. Our remit extended only to the test runs and the mess would have to be investigated by end users, who lacked our technical skills and experience. The contract wasn’t going to pay for us to do any work when live running started and we would be immediately assigned to other work.

The lead developer argued that her team couldn’t write code that hadn’t been specifically requested by users. The users’ priority was a rapid migration. Quality and accuracy were unimportant. The users were telling us, informally, that if a small proportion of policies was lost then it was no big deal.

The client didn’t offer a coherent argument to justify their stance and they refused to put in writing anything that would look incriminating in a post mortem, as if they thought that would protect them from a savaging from the auditors. They were too busy trying to force the migration through to get into a serious debate about quality, ethics, or how to handle the profusion of edge cases. The nearest they came to a defence of their position was a vague assertion that policyholders would eventually notice and complain that they’d heard nothing from the insurer about renewing their policies. There were some serious problems with that stance.

Failing to convert responsibly would have made us, as the supplier, complicit in the client’s negligence. There might not have been an explicit requirement not to lose data, but it was basic professionalism on the part of the developers to ensure they didn’t do so. It was absurd to assume that all the policyholders affected would notice before their policies expired. People would find they were uninsured only when they suffered an accident. The press would have loved the story.

We were dealing with motor insurance, which drivers legally must have. It would eventually have come to the attention of the police that the client was accidentally cancelling policies. I knew, from my audit experience, that the police took a very dim view of irresponsible system practices by motor insurers.

The development team leader was honest, but deep reflection about her role was not her style. She had a simplistic understanding of her job and the nature of the relationship with the client. She genuinely believed the developers couldn’t cut a line of code that had not been requested, even though that argument didn’t stand up to any serious scrutiny. Delivering a program for any serious business application involves all sorts of small, but vital coding decisions that are of no interest to those who write the requirements but which affect whether or not the program meets the requirements. The team leader, however, dug her heels in and thought I was exceeding my authority by pushing the point.

Frustratingly, her management backed her up for reasons I could understand, though disagreed with. A quick, botched migration suited everyone in the short term. It was my last test project for a few years. I received an approach asking if I was interested in an information security management role. Both the role and the timing were attractive. A break from testing felt like a good idea.

Edit: Reading this again I feel I should offer some defence for the attitude of the development team lead. The behaviour, and the culture, of the client meant that she was under massive pressure. There was not enough time to develop the migration routines, or to fix them in between test runs. Users were bypassing both her and me, the test manager, when they reported defects. They would go straight to the relevant developers and we were both struggling to keep track and enforce any discipline. The users also went behind her back to lean on developers to make last minute, untested changes to code that was already in the release pipeline into production. She made sure her team knew the correct response was, “**** off – it’s more than my job’s worth.”

Throughout the migration the users were acting appallingly, trying to pin the blame on her team for problems they had created. She was in a very difficult position, and that largely explains an attitude that would have seemed absurd and irrational at a calmer time. Her resistance to my requests was infuriating, but I had a lot of sympathy for her, and none whatsoever for the users who were making her life a misery.

There’s an important wider point here. When we’re dealing with “difficult” people it’s helpful to remember that they might be normal people dealing with very difficult problems. Don’t let it get personal. Think about the underlying causes, not the obvious victim.