The dragons of the unknown; part 3 – I don’t know what’s going on

Introduction

This is the third post in a series about problems that fascinate me, that I think are important and interesting. The series draws on important work from the fields of safety critical systems and from the study of complexity, specifically complex socio-technical systems. This was the theme of my keynote at EuroSTAR in The Hague (November 12th-15th 2018).

The first post was a reflection, based on personal experience, on the corporate preference for building bureaucracy rather than dealing with complex reality, “The dragons of the unknown; part 1 – corporate bureaucracies”. The second post was about the nature of complex systems, “part 2 – crucial features of complex systems”. This one follows on from part 2, which talked about the impossibility of knowing exactly how complex socio-technical systems will behave with the result that it is impossible to specify them precisely.

The starting point for a system audit

When we audited a live system the specifications of the requirements and design didn’t matter – they really didn’t. This was because;

  • specs were a partial and flawed picture of what was required at the time that the system was built,
  • they were not necessarily relevant to the business risks and problems facing the company at the time of the audit,
  • the system’s compliance, or failure to comply, with the specs told us nothing useful about what the system was doing or should be doing (we genuinely didn’t care about “compliance”),
  • we never thought it was credible that the specs would have been updated to reflect subsequent changes,
  • we were interested in the real behaviour of the people using the system, not what the analysts and designers thought they would or should be doing.

audit starting pointIt was therefore a complete waste of time in a tightly time boxed audit if we waded through the specs. Context driven testers have been fascinated when I’ve explained that we started with a blank sheet of paper. The flows we were interested in were the things that mattered to the people that mattered.audit starting point (2)

We would identify a key person and ask them to talk us through the business context of the system, sketching out how it fitted into its environment. The interfaces were where we always expected things to go wrong. The scope of the audit was dictated by the sketches of the people who mattered, not the system documentation.

IDEF0 notationWe might have started with a blank sheet but we were highly rigorous. We used a structured methods modelling technique called IDEF0 to make sense of what we were learning, and to communicate that understanding back to the auditees to confirm that it made sense.

We were constantly asking, “How do you know that the system will do what we want? How will you get the outcomes you need?. What must never happen? How does the system prevent that? What must always happen? How does the system ensure that?” It’s a similar approach to the safety critical idea of always events and never events. It is particularly popular in medical safety circles.

We were dealing with financial systems. Our concern could be summarised as; how do we know that the processing is complete, accurate, authorised and timely? It was almost a mantra; complete, accurate, authorised and timely.

These are all constrained by each other, and informed by the context, i.e. sufficiently accurate for business objectives given the need to provide the information within an acceptable time. We had to understand the current context. Context was everything.

Once we had a good idea of the processes, the outputs, the key risks and the controls that were needed, we would attack the system to see if we could force it to do what it shouldn’t, or prevent it doing what it was required to do. We would try to approach the testing with the mindset of a dishonest or irresponsible user. At that time I had never heard of exploratory testing. Training in that would have been invaluable.

We would also speak to ordinary users and watch them in action. Our interviews, observations, and our own testing told us far more about the system and how it was being used than the formal system documentation could. It also told us more than we could learn from the developers who looked after the systems. They would often be taken by surprise by what we discovered about how users were really working with their systems.

We were always asking questions to help us identify the controls that would give us the right outcomes. This is very similar to the way experts look at safety critical systems. Safety is a control problem, a question of ensuring there are mechanisms or practices in place that will help the system and its users from straying into dangerous territory. System developers cannot know how their systems will be used as part of a complex socio-technical system. They might think they do, but users will always take the system into unknown territory.

“System as imagined” versus “system as found”

The safety critical community makes an important distinction between the system as imagined and the system as found. The imagined system is neat and tidy. It is orderly, without noise, confusion and distraction. Real people are absent, or not meaningfully represented.into the unknown

A user who is working with a system for several hours a day for years on end will know all about the short cuts, hacks and vulnerabilities that are available. They make the system do things that the designers never imagined. They will understand the gaps in the system, the gaps in the designers’ understanding. The users would then have to use their own ingenuity. These user variations are usually beneficial and help the system work as the business requires. They can also be harmful and provide opportunities for fraud, a big concern in an insurance company. Large insurers receive and pay out millions of pounds a day, with nothing tangible changing hands. They have always been vulnerable to fraud, both by employees and outsiders.

how honest are the usersI investigated one careful user who stole over a million pounds, slice by slice, several thousand pounds a week, year after year, all without attracting any attention. He was exposed only by an anonymous tip off. It was always a real buzz working on those cases trying to work out exactly what the culprit had done and how they’d exploited the systems (note the plural – the best opportunities usually exploited loopholes between interfacing systems).

What shocked me about that particular case was that the fraudster hadn’t grabbed the money and run. He had settled in for a long term career looting systems we had thought were essentially sound and free of significant bugs. He was confident that he would never be caught. After piecing together the evidence I knew that he was right. There was nothing in the systems to stop him or to reveal what he had done, unless we happened to investigate him in detail.

Without the anonymous tip from someone he had double crossed he would certainly have got away with it. That forced me to realise that I had very little idea what was happening out in the wild, in the system as found.

The system as found is messy. People are distracted and working under pressure. What matters is the problems and the environment the people are dealing with, and the way they have to respond and adapt to make the system work in the mess.

There are three things you really shouldn’t say to IT auditors. In ascending facepalm order.three things you don't say to IT auditors

“But we thought audit would expect …”.

“But the requirements didn’t specify…”.

“But users should never do that”.

The last was the one that really riled me. Developers never know what users will do. They think they do, but they can’t know with any certainty. Developers don’t have the right mindset to think about what real users will do. Our (very unscientific and unevidenced) rule of thunb was as follows. 10% of people will never steal, regardles of the temptation. 10% will always try to steal, so systems must produce and retain the evidence to ensure they will be caught. The other 80% will be honest so long as we don’t put temptation in their way, so we have to explore the application to find the points of weakness that will tempt users.

Aside from their naivety, in auditors’ eyes, regarding fraud and deliberate abuse of the system, developers, and even business analysts, don’t understand the everyday pressures users will be under when they are working with complex socio-technical systems. Nobody knows how these systems really work. It’s nothing to be ashamed of. It’s the reality and we have to be honest about that.

One of the reasons I was attracted to working in audit and testing, and lost my enthusiasm for working in information security, was that these roles required me to think about what was really going on. How is this bafflingly complex organisation working? We can’t know for sure. It’s not a predictable, deterministic machine. All we can say confidently is that certain factors are more likely to produce good outcomes and others are more likely to give us bad outcomes.

If anyone does claim they do fully understand a complex socio-technical system then one of the following applies.

  • They’re bullshitting, which is all too common, and are happy to appear more confident than they have any right to be. Sadly it’s a good look in many organisations.
  • They’re deluded and genuinely have no idea of the true complexity.
  • They understand only part of the system – probably one of the less complex parts, and they’re ignoring the rest. In fairness, they might have made a conscious decision to focus only on the part that they can understand. However, other people might not appreciate the significance of that qualification, and no-one might spot that the self-professed expert has defined the problem in a way that is understandable but not realistic.
  • They did have a good understanding of the system once upon a time, when it was simpler, before it evolved into a complex beast.

It is widely believed that mapmakers in the Middle Ages would fill in the empty spaces with dragons. It’s not true. It’s just a myth, but it is a nice idea. It is a neat analogy because the unknown is scary and potentially dangerous. That’s been picked up by people working with safety critical systems, specifically the resilience engineering community. They use phrases like “jousting with dragons” and “facing the dragons at the borderlands”.here be dragons

Safety critical experts use this metaphor of dangerous dragons for reasons I have been outlining in this series. Safety critical systems are complex socio-technical systems. Nobody can specify how these systems will behave, what people will have to do to keep them running, running safely. The users will inevitably take these systems into unknown, and therefore dangerous, territory. That has huge implications for safety critical systems. I want to look at how the safety community has responded to the problem of trying to understand why systems can have bad outcomes when they can’t even know how systems are supposed to behave. I will pick that up in later posts in this series.

In the next post I will talk about the mental models we use to try and understand failures and accidents, “part 4 – a brief history of accident models”.

The dragons of the unknown; part 2 – crucial features of complex systems

Introduction

This is the second post in a series about problems that fascinate me, that I think are important and interesting. The series draws on important work from the fields of safety critical systems and from the study of complexity, specifically complex socio-technical systems. This was the theme of my keynote at EuroSTAR in The Hague (November 12th-15th 2018).

The first post was a reflection, based on personal experience, on the corporate preference for building bureaucracy rather than dealing with complex reality, “The dragons of the unknown; part 1 – corporate bureaucracies”. This post is about the nature of complex systems and discusses some features that have significant implications for testing. We have been slow to recognise the implications of these features.

Complex systems are probabilistic (stochastic) not deterministic

A deterministic system will always produce the same output, starting from a given initial state and receiving the same input. Probabilistic, or stochastic, systems are inherently unpredictable and therefore non-deterministic. Stochastic is defined by the Oxford English Dictionary as “having a random probability distribution or pattern that may be analysed statistically but may not be predicted precisely.”

Traditionally, non-determinism meant a system was badly designed, inherently buggy, and untestable. Testers needed deterministic systems to do their job. It was therefore the job of designers to produce systems that were deterministic, and testers would demonstrate whether or not the systems met that benchmark. Any non-determinism meant a bug had to be removed.

Is that right or nonsense? Well, neither, or rather it depends on the context you choose. It depends what you choose to look at. You can restrict yourself to a context where determinism holds true, or you can expand your horizons. The traditional approach to determinism is correct, but only within carefully defined limits.

You can argue, quite correctly, that a computer program cannot have the properties of a true complex system. A program does what it’s coded to do: outputs can always be predicted from the inputs, provided you’re clever enough and you have enough time. For a single, simple program that is certainly true. A fearsomely complicated program might not be meaningfully deterministic, but we can respond constructively to that with careful design, and sensitivity to the needs of testing and maintenance. However, if we draw the context wider than individual programs the weaker becomes our confidence that we can know what should happen.

Once you’re looking at complex socio-technical systems, i.e. systems where people interact with complex technology, then any reasonable confidence that we can predict outcomes accurately has evaporated. These are the reasons.

Even if the system is theoretically still deterministic we don’t have brains the size of a planet, so for practical purposes the system becomes non-deterministic.

The safety critical systems community likes to talk about tractable and intractable systems. They know that the complex socio-technical systems they work with are intractable, which means that they can’t even describe with confidence how they are supposed to work (a problem I will return to). Does that rule out the possibility of offering a meaningful opinion about whether they are working as intended?

That has huge implications for testing artificial intelligence, autonomous vehicles and other complex technologies. Of course testers will have to offer the best information they can, but they shouldn’t pretend they can say these systems are working “as intended” because the danger is that we are assuming some artificial and unrealistic definition of “as intended” that will fit the designers’ limited understanding of what the system will do. I will be returning to that. We don’t know what complex systems will do.

In a deeply complicated system things will change that we are unaware of. There will always be factors we don’t know about, or whose impact we can’t know about. Y2K changed the way I thought about systems. Experience had made us extremely humble and modest about what we knew, but there was a huge amount of stuff we didn’t even know we didn’t know. At the end of the lengthy, meticulous job of fixing and testing we thought we’d allowed for everything, in the high risk, date sensitive areas at least. We were amazed how many fresh problems we found when we got hold of a dedicated mainframe LPAR, effectively our own mainframe, and booted it up with future dates.

We discovered that there were vital elements (operating system utilities, old vendor tools etc) lurking in the underlying infrastructure that didn’t look like they could cause a problem but which interacted with application code in ways we could not have predicted when run with Y2K dates. The fixed systems had run satisfactorily with overrides to the system date in test enviroments that were built to mirror production, but they crashed when they ran on a mainframe running at future system dates. We were experts, but we hadn’t known what we didn’t know.

The behaviour of these vastly complicated systems was indistinguishable from complex, unpredictable systems. When a test passes with such a system there are strict limits to what we should say with confidence about the system.

As Michael Bolton tweeted;

Michael Bolton's tweet“A ‘passing’ test doesn’t mean ‘no problem’. It means ‘no problem *observed*. This time. With these inputs. So far. On my machine’.”

So, even if you look at the system from a narrow technical perspective, the computerised system only, the argument that a good system has to be deterministic is weak. We’ve traditionally tested systems as if they were calculators, which should always produce the same answers from the same sequence of button presses. That is a limited perspective. When you factor in humans then the ideal of determinism disintegrates.

In any case there are severe limits to what we can say about the whole system from our testing of the components. A complex system behaves differently from the aggregation of its components. It is more than the sum. That brings us to an important feature of complex systems. They are emergent. I’ll discuss this in the next section.

My point here is that the system that matters is the wider system. In the case of safety critical systems, the whole, wider system decides whether people live or die.

Instead of thinking of systems as being deterministic, we have to accept that complex socio-technical systems are stochastic. Any conclusions we reach should reflect probability rather than certainty. We cannot know what will happen, just what is likely. We have to learn about the factors that are likely to tip the balance towards good outcomes, and those that are more likely to give us bad outcomes.

I can’t stress strongly enough that lack of determinism in socio-technical systems is not a flaw, it’s an intrinsic part of the systems. We must accept that and work with it. I must also stress that I am not dismissing the idea of determinism or of trying to learn as much as possible about the behaviour of individual programs and components. If we lose sight of what is happening within these it becomes even more confusing when we try to look at a bigger picture. Likewise, I am certainly not arguing against Test Driven Development, which is a valuable approach for coding. Cling to determinism whenever you can, but accept its limits – and abandon all hope that it will be available when you have to learn about the behaviour of complex socio-technical systems.

We have to deal with whole systems as well as components, and that brings me to the next point. It’s no good thinking about breaking the system down into its components and assuming we can learn all we need to by looking at them individually. Complex systems have emergent behaviour.

Complex systems are emergent; the whole is greater than the sum of the parts

It doesn’t make sense to talk of an H2O molecule being wet. Wetness is what you get from a whole load of them. The behaviour or the nature of the components in isolation doesn’t tell you about the behaviour or nature of the whole. However, the whole is entirely consistent with the elements. The H2O molecules are governed by the laws of the periodic table and that remains so regardless of whether they are combined. But once they are combined they become water, which is unquestionably wet and is governed by the laws of fluid dynamics. If you look at the behaviour of free surface water in the oceans under the influence of wind then you are dealing with a stochastic process. Individual waves are unpredictable, but reasonable predictions can be made about the behaviour of a long series of waves.

As you draw back and look at the wider picture, rather than the low level components you see that the components are combining in ways that couldn’t possibly have been predicted simply by looking at the components and trying to extrapolate.

Starlings offer another good illustration of emergence. These birds combine in huge flocks to form murmurations, amazing, constantly evolving aerial patterns that look as if a single brain is in control. The individual birds are aware of only seven others, rather than the whole murmuration. They concentrate on those neighbours and respond to their movements. Their behaviour isn’t any different from what they can do on their own. However well you understood the invidual starling and its behaviour you could not possibly predict what these birds do together.


Likewise with computer systems, even if all of the components are well understood and working as intended the behaviour of the whole is different from what you’d expect from simply looking at these components. This applies especially when humans are in the loop. Not only is the whole different from the sum of the parts, the whole system will evolve and adapt unpredictably as people find out what they have to do to make the system work, as they patch it and cover for problems and as they try to make it work better. This is more than a matter of changing code to enhance the system. It is about how people work with the system.

Safety is an emergent property of complex systems. The safety critical experts know that they cannot offer a meaningful opinion just by looking at the individual components. They have to look at how the whole system works.

In complex systems success & failure are not absolutes

Success & failure are not absolutes. A system might be flawed, even broken, but still valuable to someone. There is no right, simple answer to the question “Is it working? Are the figures correct?”

Appropriate answers might be “I don’t know. It depends. What do you mean by ‘working’? What is ‘correct’? Who is it supposed to be working for?”

The insurance finance systems I used to work on were notoriously difficult to understand and manipulate. 100% accuracy was never a serious, practicable goal. As I wrote in “Fix on failure – a failure to understand failure”;

“With complex financial applications an honest and constructive answer to the question ‘is the application correct?’ would be some variant on ‘what do you mean by correct?’, or ‘I don’t know. It depends’. It might be possible to say the application is definitely not correct if it is producing obvious garbage. But the real difficulty is distinguishing between the seriously inaccurate, but plausible, and the acceptably inaccurate that is good enough to be useful. Discussion of accuracy requires understanding of critical assumptions, acceptable margins of error, confidence levels, the nature and availability of oracles, and the business context of the application.”

I once had to lead a project to deliver a new sub-system that would be integrated into the main financial decision support system. There were two parallel projects, each tackling a different line of insurance. I would then be responsible for integrating the new sub-systems to the overall system, a big job in itself.

The other project manager wanted to do his job perfectly. I wanted to do whatever was necessary to build an acceptable system in time. I succeeded. The other guy delivered late and missed the implementation window. I had to carry on with the integration without his beautiful baby.

By the time the next window came around there were no developers available to make the changes needed to bring it all up to date. The same happened next time, and then the next time, and then… and eventually it was scrapped without ever going live.

If you compared the two sub-systems in isolation there was no question that the other man’s was far better than the one I lashed together. Mine was flawed but gave the business what they needed, when they needed it. The other was slightly more accurate but far more elegant, logical, efficient and lovingly crafted. And it was utterly useless. The whole decision support system was composed of sub-systems like mine, flawed, full of minor errors, needing constant nursing, but extremely valuable to the business. If we had chased perfection we would never have been able to deliver anything useful. Even if we had ever achieved perfection it would have been fleeting as the shifting sands of the operational systems that fed it introduced new problems.

The difficult lesson we had to learn was that flaws might have been uncomfortable but they were an inescapable feature of these systems. If they were to be available when the business needed them they had to run with all these minor flaws.

Richard Cook expanded on this point in his classic, and highly influential article from 1998 “How complex systems fail”. He put it succinctly.

“Complex systems run in degraded mode.”

Cook’s arguments ring true to those who have worked with complex systems, but it hasn’t been widely appreciated in the circles of senior management where budgets, plans and priorities are set.

Complex systems are impossible to specify precisely

SystemanticsCook’s 1998 paper is important, and I strongly recommend it, but it wasn’t quite ground breaking. John Gall wrote a slightly whimsical and comical book that elaborated on the same themes back in 1975. “Systemantics; how systems work and especially how they fail”. Despite the jokey tone he made serious arguments about the nature of complex systems and the way that organisations deal, and fail to deal, with them. Here is a selection of his observations.

“Large systems usually operate in failure mode.”

“The behaviour of complex systems… living or non-living, is unpredictable.”

“People in systems do not do what the system says they are doing.”

“Failure to function as expected is an intrinsic feature of systems.”

John Gall wrote that fascinating and hugely entertaining book more than forty years ago. He nailed it when he discussed the problems we’d face with complex socio-technical systems. How can we say the system is working properly if we neither know how it is working, or even how it is supposed to work? Or what the people are doing within the system?

The complex systems we have to deal with are usually socio-technical systems. They operate in a social setting, with humans. People make the systems work and they have to make decisions under pressure in order to keep the system running. Different people will do different things. Even the same person might act differently at different times. That makes the outcomes from such a system inherently unpredictable. How can we specify such a system? What does it even mean to talk of specifying an unpredictable system?

That’s something that the safety critical experts focus on. People die because software can trip up humans even when it is working smoothly as designed. This has received a lot of attention in medical circles. I’ll come back to that in a later post.

That is the reality of complex socio-technical systems. These systems are impossible to specify with complete accuracy or confidence, and certainly not at the start of any development. Again, this is not a bug, but an inescapable feature of complex socio-technical systems. Any failure may well be in our expectations, a flaw in our assumptions and knowledge, and not necessarily the system.

This reflected my experience with the insurance finance systems, especially for Y2K, and it was also something I had to think seriously about when I was an IT auditor. I will turn to that in my next post, “part 3 – I don’t know what’s going on”.

The dragons of the unknown; part 1 – corporate bureaucracies

Introduction

This is the first post in a series about problems that fascinate me, that I think are important and interesting. The series will draw on important work from the fields of safety critical systems and from the study of complexity, specifically complex socio-technical systems. I’m afraid I will probably dwell longer on problems than answers. One of the historical problems with software development and testing has been an eagerness to look for and accept easy, but wrong answers. We have been reluctant to face up to reality when we are dealing with complexity, which doesn’t offer simple or clear answers.

This was the theme of my keynote at EuroSTAR in The Hague (November 12th-15th 2018).

Complexity is intimidating and it’s tempting to pretend the world is simpler than it is. We’ve been too keen to try & reshape reality so that it will look like something we can manage neatly. That mindset often dovetails neatly with the pressures of corporate life and it is possible to go far in large organisations while denying and evading reality. It is, however, bullshit.

A bit about my background

When I left university I went to work for one of the big, international accountancy firms as a trainee chartered accountant. It was a bewildering experience. I felt clueless. I didn’t understand what was going on. I never did feel comfortable that I understood what we were doing. It wasn’t that I was dimmer than my colleagues. I was the only one who seemed to question what was going on and I felt confused. Everyone else took it all at face value but the work we were doing seemed to provide no value to anyone.

Alice managing flamingosAt best we were running through a set of rituals to earn a fee that paid our salaries. The client got a clean, signed off set of accounts, but I struggled to see what value the information we produced might have for anyone. None of the methods we used seemed designed to tell us anything useful about what our clients were doing. It all felt like a charade. I was being told to do things and I just couldn’t see how anything made sense. I may as well have been trying to manage that flamingo, from Alice in Wonderland. That picture may seem a strange one to use, but it appeals to me; it sums up my confusion well. What on earth was the point of all these processes? They might as well have been flamingos for all the use they seemed. I hadn’t a clue.

I moved to a life assurance company, managing the foreign currency bank accounts. That entailed shuffling tens of millions of dollars around the world every day to maximise the overnight interest we earned. The job had a highly unattractive combination of stress and boredom. A simple, single mistake in my projections of the cash flowing through the accounts on one day would cost far more than my annual salary. The projections weren’t an arithmetical exercise. They required judgment and trade offs of various factors. Getting it right produced a sense of relief rather than satisfaction.

The most interesting part of the job was using the computer systems to remotely manage the New York bank accounts (which was seriously cutting edge for the early 1980s) and discussing with the IT people how they worked. So I decided on a career switch into IT, a decision I never regretted, and arrived in the IT division of a major insurance company. I loved it. I think my business background got me more interesting roles in development, involving lots of analysis and design as well as coding.

After a few years I had the chance to move into computer audit, part of the group audit department. It was a marvellous learning experience, seeing how IT fitted into the wider business, and seeing all the problems of IT from a business perspective. That transformed my outlook and helped me navigate my way round corporate bureaucracies, but once I learned to see problems and irresponsible bullshit I couldn’t keep my mouth shut. I didn’t want to, and that’s because of my background, my upbringing, and training. I acquired the reputation for being an iconoclast, an awkward bastard. I couldn’t stand bullshit.

The rise of bullshit jobs

My ancestors had real jobs, tough, physical jobs as farmhands, stonemasons and domestic servants till the 20th century when they managed to work their way up into better occupations, like shopkeeping, teaching, sales interspersed with spells in the military during the two world wars. They were still real jobs, where it was obvious if you didn’t do anything worthwhile, if you weren’t achieving anything.

I had a very orthodox Scottish Presbyterian upbringing. We were taught to revere books and education. We should work hard, learn, stand our ground when we know we are right and always argue our case. We should always respect those who have earned respect, regardless of where they are in society.Scotty from Star Trek

In the original Star Trek Scotty’s accent may have been dodgy, but that character was authentic. It was my father’s generation. As a Star Trek profile puts it; “rank means nothing to Scotty if you’re telling him how to do his job”.

A few years ago Better Software magazine introduced an article I wrote by saying I was never afraid to voice my opinion. I was rather embarrassed when I saw that. Am I really opinionated and argumentative? Well, probably (definitely, says my wife). When I think that I’m right I find it hard to shut up. Nobody does righteous certainty better than Scottish Presbyterians! In that, at least, we are world class, but I have to admit, it’s not always an attractive quality (and the addictive yearning for certainty becomes very dangerous when you are dealing with complex systems). However, that ingrained attitude, along with my experience in audit, did prepare me well to analyse and challenge dysfunctional corporate practices, to challenge bullshit and there has never been any shortage of that.

Why did corporations embrace harmful practices? A major factor was that they had become too big, complex and confusing for anyone to understand what was going on, never mind exercise effective control. The complexity of the corporation itself is difficult enough to cope with, but the problems it faces and the environment it operates in have also become more complex.

I’m not advocating some radical Year Zero destruction of corporate bureaucracy. Large organisations are so baffling and difficult to manage that without some form of bureaucracy nothing would happen. All would be confusion and chaos. But it is difficult to keep the bureaucracy under control and in proportion to the organisation’s real needs and purpose. There is an almost irrestible tendency for the bureaucracy to become the master rather than the servant.

Long and painful, if educational, experience has allowed me to distill the lessons I’ve learned into seven simple statements.

  • Modern corporations, the environment they’re operating in and the problems they face are too complex for anyone to control or understand.
  • Corporations have been taken over by managers and run for their own benefit, rather than customers, shareholders, the workforce or wider society.
  • Managers need an elaborate bureaucracy to maintain even a semblance of control, though it’s only the bureaucracy they control, not the underlying reality.
  • These managers struggle to understand the jobs of the people who do the productive work.
  • So the managers value protocol and compliance with the bureaucracy over technical expertise.
  • The purpose of the corporate bureaucracy therefore becomes the smooth running of the bureaucracy.
  • Hence the proliferation of jobs that provide no real value and exist only so that the corporate bureaucracy can create the illusion of working effectively.

I have written about this phenomenon in a blog series “Corporate bureaucracy and testing” and also reflected in “Testing: valuable or bullshit?” on the specific threat to testing if it becomes a low skilled, low value corporate bullshit job.

The aspect of this problem that I want to focus on in this series is our desire to simplify complexity. We furnish simple explanations for complex problems. I did this as a child when I decided the wind was caused by trees waving their branches. My theory fitted what I observed, and it was certainly much easier for a five year old to understand than variations in atmospheric pressure. We also make convenient, but flawed, assumptions that turn a messy, confusing, complex problem into one that we are confident we can deal with. The danger is that in doing so we completely lose sight of the real problem while we focus on a construct of our own imagination. This is hardly a recent phenomenon.Guns of August

The German military planners of World War One provide a striking example of this escape from reality. They fully appreciated what a modern, industrial war would be like with huge armies and massively destructive armaments. The politicians didn’t get it, but according to “Barbara Tuchman” the German military staff did understand. They just didn’t know how to respond. So they planned to win the sort of war they were already familiar with, a 19th century war.

(General Moltke, the German Chief of Staff,) said to the Kaiser in 1906, ‘It will be a long war that will not be settled by a decisive battle but by a long wearisome struggle with a country that will not be overcome until its whole national force is broken, and a war that will utterly exhaust our own people, even if we are victorious.’ It went against human nature, however – and the nature of General Staffs – to follow the logic of his own prophecy. Amorphous and without limits, the concept of a long war could not be scientifically planned for as could the orthodox, predictable and simple solution of decisive battle and short war. The younger Moltke was already Chief of Staff when he made his prophecy, but neither he nor his Staff, nor the Staff of any other country made any effort to plan for a long war.

The military planners yearned for a problem that allowed an “orthodox, predictable and simple solution”, so they redefined the problem to fit that longing. The results were predictably horrific.

There is a phrase for the mental construct the military planners chose to work with; an “envisioned world” (PDF – opens in new tab). That paper, by David Woods, Paul Feltovich, Robert Hoffman, and Axel Roesler is a fairly short and clear introduction to the dangers of approaching complex systems with a set of naively simplistic assumptions. Our natural, human bias towards over-simplification has various features. In each case the danger is that we opt for a simplified perspective, rather than a more realistic one.

We like to think of activities as a series of discrete steps that can be analysed individually, rather than as continuous processes that cannot meaningfully be broken down. Similarly, we prefer to see processes as being separable and independent, rather than envisage them all interacting with the wider world. We are inclined to consider activities as if they were sequential when they actually happen simultaneously. We instinctively want to assume homogeneity rather than heterogeneity, so we mentally class similar things as if they were exactly the same, thus losing sight of nuance and important distinctions; we assume regularity when the reality is irregular. We look at elements as if there is only one perspective when there might be multiple viewpoints. We like to assume any rules or principles are universal when they might really be local and conditional, relevant only to the current context. We inspect the surface and shy away from considering deep analysis that might reveal awkward complications and subtleties.

These are all relevant considerations for testers, but there are three more that are all related and are particularly important when trying to learn how complex socio-technical systems work.

  • We look on problems as if they are static objects, when we should be thinking of them as dynamic, flowing processes. If we focus on the static then we lose sight of the problems or opportunities that might arise as the problem, or the application, changes over time or space.
  • We treat problems as if they are simple and mechanical, rather than organic with unpredictable, emergent properties. The implicit assumption is that we can know how whole systems will behave simply by looking at the behaviour of the components.
  • We pretend that the systems are subject to linear causes and effects with the same cause always producing the same effect. The possibilities of tipping points and cascading effects is ignored.

Complex socio-technical systems are not static, simple or linear. Testers have to recognise that and frame their testing to take account of the reality, that these systems are dynamic, organic and non-linear. If they don’t and if they try to restrict themselves to the parts of the system that can be treated as mechanical rather than truly complex, the great danger is that testing will become just another pointless, bureaucratic job producing nothing of any real value. I have worked both as an external auditor and an internal auditor. Internal audit has a focus and a mindset that allows it to deliver great value, when it is done well. External audit has been plagued by a flawed business model that is struggling with the complexity of modern corporations and their accounts. The external audit model requires masses of inexperienced, relatively lowly paid staff, carrying out unskilled checking of the accounts and producing output of dubious value. The result can fairly be described as a crisis of relevance for external audit.

I don’t want to see testing suffer the same fate, but that is likely if we try to define the job as one that can be carried out by large squads of poorly skilled testers. We can’t afford to act as if the job is easy. That is the road to irrelevance. In order to remain relevant we must try to meet the real needs of those who employ us. That requires us to deal with the world as it is, not as we would like it to be.

My spell in IT audit forced me to think seriously about all these issues seriously for the first time. The audit department in which I worked was very professional and enlightened, with some very good, very bright people. We carried out valuable, risk-based auditing when that was at the leading edge of internal audit practice. Many organisations have still not caught up and are mired in low-value, low-skilled, compliance checking. That style of auditing falls squarely into the category of pointless, bullshit jobs. It is performing a ritual for the sake of appearances

My spell as an auditor transformed my outlook. I had to look at, and understand the bigger picture, how the various business critical applications fitted together, and what the implications were of changing them. We had to confront bullshitters and “challenge the intellectual inadequates”, as the Group Chief Auditor put it. We weren’t just allowed to challenge bullshit; it was our duty. Our organisational independence meant that nobody could pull rank on us, or go over our heads.

I never had a good understanding of what the company was doing with IT till I moved into audit. The company paid me enough money to enjoy a good lifestyle while I played with fun technology. As an auditor I had to think seriously about how IT kept the company competitive and profitable. I had to understand how everything fitted together, understand the risks we faced and the controls we needed.

I could no longer just say “well, shit happens” I had to think “what sort of shit?”, “how bad is it?”, “what shit can we live with?”, “what shit have we really, really got to avoid”, “what are the knock-on implications?”, “can we recover from it?”, “how do we recover?”, “what does ‘happen’ mean anyway?”, “who does it happen to?”, “where does it happen?”.

Everything that mattered fitted together. If it was stand alone, then it almost certainly didn’t matter and we had more important stuff to worry about. The more I learned the more humble I became about the limits of my knowledge. It gradually dawned on me how few people had a good overall understanding of how the company worked, and this lesson was hammered home when we reached Y2K.

When I was drafted onto the Y2K programme as a test manager I looked at the plans drawn up by the Y2K architects for my area, which included the complex finance systems on which I had been working. The plans were a hopelessly misleading over-simplification. There were only three broad systems defined, covering 1,175 modules. I explained that it was nonsense, but I couldn’t say for sure what the right answer was, just that it was a lot more.

I wrote SAS programs to crawl through the production libraries, schedules, datasets and access control records to establish all the links and outputs. I drew up an overview that identified 20 separate interfacing applications with 3,000 modules. That was a shock to management because it had already been accepted that there would not be enough time to test the lower number thoroughly.

My employers realised I was the only available person who had any idea of the complexity of both the technical and business issues. They put me in charge of the development team as well as the testers. That was an unusual outcome for a test manager identifying a fundamental problem. I might not have considered myself an expert, but I had proved my value by demonstrating how much we didn’t know. That awareness was crucial.

That Y2K programme might be 20 years ago but it was painfully clear at the time that we had totally lost sight of the complexity of these finance applications. I was able to provide realistic advice only because of my deep expertise and thereafter I was always uncomfortably aware that I never again had the time to acquire such deep knowledge.

These applications, for all their complexity, were at least rigidly bounded. We might not have known what was going on within them, but we knew where the limits lay. They were all internal to the corporation with a tightly secured perimeter. That is a different world from today. The level of complexity has increased vastly. Web applications are built on layers of abstraction that render the infrastructure largely opaque. These applications aren’t even notionally under the control of organisations in the way that our complex insurance applications were. That makes their behaviour impossible to control precisely, and even to predict as I will discussing in my next post, “part 2 – crucial features of complex systems”.

Has opposition to ISO 29119 really died down?

One of my concerns about the Stop 29119 campaign, ever since it was launched four years ago, was that ISO would try to win the debate by default, by ignoring the opposition. In my CAST 2014 talk, which kicked off the campaign, I talked about ISO’s attempt to define its opponents as being irrelevant. ISO breached its own rules requiring consensus from the profession, and in order to justify doing so they had to maintain a pretence that testers who opposed their efforts were a troublesome, old-fashioned faction that should be ignored.

That’s exactly what has happened. ISO have kept their collective heads down, tried to ride out the storm and emerged to claim that it was all a lot of fuss about nothing; the few malcontents have given up and gone away.

I have just come across a comment in the “talk” section of the Wikipedia article on ISO 29119, arguing for some warning flags on the article to be removed.

“…finally, the objection to this standard was (a) from a small but vocal group and (b) died down – the ballots of member National Bodies were unanimous in favour of publication. Furthermore, the same group objected to IEEE 829 as well.”

The opposition is significantly more than “a small but vocal group”, but I won’t dwell on that point. My concern here is point b. Have the objections died down? Yes, they have in the sense that the opponents of ISO 29119 have been less vocal. There have been fewer talks and articles pointing out the flaws in the principle and the detail of the standard.

However, there has been no change in the beliefs of the opposition. There comes a point when it feels unnecessary, even pointless, to keep repeating the same arguments without the other side engaging. You can’t have a one-sided debate. The Stop 29119 campaigners have other things to do. Frankly, attacking ISO 29119 is a dreary activity compared with most of the alternatives. I would prefer to do something interesting and positive rather than launching another negative attack on a flawed standard. However, needs must.

The argument that “ballots of member National Bodies were unanimous in favour of publication” may be true, but it is a circular point. The opponents of ISO 29119 argued convincingly that software testing is not an activity that lends itself to ISO style standardisation and that ISO failed to gain any consensus outside its own ranks. The fact that ISO are quite happy with that arrangement is hardly a convincing refutation of our argument.

The point about our opposition to the IEEE 829 standard is also true, but it’s irrelevant. Even ISO thought that standard was dated and inadequate for modern purposes. It decided to replace it rather than try to keep updating it. Unfortunately the creators of ISO 29119 repeated the fundamental mistakes that rendered IEEE 829 flawed and unsuitable for good testing.

I was pleased to discover that the author of the Wikipedia comment was on the ISO working group that developed ISO 29119 and wrote a blog defending the standard, or rather dismissing the opposition. It was written four years ago in the immediate aftermath of the launch of Stop 29119. It’s a pity it didn’t receive more attention at the time. The debate was far too one sided and we badly needed contributions from ISO 29119’s supporters. So, in order to provide a small demonstration that opposition to the standard is continuing I shall offer a belated response. I’ll quote Andrew’s arguments, section by section, in dark blue and then respond.

“As a member of the UK Mirror Panel to WG26, which is responsible for the ISO 29119 standard, I am disappointed to read of the objection to the standard led by the International Society for Software Testing, which has resulted in a formal petition to ISO.

I respectfully suggest that their objections would be more effective if they engaged with their respective national bodies, and sought to overcome their objections, constructively.

People who are opposing ISO 29119 claim:

  1. It is costly.
  2. It will be seen as mandatory skill for testers (which may harm individuality and freedom).
  3. It may reduce the ability to experiment and try non-conventional ways.
  4. Once the standard is accepted, testers can be held responsible for project failures (or non-compliance).
  5. Effort will be more on documentation and process rather than testing.
    Let us consider each of these in turn.”

The International Society for Software Testing (ISST) launched the petition against ISO 29119, but this was merely one aspect of the campaign against the standard. Opposition was certainly not confined to ISST. The situation is somewhat confused by the fact that ISST disbanded in 2017. One of the prime reasons was that the “objectives set out by the founders have been met, or are in the capable hands of organisations that we support”. The main organisation referred to here is the larger and more established Association for Software Testing (AST), which can put more resources into the fight. I always felt the main differences between ISST and AST were in style and approach rather than principles and objectives.

The suggestion that the opponents of ISO 29119 should have worked through national ISO bodies is completely unrealistic. ISO’s approach is fundamentally wrong and opponents would have been seen as a wrecking crew preventing any progress. I know of a couple of people who did try and involve themselves in ISO groups and gave up in frustration. The debate within ISO about a standard like 29119 concerns the detail, not the underlying approach. In any case the committment required to join an ISO working group is massive. Meetings are held all over the world. They take up a lot of time and require huge expenses for travel and accommodation. That completely excludes independent consultants like myself.

“Costly

Opponents object to this standard because it is not freely available.

While this is a fair point, it is no different from every other standard that is in place – and which companies follow, often because it gives them a competitive advantage.

Personally, I would like to see more standards placed freely in the public domain, but I am not in a position to do it!”

The cost of the standard is a minor criticism. As a member of the AST’s Committee on Standards and Professional Practice I am fortunate to have access to the documents comprising the standard. These cost hundreds of dollars and I would baulk at buying them for myself. The full set would cost as much as a family holiday. I know which would be more important!

However, the cost does hamper informed debate about the content, and that was the true concern. The real damage of a poorly conceived standard will be poorer quality testing and that will be far more costly than the initial cost of the documents.

“Mandatory

Opponents claim this standard will be seen as a mandatory skill for testers (which may harm individuality and freedom).

ISO 29119 replaces a number of IEEE and British standards that have been in place for many years. And while those standards are seen to represent best practice, they have not been mandatory.”

I have two big issues with this counter argument. Firstly, the standards that ISO 29119 replaced were emphatically not “seen to represent best practice”. If they were best practice there would have been no need to replace them. They were hopelessly out of date but IEEE 829 was unhelpful, even damaging, when it was new.

My second concern is about the way that people respond to a standard. Back in 2009 I wrote this article “Do standards keep testers in the kindergarten?” in Testing Experience magazine arguing against the principle of testing standards, the idea of best practice and the inevitable danger of an unhelpful and dated standard like IEEE 829 being imposed on unwilling testers.

Once you’ve called a set of procedures a standard the argument is over in many organisations; testers are required to use them. It is disingenuous to say that standards are not mandatory. They are sold on the basis that they offer reassurance and that the wise, safe option is to make them compulsory.

I made this argument nine years ago thinking the case against standards had been won. I was dismayed to discover subsequently that ISO was trying to take us back several decades with ISO 29119.

“Experimentation

A formal testing environment should be a place where processes and procedures are in place, and is not one where ‘experiment and non-conventional’ methods are put in place. But having said that, there is nothing within ISO 29199 that prevents other methods being used.”

There may be a problem over the word “experiment” here. Andrew seems to think that testers who talk of experimentation are admitting they don’t know what they’re doing and are making it up as they go along. That would be an unfortunate interpretation. When testers from the Context Driven School refer to experimentation they mean the act of testing itself.

Good testing is a form of exploration and experimentation to find out how the product behaves. Michael Bolton describes that well here. A prescriptive standard that focuses on documentation distracts from, and effectively discourages, such exploring and experimentation. We have argued that at length and convincingly. It would be easier to analyse Andrew’s case if he had provided links to arguments from opponents who had advocated a form of experimentation he disapproves of.

“Accountability

Opponents claim that, once the standard is accepted, testers can be held responsible for project failures (or non-compliance).

As with any process or procedure, all staff are required to ensure compliance with the company manual – and project managers should be managing their projects to ensure that all staff are doing so.

Whether complying with ISO 29119 or any other standard or process, completion of testing and signing off as ‘passed’ carries accountability. This standard does not change that.”

This is a distortion of the opponents’ case. We do believe in accountability, but that has to be meaningful. Accountability must be based on something to which we can reasonably sign up. We strongly oppose attempts to enforce accountability to an irrelevant, poorly conceived and damaging standard. Complying with such a standard is orthogonal to good testing; there is no correlation between the two activities.

At best ISO 29119 would be an irrelevance. In reality it is more likely to be a hugely damaging distraction. If a company imposes a standard that all testers should wear laboratory technicians’ white coats it might look impressively professional, but complying with the standard would tell us nothing about the quality of the testing.

As a former auditor I have strong, well informed, views about accountability. One of ISO 29119’s serious flaws is that it fails to explain why we test. We need such clarity before we can have any meaningful discussion about compliance. I discussed this here, in “Do we want to be ‘compliant’ or valuable?”

The standard defines in great detail the process and the documents for testing, but fails to clarify the purpose of testing, the outcomes that stakeholders expect. To put it bluntly, ISO 29119 is vague about the ends towards which we are working, but tries to be precise about the means of getting there. That is an absurd combination.

ISO 29119 tries to set out rules without principles. Understanding the distinction between rules and principles is fundamental to the process of crafting professional standards that can hold practitioners meaningfully to account. I made this argument in the Fall 2015 edition of Better Software magazine. The article is also available on my blog, “Why ISO 29119 is a flawed quality standard”.

This confusion of rules and principles, means and ends, has led to an obsessive focus on delivering documentation rather than valuable information to stakeholders. That takes us on to Andrew’s next argument.

“Documentation

Opponents claim that effort will be more on documentation and process rather than testing.

I fail to understand this line of reasoning – any formal test regime requires a test specification, test cases and recorded test results. And the evidence produced by those results need argument. None of this is possible without documentation.”

Opponents of ISO 29119 have argued repeatedly and convincingly that a prescriptive standard which concentrates on documentation will inevitably lead to goal displacement; testers will concentrate on the documentation mandated by the standard and lose sight of why they are testing. That was our experience with IEEE 829. ISO 29119 repeats the same mistake.

Andrew’s second paragraph offers no refutation of the opponents’ argument. He apparently believes that we are opposed to documentation per se. That’s a straw man. Andrew justifies ISO 29119’s demand for documentation, which I believe is onerous and inappropriate, by asserting that it serves as evidence. Opponents argue that the standard places far too much emphasis on advance documentation and neglects evidence of what was discovered by the testing.

The statement that any formal test regime requires a test specification and test cases is highly contentious. Auditors would expect to see evidence of planning, but test specifications and test cases are just one way of doing it, the way that ISO 29119 advocates. In any case, advance planning is not evidence that good testing was performed any more than a neat project plan provides evidence that the project ran to time.

As for the results, the section of ISO 29119 covering test completion reports is woefully inadequate. It would be possible to produce a report that complied fully with the standard and offered nothing of value. That sums up the problems with ISO 29119. Testers can comply while doing bad testing. That is in stark contrast to the standards governing more established professions, such as accountants and auditors.

“Conclusion

Someone wise once said:

  1. Argument without Evidence is unfounded.
  2. Evidence without Argument is unexplained.

Having considered the argument put forward, and the evidence to support the case:

  • The evidence is circumstantial with no coherence.
  • The argument is weak, and seems only to support their vested interests.

For a body that represents test engineers, I would have expected better.”

The quote that Andrew uses from “someone wise” actually comes from the field of safety critical systems. There is much that we in conventional software testing can learn from that field. Perhaps the most important lessons are about realism and humility. We must deal with the world as it is, not as we would like it to be. We must accept the limitations of our knowledge and what we can realistically know.

The proponents of ISO 29119 are too confident in their ability to manage in a complex, evolving field using techniques rooted in the 1970s. Their whole approach tempts testers to look for confirmation of what they think they know already, rather than explore the unknown and explain what they cannot know.

Andrew’s verdict on the opposition to ISO 29119 should be turned around and directed at ISO and the standard itself. It was developed and launched in the absence of evidence that it would help testers to do a better job. The standard may have internal consistency, but it is incoherent when confronted with the complexities of the real world.

Testers who are forced to use it have to contort their testing to fit the process. Any good work they do is in spite of the standard and not because of it. It might provide a welcome route map to novice testers, but it offers a dangerous illusion. The standard tells them how to package their work so it appears plausible to those who don’t know any better. It defines testing in a way that makes it appear easier than it really is. But testing is not meant to be easy. It must be valuable. If you want to learn how to provide value to those who pay you then you need to look elsewhere.

Finally, I should acknowledge that some of the work I have cited was not available to Andrew when he wrote his blog in 2014. However, all of the underlying arguments and research that opponents of 29119 have drawn on were available long before then. ISO simply did not want to go looking for them. Our arguments about ISO 29119 being anti-competitive were at the heart of the Stop 29119 campaign. Andrew has not addressed those arguments.

If ISO wants to be taken seriously it must justify ISO 29119’s status as a standard. Principled, evidenced and coherent objections deserve a response. In a debate all sides have a duty to respond to their opponents’ strongest case, rather than evading difficult objections, setting up straw men and selectively choosing the arguments that are easiest to deal with.

ISO must provide evidence and coherent evidence that 29119 is effective. Simply labelling a set of prescriptive processes as a standard and expecting the industry to respect it for that reason will not do. That is the sign of a vested interest seeking to impose itself on a whole profession. No, the opposition has not died down; it has not had anything credible to oppose.

Risk mitigation versus optimism – Brexit & Y2K

The continuing Brexit shambles reminds me of a row in the approach to Y2K at the large insurer where I was working for IBM. Should a business critical back office system on which the company accounts depended be replaced or made Y2K compliant? I was one of the few people with a deep understanding of both the business and technology, and I had extensive experience of explaining complex problems in a way that allowed senior management to take action. So I was brought in to review the problem and make a recommendation.

One camp insisted that as an insurer they had to manage risk, so Y2K compliance with a more leisurely replacement was the only responsible option. The opposing camp consisted of business managers who had been assigned responsibility for managing new programmes. They would be responsible for a replacement and they insisted they could deliver a new system on time, even though they had no experience of delivering such an application. My investigation showed me they had no grasp of the mix of business and technical complexities, but they firmly believed that waterfall projects could be forced through successfully by charismatic management. All the previous failures were down to “weak management” and “bad luck”. I had seen their style of project management, which entailed bringing everyone together for massed weekly assemblies and shouting at the cynical, disbelieving developers about the need to “keep knocking down those milestones”. Making the old system compliant would be an insult to their competence.

My report pointed out the relative risks and costs of the options. I sold Y2K compliance to the senior manager in charge of the UK Accounts Department, sketching out the implications of the various options on a flipchart in a 30 minute chat so I had agreement before I’d even finished the report. The charismatic crew were furious, but silenced. The old system was Y2K compliant in time. The proposed new one could not have been delivered when it was needed. It would have been sunk by problems with upstream dependencies I was aware of but which the charismatics refused to acknowledge as being relevant.

If the charismatics’ solution had been chosen the company would have lost the use of a business critical application in late 1999. No contingency arrangements would have been possible and the company would have been unable to produce credible reserves, vital for an insurance company’s accounts. The external auditors would have been unable to pass the accounts. The share price would have collapsed and the company would have been sunk. I’m sure the charismatics would have blamed bad luck, and other people. “It was those dependencies, not us. We were let down”. That was a large, public limited company. If my advice had been rejected the people who wanted the old system to be made Y2K compliant would have brought in the internal auditors, who in turn would have escalated their concern to the board’s audit committee if necessary. If there had still been no action they would have taken the matter to the external auditors.

That’s how things should work in a big corporation. Of course they often don’t and the auditors can lose their nerve, or choose to hope that things will work out well. There is at least a mechanism that can be followed if people decide to perform their job responsibly. With Brexit there is a cavalier unwillingness to think about risk and complexity that is reminiscent of those irresponsibly optimistic managers. We are supposed to trust politicians who can offer us nothing more impressive than “trust me” and “it’s their fault” and who are offering no clear contingency arrangements if their cheery optimism proves unfounded. There is a mechanism to hold them to account. That is the responsibility of Parliament. Will the House of Commons step up to the job? We’ll see.

Do we want to be “compliant” or valuable?

Periodically I am asked to write for a magazine or blog, and more often than not I agree. Two years ago I was asked by an online magazine to write about my opposition to the ISO 29119 testing standard. I agreed, but they didn’t use my article. I’ve just come across it and decided to post it on my blog. A warning! There’s nothing new here – but the arguments are still strong, relevant, and ISO have neither countered them nor attempted to do so. Clearly they hope to win the debate by default, by lying low and hoping that opponents will give up and be forgotten.

In August 2014 I gave a talk in New York at CAST, conference of the Association for Software Testing. I criticized software testing standards, and the new ISO 29119 in particular.

I thought I would talk for about 40 minutes, then we’d then have an interesting discussion and that might be the end of it. Well, that happened, but it was only the start. My talk caught the mood of concern about the standard amongst context driven testers, and so the Stop 29119 campaign kicked off.

Life hasn’t been quite the same since I acquired a reputation for being anti-standards. My opposition to ISO 29119 has defined my public image. I can’t complain, but I’m slightly uncomfortable with that. I’d rather be seen as positive than negative.

I want to make it clear that I do approve of standards; not all standards, but ones that have been well designed for their particular context. Good standards pool collective wisdom and ensure that everyone has the same understanding of what engineering products and activities should do. They make the economy work better by providing information and confidence to consumers, and protecting responsible companies from unscrupulous competitors. Standards also increase professional discipline and responsibility, and this is where the International Standards Organization has gone wrong with ISO 29119.

The standard defines in great detail the process and the documents for testing, but fails to clarify the purpose of testing, the outcomes that stakeholders expect. To put it bluntly, ISO 29119 is vague about the ends towards which we are working, but tries to be precise about the means of getting there. That is an absurd combination. Obviously stakeholders hope for good news from testers, but what they really need is the unvarnished truth, however brutal that might be.

Remember, it’s not the job of testers to test quality into the product. It’s our job to shine a light on what is there so that the right people can take decisions about what to do next. The outcome of testing isn’t necessarily a high-quality product; there may be valid reasons for releasing a product that looks buggy to us, or it might even make sense to scrap the development. I once saw an 80 person year project scrapped after testing. It’s not our call. The point is that the outcome of our testing must be the best information we can provide to stakeholders. ISO 29119 makes no mention of that. Instead it focuses in minute detail on the process and documentation.

Strict copyright protection means I can’t share the content of ISO 29119, but I can say that the sample Test Completion Reports in the standard epitomise what is wrong. They summarise the testing process with a collection of metrics that say nothing about the quality of the product. A persistent danger of standards and templates is that people simply copy the examples and fill in templates without thinking deeply enough about what is needed on their project.
It would be simple to comply with the ISO 29119 Test Completion Process, and produce a report that provided no worthwhile information at all.

The Institute of Internal Auditors offers a worthwhile alternative approach with their mandatory performance standards, which in striking contrast to ISO 29119 are available to the public for scrutiny and discussion. The section covering audit reports says nothing about the process of reporting, or what an audit report should look like. But it stipulates brief, clear and very demanding requirements about the quality of the information in the report.

The difference between ISO 29119 and internal audit standards is that you can’t produce a worthless audit report that complies with the standard. The outcome of the audit has to be useful information. Why couldn’t testing standards focus on such a simple outcome? Do testers want to be zombies, blindly complying with a standard and failing to think about what our stakeholders need? Or do we want to offer a valuable service?

Precertification of low risk digital products by FDA

Occasionally I am asked why I use Twitter. “Why do you need to know what people have had for breakfast? Why get involved with all those crazies?”. I always answer that it’s easy to avoid the bores and trolls (all the easier if one is a straight, white male I suspect) and Twitter is a fantastic way of keeping in touch with interesting people, ideas and developments.

A good recent example was this short series of tweets from Griffin Jones.
This was the first I’d heard of the pre-certification program proposed by the Food and Drug Administration (FDA), the USA’s federal body regulating food, drugs and medical devices.

Griffin is worried that IT certification providers will rush to sell their services. My first reaction was to agree, but on consideration I’m cautiously more optimistic.

Precertification would be for organisations, not individuals. The certification controversy in software testing relates to certifying individuals through ISTQB. FDA precertification is aimed at organisations, which would need “an existing track record in developing, testing, and maintaining software products demonstrating a culture of quality and organizational excellence measured and tracked by Key Performance Indicators (KPIs) or other similar measures.” That quote is from the notification for the pilot program for the precertification scheme, so it doesn’t necessarily mean the same criteria would apply to the final scheme. However, the FDA’s own track record of highly demanding standards (no, not like ISO 29119) that are applied with pragmatism provides grounds for optimism.

Sellers of CMMi and TMMi consultancy might hope this would give them a boost, but I’ve not heard much about these in recent years. It could be a tough sell for consultancies to push these models at the FDA when it is wanting to adopt more lightweight governance with products that are relatively low risk to consumers.

The FDA action plan (PDF, opens in new tab) that announced the precertification program did contain a word that jumped out at me. The FDA will precertify companies “who demonstrate a culture of quality and organizational excellence based on objective criteria”.

“Objective” might provide an angle for ISO 29119 proponents to exploit. A standard can provide an apparently objective basis for reviewing testing. If you don’t understand testing you can check for compliance with the standard. In a sense that is objective. Checkers are not bringing their own subjective opinions to the exercise. Or are they? The check is based on the assumption that the standard is relevant, and that the exercise is useful. In the absence of any evidence of efficacy, and there is no such evidence for ISO 29119, then using ISO 29119 as the benchmark is a subjective choice. It is used because it makes the job easier; it facilitates checking for compliance, it has nothing to do with good testing.

“Objective” should mean something different, and more constructive, to the FDA. They expect evidence of testing to be sufficient in quality and quantity so that third parties would have to come to the same conclusion if they review it, without interpretation by the testers. Check out Griffin Jones’ talk about evidence on YouTube.


Incidentally, the FDA’s requirements are strikingly similar to the professional standards of the Institute of Internal Auditors (IIA). In order to form an audit opinion auditors must gather sufficient information that is “factual, adequate, and convincing so that a prudent, informed person would reach the same conclusions as the auditor.” The IIA also has an interesting warning in its Global Technology Audit Guide, “Management of IT Auditing“. It warns IT auditors of the pitfalls of auditing against standards or benchmarks that might be dated or useless just because they want something to “audit against”.

So will ISO, or some large consultancies, try to influence the FDA to endorse ISO 29119 on the grounds that it would provide an objective benchmark against which to assess testing? That wouldn’t surprise me at all. What would surprise me is if the FDA bought into it. I like to think they are too smart for that. I am concerned that some day external political pressure might force adoption of ISO 29119. There was a hint of that in the fallout from the problems with the US’s Healthcare.gov website. Politicians who are keen to see action, any action, in a field they don’t understand always worry me. That’s another subject, however, and I hope it stays that way.