Quantum computing; a whole new field of bewilderment

At primary school in London I was taught about different number bases, concentrating on binary arithmetic. This was part of an attempt in the 1960s to bring mathematics up to date. The introduction of binary to the curriculum was prompted by the realisation that computers would become hugely more significant. It was fascinating to realise that there was nothing inevitable or sacrosanct about using base 10 and I enjoyed it all.

When I started working in IT I quickly picked up the technical side. I understood binary and with a bit of work, keeping a clear head, everything made sense. It was mostly based on Boolean logic. Everything was true or false, on or off, pass or fail, 1 or 0.

Now this simple state of affairs applied only to the technology. When you started to deal with humans, with organisations, with messy social reality it was important to step back from a simple binary worldview. You can’t develop worthwhile applications, or test them, if you assume that the job simply requires stepping through a process that is akin to a computer program, with every decision being a straightforward binary, yes/no, pass/fail. I’ve talked about that at length elsewhere, eg here “binary opinions – yes or no?“.

Recently I’ve been reading about quantum computing. I’d vaguely known about the subject before. I knew it was a radically different approach to computing, but I hadn’t thought through just how radical the differences are, or the implications for testing. It’s not just testing of course. When a completely new form of computing comes on the scene, one that leaves all our expectations, assumptions and beliefs in tatters, there will be huge ramifications throughout IT. Traditional computers won’t vanish; there will still be plenty of jobs working with them. People will continue to specialise in specific areas of computing. The problem for testers will be that quantum computers are going to be used for new applications, to do things that were previously impossible, and traditional testing techniques won’t necessarily be readily transferable.

What’s so different about quantum computing?

I’m not going to try and provide an introduction to quantum computers. If you think you understand them and you haven’t done some serious study of quantum physics then you’re kidding yourself. And as Richard Feynman said;

“If you think you understand quantum physics, you don’t understand quantum physics.”

If a primary school introduction to binary arithmetic allowed me to get to grips with traditional, digital computers, then the equivalent for quantum computers would be at least an undergraduate degree in physics. Here is a good overview.

I’m just going to run through three features of quantum computers that make my head spin, that have persuaded me that everything I knew about computers will be useless and I will be as well prepared as a peasant trying to get to grips with Excel after stumbling through a time portal from the Middle Ages.

First, instead of bits we have qubits. Bits can be 0 or 1. Once set to a value they don’t change until we perform some operation. The possible states of a qubit are excited or relaxed, which could be considered equivalent to 0 or 1. The fundamental difference is that they can be both at the same time, or rather a mixture of the two. This property is superposition. However, when the qubit is measured it will be frozen as one or the other.

The second weird feature of qubits is that different qubits influence each other. In traditional computing if we operate on a bit then no other bits will change. If they do it’s because the logic of the program controls these changes. In quantum computing that doesn’t apply. Changing a qubit will cause other qubits to change. Qubits are not independent of each other; they are all linked to each other. This is called entanglement.

The third weird thing about quantum computing is that algorithms are probabilistic, not deterministic. A classical digital computer will run through an algorithm and produce the right answer, or rather it will produce a predictable answer that is consistent with the algorithm’s logic. A quantum computer won’t always give the right answer straight away. It might have to send the same input through the same process many times and the results should converge on the right answer, probably.

I’m telling you this not because I think it will give you any sort of useful understanding of quantum computing. A spot of humility is in order. I pass on this information in the hope you realise you have as little idea as I do.

In summary, quantum computing is moving way beyond dear old familiar Boolean logic. Boolean algebra is still valid, but it is only relevant to one part of a wider reality and quantum computers will allow us to explore beyond the Boolean boundaries.

I have done my share of white box testing, delving into the guts of applications to understand what is going on. I know I will never be in a position to do that with quantum computers, even assuming that they are in widespread use before I retire. I doubt if there are many, if any testers, who will be better placed than I am.

What will quantum computers be used for?

If white box testing poses a tough challenge what about black box testing? As I said earlier, quantum computers will be used for problems that traditional computers can’t help with. One application will be the transformation of cryptography. Quantum computing will probably render existing cryptographing techniques obsolete. Other likely uses will be new ways to search and interrogate massive databases, and financial modelling that requires fiendishly intricate calculations on huge volumes of data. These are the applications that caught my eye, because of my background in financial services. A big problem for testers will be, how do you evaluate the output if the computer is doing something that can’t be replicated by other means? This isn’t a new problem, but quantum computers will massively increase the difficulty.

Blind computing guiding the blind?

I wrote about the problems I’d encountered working with big data a few years ago. Our approach could be summed up simply as;

  • build a deep understanding of the domain and the data, and the essential relationships or rules,
  • get the business or domain experts heavily involved and pick their brains,
  • ensure you influence the design so that it is testable, ideally so that the solution enables testing of separate segments which can be isolated so you can bring your acquired knowledge and understanding to bear.

The first two points will always be relevant regardless of the technology. The third one has been picked up by quantum computing scientists who have developed a technique called blind quantum computing.

The blind technique involves running a separate, simpler(!), quantum computer called a verifier alongside the server running the test application. The verifier feeds the server with small, discreet calculations to perform. It does this in a way so that only the verifier, and the testers, can know what the answer should be, and the server knows only what operations it should perform. This is the simplest explanation of blind quantum computing I’ve been able to find.

An obvious observation is that this is hardly black box testing as we have known it. Who programs and controls the verifier? It is a quantum computer, and I doubt if ISTQB accreditation will ever get you very far here.

Fields of bewilderment

I don’t have any answers here, and I strongly suspect easy answers will never be available to testers who work with quantum computers. That, perhaps, is the point to take away from my article. Writing this has felt less like trying to spread some understanding of an immensely difficult subject, and more like an account of a bewildered ramble through my freshly expanded fields of ignorance.

I can’t state with confidence that software testers will ever move into quantum computing. Perhaps the serious testing will be done by quantum computing scientists. That raises its own dangers. Will they have the right instincts, an appropriate sceptical and questioning approach? There is already talk of quantum computers permitting the development of perfect software. Apparently they will be able to work their way through every conceivable permutation of data to establish that the application was built to its specification. Of course there is a huge, familiar old question being begged there; how does anyone know the spec was right?
Opening of the poem 'Youth and Hope' from which the quote 'perplexity is the beginning of knowledge' comes

I worry that good testers may be frozen out of working with quantum computers, and that the skills and questions they do have, and that will remain relevant, will be lost. Whatever happens I am confident that testers imbued with the principles of Context-Driven Testing will be better prepared than those who know only what they learned to pass ISTQB exams, or who are accustomed to following prescriptive standards. Whatever the future holds for us it will be interesting! After all, perplexity is the beginning of knowledge.

2017 Test Leadership Congress in New York

From May 1st to 4th I’ll be in New York, at the Test Leadership Congress in New York.


Why go all that way to New York? Well, there’s an obvious answer; it’s in New York for goodness sake, and there are interesting looking talks by some cool people. But for me the clincher is a one day masterclass that could hardly be better designed to lure me over the Atlantic.

Dave Snowden is presenting “A leader’s framework for decision making: managing complex projects using Cynefin“. That would be hugely interesting in any context, but when he’s addressing an audience of testers, with the added inducement that “a new collaborative project on software testing will be introduced to delegates”, well I’m checking out flights already.

For a few years now I’ve been fascinated by the understanding that Cynefin can help us gain of the environment we are working in and the problems with which we must grapple. I’ve written about this, providing an introduction to Cynefin, but that was aimed at the highest level of testing strategy, the worldview that shapes our approach to testing.

If you think about the world in the wrong way, if you make flawed and unchallenged assumptions about the nature of the problems you face then you won’t do much that is useful, and even less that is valuable. You can close your mind to all this and get on with working the process but the chances are you’ll be “faking the testing”, a phrase of James Bach’s that keeps coming back to me over the years.

Understanding Cynefin will help testers to choose an approach that will enable more effective testing; your approach will be aligned both to the way that software is developed, and also to the product or application under test. The nature of development means that we are mostly dealing with complicated and complex problems, with the most challenging problems taking us into the complex domain.

The essential feature of complex systems is that they are not predictable. There is no clear cause and effect. We might be able to discern them with hindsight, but that knowledge doesn’t allow us to predict what will happen next; the system adapts continually. In order to understand the system we need to probe it, make sense of what we discover, then respond. Cynefin is therefore likely to guide us towards more flexible, iterative development and exploratory testing. But exactly how should we use that understanding to conduct testing?

In particular I am keen to hear Dave’s ideas about how “safe to fail” experiments with complex systems can be applied in testing. Dave describes safe-fail experiments as follows.

“I can’t get it right in advance, so I need to experiment with different ways of approaching the journey that are safe-fail in nature. That is to say I can afford them to fail and critically, I plan them so that through that failure I learn more about the terrain through which I wish to travel.”

This cycle of probing and learning is obviously relevant to testing. You could say it is testing. I can’t wait for this class.

An added attraction that the Congress has for me is the class on the following day, led by Fiona Charles and Anna Royzman, “Strategic Leadership for Testers“. Test management is changing. Treating the job as a matter of managing the testing process simply won’t cut it in future. Both in testing and software development we’ve been poor at thinking strategically and that is one of Fiona’s strengths. That, combined with her incisive style and wit, makes this is a class I really don’t want to miss while I’m in New York. If you’re going to be there let me know. I look forward to seeing you.

Why ISO 29119 is a flawed quality standard

Why ISO 29119 is a flawed quality standard

This article originally appeared in the Fall 2015 edition of Better Software magazine.

In August 2014, I gave a talk attacking ISO 29119” at the Association for Software Testing’s conference in New York. That gave me the reputation for being opposed to standards in general — and testing standards in particular. I do approve of standards, and I believe it’s possible that we might have a worthwhile standard for testing. However, it won’t be the fundamentally flawed ISO 29119.

Technical standards that make life easier for companies and consumers are a great idea. The benefit of standards is that they offer protection to vulnerable consumers or help practitioners behave well and achieve better outcomes. The trouble is that even if ISO 29119 aspires to do these things, it doesn’t.

Principles, standards, and rules

The International Organization for Standardization (ISO) defines a standard as “a document that provides requirements, specifications, guidelines or characteristics that can be used consistently to ensure that materials, products, processes and services are fit for their purpose.”

It might be possible to derive a useful software standard that fits this definition, but only if it focuses on guidelines, rather than requirements, specifications, or characteristics. According to ISO’s definition, a standard doesn’t have to be all those things. A testing standard that is instead framed as high level guidelines would be consistent with the widespread view among regulatory theorists that standards are conceptually like high-level principles. Rules, in contrast, are detailed and specific (see Frederick Schauer’s “The Convergence of Rules and Standards”: PDF opens in new tab). One of ISO 29119’s fundamental problems is that it is pitched at a level consistent with rules, which will undoubtedly tempt people to treat them as fixed rules.

Principles focus on outcomes rather than detailed processes or specific rules. This is how many professional bodies have defined standards. They often use the words principles and standards interchangeably. Others favor a more rules-based approach. If you adopt a detailed, rules-based approach, there is a danger of painting yourself into a corner; you have to try to specify exactly what is compliant and noncompliant. This creates huge opportunities for people to game the system, demonstrating creative compliance as they observe the letter of the law while trashing underlying quality principles, (see John Braithwaite’s “Rules and Principles: A Theory of Legal Certainty”). Whether one follows a principles-based or a rules-based approach, regulators, lawyers, auditors, and investigators are likely to assume standards define what is acceptable.

As a result, there is a real danger that ISO 29119 could be viewed as the default set of rules for responsible software testing. People without direct experience in development or testing look for some form of reassurance about what constitutes responsible practice. They are likely to take ISO 29119 at face value as a definitive testing standard. The investigation into the HealthCare.gov website problems showed what can happen.

In its March 2015 report (PDF, opens in new tab) on the website’s problems, the US Government Accountability Office checked the HealthCare.gov project for compliance with the IEEE 829 test documentation standard. The agency didn’t know anything about testing. They just wanted a benchmark. IEEE 829 was last revised in 2008; it said that the content of standards more than five years old “do not wholly reflect the present state of the art”. Few testers would disagree that IEEE 829 is now hopelessly out of date.

when a document is more than five years old

IEEE 829’s obsolescence threshold


The obsolescence threshold for ISO 29119 has increased from five to ten years, presumably reflecting the lengthy process of creating and updating such cumbersome documents rather than the realities of testing. We surely don’t want regulators checking testing for compliance against a detailed, outdated standard they don’t understand.

Scary lessons from the social sciences

If we step away from ISO 29119, and from software development, we can learn some thought-provoking lessons from the social sciences.

Prescriptive standards don’t recognize how people apply knowledge in demanding jobs like testing. Scientist Michael Polanyi and sociologist Harry Collins have offered valuable insights into tacit knowledge, which is knowledge we possess and use but cannot articulate. Polanyi first introduced the concept, and Collins developed the idea, arguing that much valuable knowledge is cultural and will vary between different contexts and countries. Defining a detailed process as a standard for all testing excludes vital knowledge; people will respond by concentrating on the means, not the ends.

Donald Schön, a noted expert on how professionals learn and work, offered a related argument with “reflection in action” (see Willemien Visser’s article: PDF opens in new tab). Schön argued that creative professionals, such as software designers or architects, have an iterative approach to developing ideas—much of their knowledge is understood without being expressed. In other words, they can’t turn all their knowledge into an explicit, written process. Instead, to gain access to what they know, they have to perform the creative act so that they can learn, reflect on what they’ve learned, and then apply this new knowledge. Following a detailed, prescriptive process stifles learning and innovation. This applies to all software development—both agile and traditional methods.

In 1914, Thorstein Veblen identified the problem of trained incapacity. People who are trained in specific skills can lack the ability to adapt. Their response worked in the past, so they apply it regardless thereafter.

young girl, old woman

Young woman or old woman? Means or ends? We can focus on only one at a time.

Kenneth Burke built upon Veblen’s work, arguing that trained incapacity means one’s abilities become blindnesses. People can focus on the means or the ends, not both; their specific training makes them focus on the means. They don’t even see what they’re missing. As Burke put it, “a way of seeing is also a way of not seeing; a focus upon object A involves a neglect of object B”. This leads to goal displacement, and the dangers for software testing are obvious.

The problem of goal displacement was recognized before software development was even in its infancy. When humans specialize in organizations, they have a predictable tendency to see their particular skill as a hammer and every problem as a nail. Worse, they see their role as hitting the nail rather than building a product. Give test managers a detailed standard, and they’ll start to see the job as following the standard, not testing.

In the 1990s, British academic David Wastell studied software development shops that used structured methods, the dominant development technique at the time. Wastell found that developers used these highly detailed and prescriptive methods in exactly the same way that infants use teddy bears and security blankets: to give them a sense of comfort and help them deal with stress. In other words, a developer’s mindset betrayed that the method wasn’t a way to build better software but rather a defense mechanism to alleviate stress and anxiety.

Wastell could find no empirical evidence, either from his own research at these companies or from a survey of the findings of other experts, that structured methods worked. In fact, the resulting systems were no better than the old ones, and they took much more time and money to develop. Managers became hooked on the technique (the standard) while losing sight of the true goal. Wastell concluded the following:

Methodology becomes a fetish, a procedure used with pathological rigidity for its own sake, not as a means to an end. Used in this way, methodology provides a relief against anxiety; it insulates the practitioner from the risks and uncertainties of real engagement with people and problems.

Developers were delivering poorer results but defining that as the professional standard. Techniques that help managers cope with stress and anxiety but give an illusory, reassuring sense of control harm the end product. Developers and testers cope by focusing on technique, mastery of tools, or compliance with standards. In doing so they can feel that they are doing a good job, so long as they don’t think about whether they are really working toward the true ends of the organization or the needs of the customer.

Standards must be fit for their purpose

Is all this relevant to ISO 29119? We’re still trying to do a difficult, stressful job, and in my experience, people will cling to prescriptive processes and standards that give the illusion of being in control. Standards have credibility and huge influence simply from their status as standards. If we must have standards, they should be relevant, credible, and framed in a way that is helpful to practitioners. Crucially, they must not mislead stakeholders and regulators who don’t understand testing but who wield great influence and power.

The level of detail in ISO 29119 is a real concern. Any testing standard should be in the style favored by organizations like the Institute of Internal Auditors (IIA), whose principles based professional standards cover the entire range of internal auditing but are only one-tenth as long as the three completed parts of ISO 29119. The IIA’s standards are light on detail but far more demanding in the outcomes required.

Standards must be clear about the purpose they serve if we are to ensure testing is fit for its purpose, to hark back to ISO’s definition of a standard. In my opinion, this is where ISO 29119 falls down. The standard does not clarify the purpose of testing, only the mechanism—and that mechanism focuses on documentation, not true testing. It is this lack of purpose, the why, that leads to teams concentrating on standards compliance rather than delivering valuable information to stakeholders. This is a costly mistake. Standards should be clear about the outcomes and leave the means to the judgment of practitioners.

A good example of this problem is ISO 29119’s test completion report, which is defined simply as a summary of the testing that was performed. The standard offers examples for traditional and agile projects. Both focus on the format, not the substance of the report. The examples give some metrics without context or explanation and provide no information or insight that would help stakeholders understand the product and the risk and make better decisions. Testers could comply with the standard without doing anything useful. In contrast, the IIA’s standards say audit reports must be “accurate, objective, clear, concise, constructive, complete, and timely.” Each of these criteria is defined briefly in a way that makes the standard far more demanding and useful than ISO 29119, in far less space.

It’s no good saying that ISO 29119 can be used sensibly and doesn’t have to be abused. People are fallible and will misuse the standard. If we deny that fallibility, we deny the experience of software development, testing, and, indeed, human nature. As Jerry Weinberg said (in “The Secrets of Consulting”), “no matter how it looks at first, it’s always a people problem”. Any prescriptive standard that focuses on compliance with highly detailed processes is doomed. Maybe you can buck the system, but you can’t buck human nature.

David Graeber’s “The Utopia of Rules: On Technology, Stupidity and the Secret Joys of Bureaucracy”

When I gave my talk at CAST 2014 in New York, “Standards – promoting quality or restricting competition?” I was concentrating on the economic aspects of standards. They are often valuable, but they can be damaging and restrict competition if they are misused. A few months later I bought “The Utopia of Rules: On Technology, Stupidity, and the Secret Joys of Bureaucracy” by David Graeber, Professor of Anthropology at the London School of Economics. I was familiar with Graeber as a challenging and insightful writer. I drew on his work when I wrote “Testing: valuable or bullshit?“. The Utopia of Rules also inspired the blog article I wrote recently, “Frozen in time – grammar and testing standards” in which I discussed the similarity between grammar textbooks and standards, which both codify old usages and practices that no longer match the modern world.

What I hadn’t expected from The Utopia of Rules was how strongly it would support the arguments I made at CAST.

Certification and credentialism

Graeber makes the same argument I deployed against certification. It is being used increasingly to enrich special interests without benefiting society. On page 23 Graeber writes:

Almost every endeavor that used to be considered an art (best learned through doing) now requires formal professional training and a certificate of completion… In some cases, these new training requirements can only be described as outright scams, as when lenders, and those prepared to set up the training programs, jointly lobby the government to insist that, say, all pharmacists be henceforth required to pass some additional qualifying examination, forcing thousands already practicing the profession into night school, which these pharmacists know many will only be able to afford with the help of high-interest student loans. By doing this, lenders are in effect legislating themselves a cut of most pharmacists’ subsequent incomes.

To be clear, my stance on ISTQB training is that it educates testers in a legitimate, though very limited, vision of testing. My objection is to any marketing of the qualification as a certification of testing ability, rather than confirmation that the tester has passed an exam associated with a particular training course. I object even more strongly to any argument that possession of the certificate should be a requirement for employment, or for contracting out testing services. It is reasonable to talk of scams when the ability of good testers to earn a living is damaged.

What is the point of it all?

Graeber has interesting insights into how bureaucrats can be vague about the values of the bureaucracy: why does the organisation exist? Bureaucrats focus on efficient execution of rational processes, but what is the point of it all? Often the means become the ends: efficiency is an end in itself.

I didn’t argue that point at CAST, but I have done so many times in other talks and articles (e.g. “Teddy bear methods“). If people are doing a difficult, stressful job and you give them prescriptive methods, processes or standards then they will focus on ticking their way down the list. The end towards which they are working becomes compliance with the process, rather than helping the organisation reach its goal. They see their job as producing the outputs from the process, rather than the outcomes the stakeholders want. I gave a talk in London in June 2015 to the British Computer Society’s Special Interest Group in Software Testing in which I argued that testing lacks guiding principles (PDF, opens in a new tab) and ISO 29119 in particular does not offer clear guidance about the purpose of testing.

In a related argument Graeber makes a point that will be familiar to those who have criticised the misuse of testing metrics.

…from inside the system, the algorithms and mathematical formulae by which the world comes to be assessed become, ultimately, not just measures of value, but the source of value itself.

Rent extraction

The most controversial part of my CAST talk was my argument that the pressure to adopt testing standards was entirely consistent with rent seeking in economic theory. Rent seeking, or rent extraction, is what people do when they exploit failings in the market, or rig the market for their own benefit by lobbying for regulation that happens to benefit them. Instead of creating wealth, they take it from other people in a way that is legal, but which is detrimental to the economy, and society, as a whole.

This argument riled some people who took it as a personal attack on their integrity. I’m not going to dwell on that point. I meant no personal slur. Rent seeking is just a feature of modern economies. Saying so is merely being realistic. David Graeber argued the point even more strongly.

The process of financialization has meant that an ever-increasing proportion of corporate profits come in the form of rent extraction of one sort or another. Since this is ultimately little more than legalized extortion, it is accompanied by ever-increasing accumulation of rules and regulations… At the same time, some of the profits from rent extraction are recycled to select portions of the professional classes, or to create new cadres of paper-pushing corporate bureaucrats. This helps a phenomenon I have written about elsewhere: the continual growth, in recent decades, of apparently meaningless, make-work, “bullshit jobs” — strategic vision coordinators, human resources consultants, legal analysts, and the like — despite the fact that even those who hold such positions are half the time secretly convinced they contribute nothing to the enterprise.

In 2014 I wrote about “bullshit jobs“, prompted partly by one of Graeber’s articles. It’s an important point. It is vital that testers define their job so that it offers real value, and they are not merely bullshit functionaries of the corporate bureaucracy.

Utopian bureaucracies

I have believed for a long time that adopting highly prescriptive methods or standards for software development and testing places unfair pressure on people, who are set up to fail. Graeber makes exactly the same point.

Bureaucracies public and private appear — for whatever historical reasons — to be organized in such a way as to guarantee that a significant proportion of actors will not be able to perform their tasks as expected. It’s in this sense that I’ve said one can fairly say that bureaucracies are utopian forms of organization. After all, is this not what we always say of utopians: that they have a naïve faith in the perfectibility of human nature and refuse to deal with humans as they actually are? Which is, are we not also told, what leads them to set impossible standards and then blame the individuals for not living up to them? But in fact all bureaucracies do this, insofar as they set demands they insist are reasonable, and then, on discovering that they are not reasonable (since a significant number of people will always be unable to perform as expected), conclude that the problem is not with the demands themselves but with the individual inadequacy of each particular human being who fails to live up to them.

Testing standards such as ISO 29119, and its predecessor IEEE 829, don’t reflect what developers and testers do, or rather should be doing. They are at odds with the way people think and work in organisations. These standards attempt to represent a highly complex, sometimes chaotic, process in a defined, repeatable model. The end product is usually of dubious quality, late and over budget. Any review of the development will find constant deviations from the standard. The suppliers, and defenders, of the standard can then breathe a sigh of relief. The sacred standard was not followed. It was the team’s fault. If only they’d done it by the book! The possibility that the developers’ and testers’ apparent sins were the only reason anything was produced at all is never considered. This is a dreadful way to treat people, but in many organisations it has been normal for several decades.

Loss of communication

All of the previous arguments by Graeber were entirely consistent with my own thoughts about how corporate bureaucracies operate. It was fascinating to see an anthropologist’s perspective, but it didnt teach me anything that was really new about how testers work in corporations. However, later in the book Graeber developed two arguments that gave me new insights.

Understanding what is happening in a complex, social situation needs effective two way communication. This requires effort, “interpretive labor”. The greater the degree of compulsion, and the greater the bureaucratic regime of rules and forms, the less need there is for such two way communication. Those who can simply issue orders that must be obeyed don’t have to take the trouble to understand the complexities of the situation they’re managing.

…within relations of domination, it is generally the subordinates who are effectively relegated the work of understanding how the social relations in question really work. … It’s those who do not have the power to hire and fire who are left with the work of figuring out what actually did go wrong so as to make sure it doesn’t happen again.

This ties in with the previous argument about utopian bureaucracies. If you impose a inappropriate standard then poor results will be attributed to the inevitable failure to comply. There is no need for senior managers to understand more, and no need to listen to the complaints, the “excuses”, of the people who do understand what is happening. Interestingly, Graeber’s argument about interpretive labor is is consistent with regulatory theory. Good regulation of complex situations requires ongoing communication between the regulator and the regulated. I explained this in the talk on testing principles I mentioned above (slides 38 and 39).

Fear of play

My second new insight from Graeber arrived when he discussed the nature of play and how it relates to bureaucracies. Anthropologists try to maintain a distinction between games and play, a distinction that is easier to maintain in English than in languages like French and German, which use the same word for both. A game has boundaries, set rules and a predetermined conclusion. Play is more free-form and creative. Novelties and surprising results emerge from the act of playing. It is a random, unpredictable and potentially destructive activity. Graeber finishes his discussion of play and games with the striking observation.

What ultimately lies behind the appeal of bureaucracy is fear of play.

Put simply, and rather simplistically, Graeber means that we use bureaucracy to escape the terror of chaotic reality, to bring a semblance (an illusion?) of control to the uncontrollable.

This gave me an tantalising new insight into the reasons people build bureaucratic regimes in organisations. It sent me off into a whole new field of reading on the anthropology of games and play. This has fascinating implications for the debate about standards and testing. We shy away from play, but it is through play that we learn. I don’t have time now to do the topic justice, and it’s much too big and important a subject to be tacked on to the end of this article, but I will return to it. It is yet another example of the way anthropology can help us understand what we are doing as testers. As a starting point I can heartily recommend David Graeber’s book, “The Utopia of Rules”.

Frozen in time – grammar and testing standards

This recent tweet by Tyler Hayes caught my eye. “If you build software you’re an anthropologist whether you like it or not.”

It’s an interesting point, and it’s relevant on more than one level. By and large software is developed by people and for people. That is a statement of the obvious, but developers and testers have generally been reluctant to take on board the full implications. This isn’t a simple point about usability. The software we build is shaped by many assumptions about the users, and how they live and work. In turn, the software can reinforce existing structures and practices. Testers should think about these issues if they’re to provide useful findings to the people who matter. You can’t learn everything you need to know from a requirements specification. This takes us deep into anthropological territory.

What is anthropology?

Social anthropology is defined by University College London as follows.

Social Anthropology is the comparative study of the ways in which people live in different social and cultural settings across the globe. Societies vary enormously in how they organise themselves, the cultural practices in which they engage, as well as their religious, political and economic arrangements.

We build software in a social, economic and cultural context that is shaped by myriad factors, which aren’t necessarily conducive to good software, or a happy experience for the developers and testers, never mind the users. I’ve touched on this before in “Teddy Bear Methods“.

There is much that we can learn from anthropology, and not just to help us understand what we see when we look out at the users and the wider world. I’ve long thought that the software development and testing community would make a fascinating subject for anthropologists.

Bureaucracy, grammar and deference to authority

I recently read “The Utopia of Rules – On Technology, Stupidity, and the Secret Joys of Bureaucracy” by the anthropologist David Graeber.
Graeber has many fascinating insights and arguments about how organisations work, and why people are drawn to bureaucracy. One of his arguments is that regulation is imposed and formalised to try and remove arbitrary, random behaviour in organisations. That’s a huge simplification, but there’s not room here to do Graeber’s argument justice. One passage in particular caught my eye.

People do not invent languages by writing grammars, they write grammars — at least, the first grammars to be written for any given language — by observing the tacit, largely unconscious, rules that people seem to be applying when they speak. Yet once a book exists,and especially once it is employed in schoolrooms, people feel that the rules are not just descriptions of how people do talk, but prescriptions for how they should talk.

It’s easy to observe this phenomenon in places where grammars were only written recently. In many places in the world, the first grammars and dictionaries were created by Christian missionaries in the nineteenth or even twentieth century, intent on translating the Bible and other sacred texts into what had been unwritten languages. For instance, the first grammar for Malagasy, the language spoken in Madagascar, was written in the 1810s and ’20s. Of course, language is changing all the time, so the Malagasy spoken language — even its grammar — is in many ways quite different than it was two hundred years ago. However, since everyone learns the grammar in school, if you point this out, people will automatically say that speakers nowadays are simply making mistakes, not following the rules correctly. It never seems to occur to anyone — until you point it out — that had the missionaries came and written their books two hundred years later, current usages would be considered the only correct ones, and anyone speaking as they had two hundred years ago would themselves be assumed to be in error.

In fact, I found this attitude made it extremely difficult to learn how to speak colloquial Malagasy. Even when I hired native speakers, say, students at the university, to give me lessons, they would teach me how to speak nineteenth-century Malagasy as it was taught in school. As my proficiency improved, I began noticing that the way they talked to each other was nothing like the way they were teaching me to speak. But when I asked them about grammatical forms they used that weren’t in the books, they’d just shrug them off, and say, “Oh, that’s just slang, don’t say that.”

…The Malagasy attitudes towards rules of grammar clearly have… everything to do with a distaste for arbitrariness itself — a distaste which leads to an unthinking acceptance of authority in its most formal, institutional form.

Searching for the “correct” way to develop software

Graeber’s phrase “distate for arbitrariness itself” reminded me of the history of software development. In the 1960s and 70s academics and theorists agonised over the nature of development, trying to discover and articulate what it should be. Their approach was fundamentally mistaken. There are dreadful ways, and there are better ways to develop software but there is no natural, correct way that results in perfect software. The researchers assumed that there was and went hunting for it. Instead of seeking understanding they carried their assumptions about what the answer might be into their studies and went looking for confirmation.

They were trying to understand how the organisational machine worked and looked for mechanical processes. I use the word “machine” carefully, not as a casual metaphor. There really was an assumption that organisations were, in effect, machines. They were regarded as first order cybernetic entities whose behaviour would not vary depending on whether they were being observed. To a former auditor like myself this is a ludicrous assumption. The act of auditing an organisation changes the way that people behave. Even the knowledge that an audit may occur will shape behaviour, and not necessarily for the better (see my article “Cynefin, testing and auditing“). You cannot do the job well without understanding that. Second order cybernetics does recognise this crucial problem and treats observers as participants in the system.

So linear, sequential development made sense. The different phases passing outputs along the production line fitted their conception of the organisation as a machine. Iterative, incremental development looked messy and immature; it was just wrong as far as the researchers were concerned. Feeling one’s way to a solution seemed random, unsystematic – arbitrary.

Development is a difficult and complex job; people will tend to follow methods that make the job feel easier. If managers are struggling with the complexities of managing large projects they are more likely to choose linear, sequential methods that make the job of management easier, or at least less stressful. So when researchers saw development being carried out that way they were observing human behaviour, not a machine operating.

Doubts about this approach were quashed by pointing out that if organisations weren’t quite the neat machine that they should be this would be solved by the rapid advance in the use of computers. This argument looks suspiciously circular because the conclusion that in future organisations would be fully machine-like rests on the unproven premise that software development is a mechanical process which is not subject to human variability when performed properly.

Eliminating “arbitrariness” and ignoring the human element

This might all have been no more than an interesting academic sideline, but it fed back into software development. By the 1970s, when these studies into the nature of development were being carried out, organisations were moving towards increasingly formalised development methods. There was increasing pressure to adopt such methods. Not only were they attractive to managers, the use of more formal methods provided a competitive advantage. ISO certification and CMMI accreditation were increasingly seen as a way to demonstrate that organisations produced high quality software. The evidence may have been weak, but it seemed a plausible claim. These initiatives required formal processes. The sellers of formal methods were happy to look for and cite any intellectual justification for their products. So formal linear methods were underpinned by academic work that assumed that formal linear methods were correct. This was the way that responsible, professional software development was performed. ISO standards were built on this assumption.

If you are trying to define the nature of development you must acknowledge that it is a human activity, carried out by and for humans. These studies about the nature of development were essentially anthropological exercises, but the researchers assumed they were observing and taking apart a machine.

As with the missionaries who were codifying grammar the point in time when these researchers were working shaped the result. If they had carried out their studies earlier in the history of software development they might have struggled to find credible examples of formalised, linear development. In the 1950s software development was an esoteric activity in which the developers could call the shots. 20 years later it was part of the corporate bureaucracy and iterative, incremental development was sidelined. If the studies can been carried out a few decades further on then it would have been impossible to ignore Agile.

As it transpired, formal methods, CMM/CMMI and the first ISO standards concerning development and testing were all creatures of that era when organisations and their activities were seriously regarded as mechanical. Like the early Malagasy grammar books they codified and fossilised a particular, flawed approach at a particular time for an activity that was changing rapidly. ISO 29119 is merely an updated version of that dated approach to testing. It is rooted in a yearning for bureaucratic certainty, a reluctance to accept that ultimately good testing is dependent not on documentation, but on that most irrational, variable and unpredictable of creatures – the human who is working in a culture shaped by humans. Anthropology has much to teach us.

Further reading

That is the end of the essay, but there is a vast amount of material you could read about attempts to understand and define the nature of software development and of organisations. Here is a small selection.

Brian Fitzgerald has written some very interesting articles about the history of development. I recommend in particular “The systems development dilemma: whether to adopt formalised systems development methodologies or not?” (PDF, opens in new tab).

Agneta Olerup wrote this rather heavyweight study of what she calls the
Langeforsian approach to information systems design. Börje Langefors was a highly influential advocate of the mechanical, scientific approach to software development. Langefors’ Wikipedia entry describes him as “one of those who made systems development a science”.

This paper gives a good, readable introduction to first and second order cybernetics (PDF, opens in new tab), including a useful warning about the distinction between models and the entities that they attempt to represent.

All our knowledge of systems is mediated by our simplified representations—or models—of them, which necessarily ignore those aspects of the system which are irrelevant to the purposes for which the model is constructed. Thus the properties of the systems themselves must be distinguished from those of their models, which depend on us as their creators. An engineer working with a mechanical system, on the other hand, almost always know its internal structure and behavior to a high degree of accuracy, and therefore tends to de-emphasize the system/model distinction, acting as if the model is the system.

Moreover, such an engineer, scientist, or “first-order” cyberneticist, will study a system as if it were a passive, objectively given “thing”, that can be freely observed, manipulated, and taken apart. A second-order cyberneticist working with an organism or social system, on the other hand, recognizes that system as an agent in its own right, interacting with another agent, the observer.

Finally, I recommend a fascinating article in the IEEE’s Computer magazine by Craig Larman and Victor Basili, “Iterative and incremental development: a brief history” (PDF, opens in new tab). Larman and Basili argue that iterative and incremental development is not a modern practice, but has been carried out since the 1950s, though they do acknowledge that it was subordinate to the linear Waterfall in the 1970s and 80s. There is a particularly interesting contribution from Gerald Weinberg, a personal communication to the authors, in which he describes how he and his colleagues developed software in the 1950s. The techniques they followed were “indistinguishable from XP”.

Presentations

This page will carry slide decks for presentations I’ve given at conferences – as I get round to adding them. All the presentations are in PDF format and open in new tabs.

Farewell to “pass or fail”?

I gave this presentation, “Farewell to ‘pass or fail’” at Expo:QA 2014 in Madrid in May 2014. It shows how auditing and software testing have faced similar challenges and how each profession can learn from the other.

Standards – promoting quality or restricting competition?

This presentation is the one that attracted so much attention and controversy at CAST 2014 in New York. It is a critique of software testing standards from the perspective of economics.

A modest proposal for improving the efficiency of testing services

I would like to offer for your perusal a modest proposal for improving the efficiency of testing services whilst producing great benefits for clients, suppliers and testers (with a nod to Dr Jonathan Swift).

Lately I have been reading some fascinating material about the creative process, the ways that we direct our attention, and how these are linked. Whilst cooking dinner one evening I had a sudden insight into how I could launch an exciting and innovative testing service.

It was no accident that I had my eureka moment when I was doing something entirely unrelated to testing. Psychologists recognise that the creative process starts with two stages. Firstly comes the preparation stage. We familiarise ourselves with a cognitively demanding challenge. We then have to step away from the problem and perform some activity that doesn’t require much mental effort. This is the incubation stage, which gives our brain the opportunity to churn away, making connections between the problem, our stored knowledge and past experience. Crucially, it gives us the chance to envisage future possibilities. Suddenly, and without conscious effort, the answer can come, as it did to Archimedes whose original eureka moment arrived in the bath when he realised that the volume of irregular objects could be calculated by the volume of water that they displaced.

My modest proposal is to exploit this eureka principle in an entirely new way for testing. Traditionally, testers have followed the two stage approach to creativity. We have familiarised ourselves with the client, the business problem and the proposed application. We have then moved on to the vital incubation stage of mindless activity. This has traditionally been known as “writing the detailed test plans” and “churning out the test scripts”.

Now the trouble with these documents hasn’t been their negligible value for the actual testing. That’s the whole point of the incubation stage. We have to do something unrelated and mindless so that our brains can come up with creative ideas for testing. No, the real problem with the traditional approach is that there is no direct valuable output at all. The documents merely gather dust. They haven’t even been used to feed the heating.

I therefore intend to launch a start-up testing services company called CleanTest. CleanTest’s testers will familiarise themselves with the client and the application in the preparation stage. Then, for the incubation stage, they will move on to cleaning the data centre, the development shop and the toilets, whilst the creative ideas formulate. Once their creative ideas for testing have formed they will execute the testing.

Everyone will be a winner. The client will have testing performed to at least the same standard as before. They will also have clean offices and be able to save money by getting rid of their existing cleaning contractor. The testers will have increased job satisfaction from seeing shiny clean premises, instead of mouldering shelfware that no-one will ever read. And I will make a pile of money.

Of course it is vital for the credibility of CleanTest that the company is ISO compliant. We will therefore comply with the ISO 14644 cleanrooms standard, and ISO 12625 toilet paper standard. Compliance with two ISO standards will make us twice as responsible as those fly-by-night competitors who are compliant only with ISO 29119.

Anyone who wishes to join with me and invest in this exciting venture is welcome to get in touch. I also have some exciting opportunities that trusted contacts in Nigeria have emailed to me.