2017 Test Leadership Congress in New York

From May 1st to 4th I’ll be in New York, at the Test Leadership Congress in New York.

Why go all that way to New York? Well, there’s an obvious answer; it’s in New York for goodness sake, and there are interesting looking talks by some cool people. But for me the clincher is a one day masterclass that could hardly be better designed to lure me over the Atlantic.

Dave Snowden is presenting “A leader’s framework for decision making: managing complex projects using Cynefin“. That would be hugely interesting in any context, but when he’s addressing an audience of testers, with the added inducement that “a new collaborative project on software testing will be introduced to delegates”, well I’m checking out flights already.

For a few years now I’ve been fascinated by the understanding that Cynefin can help us gain of the environment we are working in and the problems with which we must grapple. I’ve written about this, providing an introduction to Cynefin, but that was aimed at the highest level of testing strategy, the worldview that shapes our approach to testing.

If you think about the world in the wrong way, if you make flawed and unchallenged assumptions about the nature of the problems you face then you won’t do much that is useful, and even less that is valuable. You can close your mind to all this and get on with working the process but the chances are you’ll be “faking the testing”, a phrase of James Bach’s that keeps coming back to me over the years.

Understanding Cynefin will help testers to choose an approach that will enable more effective testing; your approach will be aligned both to the way that software is developed, and also to the product or application under test. The nature of development means that we are mostly dealing with complicated and complex problems, with the most challenging problems taking us into the complex domain.

The essential feature of complex systems is that they are not predictable. There is no clear cause and effect. We might be able to discern them with hindsight, but that knowledge doesn’t allow us to predict what will happen next; the system adapts continually. In order to understand the system we need to probe it, make sense of what we discover, then respond. Cynefin is therefore likely to guide us towards more flexible, iterative development and exploratory testing. But exactly how should we use that understanding to conduct testing?

In particular I am keen to hear Dave’s ideas about how “safe to fail” experiments with complex systems can be applied in testing. Dave describes safe-fail experiments as follows.

“I can’t get it right in advance, so I need to experiment with different ways of approaching the journey that are safe-fail in nature. That is to say I can afford them to fail and critically, I plan them so that through that failure I learn more about the terrain through which I wish to travel.”

This cycle of probing and learning is obviously relevant to testing. You could say it is testing. I can’t wait for this class.

An added attraction that the Congress has for me is the class on the following day, led by Fiona Charles and Anna Royzman, “Strategic Leadership for Testers“. Test management is changing. Treating the job as a matter of managing the testing process simply won’t cut it in future. Both in testing and software development we’ve been poor at thinking strategically and that is one of Fiona’s strengths. That, combined with her incisive style and wit, makes this is a class I really don’t want to miss while I’m in New York. If you’re going to be there let me know. I look forward to seeing you.

Why ISO 29119 is a flawed quality standard

Why ISO 29119 is a flawed quality standard

This article originally appeared in the Fall 2015 edition of Better Software magazine.

In August 2014, I gave a talk attacking ISO 29119” at the Association for Software Testing’s conference in New York. That gave me the reputation for being opposed to standards in general — and testing standards in particular. I do approve of standards, and I believe it’s possible that we might have a worthwhile standard for testing. However, it won’t be the fundamentally flawed ISO 29119.

Technical standards that make life easier for companies and consumers are a great idea. The benefit of standards is that they offer protection to vulnerable consumers or help practitioners behave well and achieve better outcomes. The trouble is that even if ISO 29119 aspires to do these things, it doesn’t.

Principles, standards, and rules

The International Organization for Standardization (ISO) defines a standard as “a document that provides requirements, specifications, guidelines or characteristics that can be used consistently to ensure that materials, products, processes and services are fit for their purpose.”

It might be possible to derive a useful software standard that fits this definition, but only if it focuses on guidelines, rather than requirements, specifications, or characteristics. According to ISO’s definition, a standard doesn’t have to be all those things. A testing standard that is instead framed as high level guidelines would be consistent with the widespread view among regulatory theorists that standards are conceptually like high-level principles. Rules, in contrast, are detailed and specific (see Frederick Schauer’s “The Convergence of Rules and Standards”: PDF opens in new tab). One of ISO 29119’s fundamental problems is that it is pitched at a level consistent with rules, which will undoubtedly tempt people to treat them as fixed rules.

Principles focus on outcomes rather than detailed processes or specific rules. This is how many professional bodies have defined standards. They often use the words principles and standards interchangeably. Others favor a more rules-based approach. If you adopt a detailed, rules-based approach, there is a danger of painting yourself into a corner; you have to try to specify exactly what is compliant and noncompliant. This creates huge opportunities for people to game the system, demonstrating creative compliance as they observe the letter of the law while trashing underlying quality principles, (see John Braithwaite’s “Rules and Principles: A Theory of Legal Certainty”). Whether one follows a principles-based or a rules-based approach, regulators, lawyers, auditors, and investigators are likely to assume standards define what is acceptable.

As a result, there is a real danger that ISO 29119 could be viewed as the default set of rules for responsible software testing. People without direct experience in development or testing look for some form of reassurance about what constitutes responsible practice. They are likely to take ISO 29119 at face value as a definitive testing standard. The investigation into the HealthCare.gov website problems showed what can happen.

In its March 2015 report (PDF, opens in new tab) on the website’s problems, the US Government Accountability Office checked the HealthCare.gov project for compliance with the IEEE 829 test documentation standard. The agency didn’t know anything about testing. They just wanted a benchmark. IEEE 829 was last revised in 2008; it said that the content of standards more than five years old “do not wholly reflect the present state of the art”. Few testers would disagree that IEEE 829 is now hopelessly out of date.

when a document is more than five years old

IEEE 829’s obsolescence threshold

The obsolescence threshold for ISO 29119 has increased from five to ten years, presumably reflecting the lengthy process of creating and updating such cumbersome documents rather than the realities of testing. We surely don’t want regulators checking testing for compliance against a detailed, outdated standard they don’t understand.

Scary lessons from the social sciences

If we step away from ISO 29119, and from software development, we can learn some thought-provoking lessons from the social sciences.

Prescriptive standards don’t recognize how people apply knowledge in demanding jobs like testing. Scientist Michael Polanyi and sociologist Harry Collins have offered valuable insights into tacit knowledge, which is knowledge we possess and use but cannot articulate. Polanyi first introduced the concept, and Collins developed the idea, arguing that much valuable knowledge is cultural and will vary between different contexts and countries. Defining a detailed process as a standard for all testing excludes vital knowledge; people will respond by concentrating on the means, not the ends.

Donald Schön, a noted expert on how professionals learn and work, offered a related argument with “reflection in action” (see Willemien Visser’s article: PDF opens in new tab). Schön argued that creative professionals, such as software designers or architects, have an iterative approach to developing ideas—much of their knowledge is understood without being expressed. In other words, they can’t turn all their knowledge into an explicit, written process. Instead, to gain access to what they know, they have to perform the creative act so that they can learn, reflect on what they’ve learned, and then apply this new knowledge. Following a detailed, prescriptive process stifles learning and innovation. This applies to all software development—both agile and traditional methods.

In 1914, Thorstein Veblen identified the problem of trained incapacity. People who are trained in specific skills can lack the ability to adapt. Their response worked in the past, so they apply it regardless thereafter.

young girl, old woman

Young woman or old woman? Means or ends? We can focus on only one at a time.

Kenneth Burke built upon Veblen’s work, arguing that trained incapacity means one’s abilities become blindnesses. People can focus on the means or the ends, not both; their specific training makes them focus on the means. They don’t even see what they’re missing. As Burke put it, “a way of seeing is also a way of not seeing; a focus upon object A involves a neglect of object B”. This leads to goal displacement, and the dangers for software testing are obvious.

The problem of goal displacement was recognized before software development was even in its infancy. When humans specialize in organizations, they have a predictable tendency to see their particular skill as a hammer and every problem as a nail. Worse, they see their role as hitting the nail rather than building a product. Give test managers a detailed standard, and they’ll start to see the job as following the standard, not testing.

In the 1990s, British academic David Wastell studied software development shops that used structured methods, the dominant development technique at the time. Wastell found that developers used these highly detailed and prescriptive methods in exactly the same way that infants use teddy bears and security blankets: to give them a sense of comfort and help them deal with stress. In other words, a developer’s mindset betrayed that the method wasn’t a way to build better software but rather a defense mechanism to alleviate stress and anxiety.

Wastell could find no empirical evidence, either from his own research at these companies or from a survey of the findings of other experts, that structured methods worked. In fact, the resulting systems were no better than the old ones, and they took much more time and money to develop. Managers became hooked on the technique (the standard) while losing sight of the true goal. Wastell concluded the following:

Methodology becomes a fetish, a procedure used with pathological rigidity for its own sake, not as a means to an end. Used in this way, methodology provides a relief against anxiety; it insulates the practitioner from the risks and uncertainties of real engagement with people and problems.

Developers were delivering poorer results but defining that as the professional standard. Techniques that help managers cope with stress and anxiety but give an illusory, reassuring sense of control harm the end product. Developers and testers cope by focusing on technique, mastery of tools, or compliance with standards. In doing so they can feel that they are doing a good job, so long as they don’t think about whether they are really working toward the true ends of the organization or the needs of the customer.

Standards must be fit for their purpose

Is all this relevant to ISO 29119? We’re still trying to do a difficult, stressful job, and in my experience, people will cling to prescriptive processes and standards that give the illusion of being in control. Standards have credibility and huge influence simply from their status as standards. If we must have standards, they should be relevant, credible, and framed in a way that is helpful to practitioners. Crucially, they must not mislead stakeholders and regulators who don’t understand testing but who wield great influence and power.

The level of detail in ISO 29119 is a real concern. Any testing standard should be in the style favored by organizations like the Institute of Internal Auditors (IIA), whose principles based professional standards cover the entire range of internal auditing but are only one-tenth as long as the three completed parts of ISO 29119. The IIA’s standards are light on detail but far more demanding in the outcomes required.

Standards must be clear about the purpose they serve if we are to ensure testing is fit for its purpose, to hark back to ISO’s definition of a standard. In my opinion, this is where ISO 29119 falls down. The standard does not clarify the purpose of testing, only the mechanism—and that mechanism focuses on documentation, not true testing. It is this lack of purpose, the why, that leads to teams concentrating on standards compliance rather than delivering valuable information to stakeholders. This is a costly mistake. Standards should be clear about the outcomes and leave the means to the judgment of practitioners.

A good example of this problem is ISO 29119’s test completion report, which is defined simply as a summary of the testing that was performed. The standard offers examples for traditional and agile projects. Both focus on the format, not the substance of the report. The examples give some metrics without context or explanation and provide no information or insight that would help stakeholders understand the product and the risk and make better decisions. Testers could comply with the standard without doing anything useful. In contrast, the IIA’s standards say audit reports must be “accurate, objective, clear, concise, constructive, complete, and timely.” Each of these criteria is defined briefly in a way that makes the standard far more demanding and useful than ISO 29119, in far less space.

It’s no good saying that ISO 29119 can be used sensibly and doesn’t have to be abused. People are fallible and will misuse the standard. If we deny that fallibility, we deny the experience of software development, testing, and, indeed, human nature. As Jerry Weinberg said (in “The Secrets of Consulting”), “no matter how it looks at first, it’s always a people problem”. Any prescriptive standard that focuses on compliance with highly detailed processes is doomed. Maybe you can buck the system, but you can’t buck human nature.

David Graeber’s “The Utopia of Rules: On Technology, Stupidity and the Secret Joys of Bureaucracy”

When I gave my talk at CAST 2014 in New York, “Standards – promoting quality or restricting competition?” I was concentrating on the economic aspects of standards. They are often valuable, but they can be damaging and restrict competition if they are misused. A few months later I bought “The Utopia of Rules: On Technology, Stupidity, and the Secret Joys of Bureaucracy” by David Graeber, Professor of Anthropology at the London School of Economics. I was familiar with Graeber as a challenging and insightful writer. I drew on his work when I wrote “Testing: valuable or bullshit?“. The Utopia of Rules also inspired the blog article I wrote recently, “Frozen in time – grammar and testing standards” in which I discussed the similarity between grammar textbooks and standards, which both codify old usages and practices that no longer match the modern world.

What I hadn’t expected from The Utopia of Rules was how strongly it would support the arguments I made at CAST.

Certification and credentialism

Graeber makes the same argument I deployed against certification. It is being used increasingly to enrich special interests without benefiting society. On page 23 Graeber writes:

Almost every endeavor that used to be considered an art (best learned through doing) now requires formal professional training and a certificate of completion… In some cases, these new training requirements can only be described as outright scams, as when lenders, and those prepared to set up the training programs, jointly lobby the government to insist that, say, all pharmacists be henceforth required to pass some additional qualifying examination, forcing thousands already practicing the profession into night school, which these pharmacists know many will only be able to afford with the help of high-interest student loans. By doing this, lenders are in effect legislating themselves a cut of most pharmacists’ subsequent incomes.

To be clear, my stance on ISTQB training is that it educates testers in a legitimate, though very limited, vision of testing. My objection is to any marketing of the qualification as a certification of testing ability, rather than confirmation that the tester has passed an exam associated with a particular training course. I object even more strongly to any argument that possession of the certificate should be a requirement for employment, or for contracting out testing services. It is reasonable to talk of scams when the ability of good testers to earn a living is damaged.

What is the point of it all?

Graeber has interesting insights into how bureaucrats can be vague about the values of the bureaucracy: why does the organisation exist? Bureaucrats focus on efficient execution of rational processes, but what is the point of it all? Often the means become the ends: efficiency is an end in itself.

I didn’t argue that point at CAST, but I have done so many times in other talks and articles (e.g. “Teddy bear methods“). If people are doing a difficult, stressful job and you give them prescriptive methods, processes or standards then they will focus on ticking their way down the list. The end towards which they are working becomes compliance with the process, rather than helping the organisation reach its goal. They see their job as producing the outputs from the process, rather than the outcomes the stakeholders want. I gave a talk in London in June 2015 to the British Computer Society’s Special Interest Group in Software Testing in which I argued that testing lacks guiding principles (PDF, opens in a new tab) and ISO 29119 in particular does not offer clear guidance about the purpose of testing.

In a related argument Graeber makes a point that will be familiar to those who have criticised the misuse of testing metrics.

…from inside the system, the algorithms and mathematical formulae by which the world comes to be assessed become, ultimately, not just measures of value, but the source of value itself.

Rent extraction

The most controversial part of my CAST talk was my argument that the pressure to adopt testing standards was entirely consistent with rent seeking in economic theory. Rent seeking, or rent extraction, is what people do when they exploit failings in the market, or rig the market for their own benefit by lobbying for regulation that happens to benefit them. Instead of creating wealth, they take it from other people in a way that is legal, but which is detrimental to the economy, and society, as a whole.

This argument riled some people who took it as a personal attack on their integrity. I’m not going to dwell on that point. I meant no personal slur. Rent seeking is just a feature of modern economies. Saying so is merely being realistic. David Graeber argued the point even more strongly.

The process of financialization has meant that an ever-increasing proportion of corporate profits come in the form of rent extraction of one sort or another. Since this is ultimately little more than legalized extortion, it is accompanied by ever-increasing accumulation of rules and regulations… At the same time, some of the profits from rent extraction are recycled to select portions of the professional classes, or to create new cadres of paper-pushing corporate bureaucrats. This helps a phenomenon I have written about elsewhere: the continual growth, in recent decades, of apparently meaningless, make-work, “bullshit jobs” — strategic vision coordinators, human resources consultants, legal analysts, and the like — despite the fact that even those who hold such positions are half the time secretly convinced they contribute nothing to the enterprise.

In 2014 I wrote about “bullshit jobs“, prompted partly by one of Graeber’s articles. It’s an important point. It is vital that testers define their job so that it offers real value, and they are not merely bullshit functionaries of the corporate bureaucracy.

Utopian bureaucracies

I have believed for a long time that adopting highly prescriptive methods or standards for software development and testing places unfair pressure on people, who are set up to fail. Graeber makes exactly the same point.

Bureaucracies public and private appear — for whatever historical reasons — to be organized in such a way as to guarantee that a significant proportion of actors will not be able to perform their tasks as expected. It’s in this sense that I’ve said one can fairly say that bureaucracies are utopian forms of organization. After all, is this not what we always say of utopians: that they have a naïve faith in the perfectibility of human nature and refuse to deal with humans as they actually are? Which is, are we not also told, what leads them to set impossible standards and then blame the individuals for not living up to them? But in fact all bureaucracies do this, insofar as they set demands they insist are reasonable, and then, on discovering that they are not reasonable (since a significant number of people will always be unable to perform as expected), conclude that the problem is not with the demands themselves but with the individual inadequacy of each particular human being who fails to live up to them.

Testing standards such as ISO 29119, and its predecessor IEEE 829, don’t reflect what developers and testers do, or rather should be doing. They are at odds with the way people think and work in organisations. These standards attempt to represent a highly complex, sometimes chaotic, process in a defined, repeatable model. The end product is usually of dubious quality, late and over budget. Any review of the development will find constant deviations from the standard. The suppliers, and defenders, of the standard can then breathe a sigh of relief. The sacred standard was not followed. It was the team’s fault. If only they’d done it by the book! The possibility that the developers’ and testers’ apparent sins were the only reason anything was produced at all is never considered. This is a dreadful way to treat people, but in many organisations it has been normal for several decades.

Loss of communication

All of the previous arguments by Graeber were entirely consistent with my own thoughts about how corporate bureaucracies operate. It was fascinating to see an anthropologist’s perspective, but it didnt teach me anything that was really new about how testers work in corporations. However, later in the book Graeber developed two arguments that gave me new insights.

Understanding what is happening in a complex, social situation needs effective two way communication. This requires effort, “interpretive labor”. The greater the degree of compulsion, and the greater the bureaucratic regime of rules and forms, the less need there is for such two way communication. Those who can simply issue orders that must be obeyed don’t have to take the trouble to understand the complexities of the situation they’re managing.

…within relations of domination, it is generally the subordinates who are effectively relegated the work of understanding how the social relations in question really work. … It’s those who do not have the power to hire and fire who are left with the work of figuring out what actually did go wrong so as to make sure it doesn’t happen again.

This ties in with the previous argument about utopian bureaucracies. If you impose a inappropriate standard then poor results will be attributed to the inevitable failure to comply. There is no need for senior managers to understand more, and no need to listen to the complaints, the “excuses”, of the people who do understand what is happening. Interestingly, Graeber’s argument about interpretive labor is is consistent with regulatory theory. Good regulation of complex situations requires ongoing communication between the regulator and the regulated. I explained this in the talk on testing principles I mentioned above (slides 38 and 39).

Fear of play

My second new insight from Graeber arrived when he discussed the nature of play and how it relates to bureaucracies. Anthropologists try to maintain a distinction between games and play, a distinction that is easier to maintain in English than in languages like French and German, which use the same word for both. A game has boundaries, set rules and a predetermined conclusion. Play is more free-form and creative. Novelties and surprising results emerge from the act of playing. It is a random, unpredictable and potentially destructive activity. Graeber finishes his discussion of play and games with the striking observation.

What ultimately lies behind the appeal of bureaucracy is fear of play.

Put simply, and rather simplistically, Graeber means that we use bureaucracy to escape the terror of chaotic reality, to bring a semblance (an illusion?) of control to the uncontrollable.

This gave me an tantalising new insight into the reasons people build bureaucratic regimes in organisations. It sent me off into a whole new field of reading on the anthropology of games and play. This has fascinating implications for the debate about standards and testing. We shy away from play, but it is through play that we learn. I don’t have time now to do the topic justice, and it’s much too big and important a subject to be tacked on to the end of this article, but I will return to it. It is yet another example of the way anthropology can help us understand what we are doing as testers. As a starting point I can heartily recommend David Graeber’s book, “The Utopia of Rules”.

Frozen in time – grammar and testing standards

This recent tweet by Tyler Hayes caught my eye. “If you build software you’re an anthropologist whether you like it or not.”

It’s an interesting point, and it’s relevant on more than one level. By and large software is developed by people and for people. That is a statement of the obvious, but developers and testers have generally been reluctant to take on board the full implications. This isn’t a simple point about usability. The software we build is shaped by many assumptions about the users, and how they live and work. In turn, the software can reinforce existing structures and practices. Testers should think about these issues if they’re to provide useful findings to the people who matter. You can’t learn everything you need to know from a requirements specification. This takes us deep into anthropological territory.

What is anthropology?

Social anthropology is defined by University College London as follows.

Social Anthropology is the comparative study of the ways in which people live in different social and cultural settings across the globe. Societies vary enormously in how they organise themselves, the cultural practices in which they engage, as well as their religious, political and economic arrangements.

We build software in a social, economic and cultural context that is shaped by myriad factors, which aren’t necessarily conducive to good software, or a happy experience for the developers and testers, never mind the users. I’ve touched on this before in “Teddy Bear Methods“.

There is much that we can learn from anthropology, and not just to help us understand what we see when we look out at the users and the wider world. I’ve long thought that the software development and testing community would make a fascinating subject for anthropologists.

Bureaucracy, grammar and deference to authority

I recently read “The Utopia of Rules – On Technology, Stupidity, and the Secret Joys of Bureaucracy” by the anthropologist David Graeber.
Graeber has many fascinating insights and arguments about how organisations work, and why people are drawn to bureaucracy. One of his arguments is that regulation is imposed and formalised to try and remove arbitrary, random behaviour in organisations. That’s a huge simplification, but there’s not room here to do Graeber’s argument justice. One passage in particular caught my eye.

People do not invent languages by writing grammars, they write grammars — at least, the first grammars to be written for any given language — by observing the tacit, largely unconscious, rules that people seem to be applying when they speak. Yet once a book exists,and especially once it is employed in schoolrooms, people feel that the rules are not just descriptions of how people do talk, but prescriptions for how they should talk.

It’s easy to observe this phenomenon in places where grammars were only written recently. In many places in the world, the first grammars and dictionaries were created by Christian missionaries in the nineteenth or even twentieth century, intent on translating the Bible and other sacred texts into what had been unwritten languages. For instance, the first grammar for Malagasy, the language spoken in Madagascar, was written in the 1810s and ’20s. Of course, language is changing all the time, so the Malagasy spoken language — even its grammar — is in many ways quite different than it was two hundred years ago. However, since everyone learns the grammar in school, if you point this out, people will automatically say that speakers nowadays are simply making mistakes, not following the rules correctly. It never seems to occur to anyone — until you point it out — that had the missionaries came and written their books two hundred years later, current usages would be considered the only correct ones, and anyone speaking as they had two hundred years ago would themselves be assumed to be in error.

In fact, I found this attitude made it extremely difficult to learn how to speak colloquial Malagasy. Even when I hired native speakers, say, students at the university, to give me lessons, they would teach me how to speak nineteenth-century Malagasy as it was taught in school. As my proficiency improved, I began noticing that the way they talked to each other was nothing like the way they were teaching me to speak. But when I asked them about grammatical forms they used that weren’t in the books, they’d just shrug them off, and say, “Oh, that’s just slang, don’t say that.”

…The Malagasy attitudes towards rules of grammar clearly have… everything to do with a distaste for arbitrariness itself — a distaste which leads to an unthinking acceptance of authority in its most formal, institutional form.

Searching for the “correct” way to develop software

Graeber’s phrase “distate for arbitrariness itself” reminded me of the history of software development. In the 1960s and 70s academics and theorists agonised over the nature of development, trying to discover and articulate what it should be. Their approach was fundamentally mistaken. There are dreadful ways, and there are better ways to develop software but there is no natural, correct way that results in perfect software. The researchers assumed that there was and went hunting for it. Instead of seeking understanding they carried their assumptions about what the answer might be into their studies and went looking for confirmation.

They were trying to understand how the organisational machine worked and looked for mechanical processes. I use the word “machine” carefully, not as a casual metaphor. There really was an assumption that organisations were, in effect, machines. They were regarded as first order cybernetic entities whose behaviour would not vary depending on whether they were being observed. To a former auditor like myself this is a ludicrous assumption. The act of auditing an organisation changes the way that people behave. Even the knowledge that an audit may occur will shape behaviour, and not necessarily for the better (see my article “Cynefin, testing and auditing“). You cannot do the job well without understanding that. Second order cybernetics does recognise this crucial problem and treats observers as participants in the system.

So linear, sequential development made sense. The different phases passing outputs along the production line fitted their conception of the organisation as a machine. Iterative, incremental development looked messy and immature; it was just wrong as far as the researchers were concerned. Feeling one’s way to a solution seemed random, unsystematic – arbitrary.

Development is a difficult and complex job; people will tend to follow methods that make the job feel easier. If managers are struggling with the complexities of managing large projects they are more likely to choose linear, sequential methods that make the job of management easier, or at least less stressful. So when researchers saw development being carried out that way they were observing human behaviour, not a machine operating.

Doubts about this approach were quashed by pointing out that if organisations weren’t quite the neat machine that they should be this would be solved by the rapid advance in the use of computers. This argument looks suspiciously circular because the conclusion that in future organisations would be fully machine-like rests on the unproven premise that software development is a mechanical process which is not subject to human variability when performed properly.

Eliminating “arbitrariness” and ignoring the human element

This might all have been no more than an interesting academic sideline, but it fed back into software development. By the 1970s, when these studies into the nature of development were being carried out, organisations were moving towards increasingly formalised development methods. There was increasing pressure to adopt such methods. Not only were they attractive to managers, the use of more formal methods provided a competitive advantage. ISO certification and CMMI accreditation were increasingly seen as a way to demonstrate that organisations produced high quality software. The evidence may have been weak, but it seemed a plausible claim. These initiatives required formal processes. The sellers of formal methods were happy to look for and cite any intellectual justification for their products. So formal linear methods were underpinned by academic work that assumed that formal linear methods were correct. This was the way that responsible, professional software development was performed. ISO standards were built on this assumption.

If you are trying to define the nature of development you must acknowledge that it is a human activity, carried out by and for humans. These studies about the nature of development were essentially anthropological exercises, but the researchers assumed they were observing and taking apart a machine.

As with the missionaries who were codifying grammar the point in time when these researchers were working shaped the result. If they had carried out their studies earlier in the history of software development they might have struggled to find credible examples of formalised, linear development. In the 1950s software development was an esoteric activity in which the developers could call the shots. 20 years later it was part of the corporate bureaucracy and iterative, incremental development was sidelined. If the studies can been carried out a few decades further on then it would have been impossible to ignore Agile.

As it transpired, formal methods, CMM/CMMI and the first ISO standards concerning development and testing were all creatures of that era when organisations and their activities were seriously regarded as mechanical. Like the early Malagasy grammar books they codified and fossilised a particular, flawed approach at a particular time for an activity that was changing rapidly. ISO 29119 is merely an updated version of that dated approach to testing. It is rooted in a yearning for bureaucratic certainty, a reluctance to accept that ultimately good testing is dependent not on documentation, but on that most irrational, variable and unpredictable of creatures – the human who is working in a culture shaped by humans. Anthropology has much to teach us.

Further reading

That is the end of the essay, but there is a vast amount of material you could read about attempts to understand and define the nature of software development and of organisations. Here is a small selection.

Brian Fitzgerald has written some very interesting articles about the history of development. I recommend in particular “The systems development dilemma: whether to adopt formalised systems development methodologies or not?” (PDF, opens in new tab).

Agneta Olerup wrote this rather heavyweight study of what she calls the
Langeforsian approach to information systems design. Börje Langefors was a highly influential advocate of the mechanical, scientific approach to software development. Langefors’ Wikipedia entry describes him as “one of those who made systems development a science”.

This paper gives a good, readable introduction to first and second order cybernetics (PDF, opens in new tab), including a useful warning about the distinction between models and the entities that they attempt to represent.

All our knowledge of systems is mediated by our simplified representations—or models—of them, which necessarily ignore those aspects of the system which are irrelevant to the purposes for which the model is constructed. Thus the properties of the systems themselves must be distinguished from those of their models, which depend on us as their creators. An engineer working with a mechanical system, on the other hand, almost always know its internal structure and behavior to a high degree of accuracy, and therefore tends to de-emphasize the system/model distinction, acting as if the model is the system.

Moreover, such an engineer, scientist, or “first-order” cyberneticist, will study a system as if it were a passive, objectively given “thing”, that can be freely observed, manipulated, and taken apart. A second-order cyberneticist working with an organism or social system, on the other hand, recognizes that system as an agent in its own right, interacting with another agent, the observer.

Finally, I recommend a fascinating article in the IEEE’s Computer magazine by Craig Larman and Victor Basili, “Iterative and incremental development: a brief history” (PDF, opens in new tab). Larman and Basili argue that iterative and incremental development is not a modern practice, but has been carried out since the 1950s, though they do acknowledge that it was subordinate to the linear Waterfall in the 1970s and 80s. There is a particularly interesting contribution from Gerald Weinberg, a personal communication to the authors, in which he describes how he and his colleagues developed software in the 1950s. The techniques they followed were “indistinguishable from XP”.


This page will carry slide decks for presentations I’ve given at conferences – as I get round to adding them. All the presentations are in PDF format and open in new tabs.

Farewell to “pass or fail”?

I gave this presentation, “Farewell to ‘pass or fail’” at Expo:QA 2014 in Madrid in May 2014. It shows how auditing and software testing have faced similar challenges and how each profession can learn from the other.

Standards – promoting quality or restricting competition?

This presentation is the one that attracted so much attention and controversy at CAST 2014 in New York. It is a critique of software testing standards from the perspective of economics.

A modest proposal for improving the efficiency of testing services

I would like to offer for your perusal a modest proposal for improving the efficiency of testing services whilst producing great benefits for clients, suppliers and testers (with a nod to Dr Jonathan Swift).

Lately I have been reading some fascinating material about the creative process, the ways that we direct our attention, and how these are linked. Whilst cooking dinner one evening I had a sudden insight into how I could launch an exciting and innovative testing service.

It was no accident that I had my eureka moment when I was doing something entirely unrelated to testing. Psychologists recognise that the creative process starts with two stages. Firstly comes the preparation stage. We familiarise ourselves with a cognitively demanding challenge. We then have to step away from the problem and perform some activity that doesn’t require much mental effort. This is the incubation stage, which gives our brain the opportunity to churn away, making connections between the problem, our stored knowledge and past experience. Crucially, it gives us the chance to envisage future possibilities. Suddenly, and without conscious effort, the answer can come, as it did to Archimedes whose original eureka moment arrived in the bath when he realised that the volume of irregular objects could be calculated by the volume of water that they displaced.

My modest proposal is to exploit this eureka principle in an entirely new way for testing. Traditionally, testers have followed the two stage approach to creativity. We have familiarised ourselves with the client, the business problem and the proposed application. We have then moved on to the vital incubation stage of mindless activity. This has traditionally been known as “writing the detailed test plans” and “churning out the test scripts”.

Now the trouble with these documents hasn’t been their negligible value for the actual testing. That’s the whole point of the incubation stage. We have to do something unrelated and mindless so that our brains can come up with creative ideas for testing. No, the real problem with the traditional approach is that there is no direct valuable output at all. The documents merely gather dust. They haven’t even been used to feed the heating.

I therefore intend to launch a start-up testing services company called CleanTest. CleanTest’s testers will familiarise themselves with the client and the application in the preparation stage. Then, for the incubation stage, they will move on to cleaning the data centre, the development shop and the toilets, whilst the creative ideas formulate. Once their creative ideas for testing have formed they will execute the testing.

Everyone will be a winner. The client will have testing performed to at least the same standard as before. They will also have clean offices and be able to save money by getting rid of their existing cleaning contractor. The testers will have increased job satisfaction from seeing shiny clean premises, instead of mouldering shelfware that no-one will ever read. And I will make a pile of money.

Of course it is vital for the credibility of CleanTest that the company is ISO compliant. We will therefore comply with the ISO 14644 cleanrooms standard, and ISO 12625 toilet paper standard. Compliance with two ISO standards will make us twice as responsible as those fly-by-night competitors who are compliant only with ISO 29119.

Anyone who wishes to join with me and invest in this exciting venture is welcome to get in touch. I also have some exciting opportunities that trusted contacts in Nigeria have emailed to me.

A more optimistic conclusion?

This is the final post in a series about how and why so many corporations became embroiled in a bureaucratic mess in which social and political skills are more important than competence.

In my first post “Sick bureaucracies and mere technicians” I talked about Edward Giblin’s analysis back in the early 1980s of the way senior managers had become detached from the real work of many corporations. Not only did this problem persist, but it become far worse.

In my second post, “Digital Taylorism & the drive for standardisation“, I explained how globalisation and technical advances gave impetus to digital Taylorism and the mass standardisation of knowledge work. It was widely recognised that Taylorism damaged creativity, a particularly serious concern with knowledge work. However, that concern was largely ignored, swamped by the trends I discussed in my third post, “Permission to think“.

In this post I will try to offer a more constructive conclusion after three essays of unremitting bleakness!

Deskilling – a chilling future for testing?

When it comes to standardisation of testing the “talented managers” (see “Permission to think“) will tell themselves that they are looking at a bigger picture than the awkward squad (ok, I mean context driven testers here) who complain that this is disastrous for software testing.

Many large corporations are hooked into a paradigm that requires them to simultaneously improve quality and reduce costs, and to do so by de-skilling jobs below the elite level. Of course other tactics are deployed, but deskilling is what concerns me here. The underlying assumption is that standardisation and detailed processes will not only improve quality, but also reduce costs, either directly by outsourcing, or indirectly by permitting benchmarking against outsourcing suppliers.

In the case of testing that doesn’t work. You can do it, but at the expense of the quality of testing. Testing is either a thinking, reflective activity, or it is done badly. However, testing is a mere pawn; it’s very difficult for corporate bureaucrats to make an exception for testing. If they were to do that it would undermine the whole paradigm. If testing is exempt then how could decision makers hold the line when faced with special pleading on behalf of other roles they don’t understand? No, if the quality of testing has to be sacrificed then so be it.

The drive for higher quality at reduced cost is so powerful that its underlying assumption is unchallengeable. Standardisation produces simplicity which allows higher quality and lower costs. That is corporate dogma, and anyone who wants to take a more nuanced approach is in danger of being branded a heretic and denied a hearing. It is easier to fudge the issue and ignore evidence that applying this strategy to testing increases costs and reduces quality.

Small is beautiful

Perhaps my whole story has been unnecessarily bleak. I have been talking about corporations and organisations. I really mean large bodies. The gloomy, even chilling, picture that I’ve been painting doesn’t apply to smaller, more nimble firms. Start-ups, technology focused firms, and specialist testing services providers (or the good ones at least) have a clearer idea of what the company is trying to do. They’re far less likely to sink into a bureaucratic swamp. For one thing it would kill them quickly. Also, to hark back to my first post in this series, “Sick bureaucracies and mere technicians“, such firms are more likely to be task dependent, i.e. the more senior people will probably have a deep understanding of the core business. It is their job to apply that knowledge in the interests of the company, rather than merely to run the corporate bureaucracy.

My advice to testers who want to do good work would be to head for the smaller outfits, the task dependent companies. As a heuristic I’d want to work for a company that was small enough for me to speak to anyone, at any time, who had the power to change things. Then, I’d know that if I saw possible improvements I’d have the chance to sell my ideas to someone who could make a difference. One of the most dispiriting things I ever heard was a senior manager in the global corporation where I worked saying “you’re quite right – but you’d be appalled at how high you could go and find people who’d agree with you, but say that they couldn’t change anything”.

What’s to be done?

Nevertheless, many good testers are working for big corporations, and struggling to make things better. They’re not all going to walk out the door, and they shouldn’t just give up in despair. What can they do? Well, plain speaking will have a limited impact – except on their careers. Senior managers don’t like being told “we’re doing rubbish work and you’re acting like an idiot if you refuse to admit that”.

Corporate managers are under pressure to make the bureaucracy more efficient by standardising working practices and processes. In order to do so they have to redefine what constitutes simple, routine work. Testers have to understand that pressure and respond by lobbying to be allowed to carry out that redefinition themselves. Testing has to be defined by those who understand and respect it so that the thoughtful, reflective, non-routine elements are recognised. Testing must be defined in such a way that it can handle complex problems, and not just simple, ordered problems (see Cynefin).

That takes us back to the segmentation of knowledge workers described by Brown, Lauder and Ashton in The Global Auction (see my previous post “Permission to think“). The workforce is increasingly segmented into developers (those responsible for corporate development, not software developers!), who are given “permission to think”, demonstrators who apply processes, and drones who basically follow a script without being required to engage their brains. If testers have to follow a prescriptive, documentation driven standard like ISO 29119 they are implicitly assigned to the status of drones.

Testers must argue their case so they are represented in the class of developers who are allowed to shape the way the corporation works. The arguments are essentially the same as those that have been deployed against ISO 29119, and can be summed up in the phrase I used at the top; testing is either a thinking, reflective activity, or it is done badly. Testing is an activity that provides crucial information to the corporate elite, the “developers”. As such testers must be given the responsiblity to think, or else senior management will be choking off the flow of vital information about applications and products.

That is a tough task, and I’m sceptical about the chances of testers persuading their corporations to buck a powerful trend. I doubt if many will be successful, but perhaps some brave, persuasive and persistent souls will succeed. They deserve respect and support from the rest of the testing profession.

If large corporations won’t admit their approach is damaging to testing then ultimately I fear that their in-house test teams are doomed. They will be sucked into a vicious circle of commoditised testing that will lead to the work being outsourced to cheaper suppliers. If you’re not doing anything worthwhile there is always someone who can do it cheaper. Iain McCowatt wrote a great blog about this.

Where might hope lie?

Perhaps outsourcing might offer some hope for testing after all. A major motive for adopting standards is to facilitate outsourcing. The service that is being outsourced is standard, neatly defined, and open to benchmarking. Suppliers who can demonstrate they comply with standards have a competitive advantage. That is one of the reasons ISO 29119 is so pernicious. Good testing suppliers will have to ignore that market and make it clear that they are not competing to provide cheaper drones, but highly capable, thinking consultants who can provide valuable insights about products and applications.

The more imaginative, context-driven (and smaller?) suppliers can certainly compete effectively in this fashion. After all they are following an approach that is is both more efficient and more effective. Their focus is on testing rather than documentation and compliance with an irrelevant standard. However, I suspect that is exactly why many large corporations are suspicious of such an approach. The corporate bureaucrat is reassured by visible documents and compliance with an ISO standard.

A new framework?

Perhaps there is room for an alternative approach. I don’t mean an alternative standard, but a framework that shows how good context driven testing is responsible testing that can keep regulators happy. It could tie together the requirements of regulators, auditors and governance professionals with context driven techniques, perhaps a particular context driven approach. The framework could demonstrate links between governance needs and specific context driven techniques. This has been lurking at the back of my mind for a couple of years, but I haven’t yet committed serious effort to the idea. My reading and thinking around the subject of corporate bureaucracy for this series of blog posts has helped shape my understanding of why such an alternative framework might be needed, and why it might work.

An alternative framework in the form of a set of specific, practical, actionable guidelines would ironically be more consistent with ISO’s definition of a standard than ISO 29119 itself is.

A standard is a document that provides requirements, specifications, guidelines or characteristics that can be used consistently to ensure that materials, products, processes and services are fit for their purpose.

Taking the relevant parts of the definition, the framework would provide guidelines that can be used consistently to ensure that testing services are fit for their purpose.

Could this give corporations the quality of testing they require without having to abandon their worldview? Internal testers might still be defined as drones (with a few, senior testers allowed to be demonstrators). External testers can be treated as consultants and allowed to think.

When discussing ISO 29119, and the approach to testing that it embodies, we should always bear in mind that the standard does not exist to provide better testing. It was developed because it fits a corporate mindset that wants to see as many activities as possible defined as simple and routine. Testers who have a vision of better testing, and a better future for testing, have to understand that mindset and deal with it, rather than just kicking ISO 29119 for being a useless piece of verbiage. The standard really is useless, but perhaps we need a more sophisticated approach than just calling it like it is.