Dave Snowden’s Cynefin masterclass in New York, 2nd May 2017 – part 2

This post is the second of two discussing Dave Snowden’s recent Cynefin masterclass at the Test Leadership Congress in New York. I wrote the series with the support of the Committee on Standards and Professional Practices of the Association for Software Testing. The posts originally appeared on the AST site.

In the first I gave an overview of Cynefin and explained why I think it is important, and how it can helpfully shape the way we look at the world and make sense of the problems we face. In this post I will look at some of the issues raised in Dave’s class and discuss their relevance to development and testing.

The dynamics between domains

Understanding that the boundaries between the different domains are fluid and permeable is crucial to understanding Cynefin. A vital lesson is that we don’t start in one domain and stay there; we can and should move between them. Even if we ignore that lesson reality will drag us from one domain to another. Dave said “all the domains have value – it’s the ability to move between them that is key”.

The Cynefin dynamics are closely tied to the concept of constraints, which are so important to Cynefin that they act as differentiators between the domains. You could say that constraints define the domains.

Constraint is perhaps a slightly misleading word. In Cynefin terms it is not necessarily something that compels or prevents certain behaviour. That does apply to the Obvious domain, where the constraints are fixed and rigid. The constraints in the Complicated domain govern behaviour, and can be agreed by expert consensus. In the Complex domain the constraints enable action, rather than restricting it or compelling it. They are a starting point rather than an end. In Chaos there are no constraints.

Dave Snowden puts it as follows, differentiating rules and heuristics.

“Rules are governing constraints, they set limits to action, they contain all possible instances of action. In contrast heuristics are enabling constraints, they provide measurable guidance which can adapt to the unknowable unknowns.”

If we can change the constraints then we are moving from one domain to another. The most significant dynamic is the cycle between Complex and Complicated.

Cynefin core dynamic - Complex to ComplicatedCrucially, we must recognise that if we are attempting something new, that involves a significant amount of uncertainty then we start in the Complex domain exploring and discovering more about the problem. Once we have a better understanding and have found constraints that allow us to achieve repeatable outcomes we have moved the problem to the Complicated domain where we can manage it more easily and exploit our new knowledge. If our testing reveals that the constraints are not producing repeatable results then it’s important to get back into the Complex domain and carry out some more probing experiments.

This is not a one off move. We have to keep cycling to ensure the solution remains relevant. The cadence, or natural flow of the cycle will vary depending on the context. Different industries, or sectors, or applications will have different cadences. It could be days, or years, or anything in between. If, or rather when, our constraints fail to produce repeatable results we have to get back into the Complex domain.

This cycle between Complex and Complicated is key for software development in particular. Understanding this dynamic is essential in order to understand how Cynefin might be employed.

Setting up developments

As I said earlier the parts of a software development project that will provide value are where we are doing something new, and that is where the risk also lies. Any significant and worthwhile development project will start in the Complex domain. The initial challenge is to learn enough to move it to Complicated. Dave explained it as follows in a talk at Agile India in 2015.

“As things are Complex we see patterns, patterns emerge. We stabilise the patterns. As we stabilise them we can actually shift them into the Complicated domain. So the basic principle of Complexity-based intervention is you start off with multiple, parallel, safe-to-fail experiments, which is why Scrum is not a true Complexity technique; it does one thing in a linear way. We call (these experiments) a pre-Scrum technique. You do smaller experiments faster in parallel… So you’re moving from the centre of the Complex domain into the boundary, once you’re in the boundary you use Scrum to move it across the boundary.”

Such a safe-to-fail experiment might be an XP pair programming team being assigned to knock up a small, quick prototype.

So the challenge in starting the move from Complex to Complicated is to come up with the ideas for safe-to-fail pre-Scrum experiments that would allow us to use Scrum effectively.

Dave outlined the criteria that suitable experiments should meet. There should be some way of knowing whether the experiment is succeeding and it must be possible to amplify (i.e. reinforce) signs of success. Similarly, there should be some way of knowing whether it is failing and of dampening, or reducing, the damaging impact of a failing experiment. Failure is not bad. In any useful set of safe-to-fail experiments some must fail if we are to learn anything worthwhile The final criterion is that the experiment must be coherent. This idea of coherence requires more attention.

Dave Snowden explains the tests for coherence here. He isn’t entirely clear about how rigid these tests should be. Perhaps it’s more useful to regard them as heuristics than fixed rules, though the first two are of particular importance.

  • A coherent experiment, the ideas and assumptions behind it, should be compatible with natural science. That might seem like a rather banal statement, till you consider all the massive IT developments and change programmes that were launched in blissful ignorance of the fact that science could have predicted inevitable failure.
  • There should be some evidence from elsewhere to support the proposal. Replicating past cases is no guarantee of success, far from it, but it is a valid way to try and learn about the problem.
  • The proposal should fit where we are. It has to be consistent to some degree with what we have been doing. A leap into the unknown attempting something that is utterly unfamiliar is unlikely to gain any traction.
  • Can the proposal pass a series of “ritual dissent challenges? These are a formalised way of identifying flaws and refining possible experiments.
  • Does the experiment reflect an unmet, unarticulated need that has been revealed by sense-making, by attempts to make sense of the problem?

The two latter criteria refer explicitly to Cynefin techniques. The final one, identifying unmet needs, assumes the use of Cognitive Edge’s SenseMaker. Remember Fred Brooks’ blunt statement about requirements? Clients do not know what they want. They cannot articulate their needs if they are asked directly. They cannot envisage what is possible. Dave Snowden takes that point further. If users can articulate their needs than you’re dealing with a commoditized product and the solution is unlikely to have great value. Real values lies in meeting needs that users are unaware of and that they cannot articulate. This has always been so, but in days of yore we could often get away with ignoring that problem. Most applications were in-house developments that either automated back-office functions or were built around business rules and clerical processes that served as an effective proxy for true requirements. The inadequacies of the old structured methods and traditional requirements gathering could be masked.

With the arrival of web development, and then especially with mobile technology this gulf between user needs and the ability of developers to grasp them became a problem that could be ignored only through wilful blindness, admittedly a trait that has never been in short supply in corporate life. The problem has been exacerbated by our historic willingness to confuse rigour with a heavily documented, top-down approach to software development. Sense-making entails capturing large numbers of user reports in order to discern patterns that can be exploited. This appears messy, random and unstructured to anyone immured in traditional ways of development. It might appear to lack rigour, but such an approach is in accord with messy, unpredictable reality. That means it offers a more rigorous and effective way of deriving requirements than we can get by pretending that every development belongs in the Obvious domain. A simple lesson I’ve had to learn and relearn over the years is that rigour and structure are not the same as heavy documentation, prescriptive methods and a linear, top-down approach to problem solving.

This all raises big questions for testers. How do we respond? How do we get involved in testing requirements that have been derived this way and indeed the resulting applications? Any response to those questions should take account of another theme that really struck me from Dave’s day in New York. That was the need for resilience.

Resilience

The crucial feature of complex adaptive systems is their unpredictability. Applications operating in such a space will inevitably be subject to problems and threats that we would never have predicted. Even where we can confidently predict the type of threat the magnitude will remain uncertain. Failure is inevitable. What matters is how the application responds.

The need for resilience, with its linked themes of tolerance, diversity and redundancy, was a recurring message in Dave’s class. Resilience is not the same as robustness. The example that Dave gave was that a seawall is robust but a salt marsh is resilient. A seawall is a barrier to large waves and storms. It protects the harbour behind, but if it fails it does so catastrophically. A salt marsh protects inland areas by acting as a buffer, absorbing storm waves rather than repelling them. It might deteriorate over time but it won’t fail suddenly and disastrously.

An increasing challenge for testers will be to look for information about how systems fail, and test for resilience rather than robustness. Tolerance for failure becomes more important than a vain attempt to prevent failure. This tolerance often requires greater redundancy. Stripping out redundancy and maximizing the efficiency of systems has a downside, as I’ve discovered in my career. Greater efficiency can make applications brittle and inflexible. When problems hit they hit hard and recovery can be difficult.

it could be worse - not sure how, but it could be

The six years I spent working as an IT auditor had a huge impact on my thinking. I learned that things would go wrong, that systems would fail, and that they’d do so in ways I couldn’t have envisaged. There is nothing like a spell working as an auditor to imbue one with a gloomy sense of realism about the possibility of perfection, or even adequacy. I ended up like the gloomy old pessimist Eeyore in Winnie the Pooh. When I returned to development work a friend once commented that she could always spot one of my designs. Like Eeyore I couldn’t be certain exactly how things would go wrong, I just knew they would and my experience had taught me where to be wary. I was destined to end up as a tester.

Liz Keogh, in this talk on Safe-to-Fail makes a similar point.

“Testers are really, really good at spotting failure scenarios… they are awesomely imaginative at calamity… Devs are problem solvers. They spot patterns. Testers spot holes in patterns… I have a theory that other people who are in critical positions, like compliance and governance people are also really good at this”.

Testers should have the creativity to imagine how things might go wrong. In a Complex domain, working with applications that have been developed working with Cynefin, this insight and imagination, the ability to spot potential holes, will be extremely valuable. Testers have to seize that opportunity to remain relevant.

There is an upside to redundancy. If there are different ways of achieving the same ends then that diversity will offer more scope for innovation, for users to learn about the application and how it could be adapted and exploited to do more than the developers had imagined. Again, this is an opportunity for testers. Stakeholders need to know about the application and what it can do. Telling them that the application complied with a set of requirements that might have been of dubious relevance and accuracy just doesn’t cut it.

Conclusion

Conclusion is probably the wrong word. Dave Snowden’s class opened my mind to a wide range of new ideas and avenues to explore. This was just the starting point. These two essays can’t go very far in telling you about Cynefin and how it might apply to software testing. All I can realistically do is make people curious to go and learn more for themselves, to explore in more depth. That is what I will be doing, and as a starter I will be in London at the end of June for the London Tester Gathering. I will be at the workshop An Introduction to Complexity and Cynefin for Software Testers” being run by Martin Hynie and Ben Kelly where I hope to discuss Cynefin with fellow testers and explorers.

If you are going to the CAST conference in Nashville in August you will have the chance to hear Dave Snowden giving a keynote speech. He really is worth hearing.

Advertisements

Dave Snowden’s Cynefin masterclass in New York, 2nd May 2017 – part 1

This is part one of a two post series on Cynefin and software testing. I wrote it with the support of the Committee on Standards and Professional Practices of the Association for Software Testing. The posts originally appeared on the AST site.

Introduction

On May 2nd I attended Dave Snowden’s masterclass in New York, “A leader’s framework for decision making: managing complex projects using Cynefin”, at the Test Leadership Congress. For several years I have been following Dave’s work and I was keen to hear him speak in person. Dave is a gifted communicator, but he moves through his material fast, very fast. In a full day class he threw out a huge range of information, insights and arguments. I was writing frantically throughout, capturing key ideas and phrases I could research in detail later.

It was an extremely valuable day. All of it was relevant to software development, and therefore indirectly to testing. However, it would require a small book to do justice to Dave’s ideas. I will restrict myself to two posts in which I will concentrate on a few key themes that struck me as being particularly important to the testing community.

Our worldview matters

We need to understand how the world works or we will fail to understand the problems we face. We won’t recognise what success might look like, nor will we be able to anticipate unacceptable failure till we are beaten over the head, and we will select the wrong techniques to address problems.it ain't what you don't know that gets you into trouble - it's what you know for sure that just ain't do

Dave used a slide with this quote from Mark Twain. It’s an important point. Software development and testing has been plagued over the years by unquestioned assumptions and beliefs that we were paid well to take for granted, without asking awkward questions, but which just ain’t so. And they’ve got us into endless trouble.

A persistent damaging feature of software development over the years has been the illusion that is a neater, more orderly process than it really is. We craved certainty, fondly imagining that if we just put a bit more effort and expertise into the upfront analysis and requirements then good, experienced professionals can predictably develop high quality applications. It hardly ever panned out that way, and the cruel twist was that the people who finally managed to crank out something workable picked up the blame for the lack of perfection.

Fred Brooks made the point superbly in his classic paper, “No Silver Bullet”.

“The truth is, the client does not know what he wants. The client usually does not know what questions must be answered, and he has almost never thought of the problem in the detail necessary for specification. … in planning any software-design activity, it is necessary to allow for an extensive iteration between the client and the designer as part of the system definition.

…… it is really impossible for a client, even working with a software engineer, to specify completely, precisely, and correctly the exact requirements of a modern software product before trying some versions of the product.”

So iteration is required, but that doesn’t mean simply taking a linear process and repeating it. Understanding and applying Cynefin does not mean tackling problems in familiar ways but with a new vocabulary. It means thinking about the world in a different way, drawing on lessons from complexity science, cognitive neuroscience and biological anthropology.

Cynefin and ISO 29119

Cynefin is not based on successful individual cases, or on ideology, or on wishful thinking. Methods that are rooted in successful cases are suspect because of the survivorship bias (how many failed projects did the same thing?), and because people do not remember clearly and accurately what they did after the event; they reinterpret their actions dependent on the outcome. Cynefin is rooted in science and the way things are, the way systems behave, and the way that people behave. Developing software is an activity carried out by humans, for humans, mostly in social organisations. If we follow methods that are not rooted in reality, in science and which don’t allow for the way people behave then we will fail.

Dave Snowden often uses the philosophical phrase “a priori”, usually in the sense of saying that something is wrong a priori. A priori knowledge is based on theoretical deduction, or on mathematics, or the logic of the language in which the proposition is stated. We can say that certain things are true or false a priori, without having to refer to experience. Knowledge based on experience is a posteriori.

The distinction is important in the debate over the software testing standard ISO 29119. The ISO standards lobby has not attempted to defend 29119 on either a priori or on a posteriori grounds. The standard has its roots in linear, document driven development methods that were conspicuously unsuccessful. ISO were unable to cite any evidence or experience to justify their approach.

Defenders of the standard, and some neutrals, have argued that critics must examine the detailed content of the standard, which is extremely expensive to purchase, in order to provide meaningful criticism. However, this defence is misconceived because the standard itself is misconceived. The standard’s stated purpose, “is to define an internationally agreed set of standards for software testing that can be used by any organization when performing any form of software testing”. If ISO believes that a linear, prescriptive standard like ISO 29119 will apply to “any form of software testing” we can refer to Cynefin and say that they are wrong; we can say so confidently knowing that our stance is backed by reputable science and theory. ISO is attempting to introduce a practice that might, sometimes at best, be appropriate for the Obvious domain into the Complicated and Complex domains where it is wildly unsuitable and damaging. ISO is wrong a priori.

What is Cynefin?

The Wikipedia article is worth checking out, not least because Dave Snowden keeps an eye on it. This short video presented by Dave is also helpful.

The Cynefin Framework might look like a quadrant, but it isn’t. It is a collection of five domains that are distinct and clearly defined in principle, but which blur into one another in practice.

In addition to the four domains that look like the cells of a quadrant there is a fifth, in the middle, called Disorder, and this one is crucial to an understanding of the framework and its significance.

Cynefin is not a categorisation model, as would be implied if it were a simple matrix. It is not a matter of dropping data into the framework then cracking on with the work. Cynefin is a framework that is designed to help us make sense of what confronts us, to give us a better understanding of our situation and the approaches that we should take.

The first domain is Obvious, in which there are clear and predictable causes and effects. The second is Complicated, which also has definite causes and effects, but where the connections are not so obvious; expert knowledge and judgement is required.

The third is Complex, where there is no clear cause and effect. We might be able to discern it with hindsight, but that knowledge doesn’t allow us to predict what will happen next; the system adapts continually. Dave Snowden and Mary Boone used a key phrase in their Harvard Business Review article about Cynefin.

”Hindsight does not lead to foresight because the external conditions and systems constantly change.”

The fourth domain is Chaotic. Here, urgent action rather than reflective analysis, is required. The participants must act, sense feedback and respond. Complex situations might be suited to safe probing, which can teach us more about the problem, but such probing is a luxury in the Chaotic domain.

The appropriate responses in all four of these domains are different. In Obvious, the categories are clearly defined, one simply chooses the right one, and that provides the right route to follow. Best practices are appropriate here.

In the Complicated domain there is no single, right category to choose. There could be several valid options, but an expert can select a good route. There are various good practices, but the idea of a single best practice is misconceived.

In the Complex domain it is essential to probe the problem and learn by trial and error. The practices we might follow will emerge from that learning. In Chaos as I mentioned, we simply have to start with action, firefighting to stop the situation getting worse. It is helpful to remember that, instead of the everyday definition, chaos in Cynefin terms refer to the concept in physics. Here chaos refers to a system that it is so dynamic that minor variations in initial conditions lead to outcomes so dramatically divergent that the system is unpredictable. In some circumstances it makes sense to make a deliberate and temporary move into Chaos to learn new practice. That would require removing constraints and the connections that impose some sort of order.

The fifth domain is that of Disorder, in the middle of the diagram. This is the default position in a sense. It’s where we find ourselves when we don’t know which domain we should really be in. It’s therefore the normal starting point. The great danger is that we don’t choose the appropriate domain, but simply opt for the one that fits our instincts or our training, or that is aligned with the organisation’s traditions and culture, regardless of the reality.

The only stable domains are Obvious, Complicated and Complex. Chaotic and Disorder are transitional. You don’t (can’t) stay there. Chaotic is transitional because constraints will kick in very quickly, almost as a reflex. Disorder is transitional because you are actually in one of the other domains, but you just don’t know it.

The different domains have blurred edges. In any context there might be elements that fit into different domains if they are looked at independently. That isn’t a flaw with Cynefin. It merely reflects reality. As I said, Cynefin is not a neat categorisation model. It is intended to help us make sense of what we face. If reality is messy and blurred then there’s no point trying to force it into a straitjacket.

Many projects will have elements that are Obvious, that deal with a problem that is well understood, that we have dealt with before and whose solution is familiar and predictable. However, these are not the parts of a project that should shape the approach we take. The parts where the potential value, and the risk, lie are where we are dealing with something we have not done before. Liz Keogh has given many talks and written some very good blogs and articles about applying Cynefin to software development. Check out her work. This video is a good starter.

The boundaries between the domains are therefore fuzzy, but there is one boundary that is fundamentally different from the others; the border between Obvious and Chaotic. This is not really a boundary at all. It is more of a cliff. If you move from Obvious to Chaotic you don’t glide smoothly into a subtly changing landscape. You fall off the cliff.

Within the Obvious domain the area approaching the cliff is the complacent zone. Here, we think we are working in a neat, ordered environment and “we believe our own myths” as Snowden puts it in the video above. The reality is quite different and we are caught totally unaware when we hit a crisis and plunge off the cliff into chaos.

That was a quick skim through Cynefin. However, you shouldn’t think of it as being a static framework. If you are going to apply it usefully you have to understand the dynamics of the framework, and I will return to that in part two.

Why ISO 29119 is a flawed quality standard

Why ISO 29119 is a flawed quality standard

This article originally appeared in the Fall 2015 edition of Better Software magazine.

In August 2014, I gave a talk attacking ISO 29119” at the Association for Software Testing’s conference in New York. That gave me the reputation for being opposed to standards in general — and testing standards in particular. I do approve of standards, and I believe it’s possible that we might have a worthwhile standard for testing. However, it won’t be the fundamentally flawed ISO 29119.

Technical standards that make life easier for companies and consumers are a great idea. The benefit of standards is that they offer protection to vulnerable consumers or help practitioners behave well and achieve better outcomes. The trouble is that even if ISO 29119 aspires to do these things, it doesn’t.

Principles, standards, and rules

The International Organization for Standardization (ISO) defines a standard as “a document that provides requirements, specifications, guidelines or characteristics that can be used consistently to ensure that materials, products, processes and services are fit for their purpose.”

It might be possible to derive a useful software standard that fits this definition, but only if it focuses on guidelines, rather than requirements, specifications, or characteristics. According to ISO’s definition, a standard doesn’t have to be all those things. A testing standard that is instead framed as high level guidelines would be consistent with the widespread view among regulatory theorists that standards are conceptually like high-level principles. Rules, in contrast, are detailed and specific (see Frederick Schauer’s “The Convergence of Rules and Standards”: PDF opens in new tab). One of ISO 29119’s fundamental problems is that it is pitched at a level consistent with rules, which will undoubtedly tempt people to treat them as fixed rules.

Principles focus on outcomes rather than detailed processes or specific rules. This is how many professional bodies have defined standards. They often use the words principles and standards interchangeably. Others favor a more rules-based approach. If you adopt a detailed, rules-based approach, there is a danger of painting yourself into a corner; you have to try to specify exactly what is compliant and noncompliant. This creates huge opportunities for people to game the system, demonstrating creative compliance as they observe the letter of the law while trashing underlying quality principles, (see John Braithwaite’s “Rules and Principles: A Theory of Legal Certainty”). Whether one follows a principles-based or a rules-based approach, regulators, lawyers, auditors, and investigators are likely to assume standards define what is acceptable.

As a result, there is a real danger that ISO 29119 could be viewed as the default set of rules for responsible software testing. People without direct experience in development or testing look for some form of reassurance about what constitutes responsible practice. They are likely to take ISO 29119 at face value as a definitive testing standard. The investigation into the HealthCare.gov website problems showed what can happen.

In its March 2015 report (PDF, opens in new tab) on the website’s problems, the US Government Accountability Office checked the HealthCare.gov project for compliance with the IEEE 829 test documentation standard. The agency didn’t know anything about testing. They just wanted a benchmark. IEEE 829 was last revised in 2008; it said that the content of standards more than five years old “do not wholly reflect the present state of the art”. Few testers would disagree that IEEE 829 is now hopelessly out of date.

when a document is more than five years old

IEEE 829’s obsolescence threshold


The obsolescence threshold for ISO 29119 has increased from five to ten years, presumably reflecting the lengthy process of creating and updating such cumbersome documents rather than the realities of testing. We surely don’t want regulators checking testing for compliance against a detailed, outdated standard they don’t understand.

Scary lessons from the social sciences

If we step away from ISO 29119, and from software development, we can learn some thought-provoking lessons from the social sciences.

Prescriptive standards don’t recognize how people apply knowledge in demanding jobs like testing. Scientist Michael Polanyi and sociologist Harry Collins have offered valuable insights into tacit knowledge, which is knowledge we possess and use but cannot articulate. Polanyi first introduced the concept, and Collins developed the idea, arguing that much valuable knowledge is cultural and will vary between different contexts and countries. Defining a detailed process as a standard for all testing excludes vital knowledge; people will respond by concentrating on the means, not the ends.

Donald Schön, a noted expert on how professionals learn and work, offered a related argument with “reflection in action” (see Willemien Visser’s article: PDF opens in new tab). Schön argued that creative professionals, such as software designers or architects, have an iterative approach to developing ideas—much of their knowledge is understood without being expressed. In other words, they can’t turn all their knowledge into an explicit, written process. Instead, to gain access to what they know, they have to perform the creative act so that they can learn, reflect on what they’ve learned, and then apply this new knowledge. Following a detailed, prescriptive process stifles learning and innovation. This applies to all software development—both agile and traditional methods.

In 1914, Thorstein Veblen identified the problem of trained incapacity. People who are trained in specific skills can lack the ability to adapt. Their response worked in the past, so they apply it regardless thereafter.

young girl, old woman

Young woman or old woman? Means or ends? We can focus on only one at a time.

Kenneth Burke built upon Veblen’s work, arguing that trained incapacity means one’s abilities become blindnesses. People can focus on the means or the ends, not both; their specific training makes them focus on the means. They don’t even see what they’re missing. As Burke put it, “a way of seeing is also a way of not seeing; a focus upon object A involves a neglect of object B”. This leads to goal displacement, and the dangers for software testing are obvious.

The problem of goal displacement was recognized before software development was even in its infancy. When humans specialize in organizations, they have a predictable tendency to see their particular skill as a hammer and every problem as a nail. Worse, they see their role as hitting the nail rather than building a product. Give test managers a detailed standard, and they’ll start to see the job as following the standard, not testing.

In the 1990s, British academic David Wastell studied software development shops that used structured methods, the dominant development technique at the time. Wastell found that developers used these highly detailed and prescriptive methods in exactly the same way that infants use teddy bears and security blankets: to give them a sense of comfort and help them deal with stress. In other words, a developer’s mindset betrayed that the method wasn’t a way to build better software but rather a defense mechanism to alleviate stress and anxiety.

Wastell could find no empirical evidence, either from his own research at these companies or from a survey of the findings of other experts, that structured methods worked. In fact, the resulting systems were no better than the old ones, and they took much more time and money to develop. Managers became hooked on the technique (the standard) while losing sight of the true goal. Wastell concluded the following:

Methodology becomes a fetish, a procedure used with pathological rigidity for its own sake, not as a means to an end. Used in this way, methodology provides a relief against anxiety; it insulates the practitioner from the risks and uncertainties of real engagement with people and problems.

Developers were delivering poorer results but defining that as the professional standard. Techniques that help managers cope with stress and anxiety but give an illusory, reassuring sense of control harm the end product. Developers and testers cope by focusing on technique, mastery of tools, or compliance with standards. In doing so they can feel that they are doing a good job, so long as they don’t think about whether they are really working toward the true ends of the organization or the needs of the customer.

Standards must be fit for their purpose

Is all this relevant to ISO 29119? We’re still trying to do a difficult, stressful job, and in my experience, people will cling to prescriptive processes and standards that give the illusion of being in control. Standards have credibility and huge influence simply from their status as standards. If we must have standards, they should be relevant, credible, and framed in a way that is helpful to practitioners. Crucially, they must not mislead stakeholders and regulators who don’t understand testing but who wield great influence and power.

The level of detail in ISO 29119 is a real concern. Any testing standard should be in the style favored by organizations like the Institute of Internal Auditors (IIA), whose principles based professional standards cover the entire range of internal auditing but are only one-tenth as long as the three completed parts of ISO 29119. The IIA’s standards are light on detail but far more demanding in the outcomes required.

Standards must be clear about the purpose they serve if we are to ensure testing is fit for its purpose, to hark back to ISO’s definition of a standard. In my opinion, this is where ISO 29119 falls down. The standard does not clarify the purpose of testing, only the mechanism—and that mechanism focuses on documentation, not true testing. It is this lack of purpose, the why, that leads to teams concentrating on standards compliance rather than delivering valuable information to stakeholders. This is a costly mistake. Standards should be clear about the outcomes and leave the means to the judgment of practitioners.

A good example of this problem is ISO 29119’s test completion report, which is defined simply as a summary of the testing that was performed. The standard offers examples for traditional and agile projects. Both focus on the format, not the substance of the report. The examples give some metrics without context or explanation and provide no information or insight that would help stakeholders understand the product and the risk and make better decisions. Testers could comply with the standard without doing anything useful. In contrast, the IIA’s standards say audit reports must be “accurate, objective, clear, concise, constructive, complete, and timely.” Each of these criteria is defined briefly in a way that makes the standard far more demanding and useful than ISO 29119, in far less space.

It’s no good saying that ISO 29119 can be used sensibly and doesn’t have to be abused. People are fallible and will misuse the standard. If we deny that fallibility, we deny the experience of software development, testing, and, indeed, human nature. As Jerry Weinberg said (in “The Secrets of Consulting”), “no matter how it looks at first, it’s always a people problem”. Any prescriptive standard that focuses on compliance with highly detailed processes is doomed. Maybe you can buck the system, but you can’t buck human nature.

David Graeber’s “The Utopia of Rules: On Technology, Stupidity and the Secret Joys of Bureaucracy”

When I gave my talk at CAST 2014 in New York, “Standards – promoting quality or restricting competition?” I was concentrating on the economic aspects of standards. They are often valuable, but they can be damaging and restrict competition if they are misused. A few months later I bought “The Utopia of Rules: On Technology, Stupidity, and the Secret Joys of Bureaucracy” by David Graeber, Professor of Anthropology at the London School of Economics. I was familiar with Graeber as a challenging and insightful writer. I drew on his work when I wrote “Testing: valuable or bullshit?“. The Utopia of Rules also inspired the blog article I wrote recently, “Frozen in time – grammar and testing standards” in which I discussed the similarity between grammar textbooks and standards, which both codify old usages and practices that no longer match the modern world.

What I hadn’t expected from The Utopia of Rules was how strongly it would support the arguments I made at CAST.

Certification and credentialism

Graeber makes the same argument I deployed against certification. It is being used increasingly to enrich special interests without benefiting society. On page 23 Graeber writes:

Almost every endeavor that used to be considered an art (best learned through doing) now requires formal professional training and a certificate of completion… In some cases, these new training requirements can only be described as outright scams, as when lenders, and those prepared to set up the training programs, jointly lobby the government to insist that, say, all pharmacists be henceforth required to pass some additional qualifying examination, forcing thousands already practicing the profession into night school, which these pharmacists know many will only be able to afford with the help of high-interest student loans. By doing this, lenders are in effect legislating themselves a cut of most pharmacists’ subsequent incomes.

To be clear, my stance on ISTQB training is that it educates testers in a legitimate, though very limited, vision of testing. My objection is to any marketing of the qualification as a certification of testing ability, rather than confirmation that the tester has passed an exam associated with a particular training course. I object even more strongly to any argument that possession of the certificate should be a requirement for employment, or for contracting out testing services. It is reasonable to talk of scams when the ability of good testers to earn a living is damaged.

What is the point of it all?

Graeber has interesting insights into how bureaucrats can be vague about the values of the bureaucracy: why does the organisation exist? Bureaucrats focus on efficient execution of rational processes, but what is the point of it all? Often the means become the ends: efficiency is an end in itself.

I didn’t argue that point at CAST, but I have done so many times in other talks and articles (e.g. “Teddy bear methods“). If people are doing a difficult, stressful job and you give them prescriptive methods, processes or standards then they will focus on ticking their way down the list. The end towards which they are working becomes compliance with the process, rather than helping the organisation reach its goal. They see their job as producing the outputs from the process, rather than the outcomes the stakeholders want. I gave a talk in London in June 2015 to the British Computer Society’s Special Interest Group in Software Testing in which I argued that testing lacks guiding principles (PDF, opens in a new tab) and ISO 29119 in particular does not offer clear guidance about the purpose of testing.

In a related argument Graeber makes a point that will be familiar to those who have criticised the misuse of testing metrics.

…from inside the system, the algorithms and mathematical formulae by which the world comes to be assessed become, ultimately, not just measures of value, but the source of value itself.

Rent extraction

The most controversial part of my CAST talk was my argument that the pressure to adopt testing standards was entirely consistent with rent seeking in economic theory. Rent seeking, or rent extraction, is what people do when they exploit failings in the market, or rig the market for their own benefit by lobbying for regulation that happens to benefit them. Instead of creating wealth, they take it from other people in a way that is legal, but which is detrimental to the economy, and society, as a whole.

This argument riled some people who took it as a personal attack on their integrity. I’m not going to dwell on that point. I meant no personal slur. Rent seeking is just a feature of modern economies. Saying so is merely being realistic. David Graeber argued the point even more strongly.

The process of financialization has meant that an ever-increasing proportion of corporate profits come in the form of rent extraction of one sort or another. Since this is ultimately little more than legalized extortion, it is accompanied by ever-increasing accumulation of rules and regulations… At the same time, some of the profits from rent extraction are recycled to select portions of the professional classes, or to create new cadres of paper-pushing corporate bureaucrats. This helps a phenomenon I have written about elsewhere: the continual growth, in recent decades, of apparently meaningless, make-work, “bullshit jobs” — strategic vision coordinators, human resources consultants, legal analysts, and the like — despite the fact that even those who hold such positions are half the time secretly convinced they contribute nothing to the enterprise.

In 2014 I wrote about “bullshit jobs“, prompted partly by one of Graeber’s articles. It’s an important point. It is vital that testers define their job so that it offers real value, and they are not merely bullshit functionaries of the corporate bureaucracy.

Utopian bureaucracies

I have believed for a long time that adopting highly prescriptive methods or standards for software development and testing places unfair pressure on people, who are set up to fail. Graeber makes exactly the same point.

Bureaucracies public and private appear — for whatever historical reasons — to be organized in such a way as to guarantee that a significant proportion of actors will not be able to perform their tasks as expected. It’s in this sense that I’ve said one can fairly say that bureaucracies are utopian forms of organization. After all, is this not what we always say of utopians: that they have a naïve faith in the perfectibility of human nature and refuse to deal with humans as they actually are? Which is, are we not also told, what leads them to set impossible standards and then blame the individuals for not living up to them? But in fact all bureaucracies do this, insofar as they set demands they insist are reasonable, and then, on discovering that they are not reasonable (since a significant number of people will always be unable to perform as expected), conclude that the problem is not with the demands themselves but with the individual inadequacy of each particular human being who fails to live up to them.

Testing standards such as ISO 29119, and its predecessor IEEE 829, don’t reflect what developers and testers do, or rather should be doing. They are at odds with the way people think and work in organisations. These standards attempt to represent a highly complex, sometimes chaotic, process in a defined, repeatable model. The end product is usually of dubious quality, late and over budget. Any review of the development will find constant deviations from the standard. The suppliers, and defenders, of the standard can then breathe a sigh of relief. The sacred standard was not followed. It was the team’s fault. If only they’d done it by the book! The possibility that the developers’ and testers’ apparent sins were the only reason anything was produced at all is never considered. This is a dreadful way to treat people, but in many organisations it has been normal for several decades.

Loss of communication

All of the previous arguments by Graeber were entirely consistent with my own thoughts about how corporate bureaucracies operate. It was fascinating to see an anthropologist’s perspective, but it didnt teach me anything that was really new about how testers work in corporations. However, later in the book Graeber developed two arguments that gave me new insights.

Understanding what is happening in a complex, social situation needs effective two way communication. This requires effort, “interpretive labor”. The greater the degree of compulsion, and the greater the bureaucratic regime of rules and forms, the less need there is for such two way communication. Those who can simply issue orders that must be obeyed don’t have to take the trouble to understand the complexities of the situation they’re managing.

…within relations of domination, it is generally the subordinates who are effectively relegated the work of understanding how the social relations in question really work. … It’s those who do not have the power to hire and fire who are left with the work of figuring out what actually did go wrong so as to make sure it doesn’t happen again.

This ties in with the previous argument about utopian bureaucracies. If you impose a inappropriate standard then poor results will be attributed to the inevitable failure to comply. There is no need for senior managers to understand more, and no need to listen to the complaints, the “excuses”, of the people who do understand what is happening. Interestingly, Graeber’s argument about interpretive labor is is consistent with regulatory theory. Good regulation of complex situations requires ongoing communication between the regulator and the regulated. I explained this in the talk on testing principles I mentioned above (slides 38 and 39).

Fear of play

My second new insight from Graeber arrived when he discussed the nature of play and how it relates to bureaucracies. Anthropologists try to maintain a distinction between games and play, a distinction that is easier to maintain in English than in languages like French and German, which use the same word for both. A game has boundaries, set rules and a predetermined conclusion. Play is more free-form and creative. Novelties and surprising results emerge from the act of playing. It is a random, unpredictable and potentially destructive activity. Graeber finishes his discussion of play and games with the striking observation.

What ultimately lies behind the appeal of bureaucracy is fear of play.

Put simply, and rather simplistically, Graeber means that we use bureaucracy to escape the terror of chaotic reality, to bring a semblance (an illusion?) of control to the uncontrollable.

This gave me an tantalising new insight into the reasons people build bureaucratic regimes in organisations. It sent me off into a whole new field of reading on the anthropology of games and play. This has fascinating implications for the debate about standards and testing. We shy away from play, but it is through play that we learn. I don’t have time now to do the topic justice, and it’s much too big and important a subject to be tacked on to the end of this article, but I will return to it. It is yet another example of the way anthropology can help us understand what we are doing as testers. As a starting point I can heartily recommend David Graeber’s book, “The Utopia of Rules”.

Frozen in time – grammar and testing standards

This recent tweet by Tyler Hayes caught my eye. “If you build software you’re an anthropologist whether you like it or not.”

It’s an interesting point, and it’s relevant on more than one level. By and large software is developed by people and for people. That is a statement of the obvious, but developers and testers have generally been reluctant to take on board the full implications. This isn’t a simple point about usability. The software we build is shaped by many assumptions about the users, and how they live and work. In turn, the software can reinforce existing structures and practices. Testers should think about these issues if they’re to provide useful findings to the people who matter. You can’t learn everything you need to know from a requirements specification. This takes us deep into anthropological territory.

What is anthropology?

Social anthropology is defined by University College London as follows.

Social Anthropology is the comparative study of the ways in which people live in different social and cultural settings across the globe. Societies vary enormously in how they organise themselves, the cultural practices in which they engage, as well as their religious, political and economic arrangements.

We build software in a social, economic and cultural context that is shaped by myriad factors, which aren’t necessarily conducive to good software, or a happy experience for the developers and testers, never mind the users. I’ve touched on this before in “Teddy Bear Methods“.

There is much that we can learn from anthropology, and not just to help us understand what we see when we look out at the users and the wider world. I’ve long thought that the software development and testing community would make a fascinating subject for anthropologists.

Bureaucracy, grammar and deference to authority

I recently read “The Utopia of Rules – On Technology, Stupidity, and the Secret Joys of Bureaucracy” by the anthropologist David Graeber.
Graeber has many fascinating insights and arguments about how organisations work, and why people are drawn to bureaucracy. One of his arguments is that regulation is imposed and formalised to try and remove arbitrary, random behaviour in organisations. That’s a huge simplification, but there’s not room here to do Graeber’s argument justice. One passage in particular caught my eye.

People do not invent languages by writing grammars, they write grammars — at least, the first grammars to be written for any given language — by observing the tacit, largely unconscious, rules that people seem to be applying when they speak. Yet once a book exists,and especially once it is employed in schoolrooms, people feel that the rules are not just descriptions of how people do talk, but prescriptions for how they should talk.

It’s easy to observe this phenomenon in places where grammars were only written recently. In many places in the world, the first grammars and dictionaries were created by Christian missionaries in the nineteenth or even twentieth century, intent on translating the Bible and other sacred texts into what had been unwritten languages. For instance, the first grammar for Malagasy, the language spoken in Madagascar, was written in the 1810s and ’20s. Of course, language is changing all the time, so the Malagasy spoken language — even its grammar — is in many ways quite different than it was two hundred years ago. However, since everyone learns the grammar in school, if you point this out, people will automatically say that speakers nowadays are simply making mistakes, not following the rules correctly. It never seems to occur to anyone — until you point it out — that had the missionaries came and written their books two hundred years later, current usages would be considered the only correct ones, and anyone speaking as they had two hundred years ago would themselves be assumed to be in error.

In fact, I found this attitude made it extremely difficult to learn how to speak colloquial Malagasy. Even when I hired native speakers, say, students at the university, to give me lessons, they would teach me how to speak nineteenth-century Malagasy as it was taught in school. As my proficiency improved, I began noticing that the way they talked to each other was nothing like the way they were teaching me to speak. But when I asked them about grammatical forms they used that weren’t in the books, they’d just shrug them off, and say, “Oh, that’s just slang, don’t say that.”

…The Malagasy attitudes towards rules of grammar clearly have… everything to do with a distaste for arbitrariness itself — a distaste which leads to an unthinking acceptance of authority in its most formal, institutional form.

Searching for the “correct” way to develop software

Graeber’s phrase “distate for arbitrariness itself” reminded me of the history of software development. In the 1960s and 70s academics and theorists agonised over the nature of development, trying to discover and articulate what it should be. Their approach was fundamentally mistaken. There are dreadful ways, and there are better ways to develop software but there is no natural, correct way that results in perfect software. The researchers assumed that there was and went hunting for it. Instead of seeking understanding they carried their assumptions about what the answer might be into their studies and went looking for confirmation.

They were trying to understand how the organisational machine worked and looked for mechanical processes. I use the word “machine” carefully, not as a casual metaphor. There really was an assumption that organisations were, in effect, machines. They were regarded as first order cybernetic entities whose behaviour would not vary depending on whether they were being observed. To a former auditor like myself this is a ludicrous assumption. The act of auditing an organisation changes the way that people behave. Even the knowledge that an audit may occur will shape behaviour, and not necessarily for the better (see my article “Cynefin, testing and auditing“). You cannot do the job well without understanding that. Second order cybernetics does recognise this crucial problem and treats observers as participants in the system.

So linear, sequential development made sense. The different phases passing outputs along the production line fitted their conception of the organisation as a machine. Iterative, incremental development looked messy and immature; it was just wrong as far as the researchers were concerned. Feeling one’s way to a solution seemed random, unsystematic – arbitrary.

Development is a difficult and complex job; people will tend to follow methods that make the job feel easier. If managers are struggling with the complexities of managing large projects they are more likely to choose linear, sequential methods that make the job of management easier, or at least less stressful. So when researchers saw development being carried out that way they were observing human behaviour, not a machine operating.

Doubts about this approach were quashed by pointing out that if organisations weren’t quite the neat machine that they should be this would be solved by the rapid advance in the use of computers. This argument looks suspiciously circular because the conclusion that in future organisations would be fully machine-like rests on the unproven premise that software development is a mechanical process which is not subject to human variability when performed properly.

Eliminating “arbitrariness” and ignoring the human element

This might all have been no more than an interesting academic sideline, but it fed back into software development. By the 1970s, when these studies into the nature of development were being carried out, organisations were moving towards increasingly formalised development methods. There was increasing pressure to adopt such methods. Not only were they attractive to managers, the use of more formal methods provided a competitive advantage. ISO certification and CMMI accreditation were increasingly seen as a way to demonstrate that organisations produced high quality software. The evidence may have been weak, but it seemed a plausible claim. These initiatives required formal processes. The sellers of formal methods were happy to look for and cite any intellectual justification for their products. So formal linear methods were underpinned by academic work that assumed that formal linear methods were correct. This was the way that responsible, professional software development was performed. ISO standards were built on this assumption.

If you are trying to define the nature of development you must acknowledge that it is a human activity, carried out by and for humans. These studies about the nature of development were essentially anthropological exercises, but the researchers assumed they were observing and taking apart a machine.

As with the missionaries who were codifying grammar the point in time when these researchers were working shaped the result. If they had carried out their studies earlier in the history of software development they might have struggled to find credible examples of formalised, linear development. In the 1950s software development was an esoteric activity in which the developers could call the shots. 20 years later it was part of the corporate bureaucracy and iterative, incremental development was sidelined. If the studies can been carried out a few decades further on then it would have been impossible to ignore Agile.

As it transpired, formal methods, CMM/CMMI and the first ISO standards concerning development and testing were all creatures of that era when organisations and their activities were seriously regarded as mechanical. Like the early Malagasy grammar books they codified and fossilised a particular, flawed approach at a particular time for an activity that was changing rapidly. ISO 29119 is merely an updated version of that dated approach to testing. It is rooted in a yearning for bureaucratic certainty, a reluctance to accept that ultimately good testing is dependent not on documentation, but on that most irrational, variable and unpredictable of creatures – the human who is working in a culture shaped by humans. Anthropology has much to teach us.

Further reading

That is the end of the essay, but there is a vast amount of material you could read about attempts to understand and define the nature of software development and of organisations. Here is a small selection.

Brian Fitzgerald has written some very interesting articles about the history of development. I recommend in particular “The systems development dilemma: whether to adopt formalised systems development methodologies or not?” (PDF, opens in new tab).

Agneta Olerup wrote this rather heavyweight study of what she calls the
Langeforsian approach to information systems design. Börje Langefors was a highly influential advocate of the mechanical, scientific approach to software development. Langefors’ Wikipedia entry describes him as “one of those who made systems development a science”.

This paper gives a good, readable introduction to first and second order cybernetics (PDF, opens in new tab), including a useful warning about the distinction between models and the entities that they attempt to represent.

All our knowledge of systems is mediated by our simplified representations—or models—of them, which necessarily ignore those aspects of the system which are irrelevant to the purposes for which the model is constructed. Thus the properties of the systems themselves must be distinguished from those of their models, which depend on us as their creators. An engineer working with a mechanical system, on the other hand, almost always know its internal structure and behavior to a high degree of accuracy, and therefore tends to de-emphasize the system/model distinction, acting as if the model is the system.

Moreover, such an engineer, scientist, or “first-order” cyberneticist, will study a system as if it were a passive, objectively given “thing”, that can be freely observed, manipulated, and taken apart. A second-order cyberneticist working with an organism or social system, on the other hand, recognizes that system as an agent in its own right, interacting with another agent, the observer.

Finally, I recommend a fascinating article in the IEEE’s Computer magazine by Craig Larman and Victor Basili, “Iterative and incremental development: a brief history” (PDF, opens in new tab). Larman and Basili argue that iterative and incremental development is not a modern practice, but has been carried out since the 1950s, though they do acknowledge that it was subordinate to the linear Waterfall in the 1970s and 80s. There is a particularly interesting contribution from Gerald Weinberg, a personal communication to the authors, in which he describes how he and his colleagues developed software in the 1950s. The techniques they followed were “indistinguishable from XP”.

Audit and Agile (part 2)

This is the second part of my article about the training day I attended on October 8th, organised by the Scottish Chapter of ISACA and presented by Christopher Wright.

In the first part I set the scene and explained why good auditors have no problem in principle with agile development. However, it does pose problems in practice, especially for the inexperienced auditors, who can find agile terrifyingly vague.

In this second part I’ll talk about how auditors should get involved in agile projects and about testing. The emphasis was very much on Scrum, but the points apply in general to all agile development.

The importance of working together constructively

Christopher emphasised that auditors should be proactive. They should get involved in developments as early as possible and encourage developers to speak to them. Developers are naturally suspicious of auditors. They think auditors “want us to stop having fun”. These are messages I’ve been harping on about ever since I started in audit.

Developers, and testers, make assumptions about what auditors will do, and what they will expect. These assumptions can shape behaviour, for the worse, if they are not discussed with the auditors. That can waste a huge amount of time.

Christopher developed an argument that I have also often made. Auditors can see a bigger picture than developers. They will often have wider experience of what can go wrong, and what controls should be in place to protect the company. Auditors can be a useful source of misuse stories. They can even usefully embed themselves in an agile development team writing stories and tests that should help make the application more robust.

Auditors have to go native to a certain extent, accepting agile on its terms and adapting to the culture. Christopher advised the audience to conform to the developers’ dress code; “lose the tie” and remove any unnecessary barriers between the auditors and the developers. The final tip was one that will resonate with context driven testers. “Focus on people and product – not paperwork”.

Testing

Discussion of testing comprised just one, relatively small, part of the day. Obviously I would have preferred more time. However, the general guidance throughout the day about working with agile provided a good guide for auditing agile testing. Auditors who have absorbed these general lessons should be able to handle an audit of agile testing.

Christopher did have a couple of specific criticisms of agile testing. He thinks the standard is generally fairly poor, though he did not offer any comparisons with more traditional testing. He also expects to see testing that is repeatable, and wants to see more automated testing where possible. Christopher observed that too few projects develop repeatable, automated tests for regression testing. He is probably right on that point. I’m not sure that this is really just an agile problem.

Traditional projects were often planned and costed in a way that gave the test manager little incentive to make an investment for the future by automating tests. The difference under an agile regime is that a failure to invest in appropriate automation is likely to create problems for the current project rather than leave them lurking for the future support team.

In addition to his comments on automation Christopher’s key points were that auditors should look for evidence of appropriate planning and preparation, and evidence of the test results. There might not be a single, standard, documented agile development method, but each organisation should be clear and consistent about how it does agile.

Christopher did use the word “scripts” a lot, but he made it clear that auditors should not expect an agile test script to be as detailed and prescriptive as a traditional script; it shouldn’t go down to the level of specifying the values to go into every field. Together with the results the script should allow testing to be recreated. The auditor should be able to see exactly what was planned, what was tested and what was found.

Conclusion

The day was interesting and very worthwhile. It was reassuring to see auditors being encouraged to engage with agile in an open minded and constructive manner. It was also encouraging to see auditors responding positively to the message, even if the reality of dealing with agile is rather frightening for many auditors. Good auditors are accustomed to the challenge of being scared by the prospect of difficult jobs. It is a yardstick of good auditors that they cope with that challenge.

I don’t have a great deal of sympathy with auditors who shy away from auditing agile projects because it’s too difficult, or with those who bring inappropriate prejudices or assumptions to the audit. Auditing is like testing; it isn’t meant to be easy, it’s meant to be valuable.

Internal auditors should not be aliens who beam down to a project to beat up the developers and testers, then shoot off to their next assigment leaving bruised and hurt victims. I’m afraid that is how some auditors have worked, but good auditors are broadly on the same side as developers and testers. They are trying to achieve the same ends, but they have a different perspective. They should have wider knowledge of the context in which the project is working, and they should have a bleaker view of reality and of human nature. They should know what can go wrong, how that can happen, and what might be effective and efficient ways of preventing that.

Following the happy path is a foreign concept to good, experienced auditors. Their path is a narrow one. They strive to warn of the unseen dangers lurking all around while also providing constructive input, all the time maintaining sufficient independence so that they can comment dispassionately and constructively on what they see. As I’ve said, it’s not easy.

Auditors and testers should resist any attempts to redefine their difficult jobs to try and make them appear easier. Such attempts require a refusal to deal with reality, and a pretence, a delusion, that we can do something worthwhile if we refuse to engage with complex and messy reality.

Testing and auditing are both jobs that it is possible to fake, going through the motions in a plausible manner, while producing nothing of value. That approach is easier in the short tun, but it is deeply short sighted and irresponsible. It trashes the credibility and reputation of practitioners, it short-changes people who expect to receive valuable information, and it leaves both testers and auditors open to being replaced by semi-skilled competition. If you’re doing a lousy job and focusing on cost, there is always someone who can do it cheaper.

Audit and Agile (part 1)

Training in stuff I know

Last Thursday, October 8th, I went to a training day organised by the Scottish Chapter of ISACA, the international body representing IT governance, control, security and audit professionals. The event was entitled “Audit and Control of Agile Projects”. It was presented by Christopher Wright, who has over thirty years experience in IT audit and risk management.

This is a topic about which I am already very well informed but I was keen to go along for a mixture of reasons. When discussing auditing and the expectations of auditors we are not dealing with absolutes. There is no hard and fast right answer. We can’t say “this interpretation is correct and that one is wrong”. Well, we could, but that ignores the possibility that others might disagree, and they might be auditors who are conducting audits on a basis that we believe to be flawed.

It is therefore important to think deeply about, and understand, the approach that we believe to be appropriate, but also to consider alternative opinions, who holds them and why. Consultants offering advice in this area have to know the job, but they must also be able to advise clients about other approaches. We have to stay in touch with what other people are saying and doing, regardless of whether we agree with them.

I was keen to hear what Christopher Wright had to say, and also to talk to other attendees, who work for a wide range of employers. Happily the event was free to members, which made the decision to attend even easier!

As it transpired I didn’t hear anything from Christopher that was really new to me, but that was reassuring. His message was very much aligned with the advice I’ve been pushing over the last few years to clients, in my writing and in training tutorials. What was also reassuring was that the attendees were receptive to Christopher’s message and there was no sign of lingering attachment to older, traditional ways of working.

I don’t want to cover everything Christopher said. I will just focus on a few points that I thought were key. I will split them into two posts. I will try to make it clear when I am offering my own opinion rather than reporting Christopher’s views.

Agile is potentially an appropriate response to the problems of software development

Firstly, and crucially, agile can be entirely consistent with good governance. The “can be” is an important qualification. You can do agile well or badly. It is not a cop out by organisations that couldn’t make traditional methods work. In many situations it is the most appropriate response. Christopher ran through a quick discussion of the factors involved in making a decision whether to go with an agile or a traditional, waterfall approach. His views were orthodox, current software development thinking rather than a grudging auditor’s acceptance of what has been happening in development circles.

Put simply, Christopher argued that a waterfall approach is fine when the problem is stable and the solution is well understood and predictable. If we know at the outset, with justified confidence, what we are going to do then there’s no problem with waterfall, indeed we might as well use it because it is simpler to manage. We are dealing with a well defined problem and we are not going to be swamped by a succession of changes late in the project. I would argue that that is usually not the case when we are developing new software intended to provide a new solution to a problem.

If the problem is not trivial then it is unlikely that we can understand it fully at the start of the project, and we can only build our understand once the project is well underway. Agile is appropriate in these circumstances. We need to learn what is needed through a process of iteration, experimentation and discovery.

We had a brief discussion about predictable solutions. I would happily have spent much longer mulling over the idea of predictability. My view is that software development has been plagued throughout its history by a yearning for unattainable predictability. That has been damaging because we have pretended that we could predict solutions and project outcomes when the reality has been that we didn’t know, and couldn’t have known at the time. The paradox is that if we try to define a predictable outcome too early that pushes back the point when we truly can predict with confidence what is required and possible. Mary Poppendieck made the point very well back in 2003 with an excellent article “Lean Development and the Predictability Paradox” (PDF, opens in new window).

Agile is scary for auditors

Christopher argued some important points about why many traditional auditors are suspicious of agile, and I was very much in agreement with him. There are many agile methods, and none of them are rigidly defined or standardised. Auditors can’t march into a project with a checklist and say “you’ve not produced a burndown chart in the form prescribed by… ”.

The auditors have to base their work on the Agile Manifesto, but they need to read and understand it carefully. It is not a matter of choosing either working software or comprehensive documentation, but valuing the former over the latter. Auditors have to satisfy themselves that an appropriate balance is being struck, and this requires judgment and experience. The Agile Manifesto can guide them to ask useful questions, but it can’t provide them with answers.

Crucially, auditors have to ask open questions when they are auditing an agile project. This is one of my hobby horses, and I have often written and spoken about it. Auditors must have highly developed questioning skills. They need the soft skills that will allow them to draw out the information they need from auditees. They must not rely on checklists that encourage them to ask simplistic questions that invite simple, binary answers; yes or no.

Christopher told a sadly plausible story of an audit team that reviewed an agile project and produced a report listing 50 detailed problems. The auditors did not have experience with agile and had conducted the audit using a Prince2 checklist that was designed for waterfall projects.

This might seem a ludicrously extreme example, but I’ve seen similar appallingly unprofessional and incompetent behaviour by auditors. I believe that when this happens the best outcome is that the auditors have wasted everyone’s time, failed to do anything useful and trashed their credibility.

What is far more damaging, in my opinion, is a situation where the auditors have real power, regardless of their competence, and auditees start to tailor their work to fit the prejudices and assumptions of the auditors. Managers start to do things they know are wrong, wasteful and damaging because they fear criticism from auditors. Clients become infuriated because the supplier staff are ignoring their needs and focusing on appeasing the auditors.

In this scenario the auditors are taking commercial decisions, but are not accountable for the consequences. They are probably not even aware that this is what they are doing. They lack the experience, knowledge and insight to understand the damage they are doing.

I therefore agree with Christopher that auditing IT projects is no job for the inexperienced. Closed questions are dangerous, and should be used only to confirm information that has already been elicited. The auditor must not ask “do you have a detailed stage plan?”, but ask “can you show me how you plan the work?”. The former question might simply produce a “no”. The latter question will allow the auditor to see, and assess, how the planning is being performed. Auditees are naturally suspicious of auditors and many people will offer as little information as they can. Allowing them to answer with a curt yes or no helps nobody.

Of course, the problem for inexperienced auditors is that if they ask open ended questions they have to understand and interpret the answers then vary their follow up depending on what they are told. That can be difficult. Well, tough. Auditing isn’t meant to be easy. Like testing it should offer valuable information, and redefining the job to make it easy but pointless is deeply unprofessional.

And in part two…

I am splitting this article into two parts and will post the second one as soon as possible. You could easily write a book about this topic, as indeed Christopher has already done. It’s called “Agile Governance and Audit”.

In part two I cover how auditors can work constructively with audit teams, and also discuss testing. I wanted to set the scene in part one before moving on to testing. I don’t think it’s worth treating testing in isolation, and it’s useful to understand first where modern auditors are coming from. The way they should approach testing follows on naturally from the way that they engage with agile in general.