Dave Snowden’s Cynefin masterclass in New York, 2nd May 2017 – part 2

This post is the second of two discussing Dave Snowden’s recent Cynefin masterclass at the Test Leadership Congress in New York. I wrote the series with the support of the Committee on Standards and Professional Practices of the Association for Software Testing. The posts originally appeared on the AST site.

In the first I gave an overview of Cynefin and explained why I think it is important, and how it can helpfully shape the way we look at the world and make sense of the problems we face. In this post I will look at some of the issues raised in Dave’s class and discuss their relevance to development and testing.

The dynamics between domains

Understanding that the boundaries between the different domains are fluid and permeable is crucial to understanding Cynefin. A vital lesson is that we don’t start in one domain and stay there; we can and should move between them. Even if we ignore that lesson reality will drag us from one domain to another. Dave said “all the domains have value – it’s the ability to move between them that is key”.

The Cynefin dynamics are closely tied to the concept of constraints, which are so important to Cynefin that they act as differentiators between the domains. You could say that constraints define the domains.

Constraint is perhaps a slightly misleading word. In Cynefin terms it is not necessarily something that compels or prevents certain behaviour. That does apply to the Obvious domain, where the constraints are fixed and rigid. The constraints in the Complicated domain govern behaviour, and can be agreed by expert consensus. In the Complex domain the constraints enable action, rather than restricting it or compelling it. They are a starting point rather than an end. In Chaos there are no constraints.

Dave Snowden puts it as follows, differentiating rules and heuristics.

“Rules are governing constraints, they set limits to action, they contain all possible instances of action. In contrast heuristics are enabling constraints, they provide measurable guidance which can adapt to the unknowable unknowns.”

If we can change the constraints then we are moving from one domain to another. The most significant dynamic is the cycle between Complex and Complicated.

Cynefin core dynamic - Complex to ComplicatedCrucially, we must recognise that if we are attempting something new, that involves a significant amount of uncertainty then we start in the Complex domain exploring and discovering more about the problem. Once we have a better understanding and have found constraints that allow us to achieve repeatable outcomes we have moved the problem to the Complicated domain where we can manage it more easily and exploit our new knowledge. If our testing reveals that the constraints are not producing repeatable results then it’s important to get back into the Complex domain and carry out some more probing experiments.

This is not a one off move. We have to keep cycling to ensure the solution remains relevant. The cadence, or natural flow of the cycle will vary depending on the context. Different industries, or sectors, or applications will have different cadences. It could be days, or years, or anything in between. If, or rather when, our constraints fail to produce repeatable results we have to get back into the Complex domain.

This cycle between Complex and Complicated is key for software development in particular. Understanding this dynamic is essential in order to understand how Cynefin might be employed.

Setting up developments

As I said earlier the parts of a software development project that will provide value are where we are doing something new, and that is where the risk also lies. Any significant and worthwhile development project will start in the Complex domain. The initial challenge is to learn enough to move it to Complicated. Dave explained it as follows in a talk at Agile India in 2015.

“As things are Complex we see patterns, patterns emerge. We stabilise the patterns. As we stabilise them we can actually shift them into the Complicated domain. So the basic principle of Complexity-based intervention is you start off with multiple, parallel, safe-to-fail experiments, which is why Scrum is not a true Complexity technique; it does one thing in a linear way. We call (these experiments) a pre-Scrum technique. You do smaller experiments faster in parallel… So you’re moving from the centre of the Complex domain into the boundary, once you’re in the boundary you use Scrum to move it across the boundary.”

Such a safe-to-fail experiment might be an XP pair programming team being assigned to knock up a small, quick prototype.

So the challenge in starting the move from Complex to Complicated is to come up with the ideas for safe-to-fail pre-Scrum experiments that would allow us to use Scrum effectively.

Dave outlined the criteria that suitable experiments should meet. There should be some way of knowing whether the experiment is succeeding and it must be possible to amplify (i.e. reinforce) signs of success. Similarly, there should be some way of knowing whether it is failing and of dampening, or reducing, the damaging impact of a failing experiment. Failure is not bad. In any useful set of safe-to-fail experiments some must fail if we are to learn anything worthwhile The final criterion is that the experiment must be coherent. This idea of coherence requires more attention.

Dave Snowden explains the tests for coherence here. He isn’t entirely clear about how rigid these tests should be. Perhaps it’s more useful to regard them as heuristics than fixed rules, though the first two are of particular importance.

  • A coherent experiment, the ideas and assumptions behind it, should be compatible with natural science. That might seem like a rather banal statement, till you consider all the massive IT developments and change programmes that were launched in blissful ignorance of the fact that science could have predicted inevitable failure.
  • There should be some evidence from elsewhere to support the proposal. Replicating past cases is no guarantee of success, far from it, but it is a valid way to try and learn about the problem.
  • The proposal should fit where we are. It has to be consistent to some degree with what we have been doing. A leap into the unknown attempting something that is utterly unfamiliar is unlikely to gain any traction.
  • Can the proposal pass a series of “ritual dissent challenges? These are a formalised way of identifying flaws and refining possible experiments.
  • Does the experiment reflect an unmet, unarticulated need that has been revealed by sense-making, by attempts to make sense of the problem?

The two latter criteria refer explicitly to Cynefin techniques. The final one, identifying unmet needs, assumes the use of Cognitive Edge’s SenseMaker. Remember Fred Brooks’ blunt statement about requirements? Clients do not know what they want. They cannot articulate their needs if they are asked directly. They cannot envisage what is possible. Dave Snowden takes that point further. If users can articulate their needs than you’re dealing with a commoditized product and the solution is unlikely to have great value. Real values lies in meeting needs that users are unaware of and that they cannot articulate. This has always been so, but in days of yore we could often get away with ignoring that problem. Most applications were in-house developments that either automated back-office functions or were built around business rules and clerical processes that served as an effective proxy for true requirements. The inadequacies of the old structured methods and traditional requirements gathering could be masked.

With the arrival of web development, and then especially with mobile technology this gulf between user needs and the ability of developers to grasp them became a problem that could be ignored only through wilful blindness, admittedly a trait that has never been in short supply in corporate life. The problem has been exacerbated by our historic willingness to confuse rigour with a heavily documented, top-down approach to software development. Sense-making entails capturing large numbers of user reports in order to discern patterns that can be exploited. This appears messy, random and unstructured to anyone immured in traditional ways of development. It might appear to lack rigour, but such an approach is in accord with messy, unpredictable reality. That means it offers a more rigorous and effective way of deriving requirements than we can get by pretending that every development belongs in the Obvious domain. A simple lesson I’ve had to learn and relearn over the years is that rigour and structure are not the same as heavy documentation, prescriptive methods and a linear, top-down approach to problem solving.

This all raises big questions for testers. How do we respond? How do we get involved in testing requirements that have been derived this way and indeed the resulting applications? Any response to those questions should take account of another theme that really struck me from Dave’s day in New York. That was the need for resilience.

Resilience

The crucial feature of complex adaptive systems is their unpredictability. Applications operating in such a space will inevitably be subject to problems and threats that we would never have predicted. Even where we can confidently predict the type of threat the magnitude will remain uncertain. Failure is inevitable. What matters is how the application responds.

The need for resilience, with its linked themes of tolerance, diversity and redundancy, was a recurring message in Dave’s class. Resilience is not the same as robustness. The example that Dave gave was that a seawall is robust but a salt marsh is resilient. A seawall is a barrier to large waves and storms. It protects the harbour behind, but if it fails it does so catastrophically. A salt marsh protects inland areas by acting as a buffer, absorbing storm waves rather than repelling them. It might deteriorate over time but it won’t fail suddenly and disastrously.

An increasing challenge for testers will be to look for information about how systems fail, and test for resilience rather than robustness. Tolerance for failure becomes more important than a vain attempt to prevent failure. This tolerance often requires greater redundancy. Stripping out redundancy and maximizing the efficiency of systems has a downside, as I’ve discovered in my career. Greater efficiency can make applications brittle and inflexible. When problems hit they hit hard and recovery can be difficult.

it could be worse - not sure how, but it could be

The six years I spent working as an IT auditor had a huge impact on my thinking. I learned that things would go wrong, that systems would fail, and that they’d do so in ways I couldn’t have envisaged. There is nothing like a spell working as an auditor to imbue one with a gloomy sense of realism about the possibility of perfection, or even adequacy. I ended up like the gloomy old pessimist Eeyore in Winnie the Pooh. When I returned to development work a friend once commented that she could always spot one of my designs. Like Eeyore I couldn’t be certain exactly how things would go wrong, I just knew they would and my experience had taught me where to be wary. I was destined to end up as a tester.

Liz Keogh, in this talk on Safe-to-Fail makes a similar point.

“Testers are really, really good at spotting failure scenarios… they are awesomely imaginative at calamity… Devs are problem solvers. They spot patterns. Testers spot holes in patterns… I have a theory that other people who are in critical positions, like compliance and governance people are also really good at this”.

Testers should have the creativity to imagine how things might go wrong. In a Complex domain, working with applications that have been developed working with Cynefin, this insight and imagination, the ability to spot potential holes, will be extremely valuable. Testers have to seize that opportunity to remain relevant.

There is an upside to redundancy. If there are different ways of achieving the same ends then that diversity will offer more scope for innovation, for users to learn about the application and how it could be adapted and exploited to do more than the developers had imagined. Again, this is an opportunity for testers. Stakeholders need to know about the application and what it can do. Telling them that the application complied with a set of requirements that might have been of dubious relevance and accuracy just doesn’t cut it.

Conclusion

Conclusion is probably the wrong word. Dave Snowden’s class opened my mind to a wide range of new ideas and avenues to explore. This was just the starting point. These two essays can’t go very far in telling you about Cynefin and how it might apply to software testing. All I can realistically do is make people curious to go and learn more for themselves, to explore in more depth. That is what I will be doing, and as a starter I will be in London at the end of June for the London Tester Gathering. I will be at the workshop An Introduction to Complexity and Cynefin for Software Testers” being run by Martin Hynie and Ben Kelly where I hope to discuss Cynefin with fellow testers and explorers.

If you are going to the CAST conference in Nashville in August you will have the chance to hear Dave Snowden giving a keynote speech. He really is worth hearing.

Advertisements

Dave Snowden’s Cynefin masterclass in New York, 2nd May 2017 – part 1

This is part one of a two post series on Cynefin and software testing. I wrote it with the support of the Committee on Standards and Professional Practices of the Association for Software Testing. The posts originally appeared on the AST site.

Introduction

On May 2nd I attended Dave Snowden’s masterclass in New York, “A leader’s framework for decision making: managing complex projects using Cynefin”, at the Test Leadership Congress. For several years I have been following Dave’s work and I was keen to hear him speak in person. Dave is a gifted communicator, but he moves through his material fast, very fast. In a full day class he threw out a huge range of information, insights and arguments. I was writing frantically throughout, capturing key ideas and phrases I could research in detail later.

It was an extremely valuable day. All of it was relevant to software development, and therefore indirectly to testing. However, it would require a small book to do justice to Dave’s ideas. I will restrict myself to two posts in which I will concentrate on a few key themes that struck me as being particularly important to the testing community.

Our worldview matters

We need to understand how the world works or we will fail to understand the problems we face. We won’t recognise what success might look like, nor will we be able to anticipate unacceptable failure till we are beaten over the head, and we will select the wrong techniques to address problems.it ain't what you don't know that gets you into trouble - it's what you know for sure that just ain't do

Dave used a slide with this quote from Mark Twain. It’s an important point. Software development and testing has been plagued over the years by unquestioned assumptions and beliefs that we were paid well to take for granted, without asking awkward questions, but which just ain’t so. And they’ve got us into endless trouble.

A persistent damaging feature of software development over the years has been the illusion that is a neater, more orderly process than it really is. We craved certainty, fondly imagining that if we just put a bit more effort and expertise into the upfront analysis and requirements then good, experienced professionals can predictably develop high quality applications. It hardly ever panned out that way, and the cruel twist was that the people who finally managed to crank out something workable picked up the blame for the lack of perfection.

Fred Brooks made the point superbly in his classic paper, “No Silver Bullet”.

“The truth is, the client does not know what he wants. The client usually does not know what questions must be answered, and he has almost never thought of the problem in the detail necessary for specification. … in planning any software-design activity, it is necessary to allow for an extensive iteration between the client and the designer as part of the system definition.

…… it is really impossible for a client, even working with a software engineer, to specify completely, precisely, and correctly the exact requirements of a modern software product before trying some versions of the product.”

So iteration is required, but that doesn’t mean simply taking a linear process and repeating it. Understanding and applying Cynefin does not mean tackling problems in familiar ways but with a new vocabulary. It means thinking about the world in a different way, drawing on lessons from complexity science, cognitive neuroscience and biological anthropology.

Cynefin and ISO 29119

Cynefin is not based on successful individual cases, or on ideology, or on wishful thinking. Methods that are rooted in successful cases are suspect because of the survivorship bias (how many failed projects did the same thing?), and because people do not remember clearly and accurately what they did after the event; they reinterpret their actions dependent on the outcome. Cynefin is rooted in science and the way things are, the way systems behave, and the way that people behave. Developing software is an activity carried out by humans, for humans, mostly in social organisations. If we follow methods that are not rooted in reality, in science and which don’t allow for the way people behave then we will fail.

Dave Snowden often uses the philosophical phrase “a priori”, usually in the sense of saying that something is wrong a priori. A priori knowledge is based on theoretical deduction, or on mathematics, or the logic of the language in which the proposition is stated. We can say that certain things are true or false a priori, without having to refer to experience. Knowledge based on experience is a posteriori.

The distinction is important in the debate over the software testing standard ISO 29119. The ISO standards lobby has not attempted to defend 29119 on either a priori or on a posteriori grounds. The standard has its roots in linear, document driven development methods that were conspicuously unsuccessful. ISO were unable to cite any evidence or experience to justify their approach.

Defenders of the standard, and some neutrals, have argued that critics must examine the detailed content of the standard, which is extremely expensive to purchase, in order to provide meaningful criticism. However, this defence is misconceived because the standard itself is misconceived. The standard’s stated purpose, “is to define an internationally agreed set of standards for software testing that can be used by any organization when performing any form of software testing”. If ISO believes that a linear, prescriptive standard like ISO 29119 will apply to “any form of software testing” we can refer to Cynefin and say that they are wrong; we can say so confidently knowing that our stance is backed by reputable science and theory. ISO is attempting to introduce a practice that might, sometimes at best, be appropriate for the Obvious domain into the Complicated and Complex domains where it is wildly unsuitable and damaging. ISO is wrong a priori.

What is Cynefin?

The Wikipedia article is worth checking out, not least because Dave Snowden keeps an eye on it. This short video presented by Dave is also helpful.

The Cynefin Framework might look like a quadrant, but it isn’t. It is a collection of five domains that are distinct and clearly defined in principle, but which blur into one another in practice.

In addition to the four domains that look like the cells of a quadrant there is a fifth, in the middle, called Disorder, and this one is crucial to an understanding of the framework and its significance.

Cynefin is not a categorisation model, as would be implied if it were a simple matrix. It is not a matter of dropping data into the framework then cracking on with the work. Cynefin is a framework that is designed to help us make sense of what confronts us, to give us a better understanding of our situation and the approaches that we should take.

The first domain is Obvious, in which there are clear and predictable causes and effects. The second is Complicated, which also has definite causes and effects, but where the connections are not so obvious; expert knowledge and judgement is required.

The third is Complex, where there is no clear cause and effect. We might be able to discern it with hindsight, but that knowledge doesn’t allow us to predict what will happen next; the system adapts continually. Dave Snowden and Mary Boone used a key phrase in their Harvard Business Review article about Cynefin.

”Hindsight does not lead to foresight because the external conditions and systems constantly change.”

The fourth domain is Chaotic. Here, urgent action rather than reflective analysis, is required. The participants must act, sense feedback and respond. Complex situations might be suited to safe probing, which can teach us more about the problem, but such probing is a luxury in the Chaotic domain.

The appropriate responses in all four of these domains are different. In Obvious, the categories are clearly defined, one simply chooses the right one, and that provides the right route to follow. Best practices are appropriate here.

In the Complicated domain there is no single, right category to choose. There could be several valid options, but an expert can select a good route. There are various good practices, but the idea of a single best practice is misconceived.

In the Complex domain it is essential to probe the problem and learn by trial and error. The practices we might follow will emerge from that learning. In Chaos as I mentioned, we simply have to start with action, firefighting to stop the situation getting worse. It is helpful to remember that, instead of the everyday definition, chaos in Cynefin terms refer to the concept in physics. Here chaos refers to a system that it is so dynamic that minor variations in initial conditions lead to outcomes so dramatically divergent that the system is unpredictable. In some circumstances it makes sense to make a deliberate and temporary move into Chaos to learn new practice. That would require removing constraints and the connections that impose some sort of order.

The fifth domain is that of Disorder, in the middle of the diagram. This is the default position in a sense. It’s where we find ourselves when we don’t know which domain we should really be in. It’s therefore the normal starting point. The great danger is that we don’t choose the appropriate domain, but simply opt for the one that fits our instincts or our training, or that is aligned with the organisation’s traditions and culture, regardless of the reality.

The only stable domains are Obvious, Complicated and Complex. Chaotic and Disorder are transitional. You don’t (can’t) stay there. Chaotic is transitional because constraints will kick in very quickly, almost as a reflex. Disorder is transitional because you are actually in one of the other domains, but you just don’t know it.

The different domains have blurred edges. In any context there might be elements that fit into different domains if they are looked at independently. That isn’t a flaw with Cynefin. It merely reflects reality. As I said, Cynefin is not a neat categorisation model. It is intended to help us make sense of what we face. If reality is messy and blurred then there’s no point trying to force it into a straitjacket.

Many projects will have elements that are Obvious, that deal with a problem that is well understood, that we have dealt with before and whose solution is familiar and predictable. However, these are not the parts of a project that should shape the approach we take. The parts where the potential value, and the risk, lie are where we are dealing with something we have not done before. Liz Keogh has given many talks and written some very good blogs and articles about applying Cynefin to software development. Check out her work. This video is a good starter.

The boundaries between the domains are therefore fuzzy, but there is one boundary that is fundamentally different from the others; the border between Obvious and Chaotic. This is not really a boundary at all. It is more of a cliff. If you move from Obvious to Chaotic you don’t glide smoothly into a subtly changing landscape. You fall off the cliff.

Within the Obvious domain the area approaching the cliff is the complacent zone. Here, we think we are working in a neat, ordered environment and “we believe our own myths” as Snowden puts it in the video above. The reality is quite different and we are caught totally unaware when we hit a crisis and plunge off the cliff into chaos.

That was a quick skim through Cynefin. However, you shouldn’t think of it as being a static framework. If you are going to apply it usefully you have to understand the dynamics of the framework, and I will return to that in part two.

Auditors and testing – a rant justified by experience

A couple of weeks ago I was drawn into a discussion on Twitter about auditors and testing. At the time I was on holiday in a delightfully faraway part of Galloway, in south west Scotland.

One of the attractions of the cottage where we were staying was that it lacked a mobile (cell) phone signal, never mind internet access. Only when we happened to be in a pub or restaurant could I sneak onto wifi discreetly, without incurring a disapproving look from my wife.

Having worked as both an IT auditor and a tester, and having both strong opinions and an argumentative nature, I had plenty to say on the subject. That had to wait till I returned (via New York, but that’s another subject) when I unleashed a rant on Twitter. Here is that thread, in a more readable format. It might be a rant, but it is based on extensive experience.

Auditors looking for items they can check that MUST be called test cases? That’s a big, flashing, warning sign they have a lousy conceptual grasp of auditing. It’s true, but missing the point, to say that’s old fashioned. It’s like saying the problem with ISO 29119 is it’s old fashioned.

The crucial point is it’s bad, unprofessional auditing. The company that taught me to audit was promoting good auditing 30 years ago. If anyone had remained ignorant of the transformation in software development in the last 30 years you’d call them idiots, not old-fashioned.

A test case is just a name for a receptacle. It’s a bucket of ideas. Who cares about the bucket? Ideas and evidence really matter to auditors, who live and die by evidence; they expect compelling evidence that the auditees have been thinking about what they are doing. A lack of useful evidence showing what testing has been performed, or a lack of thought about how to test should be certain ways to attract criticism from auditors. The IT auditors’ governance model COBIT5 mentions “test cases” once (in passing). It mentions “ideas” 32 times & “evidence” 16 times.

COBIT5 isn’t just about testing of course. Its principles apply across the whole range of IT and testing is no exception. Auditors should expect testers to have;

  • a clear vision or strategy of how testing should be performed in their organisation,
  • a clear (but not necessarily detailed) plan for testing the product,
  • relevant, contemporary evidence that justifies and leads inescapably to the conclusions, lessons and insights that the testers derived and reported from their testing.

That’s what auditors should expect. Some (or many?) organisations are locked into a pattern of low quality and low value auditing. They define auditing as brainless compliance checking that is performed by low quality staff who don’t understand what they’re doing. Their work is worthless. As a result audit is held in low esteem in the organisation. Smart people don’t want to work there. Therefore audit must be defined in such a way that low quality staff are able to carry it out.

This is inexcusable. At best it is negligence. Maintaining that model of auditing requires willful ignorance of what the audit profession stipulates. It is damaging and contributes towards the creation of a dysfunctional culture. Nevertheless it is cheap and ensures there are no good auditors who might pose uncomfortable, challenging questions to senior managers.

However, this doesn’t mean there are never times when auditors do need to see test cases. If a contract has been stupidly written so that test cases must be produced and visible then there’s no wriggle room. It’s just the same (and just as stupid) as if the contract says testers must wear pink shirts. It might be stupid but it is a contractual deliverable; auditors will want to see proof of compliance. As Griffin Jones pointed out on seeing my tweet, “often (the contract) is stupidly written – thus the need to get involved with the contracting organization. The problem is bigger than test or SW dev”.

I fully agree with Griffin. Testers should get involved in contractual discussions that will influence their work, in order to anticipate and head off unhelpful contractual terms.

I would add that testers should ask to see the original contract. Contractual terms are sometimes misinterpreted as they are passed through the organisation to the testers. It might be possible to produce the required evidence by smarter means.

Apart from such tiresome contractual requirements, demanding to see “test cases” is a classic case of confusing form and content. It’s unprofessional. That’s not just my opinion; it’s not novel or radical. It’s simply orthodox, professional opinion. Anyone who says otherwise is clueless or bullshitting. Either way they must be resisted. Clueless bullshitters can enjoy good, lucrative careers, but do huge damage. I’ve no respect for them.

The US Food and Drug Administration’s “General Principles of Software Validation” do pose a problem. They date back to 1997, updated in 2002. They are creakily old. They mention test cases many times, but they were written when it was assumed that testing meant writing test cases. The term seems to be used as jargon for tests. If testing satisfies FDA criteria then there’s no obvious reason why you can’t just call planned tests “test cases”.

There’s no requirement to produce test scripts as well as test cases, but expected results with objective pass/fail criteria are required. That doesn’t, and mustn’t, mean testers should be looking only for the expected results. The underlying principle is that compliance should follow the “least burdensome” approach and the FDA do say that they are open to considering alternative approaches to comply with the requirements in a way that is less burdensome.

Further, the FDA does not have a problem with Agile development (PDF, opens in new tab), and they also do approve of exploratory testing, as explained by James Bach.

Frozen in time – grammar and testing standards

This recent tweet by Tyler Hayes caught my eye. “If you build software you’re an anthropologist whether you like it or not.”

It’s an interesting point, and it’s relevant on more than one level. By and large software is developed by people and for people. That is a statement of the obvious, but developers and testers have generally been reluctant to take on board the full implications. This isn’t a simple point about usability. The software we build is shaped by many assumptions about the users, and how they live and work. In turn, the software can reinforce existing structures and practices. Testers should think about these issues if they’re to provide useful findings to the people who matter. You can’t learn everything you need to know from a requirements specification. This takes us deep into anthropological territory.

What is anthropology?

Social anthropology is defined by University College London as follows.

Social Anthropology is the comparative study of the ways in which people live in different social and cultural settings across the globe. Societies vary enormously in how they organise themselves, the cultural practices in which they engage, as well as their religious, political and economic arrangements.

We build software in a social, economic and cultural context that is shaped by myriad factors, which aren’t necessarily conducive to good software, or a happy experience for the developers and testers, never mind the users. I’ve touched on this before in “Teddy Bear Methods“.

There is much that we can learn from anthropology, and not just to help us understand what we see when we look out at the users and the wider world. I’ve long thought that the software development and testing community would make a fascinating subject for anthropologists.

Bureaucracy, grammar and deference to authority

I recently read “The Utopia of Rules – On Technology, Stupidity, and the Secret Joys of Bureaucracy” by the anthropologist David Graeber.
Graeber has many fascinating insights and arguments about how organisations work, and why people are drawn to bureaucracy. One of his arguments is that regulation is imposed and formalised to try and remove arbitrary, random behaviour in organisations. That’s a huge simplification, but there’s not room here to do Graeber’s argument justice. One passage in particular caught my eye.

People do not invent languages by writing grammars, they write grammars — at least, the first grammars to be written for any given language — by observing the tacit, largely unconscious, rules that people seem to be applying when they speak. Yet once a book exists,and especially once it is employed in schoolrooms, people feel that the rules are not just descriptions of how people do talk, but prescriptions for how they should talk.

It’s easy to observe this phenomenon in places where grammars were only written recently. In many places in the world, the first grammars and dictionaries were created by Christian missionaries in the nineteenth or even twentieth century, intent on translating the Bible and other sacred texts into what had been unwritten languages. For instance, the first grammar for Malagasy, the language spoken in Madagascar, was written in the 1810s and ’20s. Of course, language is changing all the time, so the Malagasy spoken language — even its grammar — is in many ways quite different than it was two hundred years ago. However, since everyone learns the grammar in school, if you point this out, people will automatically say that speakers nowadays are simply making mistakes, not following the rules correctly. It never seems to occur to anyone — until you point it out — that had the missionaries came and written their books two hundred years later, current usages would be considered the only correct ones, and anyone speaking as they had two hundred years ago would themselves be assumed to be in error.

In fact, I found this attitude made it extremely difficult to learn how to speak colloquial Malagasy. Even when I hired native speakers, say, students at the university, to give me lessons, they would teach me how to speak nineteenth-century Malagasy as it was taught in school. As my proficiency improved, I began noticing that the way they talked to each other was nothing like the way they were teaching me to speak. But when I asked them about grammatical forms they used that weren’t in the books, they’d just shrug them off, and say, “Oh, that’s just slang, don’t say that.”

…The Malagasy attitudes towards rules of grammar clearly have… everything to do with a distaste for arbitrariness itself — a distaste which leads to an unthinking acceptance of authority in its most formal, institutional form.

Searching for the “correct” way to develop software

Graeber’s phrase “distate for arbitrariness itself” reminded me of the history of software development. In the 1960s and 70s academics and theorists agonised over the nature of development, trying to discover and articulate what it should be. Their approach was fundamentally mistaken. There are dreadful ways, and there are better ways to develop software but there is no natural, correct way that results in perfect software. The researchers assumed that there was and went hunting for it. Instead of seeking understanding they carried their assumptions about what the answer might be into their studies and went looking for confirmation.

They were trying to understand how the organisational machine worked and looked for mechanical processes. I use the word “machine” carefully, not as a casual metaphor. There really was an assumption that organisations were, in effect, machines. They were regarded as first order cybernetic entities whose behaviour would not vary depending on whether they were being observed. To a former auditor like myself this is a ludicrous assumption. The act of auditing an organisation changes the way that people behave. Even the knowledge that an audit may occur will shape behaviour, and not necessarily for the better (see my article “Cynefin, testing and auditing“). You cannot do the job well without understanding that. Second order cybernetics does recognise this crucial problem and treats observers as participants in the system.

So linear, sequential development made sense. The different phases passing outputs along the production line fitted their conception of the organisation as a machine. Iterative, incremental development looked messy and immature; it was just wrong as far as the researchers were concerned. Feeling one’s way to a solution seemed random, unsystematic – arbitrary.

Development is a difficult and complex job; people will tend to follow methods that make the job feel easier. If managers are struggling with the complexities of managing large projects they are more likely to choose linear, sequential methods that make the job of management easier, or at least less stressful. So when researchers saw development being carried out that way they were observing human behaviour, not a machine operating.

Doubts about this approach were quashed by pointing out that if organisations weren’t quite the neat machine that they should be this would be solved by the rapid advance in the use of computers. This argument looks suspiciously circular because the conclusion that in future organisations would be fully machine-like rests on the unproven premise that software development is a mechanical process which is not subject to human variability when performed properly.

Eliminating “arbitrariness” and ignoring the human element

This might all have been no more than an interesting academic sideline, but it fed back into software development. By the 1970s, when these studies into the nature of development were being carried out, organisations were moving towards increasingly formalised development methods. There was increasing pressure to adopt such methods. Not only were they attractive to managers, the use of more formal methods provided a competitive advantage. ISO certification and CMMI accreditation were increasingly seen as a way to demonstrate that organisations produced high quality software. The evidence may have been weak, but it seemed a plausible claim. These initiatives required formal processes. The sellers of formal methods were happy to look for and cite any intellectual justification for their products. So formal linear methods were underpinned by academic work that assumed that formal linear methods were correct. This was the way that responsible, professional software development was performed. ISO standards were built on this assumption.

If you are trying to define the nature of development you must acknowledge that it is a human activity, carried out by and for humans. These studies about the nature of development were essentially anthropological exercises, but the researchers assumed they were observing and taking apart a machine.

As with the missionaries who were codifying grammar the point in time when these researchers were working shaped the result. If they had carried out their studies earlier in the history of software development they might have struggled to find credible examples of formalised, linear development. In the 1950s software development was an esoteric activity in which the developers could call the shots. 20 years later it was part of the corporate bureaucracy and iterative, incremental development was sidelined. If the studies can been carried out a few decades further on then it would have been impossible to ignore Agile.

As it transpired, formal methods, CMM/CMMI and the first ISO standards concerning development and testing were all creatures of that era when organisations and their activities were seriously regarded as mechanical. Like the early Malagasy grammar books they codified and fossilised a particular, flawed approach at a particular time for an activity that was changing rapidly. ISO 29119 is merely an updated version of that dated approach to testing. It is rooted in a yearning for bureaucratic certainty, a reluctance to accept that ultimately good testing is dependent not on documentation, but on that most irrational, variable and unpredictable of creatures – the human who is working in a culture shaped by humans. Anthropology has much to teach us.

Further reading

That is the end of the essay, but there is a vast amount of material you could read about attempts to understand and define the nature of software development and of organisations. Here is a small selection.

Brian Fitzgerald has written some very interesting articles about the history of development. I recommend in particular “The systems development dilemma: whether to adopt formalised systems development methodologies or not?” (PDF, opens in new tab).

Agneta Olerup wrote this rather heavyweight study of what she calls the
Langeforsian approach to information systems design. Börje Langefors was a highly influential advocate of the mechanical, scientific approach to software development. Langefors’ Wikipedia entry describes him as “one of those who made systems development a science”.

This paper gives a good, readable introduction to first and second order cybernetics (PDF, opens in new tab), including a useful warning about the distinction between models and the entities that they attempt to represent.

All our knowledge of systems is mediated by our simplified representations—or models—of them, which necessarily ignore those aspects of the system which are irrelevant to the purposes for which the model is constructed. Thus the properties of the systems themselves must be distinguished from those of their models, which depend on us as their creators. An engineer working with a mechanical system, on the other hand, almost always know its internal structure and behavior to a high degree of accuracy, and therefore tends to de-emphasize the system/model distinction, acting as if the model is the system.

Moreover, such an engineer, scientist, or “first-order” cyberneticist, will study a system as if it were a passive, objectively given “thing”, that can be freely observed, manipulated, and taken apart. A second-order cyberneticist working with an organism or social system, on the other hand, recognizes that system as an agent in its own right, interacting with another agent, the observer.

Finally, I recommend a fascinating article in the IEEE’s Computer magazine by Craig Larman and Victor Basili, “Iterative and incremental development: a brief history” (PDF, opens in new tab). Larman and Basili argue that iterative and incremental development is not a modern practice, but has been carried out since the 1950s, though they do acknowledge that it was subordinate to the linear Waterfall in the 1970s and 80s. There is a particularly interesting contribution from Gerald Weinberg, a personal communication to the authors, in which he describes how he and his colleagues developed software in the 1950s. The techniques they followed were “indistinguishable from XP”.

Standards – a charming illusion of action

The other day I posted an article I’d written that appeared on the uTest blog a few weeks ago. It was a follow up to an article I wrote last year about ISO 29119. Pmhut (the Project Management Hut website) provided an interesting comment.

”…are you sure that the ISO standards will be really enforced on testing – notably if they don’t really work? After all, lawyers want to get paid and clients want their projects done (regardless of how big the clients are).”

Well, as I answered, whether or not ISO 29119 works is, in a sense, irrelevant. Whether or not it is adopted and enforced will not depend on its value or efficacy. ISO 29119 might go against the grain of good software development and testing, but it is very much aligned with a hugely pervasive trend in bureaucratic, corporate life.

I pointed the commenter to an article I wrote on “Teddy Bear Methods”. People cling to methods not because they work, but because they gain comfort from doing so. That is the only way they can deal with difficult, stressful jobs in messy and complex environments. I could also have pointed to this article “Why do we think we’re different?”, in which I talk about goal displacement, our tendency to focus on what we can manage while losing sight of what we’re supposed to be managing.

A lesson from Afghanistan

I was mulling over this when I started to read a fascinating looking book I was given at Christmas; “Heirs to Forgotten Kingdoms” by Gerard Russell, a deep specialist in the Middle East and a fluent Arabic and Farsi speaker.

The book is about minority religions in the Middle East. Russell is a former diplomat in the British Foreign Office. The foreword was by Rory Stewart, the British Conservative MP. Stewart was writing about his lack of surprise that Russell, a man deeply immersed in the culture of the region, had left the diplomatic service, then added;

”Foreign services and policy makers now want ‘management competency’ – slick and articulate plans, not nuance, deep knowledge, and complexity.”

That sentence resonated with me, and reminded me of a blistering passage from Stewart’s great book “The Places in Between”, his account of walking through the mountains of Afghanistan in early 2002 in the immediate aftermath of the expulsion of the Taliban and the NATO intervention.

Rory Stewart is a fascinating character, far removed from the modern identikit politician. The book is almost entirely a dispassionate account of his adventures and the people whom he met and who provided him with hospitality. Towards the end he lets rip, giving his brutally honest and well-informed perspective of the inadequacies of the western, bureaucratic, managerial approach to building a democratic state where none had previously existed.

It’s worth quoting at some length.

“I now had half a dozen friends working in embassies, thinktanks, international development agencies, the UN and the Afghan government, controlling projects worth millions of dollars. A year before they had been in Kosovo or East Timor and in a year’s time they would have been moved to Iraq or Washington or New York.

Their objective was (to quote the United Nations Assistance Mission for Afghanistan) ‘The creation of a centralised, broad-based, multi-ethnic government committed to democracy, human rights and the rule of law’. They worked twelve- or fourteen- hour days, drafting documents for heavily-funded initiatives on ‘democratisation’, ‘enhancing capacity’, ‘gender’, ‘sustainable development,’ ‘skills training’ or ‘protection issues’. They were mostly in their late twenties or early thirties, with at least two degrees – often in international law, economics or development. They came from middle class backgrounds in Western countries and in the evenings they dined with each other and swapped anecdotes about corruption in the Government and the incompetence of the United Nations. They rarely drove their 4WDs outside Kabul because they were forbidden to do so by their security advisers. There were people who were experienced and well informed about conditions in rural areas of Afghanistan. But such people were barely fifty individuals out of many thousands. Most of the policy makers knew next to nothing about the villages where 90% of the population of Afghanistan lived…

Their policy makers did not have the time, structures or resources for a serious study of an alien culture. They justified their lack of knowledge and experience by focusing on poverty and implying that dramatic cultural differences did not exist. They acted as though villagers were interested in all the priorities of international organisations, even when they were mutually contradictory…

Critics have accused this new breed of administrators of neo-colonialism. But in fact their approach is not that of a nineteenth-century colonial officer. Colonial administrations may have been racist and exploitative but they did at least work seriously at the business of understanding the people they were governing. They recruited people prepared to spend their entire careers in dangerous provinces of a single alien nation. They invested in teaching administrators and military officers the local language…

Post-conflict experts have got the prestige without the effort or stigma of imperialism. Their implicit denial of the difference between cultures is the new mass brand of international intervention. Their policy fails but no one notices. There are no credible monitoring bodies and there is no one to take formal responsibility. Individual officers are never in any one place and rarely in any one organisation long enough to be adequately assessed. The colonial enterprise could be judged by the security or revenue it delivered, but neo-colonialists have no such performance criteria. In fact their very uselessness benefits them. By avoiding any serious action or judgement they, unlike their colonial predecessors, are able to escape accusations of racism, exploitation and oppression.

Perhaps it is because no one requires more than a charming illusion of action in the developing world. If the policy makers know little about the Afghans, the public knows even less, and few care about policy failure when the effects are felt only in Afghanistan.”

Stewart’s experience and insight, backed up by the recent history of Afghanistan, allow him to present an irrefutable case. Yet, in the eyes of pretty much everyone who matters he is wrong. Governments and the military are prepared to ignore the evidence and place their trust in irrelevant and failed techniques rather than confront the awful truth; they don’t know what they’re doing and they can’t know the answers.

Vast sums of money, and millions of lives are at stake. Yet very smart and experienced people will cling on to things that don’t work, and will repeat their mistakes in the future. Stewart, meanwhile, is very unlikely to be allowed anywhere near the levers of power in the United Kingdom. Being right isn’t necessarily a great career move.

Deep knowledge, nuance and complexity

I’m conscious that I’m mixing up quite different subjects here. Software development and testing are very different activities from state building. However, both are complex and difficult. Governments fail repeatedly at something as important and high-profile as constructing new, democratic states, and do so without feeling the need to reconsider their approach. If that can happen in the glare of publicity is it likely that corporations will refrain from adopting and enforcing standards just because they don’t work? Whether or not they work barely matters. Such approaches fit the mindset and culture of many organisations, especially large bureaucracies, and once adopted it is very difficult to persuade them to abandon them.

Any approach to testing that is based on standardisation is doomed to fail unless you define success in a way that is consistent with the flawed assumptions of the standardisation. What’s the answer? Not adopting standards that don’t work is an obvious start, but that doesn’t take you very far. You’ve got to acknowledge those things that Stewart referred to in his foreword to Gerard Russell’s book; answers aren’t easy, they require deep knowledge, an understanding of nuance and an acceptance of complexity.

A video worth watching

Finally, I’d strongly recommend this video of Rory Stewart being interviewed by Harry Kreisler of the University of California about his experiences and the problems I’ve been discussing. I’ve marked the parts I found most interesting.

34 minutes; Stewart is asked about applying abstract ideas in practice.

40:20; Stewart talks about a modernist approach of applying measurement, metrics and standardisation in contexts where they are irrelevant.

47:05; Harry Kreisler and then Stewart talk about participants failing to spot the obvious, that their efforts are futile.

49:33; Stewart says his Harvard students regarded him as a colourful contrarian. They believed that all Afghanistan needed was a new plan and new resources.

Service Virtualization interview about usability

Service VirtualizationThis interview with Service Virtualization appeared in January 2015. Initially when George Lawton approached me I wasn’t enthusiastic. I didn’t think I would have much to say. However, the questions set me thinking, and I felt they were relevant to my experience so I was happy to take part. It gave me something to do while I was waiting to fly back from EuroSTAR in Dublin!

How does usability relate to the notion of the purpose of a software project?

When I started in IT over 30 years ago I never heard the word usability. It was “user friendliness”, but that was just a nice thing to have. It was nice if your manager was friendly, but that was incidental to whether he was actually good at the job. Likewise, user friendliness was incidental. If everything else was ok then you could worry about that, but no-one was going to spend time or money, or sacrifice any functionality just to make the application user friendly. And what did “user friendly” mean anyway. “Who knows? Who cares? We’ve got serious work do do. Forget about that touchy feely stuff.”

The purpose of software development was to save money by automating clerical routines. Any online part of the system was a mildly anomalous relic of the past. It was just a way of getting the data into the system so the real work could be done. Ok, that’s an over-simplification, but I think there’s enough truth in it to illustrate why developers just didn’t much care about the users and their experience. Development moved on from that to changing the business, rather than merely changing the business’s bureaucracy, but it took a long time for these attitudes to shift.

The internet revolution turned everything upside down. Users are no longer employees who have to put up with whatever they’re given. They are more likely to be customers. They are ruthless and rightly so. Is your website confusing? Too slow to load? Your customers have gone to your rivals before you’ve even got anywhere near their credit card number.

The lesson that’s been getting hammered into the heads of software engineers over the last decade or so is that usability isn’t an extra. I hate the way that we traditionally called it a “non-functional requirement”, or one of the “quality criteria”. Usability is so important and integral to every product that telling developers that they’ve got to remember it is like telling drivers they’ve got to remember to use the steering wheel and the brakes. If they’re not doing these things as a matter of course they shouldn’t be allowed out in public. Usability has to be designed in from the very start. It can’t be considered separately.

What are the main problems in specifying for and designing for software usability?

Well, who’s using the application? Where are they? What is the platform? What else are they doing? Why are they using the application? Do they have an alternative to using your application, and if so, how do you keep them with yours? All these things can affect decisions you take that are going to have a massive impact on usability.

It’s payback time for software engineering. In the olden days it would have been easy to answer these questions, but we didn’t care. Now we have to care, and it’s all got horribly difficult.

These questions require serious research plus the experience and nous to make sound judgements with imperfect evidence.

In what ways do organisations lose track of the usability across the software development lifecycle?

I’ve already hinted at a major reason. Treating usability as a non-functional requirement or quality criterion is the wrong approach. That segregates the issue. It’s treated as being like the other quality criteria, the “…ities” like security, maintainability, portability, reliability. It creates the delusion that the core function is of primary importance and the other criteria can be tackled separately, even bolted on afterwards.

Lewis & Rieman came out with a great phrase fully 20 years ago to describe that mindset. They called it the peanut butter theory of usability. You built the application, and then at the end you smeared a nice interface over the top, like a layer of peanut butter (PDF, opens in new tab).

“Usability is seen as a spread that can be smeared over any design, however dreadful, with good results if the spread is thick enough. If the underlying functionality is confusing, then spread a graphical user interface on it. … If the user interface still has some problems, smear some manuals over it. If the manuals are still deficient, smear on some training which you force users to take.”

Of course they were talking specifically about the idea that usability was a matter of getting the interface right, and that it could be developed separately from the main application. However, this was an incredibly damaging fallacy amongst usability specialists in the 80s and 90s. There was a huge effort to try to justify this idea by experts like Hartson & Hix, Edmonds, and Green. Perhaps the arrival of Object Oriented technology contributed towards the confusion. A low level of coupling so that different parts of the system are independent of each other is a good thing. I wonder if that lured usability professionals into believing what they wanted to believe, that they could be independent from the grubby developers.

Usability professionals tried to persuaded themselves that they could operate a separate development lifecycle that would liberate them from the constraints and compromises that would be inevitable if they were fully integrated into development projects. The fallacy was flawed conceptually and architecturally. However, it was also a politically disastrous approach. The usability people made themselves even less visible, and were ignored at a time when they really needed to be getting more involved at the heart of the development process.

As I’ve explained, the developers were only too happy to ignore the usability people. They were following methods and lifecycles that couldn’t easily accommodate usability.

How can organisations incorporate the idea of usability engineering into the software development and testing process?

There aren’t any right answers, certainly none that will guarantee success. However, there are plenty of wrong answers. Historically in software development we’ve kidded ourselves thinking that the next fad, whether Structured Methods, Agile, CMMi or whatever, will transform us into rigorous, respected professionals who can craft high quality applications. Now some (like Structured Methods) suck, while others (like Agile) are far more positive, but the uncomfortable truth is that it’s all hard and the most important thing is our attitude. We have to acknowledge that development is inherently very difficult. Providing good UX is even harder and it’s not going to happen organically as a by-product of some over-arching transformation of the way we develop. We have to consciously work at it.

Whatever the answer is for any particular organisation it has to incorporate UX at the very heart of the process, from the start. Iteration and prototyping are both crucial. One of the few fundamental truths of development is that users can’t know what they want and like till they’ve seen what is possible and what might be provided.

Even before the first build there should have been some attempt to understand the users and how they might be using the proposed product. There should be walkthroughs of the proposed design. It’s important to get UX professionals involved, if at all possible. I think developers have advanced to the point that they are less likely to get it horribly wrong, but actually getting it right, and delivering good UX is asking too much. For that I think you need the professionals.

I do think that Agile is much better suited to producing good UX than traditional methods, but there are still dangers. A big one is that many Agile developers are understandably sceptical about anything that smells of Big Up-Front Analysis and Design. It’s possible to strike a balance and learn about your users and their needs without committing to detailed functional requirements and design.

How can usability relate to the notion of testable hypothesis that can lead to better software?

Usability and testability go together naturally. They’re also consistent with good development practice. I’ve worked on, or closely observed, many applications where the design had been fixed and the build had been completed before anyone realised that there were serious usability problems, or that it would be extremely difficult to detect and isolate defects, or that there would be serious performance issues arising from the architectural choices that had been made.

We need to learn from work that’s been done with complexity theory and organisation theory. Developing software is mostly a complex activity, in the sense that there are rarely predictable causes and effects. Good outcomes emerge from trialling possible solutions. These possibilities aren’t just guesswork. They’re based on experience, skill, knowledge of the users. But that initial knowledge can’t tell you the solution, because trying different options changes your understanding of the problem. Indeed it changes the problem. The trials give you more knowledge about what will work. So you have to create further opportunities that will allow you to exploit that knowledge. It’s a delusion that you can get it right first time just by running through a sequential process. It would help if people thought of good software as being grown rather than built.