Dave Snowden’s Cynefin masterclass in New York, 2nd May 2017 – part 2

This post is the second of two discussing Dave Snowden’s recent Cynefin masterclass at the Test Leadership Congress in New York. I wrote the series with the support of the Committee on Standards and Professional Practices of the Association for Software Testing. The posts originally appeared on the AST site.

In the first I gave an overview of Cynefin and explained why I think it is important, and how it can helpfully shape the way we look at the world and make sense of the problems we face. In this post I will look at some of the issues raised in Dave’s class and discuss their relevance to development and testing.

The dynamics between domains

Understanding that the boundaries between the different domains are fluid and permeable is crucial to understanding Cynefin. A vital lesson is that we don’t start in one domain and stay there; we can and should move between them. Even if we ignore that lesson reality will drag us from one domain to another. Dave said “all the domains have value – it’s the ability to move between them that is key”.

The Cynefin dynamics are closely tied to the concept of constraints, which are so important to Cynefin that they act as differentiators between the domains. You could say that constraints define the domains.

Constraint is perhaps a slightly misleading word. In Cynefin terms it is not necessarily something that compels or prevents certain behaviour. That does apply to the Obvious domain, where the constraints are fixed and rigid. The constraints in the Complicated domain govern behaviour, and can be agreed by expert consensus. In the Complex domain the constraints enable action, rather than restricting it or compelling it. They are a starting point rather than an end. In Chaos there are no constraints.

Dave Snowden puts it as follows, differentiating rules and heuristics.

“Rules are governing constraints, they set limits to action, they contain all possible instances of action. In contrast heuristics are enabling constraints, they provide measurable guidance which can adapt to the unknowable unknowns.”

If we can change the constraints then we are moving from one domain to another. The most significant dynamic is the cycle between Complex and Complicated.

Cynefin core dynamic - Complex to ComplicatedCrucially, we must recognise that if we are attempting something new, that involves a significant amount of uncertainty then we start in the Complex domain exploring and discovering more about the problem. Once we have a better understanding and have found constraints that allow us to achieve repeatable outcomes we have moved the problem to the Complicated domain where we can manage it more easily and exploit our new knowledge. If our testing reveals that the constraints are not producing repeatable results then it’s important to get back into the Complex domain and carry out some more probing experiments.

This is not a one off move. We have to keep cycling to ensure the solution remains relevant. The cadence, or natural flow of the cycle will vary depending on the context. Different industries, or sectors, or applications will have different cadences. It could be days, or years, or anything in between. If, or rather when, our constraints fail to produce repeatable results we have to get back into the Complex domain.

This cycle between Complex and Complicated is key for software development in particular. Understanding this dynamic is essential in order to understand how Cynefin might be employed.

Setting up developments

As I said earlier the parts of a software development project that will provide value are where we are doing something new, and that is where the risk also lies. Any significant and worthwhile development project will start in the Complex domain. The initial challenge is to learn enough to move it to Complicated. Dave explained it as follows in a talk at Agile India in 2015.

“As things are Complex we see patterns, patterns emerge. We stabilise the patterns. As we stabilise them we can actually shift them into the Complicated domain. So the basic principle of Complexity-based intervention is you start off with multiple, parallel, safe-to-fail experiments, which is why Scrum is not a true Complexity technique; it does one thing in a linear way. We call (these experiments) a pre-Scrum technique. You do smaller experiments faster in parallel… So you’re moving from the centre of the Complex domain into the boundary, once you’re in the boundary you use Scrum to move it across the boundary.”

Such a safe-to-fail experiment might be an XP pair programming team being assigned to knock up a small, quick prototype.

So the challenge in starting the move from Complex to Complicated is to come up with the ideas for safe-to-fail pre-Scrum experiments that would allow us to use Scrum effectively.

Dave outlined the criteria that suitable experiments should meet. There should be some way of knowing whether the experiment is succeeding and it must be possible to amplify (i.e. reinforce) signs of success. Similarly, there should be some way of knowing whether it is failing and of dampening, or reducing, the damaging impact of a failing experiment. Failure is not bad. In any useful set of safe-to-fail experiments some must fail if we are to learn anything worthwhile The final criterion is that the experiment must be coherent. This idea of coherence requires more attention.

Dave Snowden explains the tests for coherence here. He isn’t entirely clear about how rigid these tests should be. Perhaps it’s more useful to regard them as heuristics than fixed rules, though the first two are of particular importance.

  • A coherent experiment, the ideas and assumptions behind it, should be compatible with natural science. That might seem like a rather banal statement, till you consider all the massive IT developments and change programmes that were launched in blissful ignorance of the fact that science could have predicted inevitable failure.
  • There should be some evidence from elsewhere to support the proposal. Replicating past cases is no guarantee of success, far from it, but it is a valid way to try and learn about the problem.
  • The proposal should fit where we are. It has to be consistent to some degree with what we have been doing. A leap into the unknown attempting something that is utterly unfamiliar is unlikely to gain any traction.
  • Can the proposal pass a series of “ritual dissent challenges? These are a formalised way of identifying flaws and refining possible experiments.
  • Does the experiment reflect an unmet, unarticulated need that has been revealed by sense-making, by attempts to make sense of the problem?

The two latter criteria refer explicitly to Cynefin techniques. The final one, identifying unmet needs, assumes the use of Cognitive Edge’s SenseMaker. Remember Fred Brooks’ blunt statement about requirements? Clients do not know what they want. They cannot articulate their needs if they are asked directly. They cannot envisage what is possible. Dave Snowden takes that point further. If users can articulate their needs than you’re dealing with a commoditized product and the solution is unlikely to have great value. Real values lies in meeting needs that users are unaware of and that they cannot articulate. This has always been so, but in days of yore we could often get away with ignoring that problem. Most applications were in-house developments that either automated back-office functions or were built around business rules and clerical processes that served as an effective proxy for true requirements. The inadequacies of the old structured methods and traditional requirements gathering could be masked.

With the arrival of web development, and then especially with mobile technology this gulf between user needs and the ability of developers to grasp them became a problem that could be ignored only through wilful blindness, admittedly a trait that has never been in short supply in corporate life. The problem has been exacerbated by our historic willingness to confuse rigour with a heavily documented, top-down approach to software development. Sense-making entails capturing large numbers of user reports in order to discern patterns that can be exploited. This appears messy, random and unstructured to anyone immured in traditional ways of development. It might appear to lack rigour, but such an approach is in accord with messy, unpredictable reality. That means it offers a more rigorous and effective way of deriving requirements than we can get by pretending that every development belongs in the Obvious domain. A simple lesson I’ve had to learn and relearn over the years is that rigour and structure are not the same as heavy documentation, prescriptive methods and a linear, top-down approach to problem solving.

This all raises big questions for testers. How do we respond? How do we get involved in testing requirements that have been derived this way and indeed the resulting applications? Any response to those questions should take account of another theme that really struck me from Dave’s day in New York. That was the need for resilience.

Resilience

The crucial feature of complex adaptive systems is their unpredictability. Applications operating in such a space will inevitably be subject to problems and threats that we would never have predicted. Even where we can confidently predict the type of threat the magnitude will remain uncertain. Failure is inevitable. What matters is how the application responds.

The need for resilience, with its linked themes of tolerance, diversity and redundancy, was a recurring message in Dave’s class. Resilience is not the same as robustness. The example that Dave gave was that a seawall is robust but a salt marsh is resilient. A seawall is a barrier to large waves and storms. It protects the harbour behind, but if it fails it does so catastrophically. A salt marsh protects inland areas by acting as a buffer, absorbing storm waves rather than repelling them. It might deteriorate over time but it won’t fail suddenly and disastrously.

An increasing challenge for testers will be to look for information about how systems fail, and test for resilience rather than robustness. Tolerance for failure becomes more important than a vain attempt to prevent failure. This tolerance often requires greater redundancy. Stripping out redundancy and maximizing the efficiency of systems has a downside, as I’ve discovered in my career. Greater efficiency can make applications brittle and inflexible. When problems hit they hit hard and recovery can be difficult.

it could be worse - not sure how, but it could be

The six years I spent working as an IT auditor had a huge impact on my thinking. I learned that things would go wrong, that systems would fail, and that they’d do so in ways I couldn’t have envisaged. There is nothing like a spell working as an auditor to imbue one with a gloomy sense of realism about the possibility of perfection, or even adequacy. I ended up like the gloomy old pessimist Eeyore in Winnie the Pooh. When I returned to development work a friend once commented that she could always spot one of my designs. Like Eeyore I couldn’t be certain exactly how things would go wrong, I just knew they would and my experience had taught me where to be wary. I was destined to end up as a tester.

Liz Keogh, in this talk on Safe-to-Fail makes a similar point.

“Testers are really, really good at spotting failure scenarios… they are awesomely imaginative at calamity… Devs are problem solvers. They spot patterns. Testers spot holes in patterns… I have a theory that other people who are in critical positions, like compliance and governance people are also really good at this”.

Testers should have the creativity to imagine how things might go wrong. In a Complex domain, working with applications that have been developed working with Cynefin, this insight and imagination, the ability to spot potential holes, will be extremely valuable. Testers have to seize that opportunity to remain relevant.

There is an upside to redundancy. If there are different ways of achieving the same ends then that diversity will offer more scope for innovation, for users to learn about the application and how it could be adapted and exploited to do more than the developers had imagined. Again, this is an opportunity for testers. Stakeholders need to know about the application and what it can do. Telling them that the application complied with a set of requirements that might have been of dubious relevance and accuracy just doesn’t cut it.

Conclusion

Conclusion is probably the wrong word. Dave Snowden’s class opened my mind to a wide range of new ideas and avenues to explore. This was just the starting point. These two essays can’t go very far in telling you about Cynefin and how it might apply to software testing. All I can realistically do is make people curious to go and learn more for themselves, to explore in more depth. That is what I will be doing, and as a starter I will be in London at the end of June for the London Tester Gathering. I will be at the workshop An Introduction to Complexity and Cynefin for Software Testers” being run by Martin Hynie and Ben Kelly where I hope to discuss Cynefin with fellow testers and explorers.

If you are going to the CAST conference in Nashville in August you will have the chance to hear Dave Snowden giving a keynote speech. He really is worth hearing.

Dave Snowden’s Cynefin masterclass in New York, 2nd May 2017 – part 1

This is part one of a two post series on Cynefin and software testing. I wrote it with the support of the Committee on Standards and Professional Practices of the Association for Software Testing. The posts originally appeared on the AST site.

Introduction

On May 2nd I attended Dave Snowden’s masterclass in New York, “A leader’s framework for decision making: managing complex projects using Cynefin”, at the Test Leadership Congress. For several years I have been following Dave’s work and I was keen to hear him speak in person. Dave is a gifted communicator, but he moves through his material fast, very fast. In a full day class he threw out a huge range of information, insights and arguments. I was writing frantically throughout, capturing key ideas and phrases I could research in detail later.

It was an extremely valuable day. All of it was relevant to software development, and therefore indirectly to testing. However, it would require a small book to do justice to Dave’s ideas. I will restrict myself to two posts in which I will concentrate on a few key themes that struck me as being particularly important to the testing community.

Our worldview matters

We need to understand how the world works or we will fail to understand the problems we face. We won’t recognise what success might look like, nor will we be able to anticipate unacceptable failure till we are beaten over the head, and we will select the wrong techniques to address problems.it ain't what you don't know that gets you into trouble - it's what you know for sure that just ain't do

Dave used a slide with this quote from Mark Twain. It’s an important point. Software development and testing has been plagued over the years by unquestioned assumptions and beliefs that we were paid well to take for granted, without asking awkward questions, but which just ain’t so. And they’ve got us into endless trouble.

A persistent damaging feature of software development over the years has been the illusion that is a neater, more orderly process than it really is. We craved certainty, fondly imagining that if we just put a bit more effort and expertise into the upfront analysis and requirements then good, experienced professionals can predictably develop high quality applications. It hardly ever panned out that way, and the cruel twist was that the people who finally managed to crank out something workable picked up the blame for the lack of perfection.

Fred Brooks made the point superbly in his classic paper, “No Silver Bullet”.

“The truth is, the client does not know what he wants. The client usually does not know what questions must be answered, and he has almost never thought of the problem in the detail necessary for specification. … in planning any software-design activity, it is necessary to allow for an extensive iteration between the client and the designer as part of the system definition.

…… it is really impossible for a client, even working with a software engineer, to specify completely, precisely, and correctly the exact requirements of a modern software product before trying some versions of the product.”

So iteration is required, but that doesn’t mean simply taking a linear process and repeating it. Understanding and applying Cynefin does not mean tackling problems in familiar ways but with a new vocabulary. It means thinking about the world in a different way, drawing on lessons from complexity science, cognitive neuroscience and biological anthropology.

Cynefin and ISO 29119

Cynefin is not based on successful individual cases, or on ideology, or on wishful thinking. Methods that are rooted in successful cases are suspect because of the survivorship bias (how many failed projects did the same thing?), and because people do not remember clearly and accurately what they did after the event; they reinterpret their actions dependent on the outcome. Cynefin is rooted in science and the way things are, the way systems behave, and the way that people behave. Developing software is an activity carried out by humans, for humans, mostly in social organisations. If we follow methods that are not rooted in reality, in science and which don’t allow for the way people behave then we will fail.

Dave Snowden often uses the philosophical phrase “a priori”, usually in the sense of saying that something is wrong a priori. A priori knowledge is based on theoretical deduction, or on mathematics, or the logic of the language in which the proposition is stated. We can say that certain things are true or false a priori, without having to refer to experience. Knowledge based on experience is a posteriori.

The distinction is important in the debate over the software testing standard ISO 29119. The ISO standards lobby has not attempted to defend 29119 on either a priori or on a posteriori grounds. The standard has its roots in linear, document driven development methods that were conspicuously unsuccessful. ISO were unable to cite any evidence or experience to justify their approach.

Defenders of the standard, and some neutrals, have argued that critics must examine the detailed content of the standard, which is extremely expensive to purchase, in order to provide meaningful criticism. However, this defence is misconceived because the standard itself is misconceived. The standard’s stated purpose, “is to define an internationally agreed set of standards for software testing that can be used by any organization when performing any form of software testing”. If ISO believes that a linear, prescriptive standard like ISO 29119 will apply to “any form of software testing” we can refer to Cynefin and say that they are wrong; we can say so confidently knowing that our stance is backed by reputable science and theory. ISO is attempting to introduce a practice that might, sometimes at best, be appropriate for the Obvious domain into the Complicated and Complex domains where it is wildly unsuitable and damaging. ISO is wrong a priori.

What is Cynefin?

The Wikipedia article is worth checking out, not least because Dave Snowden keeps an eye on it. This short video presented by Dave is also helpful.

The Cynefin Framework might look like a quadrant, but it isn’t. It is a collection of five domains that are distinct and clearly defined in principle, but which blur into one another in practice.

In addition to the four domains that look like the cells of a quadrant there is a fifth, in the middle, called Disorder, and this one is crucial to an understanding of the framework and its significance.

Cynefin is not a categorisation model, as would be implied if it were a simple matrix. It is not a matter of dropping data into the framework then cracking on with the work. Cynefin is a framework that is designed to help us make sense of what confronts us, to give us a better understanding of our situation and the approaches that we should take.

The first domain is Obvious, in which there are clear and predictable causes and effects. The second is Complicated, which also has definite causes and effects, but where the connections are not so obvious; expert knowledge and judgement is required.

The third is Complex, where there is no clear cause and effect. We might be able to discern it with hindsight, but that knowledge doesn’t allow us to predict what will happen next; the system adapts continually. Dave Snowden and Mary Boone used a key phrase in their Harvard Business Review article about Cynefin.

”Hindsight does not lead to foresight because the external conditions and systems constantly change.”

The fourth domain is Chaotic. Here, urgent action rather than reflective analysis, is required. The participants must act, sense feedback and respond. Complex situations might be suited to safe probing, which can teach us more about the problem, but such probing is a luxury in the Chaotic domain.

The appropriate responses in all four of these domains are different. In Obvious, the categories are clearly defined, one simply chooses the right one, and that provides the right route to follow. Best practices are appropriate here.

In the Complicated domain there is no single, right category to choose. There could be several valid options, but an expert can select a good route. There are various good practices, but the idea of a single best practice is misconceived.

In the Complex domain it is essential to probe the problem and learn by trial and error. The practices we might follow will emerge from that learning. In Chaos as I mentioned, we simply have to start with action, firefighting to stop the situation getting worse. It is helpful to remember that, instead of the everyday definition, chaos in Cynefin terms refer to the concept in physics. Here chaos refers to a system that it is so dynamic that minor variations in initial conditions lead to outcomes so dramatically divergent that the system is unpredictable. In some circumstances it makes sense to make a deliberate and temporary move into Chaos to learn new practice. That would require removing constraints and the connections that impose some sort of order.

The fifth domain is that of Disorder, in the middle of the diagram. This is the default position in a sense. It’s where we find ourselves when we don’t know which domain we should really be in. It’s therefore the normal starting point. The great danger is that we don’t choose the appropriate domain, but simply opt for the one that fits our instincts or our training, or that is aligned with the organisation’s traditions and culture, regardless of the reality.

The only stable domains are Obvious, Complicated and Complex. Chaotic and Disorder are transitional. You don’t (can’t) stay there. Chaotic is transitional because constraints will kick in very quickly, almost as a reflex. Disorder is transitional because you are actually in one of the other domains, but you just don’t know it.

The different domains have blurred edges. In any context there might be elements that fit into different domains if they are looked at independently. That isn’t a flaw with Cynefin. It merely reflects reality. As I said, Cynefin is not a neat categorisation model. It is intended to help us make sense of what we face. If reality is messy and blurred then there’s no point trying to force it into a straitjacket.

Many projects will have elements that are Obvious, that deal with a problem that is well understood, that we have dealt with before and whose solution is familiar and predictable. However, these are not the parts of a project that should shape the approach we take. The parts where the potential value, and the risk, lie are where we are dealing with something we have not done before. Liz Keogh has given many talks and written some very good blogs and articles about applying Cynefin to software development. Check out her work. This video is a good starter.

The boundaries between the domains are therefore fuzzy, but there is one boundary that is fundamentally different from the others; the border between Obvious and Chaotic. This is not really a boundary at all. It is more of a cliff. If you move from Obvious to Chaotic you don’t glide smoothly into a subtly changing landscape. You fall off the cliff.

Within the Obvious domain the area approaching the cliff is the complacent zone. Here, we think we are working in a neat, ordered environment and “we believe our own myths” as Snowden puts it in the video above. The reality is quite different and we are caught totally unaware when we hit a crisis and plunge off the cliff into chaos.

That was a quick skim through Cynefin. However, you shouldn’t think of it as being a static framework. If you are going to apply it usefully you have to understand the dynamics of the framework, and I will return to that in part two.

A single source of truth?

Lately in a chatroom for the International Society for Software Testing there has been some discussion about the idea of a “single source of truth”. I’m familiar with this in the sense of database design. Every piece of data is stored once and the design precludes the possibility of inconsistency, of alternative versions of the same data. That makes sense in this narrow context, but the discussion revealed that the phrase is now being used in a different sense. A single source of truth has been used to describe an oracle of oracles, an ultimate specification on which total reliance can be placed. The implications worry me, especially for financial systems, which is my background.

I’m not comfortable with a single source of truth, especially when it applies to things like bank balances, profit and loss figures, or indeed any non-trivial result of calculations. What might make more sense is to talk of a single statement of truth, and that statement could, and should, have multiple sources so the statement is transparent and can be validated. However, I still wouldn’t want to talk about truth in financial statements. For an insurance premium there are various different measures, which have different uses to different people at different times. When people start talking about a single, true, premium figure they are closing off their minds to reality and trying to redefine it to suit their limited vision.

All of these competing measures could be regarded as true in the right context, but there are other measures which are less defensible and which an expert would consider wrong, or misleading, in any context (eg lumping Insurance Premium Tax into the premium figure). That’s all quite aside from the question of whether these measures are accurate on their own terms.

A “single source of truth” reminds me of arguments I’d have with application designers. Sometimes the problem would be that they wanted to eliminate any redundancy in the design. That could make reconciliation and error detection much harder because the opportunities to spot errors would be reduced. If a calculation was wrong it might stay wrong because no-one would know. A different source of friction was the age old problem of analysts and designers determined to stick rigidly to the requirements without questioning them, or even really thinking about the implications. I suspect I was regarded as a pedantic nuisance, creating problems in places the designers were determined no problems could ever exist – or ever be visible.

Accounting for truth

Conventional financial accounting is based on double entry book-keeping, which requires every transaction to be entered twice, in different places so that the accounts as a whole remain in balance. There may be a single, definitive statement of profit, but that is distilled from multiple sources, with an intricate web of balances and documented, supporting assumptions. The whole thing is therefore verifiable, or auditable. But it’s not truth. It’s more a matter of saying “given these assumptions this set of transactions produces the following profit figure”. Vary the assumptions and you have a different and perhaps equally valid figure – so it’s not truth.

For many years academic accountants, e.g. Christopher Napier, have been doing fascinating work that strays over into philosophy. What is this reality that we are trying to understand? That’s ontology. What can we know about it, and what reliance can we put on that knowledge when we try to report it? That’s epistemology. Why are we doing it? That’s teleology.

The most interesting subject I ever studied in accountancy at university was the problem of inflation accounting. £6-£5=£1 might be a crude profit calculation for an item whose inputs cost you £5 and which you sold for £6. But what if the £5 was a cost incurred 11 months ago. You then buy replacement inputs, which now cost £7, but you’d still only be able to sell the finished product for £6. What does it mean to say you made a profit of £1? Who does that help? Couldn’t you also argue that you made a loss of £1?

What does it mean to add money together when the different elements were captured at dates when the purchasing power equivalent of that money was different? You’re adding apples and oranges. The value of money is dependent on what it can buy. Setting aside short term speculation that is what dictates currency exchange rates. £1 is more valuable than €1 because it buys more. It is meaningless to add £1 + €1 and get 2. An individual currency has different values over time, so is it any more meaningful to add different monetary figures without considering what their value was at the time the data was captured?

The academics pointed out all the problems inflation caused and came up with possible, complicated solutions. However, the profession eventually decided it was all just too difficult and pretty much gave up, except for an international standard for accounting in countries experiencing hyper-inflation (defined as greater than 100% over three years, i.e. a persisting annual rate of at least 26%). As at the end of 2014 the qualifying countries are Belarus, Venezuela, Sudan, Iran and Syria (which has rather more to worry about than financial accounting). For the rest of the world, if you want to add 5 apples and 6 oranges, that’s fine. You’ve now got 11 pieces of fruit. Stop worrying and just do the job.

I’m the treasurer for a church, and I’m often asked how much money we’ve got. I never bother going to the online bank statement, because I know that what people really want to know is how much money is available. So I use the church accounts, which factor in the income and payments that haven’t been cleared, and the money we’re due imminently, and the outgoings to which we’re already committed. These different figures all mesh together and provide a figure that we find useful, but which is different from the bank’s view of our balance. Our own accounts never rely on a single source of truth. There are multiple reconciliation checks to try and flag up errors. The hope is that inputting an incorrect amount will generate a visible error. We’re not reporting truth. All we can say is, so far as we know this is as useful and honest a statement of our finances as we can produce for our purposes, for the Church of Scotland, the Office of the Scottish Charity Regulator and the other stakeholders.

It’s messy and complex – deal with it

What’s it all got to do with testing? If your vision of testing is checking whether the apparent functionality is consistent with the specification as represented in the test script then this sort of messy complexity is a tedious distraction. It’s so much easier to pretend you can confirm the truth using a test script.

However, testing is (or should be) a difficult and intellectually demanding process of teasing out the implications of the application for the stakeholders. If you accept that, then you are far more likely to do something valuable if you stop thinking about any single source of truth. You should be thinking instead about possible sources of insight to help you shed light on the various “truths” that the various stakeholders are seeking. Understanding these different needs, and all the nuances that arise from them is essential for testers.

Assuming that there is a single truth that we can attest to with a simple, binary yes/no answer reduces testing to the level of the accountants who have tried to treat accountancy as a simple arithmetical exercise. Five oranges and six apples add up to eleven pieces of fruit; and so do eleven grapes, and eleven melons. So what? That is a useless and misleading piece of information, like the unqualified statement that the product is sound because we found what the script told us to look for. Testers, accountants and auditors all pick up good money because they are required to provide valuable information to people who need it. They should be expected to deal with messy, complex reality. They should not be allowed to get away with trying to redefine reality so it’s easier to handle.

Service Virtualization interview about usability

Service VirtualizationThis interview with Service Virtualization appeared in January 2015. Initially when George Lawton approached me I wasn’t enthusiastic. I didn’t think I would have much to say. However, the questions set me thinking, and I felt they were relevant to my experience so I was happy to take part. It gave me something to do while I was waiting to fly back from EuroSTAR in Dublin!

How does usability relate to the notion of the purpose of a software project?

When I started in IT over 30 years ago I never heard the word usability. It was “user friendliness”, but that was just a nice thing to have. It was nice if your manager was friendly, but that was incidental to whether he was actually good at the job. Likewise, user friendliness was incidental. If everything else was ok then you could worry about that, but no-one was going to spend time or money, or sacrifice any functionality just to make the application user friendly. And what did “user friendly” mean anyway. “Who knows? Who cares? We’ve got serious work do do. Forget about that touchy feely stuff.”

The purpose of software development was to save money by automating clerical routines. Any online part of the system was a mildly anomalous relic of the past. It was just a way of getting the data into the system so the real work could be done. Ok, that’s an over-simplification, but I think there’s enough truth in it to illustrate why developers just didn’t much care about the users and their experience. Development moved on from that to changing the business, rather than merely changing the business’s bureaucracy, but it took a long time for these attitudes to shift.

The internet revolution turned everything upside down. Users are no longer employees who have to put up with whatever they’re given. They are more likely to be customers. They are ruthless and rightly so. Is your website confusing? Too slow to load? Your customers have gone to your rivals before you’ve even got anywhere near their credit card number.

The lesson that’s been getting hammered into the heads of software engineers over the last decade or so is that usability isn’t an extra. I hate the way that we traditionally called it a “non-functional requirement”, or one of the “quality criteria”. Usability is so important and integral to every product that telling developers that they’ve got to remember it is like telling drivers they’ve got to remember to use the steering wheel and the brakes. If they’re not doing these things as a matter of course they shouldn’t be allowed out in public. Usability has to be designed in from the very start. It can’t be considered separately.

What are the main problems in specifying for and designing for software usability?

Well, who’s using the application? Where are they? What is the platform? What else are they doing? Why are they using the application? Do they have an alternative to using your application, and if so, how do you keep them with yours? All these things can affect decisions you take that are going to have a massive impact on usability.

It’s payback time for software engineering. In the olden days it would have been easy to answer these questions, but we didn’t care. Now we have to care, and it’s all got horribly difficult.

These questions require serious research plus the experience and nous to make sound judgements with imperfect evidence.

In what ways do organisations lose track of the usability across the software development lifecycle?

I’ve already hinted at a major reason. Treating usability as a non-functional requirement or quality criterion is the wrong approach. That segregates the issue. It’s treated as being like the other quality criteria, the “…ities” like security, maintainability, portability, reliability. It creates the delusion that the core function is of primary importance and the other criteria can be tackled separately, even bolted on afterwards.

Lewis & Rieman came out with a great phrase fully 20 years ago to describe that mindset. They called it the peanut butter theory of usability. You built the application, and then at the end you smeared a nice interface over the top, like a layer of peanut butter (PDF, opens in new tab).

“Usability is seen as a spread that can be smeared over any design, however dreadful, with good results if the spread is thick enough. If the underlying functionality is confusing, then spread a graphical user interface on it. … If the user interface still has some problems, smear some manuals over it. If the manuals are still deficient, smear on some training which you force users to take.”

Of course they were talking specifically about the idea that usability was a matter of getting the interface right, and that it could be developed separately from the main application. However, this was an incredibly damaging fallacy amongst usability specialists in the 80s and 90s. There was a huge effort to try to justify this idea by experts like Hartson & Hix, Edmonds, and Green. Perhaps the arrival of Object Oriented technology contributed towards the confusion. A low level of coupling so that different parts of the system are independent of each other is a good thing. I wonder if that lured usability professionals into believing what they wanted to believe, that they could be independent from the grubby developers.

Usability professionals tried to persuaded themselves that they could operate a separate development lifecycle that would liberate them from the constraints and compromises that would be inevitable if they were fully integrated into development projects. The fallacy was flawed conceptually and architecturally. However, it was also a politically disastrous approach. The usability people made themselves even less visible, and were ignored at a time when they really needed to be getting more involved at the heart of the development process.

As I’ve explained, the developers were only too happy to ignore the usability people. They were following methods and lifecycles that couldn’t easily accommodate usability.

How can organisations incorporate the idea of usability engineering into the software development and testing process?

There aren’t any right answers, certainly none that will guarantee success. However, there are plenty of wrong answers. Historically in software development we’ve kidded ourselves thinking that the next fad, whether Structured Methods, Agile, CMMi or whatever, will transform us into rigorous, respected professionals who can craft high quality applications. Now some (like Structured Methods) suck, while others (like Agile) are far more positive, but the uncomfortable truth is that it’s all hard and the most important thing is our attitude. We have to acknowledge that development is inherently very difficult. Providing good UX is even harder and it’s not going to happen organically as a by-product of some over-arching transformation of the way we develop. We have to consciously work at it.

Whatever the answer is for any particular organisation it has to incorporate UX at the very heart of the process, from the start. Iteration and prototyping are both crucial. One of the few fundamental truths of development is that users can’t know what they want and like till they’ve seen what is possible and what might be provided.

Even before the first build there should have been some attempt to understand the users and how they might be using the proposed product. There should be walkthroughs of the proposed design. It’s important to get UX professionals involved, if at all possible. I think developers have advanced to the point that they are less likely to get it horribly wrong, but actually getting it right, and delivering good UX is asking too much. For that I think you need the professionals.

I do think that Agile is much better suited to producing good UX than traditional methods, but there are still dangers. A big one is that many Agile developers are understandably sceptical about anything that smells of Big Up-Front Analysis and Design. It’s possible to strike a balance and learn about your users and their needs without committing to detailed functional requirements and design.

How can usability relate to the notion of testable hypothesis that can lead to better software?

Usability and testability go together naturally. They’re also consistent with good development practice. I’ve worked on, or closely observed, many applications where the design had been fixed and the build had been completed before anyone realised that there were serious usability problems, or that it would be extremely difficult to detect and isolate defects, or that there would be serious performance issues arising from the architectural choices that had been made.

We need to learn from work that’s been done with complexity theory and organisation theory. Developing software is mostly a complex activity, in the sense that there are rarely predictable causes and effects. Good outcomes emerge from trialling possible solutions. These possibilities aren’t just guesswork. They’re based on experience, skill, knowledge of the users. But that initial knowledge can’t tell you the solution, because trying different options changes your understanding of the problem. Indeed it changes the problem. The trials give you more knowledge about what will work. So you have to create further opportunities that will allow you to exploit that knowledge. It’s a delusion that you can get it right first time just by running through a sequential process. It would help if people thought of good software as being grown rather than built.

Why do you need the report?

Have you ever wondered what the purpose of a report was, whether it was a status report that you had to complete, or a report generated by an application? You may have wondered if there was any real need for the report, and whether anyone would miss it if no-one bothered to produce it.

I have come across countless examples of reports that seemed pointless. What was worse, their existence shaped the job we had to do. The reports did not help people to do the job. They dictated how we worked; production, checking and filing of the reports for future inspection were a fundamental part of the job. In any review of the project, or of our our performance, they were key evidence.

My concern, and cynicism, were sharpened by an experience as an auditor when I saw at first hand how a set of reports were defined for a large insurance company. To misquote Otto von Bismarck’s comment on the creation of laws; reports are like sausages, it is best not to see them being made.

The company was developing a new access controls system, to allow managers to assign access rights and privileges to staff who were using the various underwriting, claims and accounts applications. As an auditor I was a stakeholder, helping to shape the requirements and advising on the controls that might be needed and on possible weaknesses that should be avoided.

One day I was approached by the project manager and a user from the department that defined the working practices at the hundred or so branch offices around the UK and Republic of Ireland. “What control reports should the access control system provide?” was their question.

I said that was not my decision. The reports could not be treated as a bolt on addition to the system. They should not be specified by auditors. The application should provide managers with the information they needed to do their jobs, and if it wasn’t feasible to do that in real time, then reports should be run off to help them. It all depended on what managers needed, and that depended on their responsibilities for managing access. The others were unconvinced by my answer.

A few weeks later the request for me to specify a suite of reports was repeated. Again I declined. This time the matter was escalated. The manager of the branch operations department sat in on the meeting. He made it clear that a suite of reports must be defined and coded by the end of the month, ready for the application to go live.

He was incredulous that I, as an auditor, would not specify the reports. His reasoning was that when auditors visited branches they would presumably check to see whether the reports had been signed and filed. I explained that it was the job of his department to define the jobs and responsibilities of the branch managers, and to decide what reports these managers would need in order to fulfill their responsibilities and do their job.

The manager said that was easy; it was the responsibility of the branch managers to look at the reports, take action if necessary, then sign the reports and file them. That was absurd. I tried to explain that this was all back to front. At the risk of stating the obvious, I pointed out that reports were required only if there was a need for them. That need had to be identified so that the right reports could be produced.

I was dismissed as a troublesome timewaster. The project manager was ordered to produce a suite of reports, “whatever you think would be useful”. The resulting reports were simply clones of the reports that came out from an older access control system, designed for a different technical and office environment, with quite different working practices.

The branch managers were then ordered to check them and file them. The branch operations manager had taken decisive action. The deadline was met. Everyone was happy, except of course the poor branch managers who had to wade through useless reports, and the auditors of course. We were dismayed at the inefficiency and sheer pointlessness of producing reports without any thought about what their purpose was.

That highlighted one of the weaknesses of auditors. People invariably listened to us if we pointed out that something important wasn’t being done. When we said that something pointless was being done there was usually reluctance to stop it.

Anything that people have got used to doing, even if it is wasteful, ineffective and inefficient, acquires its own justification over time. The corporate mindset can be “this is what we do, this is how we do it”. The purpose of the corporate bureaucracy becomes the smooth running of the bureaucracy. Checking reports was a part of a branch manager’s job. It required a mental leap to shift to a position where you have to think whether reports are required, and what useful reporting might comprise. It’s so much easier to snap, “just give us something useful” and move on. That’s decisive management. That’s what’s rewarded. Thinking? Sadly, that can be regard as a self-indulgent, waste of time.

However, few things are more genuinely wasteful of the valuable time of well paid employees than reporting that has no intrinsic value. Reporting that forces us to adapt our work to fit the preconceptions of the report designer gobbles up huge amounts of time and stop us doing work that could be genuinely valuable. The preconceptions that underpin many reports and metrics may once have been justified, and have fitted in with contemporary working practices. However, these preconceptions need to be constantly challenged and re-assessed. Reports and metrics do shape the way we work, and the way we are assessed. So we need to keep asking, “just why do you need the report?”

Perfect requirements, selective inattention and junk categories

A couple of months ago I had an interesting discussion about requirements with Johan Zandhuis, Fiona Charles and Mohinder Khosla on the EuroSTAR conference blog.

Johan said something that I wanted to mull over for a while before responding. As so often happens real life intervened and I got caught up in other things. Finally here is my response.

Johan said;

Perfect requirements don’t exist, you’ll need infinite time and money to reach that. And infinity is more a mathematical thing, I never have that in real life…
But the main point that I intended to put forward is that we should put more effort in understanding each other BEFORE we start coding.

Sure, we can’t have perfect requirements, but the problem is deeper than that.

The idea that you could derive perfect requirements with infinite time and money has interesting implications. It implies that greater resources can produce better requirements. I believe that is true to only a very limited extent; it’s certainly less true than software developers and testers have traditionally chosen to believe.

Yes, more money allows you to hire better people. But giving them more time is only effective if they’re working the right way. Usually we are not working the right way, and that applies especially to “understanding each other before we start”. If we are working the right way then we can forget about the idea that requirements could ever be perfect, and certainly not perfect and detailed.

There are three related questions that come to mind. Do we really understand requirements? Is it conceptually possible to define them precisely up front? Do we even understand what we are doing when we try to come up with a design to meet the requirements?

Do we really understand requirements?

Too often we make some huge, fundamental mistakes when we are defining requirements. We treat requirements as being a shopping list; the users ask for things, we build them. Also, we get requirements mixed up with design solutions; we decide that something is an essential requirement, when it is really only an essential feature of an optional design.

However, it’s not just that we get it wrong. It’s an illusion to think we could ever get it right, up front, even with unlimited resources.

Fred Brooks stated the problem succinctly back in 1986 in his classic essay
“No Silver Bullet”.

It is really impossible for a client, even working with a software engineer, to specify completely, precisely, and correctly the exact requirements of a modern software product before trying some versions of the product.

Do we really understand design?

It’s bad enough that we don’t really understand requirements, but to make matters worse we don’t understand design either. Software engineering has attempted to fashion itself on a rather naïve view of construction engineering, that it is possible to move rationally and meticulously in a linear fashion from a defined problem to an inevitable design solution. Given the problem and the techniques available there is a single, correct solution.

This is a fundamental error. To a large extent design is a question of setting the problem, i.e defining and understanding the problem, rather than simply solving the problem. Plunging into the design, or the detailed requirements, in the belief that our mission is already clear, stops us learning about the problem.

Design is an iterative process in which we experiment and learn, not just about possible solutions, but about the problem itself.

On the face of it that doesn’t fit comfortably with my earlier statement about the confusion of requirements and design. If the process of defining the problem and eliciting accurate requirements inevitably involves some form of prototype are the requirements and design not inextricably interwoven? Well, yes, but only in the sense that the detailed requirements are a mixture of possible design solutions and the implications of a higher level goal. They are not true requirements in their own right, abstracted from the possible implementation of a solution.

Users are typically lured into stating requirements in a form that assume a certain solution, when at that stage they have little understanding of what is possible and necessary. If the goal is abstracted to a higher level then the iterative process of exploring and refining the possible solutions can proceed more safely.

So messiness, uncertainty and experimentation are inevitable features of building software. That is the reality; denying it merely stokes up problems for the future, whether it is a failed project, or an unsatisfactory product.

Selective inattention and junk categories

This could have been a very much longer article, and I had to resist the temptation to plunge into greater detail on the nature of requirements and the way that we think when we design.

If you want to know more about our failure to understand requirements, and the way we confuse them with the design, then I strongly recommend Tom Gilb’s work. There is a vast amount of it available. Simply search for Tom Gilb and requirements.

If you want to delve further into the psychology and sociology of design then Donald Schön is a good starting point. His book “The Reflective Practitioner” helped me clarify my thinking on this subject. Schön’s examples do become lengthily repetitive, but the first 70 pages are an excellent overview of the topic.

If you’re interested in dipping into Schön’s work you could check out this article by Willemien Visser (PDF, opens in a new tab).

In his book Schön argues that the professions have adopted a paradigm of Technical Rationality, in which knowledge was learned and then applied, problems being neatly resolved by the application of existing technical expertise, i.e. by “knowledge in practice”. The following passage (page 69) leapt off the page.

Many practitioners, locked into a view of themselves as technical experts, find nothing in the world of practice to occasion reflection. They have become too skilful at techniques of selective inattention, junk categories, and situational control, techniques which they use to preserve the constancy of their knowledge-in-practice. For them, uncertainty is a threat, its admission is a sign of weakness.

Schön was not writing about software development, but that paragraph is a stinging indictment of the mindset that was once unchallenged in software engineering, and which is still far too prevalent.

Could we possibly get requirements as we traditionally understood them correct up front, even with unlimited resources? Is it a smart idea even to try?

Such ideas fall into the category of junk, that would require a huge amount of selective inattention if one persists in trying to believe them!

The quality gap – part 2

In my last blog, the first of two on the theme of “The Quality Gap” I discussed the harmful conflation of quality with requirements and argued that it was part of a mindset that hindered software development for decades.

In my studies and reading of software development and its history I’ve come to believe that academics and industry gurus misunderstood the nature of software development, regarding it as a more precise and orderly process than it really was, or at least they regarded it as potentially more precise and orderly than it could reasonably be. They saw practitioners managing projects with a structured, orderly process that superficially resembled civil engineering or construction management, and that fitted their perception of what development ought to be.

They missed the point that developers were managing projects that way because it was the simplest way to manage a chaotic and unpredictable process, and not because it was the right way to produce high quality software. The needs of project management were dictating development approaches, not the needs of software development.

The pundits drew the wrong conclusion from observing the uneasy mixture of chaos and rigid management. They decided that the chaos wasn’t the result of developers struggling to cope with the realities of software development and an inappropriate management regime; it was the result of a lack of formal methods and tools, and crucially it was also the consequence of a lack of discipline.

Teach them some discipline!

The answer wasn’t to support developers in coming to grips with the problems of development; it was to crack down on them and call for greater order and formality.

Some of the comments from the time are amusing, and highly revealing. Barry Boehm approvingly quoted a survey in 1976; “the average coder…(is)…generally introverted, sloppy, inflexible, in over his head, and undermanaged”.

Even in the 1990s Paul Ward and Ed Yourdon, two of the proponents of structured methods were berating developers for their sins and moral failings.

Ward – “the wealth of ignorance… the lack of professional discipline among the great unwashed masses of systems developers”.

Yourdon – “the majority of software development organisations operate in a “Realm of Darkness”, blissfully unaware of even the rudimentary concepts of structured analysis and design”

This was pretty rich considering the lack of theoretical and practical underpinning of structured analysis and design, as promoted by Ward and Yourdon. See this part of an article I wrote a few years ago for a fuller explanation. The whole article gives a more exhaustive argument against standards than I’m providing here.

Insulting people is never a great way to influence them, but that hardly mattered. Nobody cared much about what the developers themselves thought. Senior managers were convinced and happily signed the cheques for massive consultancy fees to apply methods built on misconceived conceptual models. These methods reinforced development practices which were damaging, certainly from the perspective of quality (and particularly usability).

Quality attributes

Now we come to the problem of quality attributes. For many years there has been a consensus that a high quality application should deliver the required levels of certain quality attributes, pretty much the same set that Glass listed in the article I referred to in part 1; reliability, modifiability, understandability, efficiency, usability, testability and portability. There is debate over the the members of this set, and their relative importance, but there is agreement that these are the attributes of quality.

They are also called “non-functional requirements”. I dislike the name, but it illustrates the problem. The relentless focus of traditional, engineering obsessed development was on the function, and thus the functional requirements, supposedly in the name of quality. Yet the very attributes that a system needed in order to enjoy high quality were shunted to one side in the development process and barely considered.

I have never seen these quality attributes given the attention they really require. They were often considered only as a lame afterthought and specified in such a way that testing was impossible. They were vague aspirations and lacked precision. Where there were clear criteria and targets they could usually be assessed only after the application had been running for months, by which time the developers would have been long gone. What they did not do, or not do effectively, was to shape the design.

The quality attributes are harder to specify than functional requirements; harder, but not impossible. However, the will to specify clear and measurable quality requirements was sadly lacking. All the attention was directed at the function, a matter of logical relationships, data flows and business rules.

The result was designs that reflected what the application was supposed to do and neglected how it would do it.

This problem was not attributable to incompetent developers and designers who failed to follow the prescribed methods properly. The problem was a consequence of the method, and one of the main reasons was the difficulty of finding the right design.

The design paradox

Traditional development, and structured methods in particular, had a fundamental problem, quite apart from the neglect of quality attributes, in trying to derive the design from the requirements. Again, that same part of my article on testing and standards explains how these methods matched the mental processes of bad designers and ignored the way that successful designers think.

It’s a paradox of the traditional approach to software development that developers did their designing both too early and too late. They subconsciously fixed on design solutions too early while they they should only have been trying to understand the users’ goals and high level requirements. The requirements would be captured in a way that assumed and constrained the solution. The analysts and designers would then work their way through detailed requirements to a design that was not exposed to testing until it was too late to change easily, if it was possible to change it at all.

Ignoring reality

So software development, in attempting to be more like a conventional engineering discipline, was adopting the trappings of formal engineering, whilst ignoring its inability to deal with issues that a civil engineer would never dream of neglecting.

If software engineering really was closely aligned to civil engineering it would have focussed relentlessly on practical problems. Civil engineering has to work. It is a pragmatic discipline and cannot afford to ignore practical problems. Software engineering, or rather the sellers of formal methods, could be commercially highly successful by ignoring problems and targeting their sales pitch at senior managers who didn’t understand software development, but wrote the cheques.

Civil engineering has sound scientific and mathematical groundings. The flow from requirements to design is just that, a flow rather than a series of jumps from hidden assumptions to arbitrary solutions.

Implicit requirements (e.g. relating to safety) in civil engineering are quite emphatically as important as those that are documented. They cannot be dismissed just because the users didn’t request them. The nature of the problem engineers are trying to solve must be understood so that the implicit requirements are exposed and addressed.

In civil engineering designs are not turned into reality before anyone is certain that they will work.

These discrepancies between software development and civil engineering have been casually ignored by the proponents of the civil engineering paradigm.

So why did the civil engineering paradigm survive so long?

There are two simple reasons for the enduring survival of this deeply flawed worldview. It was comforting, and it has been hugely profitable.

Developers had to adopt formal methods to appear professional and win business. They may not have really believed in their efficacy, but it was reassuring to be able to follow an orderly process. Even the sceptics could see the value of these methods in providing commercial advantage regardless of whether they built better applications.

The situation was summed up well by Brian Fitzgerald, back in 1995.

In fact, while methodologies may contribute little to either the process or product of systems development, they continue to be used in organisations, principally as a “comfort factor” to reassure all participants that “proper” practices are being followed in the face of the stressful complexity associated with system development.

Alternately, they are being used to legitimate the development process, perhaps to win development contracts with government agencies, or to help in the quest for ISO-certification. In this role, methodologies are more a placebo than a panacea, as developers may fall victim to goal displacement, that is, blindly and slavishly following the methodology at the expense of actual systems development. In this mode, the vital insight, sensitivity and flexibility of the developer are replaced by automatic, programmed behaviour.

The particular methodologies about which Fitzgerald was writing may now be obsolete and largely discredited, but the mindset he describes is very much alive. The desire to “legitimate the development process” is still massively influential, and it is that desire that the creators of ISO 29119 are seeking to feed, and to profit from.

However, legitimising the development process, in so far as it means anything, requires only that developers should be able to demonstrate that they are accountable for the resources they use, and that they are acting in a responsible manner, delivering applications as effectively and efficiently as they can given the nature of the task. None of that requires exhaustive, prescriptive standards. Sadly many organisations don’t realise that, and the standards lobby feeds off that ignorance.

The quality equation that Robert Glass described, and which I discussed in the first post of this short series, may be no more than a simple statement of the obvious. Software quality is not simply about complying with the requirements. That should be obvious, but it is a statement that many people refuse to acknowledge. They do not see that there is a gap between what the users expected to get, and their perceptions of the application when they get it.

It is that gap which professional testers seek to illuminate. Formal standards are complicit in obscuring the gap. Instead of encouraging testers and developers to understand reality they encouraging a focus on documentation and what was expected. They reinforce assumptions rather than question them. They confuse the means with the end.

I’ll sign off with a quote that sums up the problem with testing standards. It’s from
“Information Systems Development: Methods-in-Action” by Fitzgerald, Russo and Stolterman. The comment came from a developer interviewed during research for their book.

“We may be doing wrong but we’re doing so in the proper and customary manner”.