Has opposition to ISO 29119 really died down?

One of my concerns about the Stop 29119 campaign, ever since it was launched four years ago, was that ISO would try to win the debate by default, by ignoring the opposition. In my CAST 2014 talk, which kicked off the campaign, I talked about ISO’s attempt to define its opponents as being irrelevant. ISO breached its own rules requiring consensus from the profession, and in order to justify doing so they had to maintain a pretence that testers who opposed their efforts were a troublesome, old-fashioned faction that should be ignored.

That’s exactly what has happened. ISO have kept their collective heads down, tried to ride out the storm and emerged to claim that it was all a lot of fuss about nothing; the few malcontents have given up and gone away.

I have just come across a comment in the “talk” section of the Wikipedia article on ISO 29119, arguing for some warning flags on the article to be removed.

“…finally, the objection to this standard was (a) from a small but vocal group and (b) died down – the ballots of member National Bodies were unanimous in favour of publication. Furthermore, the same group objected to IEEE 829 as well.”

The opposition is significantly more than “a small but vocal group”, but I won’t dwell on that point. My concern here is point b. Have the objections died down? Yes, they have in the sense that the opponents of ISO 29119 have been less vocal. There have been fewer talks and articles pointing out the flaws in the principle and the detail of the standard.

However, there has been no change in the beliefs of the opposition. There comes a point when it feels unnecessary, even pointless, to keep repeating the same arguments without the other side engaging. You can’t have a one-sided debate. The Stop 29119 campaigners have other things to do. Frankly, attacking ISO 29119 is a dreary activity compared with most of the alternatives. I would prefer to do something interesting and positive rather than launching another negative attack on a flawed standard. However, needs must.

The argument that “ballots of member National Bodies were unanimous in favour of publication” may be true, but it is a circular point. The opponents of ISO 29119 argued convincingly that software testing is not an activity that lends itself to ISO style standardisation and that ISO failed to gain any consensus outside its own ranks. The fact that ISO are quite happy with that arrangement is hardly a convincing refutation of our argument.

The point about our opposition to the IEEE 829 standard is also true, but it’s irrelevant. Even ISO thought that standard was dated and inadequate for modern purposes. It decided to replace it rather than try to keep updating it. Unfortunately the creators of ISO 29119 repeated the fundamental mistakes that rendered IEEE 829 flawed and unsuitable for good testing.

I was pleased to discover that the author of the Wikipedia comment was on the ISO working group that developed ISO 29119 and wrote a blog defending the standard, or rather dismissing the opposition. It was written four years ago in the immediate aftermath of the launch of Stop 29119. It’s a pity it didn’t receive more attention at the time. The debate was far too one sided and we badly needed contributions from ISO 29119’s supporters. So, in order to provide a small demonstration that opposition to the standard is continuing I shall offer a belated response. I’ll quote Andrew’s arguments, section by section, in dark blue and then respond.

“As a member of the UK Mirror Panel to WG26, which is responsible for the ISO 29119 standard, I am disappointed to read of the objection to the standard led by the International Society for Software Testing, which has resulted in a formal petition to ISO.

I respectfully suggest that their objections would be more effective if they engaged with their respective national bodies, and sought to overcome their objections, constructively.

People who are opposing ISO 29119 claim:

  1. It is costly.
  2. It will be seen as mandatory skill for testers (which may harm individuality and freedom).
  3. It may reduce the ability to experiment and try non-conventional ways.
  4. Once the standard is accepted, testers can be held responsible for project failures (or non-compliance).
  5. Effort will be more on documentation and process rather than testing.
    Let us consider each of these in turn.”

The International Society for Software Testing (ISST) launched the petition against ISO 29119, but this was merely one aspect of the campaign against the standard. Opposition was certainly not confined to ISST. The situation is somewhat confused by the fact that ISST disbanded in 2017. One of the prime reasons was that the “objectives set out by the founders have been met, or are in the capable hands of organisations that we support”. The main organisation referred to here is the larger and more established Association for Software Testing (AST), which can put more resources into the fight. I always felt the main differences between ISST and AST were in style and approach rather than principles and objectives.

The suggestion that the opponents of ISO 29119 should have worked through national ISO bodies is completely unrealistic. ISO’s approach is fundamentally wrong and opponents would have been seen as a wrecking crew preventing any progress. I know of a couple of people who did try and involve themselves in ISO groups and gave up in frustration. The debate within ISO about a standard like 29119 concerns the detail, not the underlying approach. In any case the committment required to join an ISO working group is massive. Meetings are held all over the world. They take up a lot of time and require huge expenses for travel and accommodation. That completely excludes independent consultants like myself.

“Costly

Opponents object to this standard because it is not freely available.

While this is a fair point, it is no different from every other standard that is in place – and which companies follow, often because it gives them a competitive advantage.

Personally, I would like to see more standards placed freely in the public domain, but I am not in a position to do it!”

The cost of the standard is a minor criticism. As a member of the AST’s Committee on Standards and Professional Practice I am fortunate to have access to the documents comprising the standard. These cost hundreds of dollars and I would baulk at buying them for myself. The full set would cost as much as a family holiday. I know which would be more important!

However, the cost does hamper informed debate about the content, and that was the true concern. The real damage of a poorly conceived standard will be poorer quality testing and that will be far more costly than the initial cost of the documents.

“Mandatory

Opponents claim this standard will be seen as a mandatory skill for testers (which may harm individuality and freedom).

ISO 29119 replaces a number of IEEE and British standards that have been in place for many years. And while those standards are seen to represent best practice, they have not been mandatory.”

I have two big issues with this counter argument. Firstly, the standards that ISO 29119 replaced were emphatically not “seen to represent best practice”. If they were best practice there would have been no need to replace them. They were hopelessly out of date but IEEE 829 was unhelpful, even damaging, when it was new.

My second concern is about the way that people respond to a standard. Back in 2009 I wrote this article “Do standards keep testers in the kindergarten?” in Testing Experience magazine arguing against the principle of testing standards, the idea of best practice and the inevitable danger of an unhelpful and dated standard like IEEE 829 being imposed on unwilling testers.

Once you’ve called a set of procedures a standard the argument is over in many organisations; testers are required to use them. It is disingenuous to say that standards are not mandatory. They are sold on the basis that they offer reassurance and that the wise, safe option is to make them compulsory.

I made this argument nine years ago thinking the case against standards had been won. I was dismayed to discover subsequently that ISO was trying to take us back several decades with ISO 29119.

“Experimentation

A formal testing environment should be a place where processes and procedures are in place, and is not one where ‘experiment and non-conventional’ methods are put in place. But having said that, there is nothing within ISO 29199 that prevents other methods being used.”

There may be a problem over the word “experiment” here. Andrew seems to think that testers who talk of experimentation are admitting they don’t know what they’re doing and are making it up as they go along. That would be an unfortunate interpretation. When testers from the Context Driven School refer to experimentation they mean the act of testing itself.

Good testing is a form of exploration and experimentation to find out how the product behaves. Michael Bolton describes that well here. A prescriptive standard that focuses on documentation distracts from, and effectively discourages, such exploring and experimentation. We have argued that at length and convincingly. It would be easier to analyse Andrew’s case if he had provided links to arguments from opponents who had advocated a form of experimentation he disapproves of.

“Accountability

Opponents claim that, once the standard is accepted, testers can be held responsible for project failures (or non-compliance).

As with any process or procedure, all staff are required to ensure compliance with the company manual – and project managers should be managing their projects to ensure that all staff are doing so.

Whether complying with ISO 29119 or any other standard or process, completion of testing and signing off as ‘passed’ carries accountability. This standard does not change that.”

This is a distortion of the opponents’ case. We do believe in accountability, but that has to be meaningful. Accountability must be based on something to which we can reasonably sign up. We strongly oppose attempts to enforce accountability to an irrelevant, poorly conceived and damaging standard. Complying with such a standard is orthogonal to good testing; there is no correlation between the two activities.

At best ISO 29119 would be an irrelevance. In reality it is more likely to be a hugely damaging distraction. If a company imposes a standard that all testers should wear laboratory technicians’ white coats it might look impressively professional, but complying with the standard would tell us nothing about the quality of the testing.

As a former auditor I have strong, well informed, views about accountability. One of ISO 29119’s serious flaws is that it fails to explain why we test. We need such clarity before we can have any meaningful discussion about compliance. I discussed this here, in “Do we want to be ‘compliant’ or valuable?”

The standard defines in great detail the process and the documents for testing, but fails to clarify the purpose of testing, the outcomes that stakeholders expect. To put it bluntly, ISO 29119 is vague about the ends towards which we are working, but tries to be precise about the means of getting there. That is an absurd combination.

ISO 29119 tries to set out rules without principles. Understanding the distinction between rules and principles is fundamental to the process of crafting professional standards that can hold practitioners meaningfully to account. I made this argument in the Fall 2015 edition of Better Software magazine. The article is also available on my blog, “Why ISO 29119 is a flawed quality standard”.

This confusion of rules and principles, means and ends, has led to an obsessive focus on delivering documentation rather than valuable information to stakeholders. That takes us on to Andrew’s next argument.

“Documentation

Opponents claim that effort will be more on documentation and process rather than testing.

I fail to understand this line of reasoning – any formal test regime requires a test specification, test cases and recorded test results. And the evidence produced by those results need argument. None of this is possible without documentation.”

Opponents of ISO 29119 have argued repeatedly and convincingly that a prescriptive standard which concentrates on documentation will inevitably lead to goal displacement; testers will concentrate on the documentation mandated by the standard and lose sight of why they are testing. That was our experience with IEEE 829. ISO 29119 repeats the same mistake.

Andrew’s second paragraph offers no refutation of the opponents’ argument. He apparently believes that we are opposed to documentation per se. That’s a straw man. Andrew justifies ISO 29119’s demand for documentation, which I believe is onerous and inappropriate, by asserting that it serves as evidence. Opponents argue that the standard places far too much emphasis on advance documentation and neglects evidence of what was discovered by the testing.

The statement that any formal test regime requires a test specification and test cases is highly contentious. Auditors would expect to see evidence of planning, but test specifications and test cases are just one way of doing it, the way that ISO 29119 advocates. In any case, advance planning is not evidence that good testing was performed any more than a neat project plan provides evidence that the project ran to time.

As for the results, the section of ISO 29119 covering test completion reports is woefully inadequate. It would be possible to produce a report that complied fully with the standard and offered nothing of value. That sums up the problems with ISO 29119. Testers can comply while doing bad testing. That is in stark contrast to the standards governing more established professions, such as accountants and auditors.

“Conclusion

Someone wise once said:

  1. Argument without Evidence is unfounded.
  2. Evidence without Argument is unexplained.

Having considered the argument put forward, and the evidence to support the case:

  • The evidence is circumstantial with no coherence.
  • The argument is weak, and seems only to support their vested interests.

For a body that represents test engineers, I would have expected better.”

The quote that Andrew uses from “someone wise” actually comes from the field of safety critical systems. There is much that we in conventional software testing can learn from that field. Perhaps the most important lessons are about realism and humility. We must deal with the world as it is, not as we would like it to be. We must accept the limitations of our knowledge and what we can realistically know.

The proponents of ISO 29119 are too confident in their ability to manage in a complex, evolving field using techniques rooted in the 1970s. Their whole approach tempts testers to look for confirmation of what they think they know already, rather than explore the unknown and explain what they cannot know.

Andrew’s verdict on the opposition to ISO 29119 should be turned around and directed at ISO and the standard itself. It was developed and launched in the absence of evidence that it would help testers to do a better job. The standard may have internal consistency, but it is incoherent when confronted with the complexities of the real world.

Testers who are forced to use it have to contort their testing to fit the process. Any good work they do is in spite of the standard and not because of it. It might provide a welcome route map to novice testers, but it offers a dangerous illusion. The standard tells them how to package their work so it appears plausible to those who don’t know any better. It defines testing in a way that makes it appear easier than it really is. But testing is not meant to be easy. It must be valuable. If you want to learn how to provide value to those who pay you then you need to look elsewhere.

Finally, I should acknowledge that some of the work I have cited was not available to Andrew when he wrote his blog in 2014. However, all of the underlying arguments and research that opponents of 29119 have drawn on were available long before then. ISO simply did not want to go looking for them. Our arguments about ISO 29119 being anti-competitive were at the heart of the Stop 29119 campaign. Andrew has not addressed those arguments.

If ISO wants to be taken seriously it must justify ISO 29119’s status as a standard. Principled, evidenced and coherent objections deserve a response. In a debate all sides have a duty to respond to their opponents’ strongest case, rather than evading difficult objections, setting up straw men and selectively choosing the arguments that are easiest to deal with.

ISO must provide evidence and coherent evidence that 29119 is effective. Simply labelling a set of prescriptive processes as a standard and expecting the industry to respect it for that reason will not do. That is the sign of a vested interest seeking to impose itself on a whole profession. No, the opposition has not died down; it has not had anything credible to oppose.

Advertisements

Risk mitigation versus optimism – Brexit & Y2K

The continuing Brexit shambles reminds me of a row in the approach to Y2K at the large insurer where I was working for IBM. Should a business critical back office system on which the company accounts depended be replaced or made Y2K compliant? I was brought in to review the problem and report with a recommendation.

One camp insisted that as an insurer they had to manage risk, so Y2K compliance with a more leisurely replacement was the only responsible option. The opposing camp consisted of business managers who had been assigned responsibility for managing new programmes. They would be responsible for a replacement and they insisted they could deliver a new system on time, even though they had no experience of delivering such an application. My investigation showed me they had no grasp of the business or technical complexities, but they firmly believed that waterfall projects could be forced through successfully by charismatic management. All the previous failures were down to “weak management” and “bad luck”. Making the old system compliant would be an insult to their competence.

My report pointed out the relative risks & costs of the options. I sold Y2K compliance to the UK Accountant, sketching out the implications of the various options on a flipchart in a 30 minute chat so I had agreement before I’d even finished the report. The charismatic crew were furious, but silenced. The old system was Y2K compliant in time. The proposed new one could not have been delivered when it was needed. It would have been sunk by problems with upstream dependencies I was aware of but the charismatics refused to acknowledge as being relevant.

If the charismatics’ solution had been chosen the company would have lost the use of a business critical application in late 1999. No contingency arrangements would have been possible and the company would have been unable to produce credible reserves, vital for an insurance company’s accounts. The external auditors would have been unable to pass the accounts. The share price would have collapsed and the company would have been sunk. I’m sure the charismatics would have blamed bad luck, and other people. “It was those dependencies, not us. We were let down”. That was a large, public limited company. If my advice had been rejected the people who wanted the old system to be made Y2K compliant would have brought in the internal auditors, who in turn would have escalated their concern to the board’s audit committee if necessary. If there had still been no action they would have taken the matter to the external auditors.

That’s how things should work in a big corporation. Of course they often don’t and the auditors can lose their nerve, or choose to hope that things will work out well. There is at least a mechanism that can be followed if people decide to perform their job responsibly. With Brexit there is a cavalier unwillingness to think about risk and complexity that is reminiscent of those irresponsibly optimistic managers. We are supposed to trust politicians who can offer us nothing more impressive than “trust me” and “it’s their fault” and who are offering no clear contingency arrangements if their cheery optimism proves unfounded. There is a mechanism to hold them to acccount. That is the responsibility of Parliament. Will the House of Commons step up to the job? We’ll see.

Do we want to be “compliant” or valuable?

Periodically I am asked to write for a magazine or blog, and more often than not I agree. Two years ago I was asked by an online magazine to write about my opposition to the ISO 29119 testing standard. I agreed, but they didn’t use my article. I’ve just come across it and decided to post it on my blog. A warning! There’s nothing new here – but the arguments are still strong, relevant, and ISO have neither countered them nor attempted to do so. Clearly they hope to win the debate by default, by lying low and hoping that opponents will give up and be forgotten.

In August 2014 I gave a talk in New York at CAST, conference of the Association for Software Testing. I criticized software testing standards, and the new ISO 29119 in particular.

I thought I would talk for about 40 minutes, then we’d then have an interesting discussion and that might be the end of it. Well, that happened, but it was only the start. My talk caught the mood of concern about the standard amongst context driven testers, and so the Stop 29119 campaign kicked off.

Life hasn’t been quite the same since I acquired a reputation for being anti-standards. My opposition to ISO 29119 has defined my public image. I can’t complain, but I’m slightly uncomfortable with that. I’d rather be seen as positive than negative.

I want to make it clear that I do approve of standards; not all standards, but ones that have been well designed for their particular context. Good standards pool collective wisdom and ensure that everyone has the same understanding of what engineering products and activities should do. They make the economy work better by providing information and confidence to consumers, and protecting responsible companies from unscrupulous competitors. Standards also increase professional discipline and responsibility, and this is where the International Standards Organization has gone wrong with ISO 29119.

The standard defines in great detail the process and the documents for testing, but fails to clarify the purpose of testing, the outcomes that stakeholders expect. To put it bluntly, ISO 29119 is vague about the ends towards which we are working, but tries to be precise about the means of getting there. That is an absurd combination. Obviously stakeholders hope for good news from testers, but what they really need is the unvarnished truth, however brutal that might be.

Remember, it’s not the job of testers to test quality into the product. It’s our job to shine a light on what is there so that the right people can take decisions about what to do next. The outcome of testing isn’t necessarily a high-quality product; there may be valid reasons for releasing a product that looks buggy to us, or it might even make sense to scrap the development. I once saw an 80 person year project scrapped after testing. It’s not our call. The point is that the outcome of our testing must be the best information we can provide to stakeholders. ISO 29119 makes no mention of that. Instead it focuses in minute detail on the process and documentation.

Strict copyright protection means I can’t share the content of ISO 29119, but I can say that the sample Test Completion Reports in the standard epitomise what is wrong. They summarise the testing process with a collection of metrics that say nothing about the quality of the product. A persistent danger of standards and templates is that people simply copy the examples and fill in templates without thinking deeply enough about what is needed on their project.
It would be simple to comply with the ISO 29119 Test Completion Process, and produce a report that provided no worthwhile information at all.

The Institute of Internal Auditors offers a worthwhile alternative approach with their mandatory performance standards, which in striking contrast to ISO 29119 are available to the public for scrutiny and discussion. The section covering audit reports says nothing about the process of reporting, or what an audit report should look like. But it stipulates brief, clear and very demanding requirements about the quality of the information in the report.

The difference between ISO 29119 and internal audit standards is that you can’t produce a worthless audit report that complies with the standard. The outcome of the audit has to be useful information. Why couldn’t testing standards focus on such a simple outcome? Do testers want to be zombies, blindly complying with a standard and failing to think about what our stakeholders need? Or do we want to offer a valuable service?

Precertification of low risk digital products by FDA

Occasionally I am asked why I use Twitter. “Why do you need to know what people have had for breakfast? Why get involved with all those crazies?”. I always answer that it’s easy to avoid the bores and trolls (all the easier if one is a straight, white male I suspect) and Twitter is a fantastic way of keeping in touch with interesting people, ideas and developments.

A good recent example was this short series of tweets from Griffin Jones.
This was the first I’d heard of the pre-certification program proposed by the Food and Drug Administration (FDA), the USA’s federal body regulating food, drugs and medical devices.

Griffin is worried that IT certification providers will rush to sell their services. My first reaction was to agree, but on consideration I’m cautiously more optimistic.

Precertification would be for organisations, not individuals. The certification controversy in software testing relates to certifying individuals through ISTQB. FDA precertification is aimed at organisations, which would need “an existing track record in developing, testing, and maintaining software products demonstrating a culture of quality and organizational excellence measured and tracked by Key Performance Indicators (KPIs) or other similar measures.” That quote is from the notification for the pilot program for the precertification scheme, so it doesn’t necessarily mean the same criteria would apply to the final scheme. However, the FDA’s own track record of highly demanding standards (no, not like ISO 29119) that are applied with pragmatism provides grounds for optimism.

Sellers of CMMi and TMMi consultancy might hope this would give them a boost, but I’ve not heard much about these in recent years. It could be a tough sell for consultancies to push these models at the FDA when it is wanting to adopt more lightweight governance with products that are relatively low risk to consumers.

The FDA action plan (PDF, opens in new tab) that announced the precertification program did contain a word that jumped out at me. The FDA will precertify companies “who demonstrate a culture of quality and organizational excellence based on objective criteria”.

“Objective” might provide an angle for ISO 29119 proponents to exploit. A standard can provide an apparently objective basis for reviewing testing. If you don’t understand testing you can check for compliance with the standard. In a sense that is objective. Checkers are not bringing their own subjective opinions to the exercise. Or are they? The check is based on the assumption that the standard is relevant, and that the exercise is useful. In the absence of any evidence of efficacy, and there is no such evidence for ISO 29119, then using ISO 29119 as the benchmark is a subjective choice. It is used because it makes the job easier; it facilitates checking for compliance, it has nothing to do with good testing.

“Objective” should mean something different, and more constructive, to the FDA. They expect evidence of testing to be sufficient in quality and quantity so that third parties would have to come to the same conclusion if they review it, without interpretation by the testers. Check out Griffin Jones’ talk about evidence on YouTube.


Incidentally, the FDA’s requirements are strikingly similar to the professional standards of the Institute of Internal Auditors (IIA). In order to form an audit opinion auditors must gather sufficient information that is “factual, adequate, and convincing so that a prudent, informed person would reach the same conclusions as the auditor.” The IIA also has an interesting warning in its Global Technology Audit Guide, “Management of IT Auditing“. It warns IT auditors of the pitfalls of auditing against standards or benchmarks that might be dated or useless just because they want something to “audit against”.

So will ISO, or some large consultancies, try to influence the FDA to endorse ISO 29119 on the grounds that it would provide an objective benchmark against which to assess testing? That wouldn’t surprise me at all. What would surprise me is if the FDA bought into it. I like to think they are too smart for that. I am concerned that some day external political pressure might force adoption of ISO 29119. There was a hint of that in the fallout from the problems with the US’s Healthcare.gov website. Politicians who are keen to see action, any action, in a field they don’t understand always worry me. That’s another subject, however, and I hope it stays that way.

Dave Snowden’s Cynefin masterclass in New York, 2nd May 2017 – part 2

This post is the second of two discussing Dave Snowden’s recent Cynefin masterclass at the Test Leadership Congress in New York. I wrote the series with the support of the Committee on Standards and Professional Practices of the Association for Software Testing. The posts originally appeared on the AST site.

In the first I gave an overview of Cynefin and explained why I think it is important, and how it can helpfully shape the way we look at the world and make sense of the problems we face. In this post I will look at some of the issues raised in Dave’s class and discuss their relevance to development and testing.

The dynamics between domains

Understanding that the boundaries between the different domains are fluid and permeable is crucial to understanding Cynefin. A vital lesson is that we don’t start in one domain and stay there; we can and should move between them. Even if we ignore that lesson reality will drag us from one domain to another. Dave said “all the domains have value – it’s the ability to move between them that is key”.

The Cynefin dynamics are closely tied to the concept of constraints, which are so important to Cynefin that they act as differentiators between the domains. You could say that constraints define the domains.

Constraint is perhaps a slightly misleading word. In Cynefin terms it is not necessarily something that compels or prevents certain behaviour. That does apply to the Obvious domain, where the constraints are fixed and rigid. The constraints in the Complicated domain govern behaviour, and can be agreed by expert consensus. In the Complex domain the constraints enable action, rather than restricting it or compelling it. They are a starting point rather than an end. In Chaos there are no constraints.

Dave Snowden puts it as follows, differentiating rules and heuristics.

“Rules are governing constraints, they set limits to action, they contain all possible instances of action. In contrast heuristics are enabling constraints, they provide measurable guidance which can adapt to the unknowable unknowns.”

If we can change the constraints then we are moving from one domain to another. The most significant dynamic is the cycle between Complex and Complicated.

Cynefin core dynamic - Complex to ComplicatedCrucially, we must recognise that if we are attempting something new, that involves a significant amount of uncertainty then we start in the Complex domain exploring and discovering more about the problem. Once we have a better understanding and have found constraints that allow us to achieve repeatable outcomes we have moved the problem to the Complicated domain where we can manage it more easily and exploit our new knowledge. If our testing reveals that the constraints are not producing repeatable results then it’s important to get back into the Complex domain and carry out some more probing experiments.

This is not a one off move. We have to keep cycling to ensure the solution remains relevant. The cadence, or natural flow of the cycle will vary depending on the context. Different industries, or sectors, or applications will have different cadences. It could be days, or years, or anything in between. If, or rather when, our constraints fail to produce repeatable results we have to get back into the Complex domain.

This cycle between Complex and Complicated is key for software development in particular. Understanding this dynamic is essential in order to understand how Cynefin might be employed.

Setting up developments

As I said earlier the parts of a software development project that will provide value are where we are doing something new, and that is where the risk also lies. Any significant and worthwhile development project will start in the Complex domain. The initial challenge is to learn enough to move it to Complicated. Dave explained it as follows in a talk at Agile India in 2015.

“As things are Complex we see patterns, patterns emerge. We stabilise the patterns. As we stabilise them we can actually shift them into the Complicated domain. So the basic principle of Complexity-based intervention is you start off with multiple, parallel, safe-to-fail experiments, which is why Scrum is not a true Complexity technique; it does one thing in a linear way. We call (these experiments) a pre-Scrum technique. You do smaller experiments faster in parallel… So you’re moving from the centre of the Complex domain into the boundary, once you’re in the boundary you use Scrum to move it across the boundary.”

Such a safe-to-fail experiment might be an XP pair programming team being assigned to knock up a small, quick prototype.

So the challenge in starting the move from Complex to Complicated is to come up with the ideas for safe-to-fail pre-Scrum experiments that would allow us to use Scrum effectively.

Dave outlined the criteria that suitable experiments should meet. There should be some way of knowing whether the experiment is succeeding and it must be possible to amplify (i.e. reinforce) signs of success. Similarly, there should be some way of knowing whether it is failing and of dampening, or reducing, the damaging impact of a failing experiment. Failure is not bad. In any useful set of safe-to-fail experiments some must fail if we are to learn anything worthwhile The final criterion is that the experiment must be coherent. This idea of coherence requires more attention.

Dave Snowden explains the tests for coherence here. He isn’t entirely clear about how rigid these tests should be. Perhaps it’s more useful to regard them as heuristics than fixed rules, though the first two are of particular importance.

  • A coherent experiment, the ideas and assumptions behind it, should be compatible with natural science. That might seem like a rather banal statement, till you consider all the massive IT developments and change programmes that were launched in blissful ignorance of the fact that science could have predicted inevitable failure.
  • There should be some evidence from elsewhere to support the proposal. Replicating past cases is no guarantee of success, far from it, but it is a valid way to try and learn about the problem.
  • The proposal should fit where we are. It has to be consistent to some degree with what we have been doing. A leap into the unknown attempting something that is utterly unfamiliar is unlikely to gain any traction.
  • Can the proposal pass a series of “ritual dissent challenges? These are a formalised way of identifying flaws and refining possible experiments.
  • Does the experiment reflect an unmet, unarticulated need that has been revealed by sense-making, by attempts to make sense of the problem?

The two latter criteria refer explicitly to Cynefin techniques. The final one, identifying unmet needs, assumes the use of Cognitive Edge’s SenseMaker. Remember Fred Brooks’ blunt statement about requirements? Clients do not know what they want. They cannot articulate their needs if they are asked directly. They cannot envisage what is possible. Dave Snowden takes that point further. If users can articulate their needs than you’re dealing with a commoditized product and the solution is unlikely to have great value. Real values lies in meeting needs that users are unaware of and that they cannot articulate. This has always been so, but in days of yore we could often get away with ignoring that problem. Most applications were in-house developments that either automated back-office functions or were built around business rules and clerical processes that served as an effective proxy for true requirements. The inadequacies of the old structured methods and traditional requirements gathering could be masked.

With the arrival of web development, and then especially with mobile technology this gulf between user needs and the ability of developers to grasp them became a problem that could be ignored only through wilful blindness, admittedly a trait that has never been in short supply in corporate life. The problem has been exacerbated by our historic willingness to confuse rigour with a heavily documented, top-down approach to software development. Sense-making entails capturing large numbers of user reports in order to discern patterns that can be exploited. This appears messy, random and unstructured to anyone immured in traditional ways of development. It might appear to lack rigour, but such an approach is in accord with messy, unpredictable reality. That means it offers a more rigorous and effective way of deriving requirements than we can get by pretending that every development belongs in the Obvious domain. A simple lesson I’ve had to learn and relearn over the years is that rigour and structure are not the same as heavy documentation, prescriptive methods and a linear, top-down approach to problem solving.

This all raises big questions for testers. How do we respond? How do we get involved in testing requirements that have been derived this way and indeed the resulting applications? Any response to those questions should take account of another theme that really struck me from Dave’s day in New York. That was the need for resilience.

Resilience

The crucial feature of complex adaptive systems is their unpredictability. Applications operating in such a space will inevitably be subject to problems and threats that we would never have predicted. Even where we can confidently predict the type of threat the magnitude will remain uncertain. Failure is inevitable. What matters is how the application responds.

The need for resilience, with its linked themes of tolerance, diversity and redundancy, was a recurring message in Dave’s class. Resilience is not the same as robustness. The example that Dave gave was that a seawall is robust but a salt marsh is resilient. A seawall is a barrier to large waves and storms. It protects the harbour behind, but if it fails it does so catastrophically. A salt marsh protects inland areas by acting as a buffer, absorbing storm waves rather than repelling them. It might deteriorate over time but it won’t fail suddenly and disastrously.

An increasing challenge for testers will be to look for information about how systems fail, and test for resilience rather than robustness. Tolerance for failure becomes more important than a vain attempt to prevent failure. This tolerance often requires greater redundancy. Stripping out redundancy and maximizing the efficiency of systems has a downside, as I’ve discovered in my career. Greater efficiency can make applications brittle and inflexible. When problems hit they hit hard and recovery can be difficult.

it could be worse - not sure how, but it could be

The six years I spent working as an IT auditor had a huge impact on my thinking. I learned that things would go wrong, that systems would fail, and that they’d do so in ways I couldn’t have envisaged. There is nothing like a spell working as an auditor to imbue one with a gloomy sense of realism about the possibility of perfection, or even adequacy. I ended up like the gloomy old pessimist Eeyore in Winnie the Pooh. When I returned to development work a friend once commented that she could always spot one of my designs. Like Eeyore I couldn’t be certain exactly how things would go wrong, I just knew they would and my experience had taught me where to be wary. I was destined to end up as a tester.

Liz Keogh, in this talk on Safe-to-Fail makes a similar point.

“Testers are really, really good at spotting failure scenarios… they are awesomely imaginative at calamity… Devs are problem solvers. They spot patterns. Testers spot holes in patterns… I have a theory that other people who are in critical positions, like compliance and governance people are also really good at this”.

Testers should have the creativity to imagine how things might go wrong. In a Complex domain, working with applications that have been developed working with Cynefin, this insight and imagination, the ability to spot potential holes, will be extremely valuable. Testers have to seize that opportunity to remain relevant.

There is an upside to redundancy. If there are different ways of achieving the same ends then that diversity will offer more scope for innovation, for users to learn about the application and how it could be adapted and exploited to do more than the developers had imagined. Again, this is an opportunity for testers. Stakeholders need to know about the application and what it can do. Telling them that the application complied with a set of requirements that might have been of dubious relevance and accuracy just doesn’t cut it.

Conclusion

Conclusion is probably the wrong word. Dave Snowden’s class opened my mind to a wide range of new ideas and avenues to explore. This was just the starting point. These two essays can’t go very far in telling you about Cynefin and how it might apply to software testing. All I can realistically do is make people curious to go and learn more for themselves, to explore in more depth. That is what I will be doing, and as a starter I will be in London at the end of June for the London Tester Gathering. I will be at the workshop An Introduction to Complexity and Cynefin for Software Testers” being run by Martin Hynie and Ben Kelly where I hope to discuss Cynefin with fellow testers and explorers.

If you are going to the CAST conference in Nashville in August you will have the chance to hear Dave Snowden giving a keynote speech. He really is worth hearing.

Dave Snowden’s Cynefin masterclass in New York, 2nd May 2017 – part 1

This is part one of a two post series on Cynefin and software testing. I wrote it with the support of the Committee on Standards and Professional Practices of the Association for Software Testing. The posts originally appeared on the AST site.

Introduction

On May 2nd I attended Dave Snowden’s masterclass in New York, “A leader’s framework for decision making: managing complex projects using Cynefin”, at the Test Leadership Congress. For several years I have been following Dave’s work and I was keen to hear him speak in person. Dave is a gifted communicator, but he moves through his material fast, very fast. In a full day class he threw out a huge range of information, insights and arguments. I was writing frantically throughout, capturing key ideas and phrases I could research in detail later.

It was an extremely valuable day. All of it was relevant to software development, and therefore indirectly to testing. However, it would require a small book to do justice to Dave’s ideas. I will restrict myself to two posts in which I will concentrate on a few key themes that struck me as being particularly important to the testing community.

Our worldview matters

We need to understand how the world works or we will fail to understand the problems we face. We won’t recognise what success might look like, nor will we be able to anticipate unacceptable failure till we are beaten over the head, and we will select the wrong techniques to address problems.it ain't what you don't know that gets you into trouble - it's what you know for sure that just ain't do

Dave used a slide with this quote from Mark Twain. It’s an important point. Software development and testing has been plagued over the years by unquestioned assumptions and beliefs that we were paid well to take for granted, without asking awkward questions, but which just ain’t so. And they’ve got us into endless trouble.

A persistent damaging feature of software development over the years has been the illusion that is a neater, more orderly process than it really is. We craved certainty, fondly imagining that if we just put a bit more effort and expertise into the upfront analysis and requirements then good, experienced professionals can predictably develop high quality applications. It hardly ever panned out that way, and the cruel twist was that the people who finally managed to crank out something workable picked up the blame for the lack of perfection.

Fred Brooks made the point superbly in his classic paper, “No Silver Bullet”.

“The truth is, the client does not know what he wants. The client usually does not know what questions must be answered, and he has almost never thought of the problem in the detail necessary for specification. … in planning any software-design activity, it is necessary to allow for an extensive iteration between the client and the designer as part of the system definition.

…… it is really impossible for a client, even working with a software engineer, to specify completely, precisely, and correctly the exact requirements of a modern software product before trying some versions of the product.”

So iteration is required, but that doesn’t mean simply taking a linear process and repeating it. Understanding and applying Cynefin does not mean tackling problems in familiar ways but with a new vocabulary. It means thinking about the world in a different way, drawing on lessons from complexity science, cognitive neuroscience and biological anthropology.

Cynefin and ISO 29119

Cynefin is not based on successful individual cases, or on ideology, or on wishful thinking. Methods that are rooted in successful cases are suspect because of the survivorship bias (how many failed projects did the same thing?), and because people do not remember clearly and accurately what they did after the event; they reinterpret their actions dependent on the outcome. Cynefin is rooted in science and the way things are, the way systems behave, and the way that people behave. Developing software is an activity carried out by humans, for humans, mostly in social organisations. If we follow methods that are not rooted in reality, in science and which don’t allow for the way people behave then we will fail.

Dave Snowden often uses the philosophical phrase “a priori”, usually in the sense of saying that something is wrong a priori. A priori knowledge is based on theoretical deduction, or on mathematics, or the logic of the language in which the proposition is stated. We can say that certain things are true or false a priori, without having to refer to experience. Knowledge based on experience is a posteriori.

The distinction is important in the debate over the software testing standard ISO 29119. The ISO standards lobby has not attempted to defend 29119 on either a priori or on a posteriori grounds. The standard has its roots in linear, document driven development methods that were conspicuously unsuccessful. ISO were unable to cite any evidence or experience to justify their approach.

Defenders of the standard, and some neutrals, have argued that critics must examine the detailed content of the standard, which is extremely expensive to purchase, in order to provide meaningful criticism. However, this defence is misconceived because the standard itself is misconceived. The standard’s stated purpose, “is to define an internationally agreed set of standards for software testing that can be used by any organization when performing any form of software testing”. If ISO believes that a linear, prescriptive standard like ISO 29119 will apply to “any form of software testing” we can refer to Cynefin and say that they are wrong; we can say so confidently knowing that our stance is backed by reputable science and theory. ISO is attempting to introduce a practice that might, sometimes at best, be appropriate for the Obvious domain into the Complicated and Complex domains where it is wildly unsuitable and damaging. ISO is wrong a priori.

What is Cynefin?

The Wikipedia article is worth checking out, not least because Dave Snowden keeps an eye on it. This short video presented by Dave is also helpful.

The Cynefin Framework might look like a quadrant, but it isn’t. It is a collection of five domains that are distinct and clearly defined in principle, but which blur into one another in practice.

In addition to the four domains that look like the cells of a quadrant there is a fifth, in the middle, called Disorder, and this one is crucial to an understanding of the framework and its significance.

Cynefin is not a categorisation model, as would be implied if it were a simple matrix. It is not a matter of dropping data into the framework then cracking on with the work. Cynefin is a framework that is designed to help us make sense of what confronts us, to give us a better understanding of our situation and the approaches that we should take.

The first domain is Obvious, in which there are clear and predictable causes and effects. The second is Complicated, which also has definite causes and effects, but where the connections are not so obvious; expert knowledge and judgement is required.

The third is Complex, where there is no clear cause and effect. We might be able to discern it with hindsight, but that knowledge doesn’t allow us to predict what will happen next; the system adapts continually. Dave Snowden and Mary Boone used a key phrase in their Harvard Business Review article about Cynefin.

”Hindsight does not lead to foresight because the external conditions and systems constantly change.”

The fourth domain is Chaotic. Here, urgent action rather than reflective analysis, is required. The participants must act, sense feedback and respond. Complex situations might be suited to safe probing, which can teach us more about the problem, but such probing is a luxury in the Chaotic domain.

The appropriate responses in all four of these domains are different. In Obvious, the categories are clearly defined, one simply chooses the right one, and that provides the right route to follow. Best practices are appropriate here.

In the Complicated domain there is no single, right category to choose. There could be several valid options, but an expert can select a good route. There are various good practices, but the idea of a single best practice is misconceived.

In the Complex domain it is essential to probe the problem and learn by trial and error. The practices we might follow will emerge from that learning. In Chaos as I mentioned, we simply have to start with action, firefighting to stop the situation getting worse. It is helpful to remember that, instead of the everyday definition, chaos in Cynefin terms refer to the concept in physics. Here chaos refers to a system that it is so dynamic that minor variations in initial conditions lead to outcomes so dramatically divergent that the system is unpredictable. In some circumstances it makes sense to make a deliberate and temporary move into Chaos to learn new practice. That would require removing constraints and the connections that impose some sort of order.

The fifth domain is that of Disorder, in the middle of the diagram. This is the default position in a sense. It’s where we find ourselves when we don’t know which domain we should really be in. It’s therefore the normal starting point. The great danger is that we don’t choose the appropriate domain, but simply opt for the one that fits our instincts or our training, or that is aligned with the organisation’s traditions and culture, regardless of the reality.

The only stable domains are Obvious, Complicated and Complex. Chaotic and Disorder are transitional. You don’t (can’t) stay there. Chaotic is transitional because constraints will kick in very quickly, almost as a reflex. Disorder is transitional because you are actually in one of the other domains, but you just don’t know it.

The different domains have blurred edges. In any context there might be elements that fit into different domains if they are looked at independently. That isn’t a flaw with Cynefin. It merely reflects reality. As I said, Cynefin is not a neat categorisation model. It is intended to help us make sense of what we face. If reality is messy and blurred then there’s no point trying to force it into a straitjacket.

Many projects will have elements that are Obvious, that deal with a problem that is well understood, that we have dealt with before and whose solution is familiar and predictable. However, these are not the parts of a project that should shape the approach we take. The parts where the potential value, and the risk, lie are where we are dealing with something we have not done before. Liz Keogh has given many talks and written some very good blogs and articles about applying Cynefin to software development. Check out her work. This video is a good starter.

The boundaries between the domains are therefore fuzzy, but there is one boundary that is fundamentally different from the others; the border between Obvious and Chaotic. This is not really a boundary at all. It is more of a cliff. If you move from Obvious to Chaotic you don’t glide smoothly into a subtly changing landscape. You fall off the cliff.

Within the Obvious domain the area approaching the cliff is the complacent zone. Here, we think we are working in a neat, ordered environment and “we believe our own myths” as Snowden puts it in the video above. The reality is quite different and we are caught totally unaware when we hit a crisis and plunge off the cliff into chaos.

That was a quick skim through Cynefin. However, you shouldn’t think of it as being a static framework. If you are going to apply it usefully you have to understand the dynamics of the framework, and I will return to that in part two.

Auditors and testing – a rant justified by experience

A couple of weeks ago I was drawn into a discussion on Twitter about auditors and testing. At the time I was on holiday in a delightfully faraway part of Galloway, in south west Scotland.

One of the attractions of the cottage where we were staying was that it lacked a mobile (cell) phone signal, never mind internet access. Only when we happened to be in a pub or restaurant could I sneak onto wifi discreetly, without incurring a disapproving look from my wife.

Having worked as both an IT auditor and a tester, and having both strong opinions and an argumentative nature, I had plenty to say on the subject. That had to wait till I returned (via New York, but that’s another subject) when I unleashed a rant on Twitter. Here is that thread, in a more readable format. It might be a rant, but it is based on extensive experience.

Auditors looking for items they can check that MUST be called test cases? That’s a big, flashing, warning sign they have a lousy conceptual grasp of auditing. It’s true, but missing the point, to say that’s old fashioned. It’s like saying the problem with ISO 29119 is it’s old fashioned.

The crucial point is it’s bad, unprofessional auditing. The company that taught me to audit was promoting good auditing 30 years ago. If anyone had remained ignorant of the transformation in software development in the last 30 years you’d call them idiots, not old-fashioned.

A test case is just a name for a receptacle. It’s a bucket of ideas. Who cares about the bucket? Ideas and evidence really matter to auditors, who live and die by evidence; they expect compelling evidence that the auditees have been thinking about what they are doing. A lack of useful evidence showing what testing has been performed, or a lack of thought about how to test should be certain ways to attract criticism from auditors. The IT auditors’ governance model COBIT5 mentions “test cases” once (in passing). It mentions “ideas” 32 times & “evidence” 16 times.

COBIT5 isn’t just about testing of course. Its principles apply across the whole range of IT and testing is no exception. Auditors should expect testers to have;

  • a clear vision or strategy of how testing should be performed in their organisation,
  • a clear (but not necessarily detailed) plan for testing the product,
  • relevant, contemporary evidence that justifies and leads inescapably to the conclusions, lessons and insights that the testers derived and reported from their testing.

That’s what auditors should expect. Some (or many?) organisations are locked into a pattern of low quality and low value auditing. They define auditing as brainless compliance checking that is performed by low quality staff who don’t understand what they’re doing. Their work is worthless. As a result audit is held in low esteem in the organisation. Smart people don’t want to work there. Therefore audit must be defined in such a way that low quality staff are able to carry it out.

This is inexcusable. At best it is negligence. Maintaining that model of auditing requires willful ignorance of what the audit profession stipulates. It is damaging and contributes towards the creation of a dysfunctional culture. Nevertheless it is cheap and ensures there are no good auditors who might pose uncomfortable, challenging questions to senior managers.

However, this doesn’t mean there are never times when auditors do need to see test cases. If a contract has been stupidly written so that test cases must be produced and visible then there’s no wriggle room. It’s just the same (and just as stupid) as if the contract says testers must wear pink shirts. It might be stupid but it is a contractual deliverable; auditors will want to see proof of compliance. As Griffin Jones pointed out on seeing my tweet, “often (the contract) is stupidly written – thus the need to get involved with the contracting organization. The problem is bigger than test or SW dev”.

I fully agree with Griffin. Testers should get involved in contractual discussions that will influence their work, in order to anticipate and head off unhelpful contractual terms.

I would add that testers should ask to see the original contract. Contractual terms are sometimes misinterpreted as they are passed through the organisation to the testers. It might be possible to produce the required evidence by smarter means.

Apart from such tiresome contractual requirements, demanding to see “test cases” is a classic case of confusing form and content. It’s unprofessional. That’s not just my opinion; it’s not novel or radical. It’s simply orthodox, professional opinion. Anyone who says otherwise is clueless or bullshitting. Either way they must be resisted. Clueless bullshitters can enjoy good, lucrative careers, but do huge damage. I’ve no respect for them.

The US Food and Drug Administration’s “General Principles of Software Validation” do pose a problem. They date back to 1997, updated in 2002. They are creakily old. They mention test cases many times, but they were written when it was assumed that testing meant writing test cases. The term seems to be used as jargon for tests. If testing satisfies FDA criteria then there’s no obvious reason why you can’t just call planned tests “test cases”.

There’s no requirement to produce test scripts as well as test cases, but expected results with objective pass/fail criteria are required. That doesn’t, and mustn’t, mean testers should be looking only for the expected results. The underlying principle is that compliance should follow the “least burdensome” approach and the FDA do say that they are open to considering alternative approaches to comply with the requirements in a way that is less burdensome.

Further, the FDA does not have a problem with Agile development (PDF, opens in new tab), and they also do approve of exploratory testing, as explained by James Bach.