Do standards keep testers in the kindergarten? (2009)

Do standards keep testers in the kindergarten? (2009)

Testing ExperienceThis article appeared in the December 2009 edition of Testing Experience magazine, which is no longer published. I’m moving the article onto my blog from my website, which will be decommissioned soon.

Normally when I re-post old articles I provide a warning about them being dated. This one was written in November 2009 but I think that its arguments are still valid. It is only dated in the sense that it doesn’t mention ISO 29119, the current ISO software testing standard, which was released in 2013. This article shows why I was dismayed when ISO 29119 arrived on the scene. I thought that prescriptive testing standards, such as IEEE 829, had had their day. They had failed and we had moved on.

The references in the article were all structured for a paper magazine. There are no hyperlinks and I have not tried to recreate them and check out whether they still work.
kindergarten

The article

Discussion of standards usually starts from the premise that they are intrinsically a good thing, and the debate then moves on to consider what form they should take and how detailed they should be.

Too often sceptics are marginalised. The presumption is that standards are good and beneficial. Those who are opposed to them appear suspect, even unprofessional.

I believe that although the content of standards for software development and testing can be valuable, especially within individual organisations, I do not believe that they should be regarded as generic “standards” for the whole profession. Turning useful guidelines into standards suggests that they should be mandatory.

My particular concern is that the IEEE 829 “Standard for Software and System Test Documentation”, and the many document templates derived from it, encourage a safety first approach to documentation, with testers documenting plans and scripts in slavish detail.

They do so not because the project genuinely requires it, but because they have been encouraged to equate documentation with quality, and they fear that they will look unprofessional and irresponsible in a subsequent review or audit. I think these fears are ungrounded and I will explain why.

A sensible debate about the value of standards must start with a look at what standards are, and the benefits that they bring in general, and specifically to testing.

Often discussion becomes confused because justification for applying standards in one context is transferred to a quite different context without any acknowledgement that the standards and the justification may no longer be relevant in the new context.

Standards can be internal to a particular organisation or they can be external standards attempting to introduce consistency across an industry, country or throughout the world.

I’m not going to discuss legal requirements enforcing minimum standards of safety, such as Health and Safety legislation, or the requirements of the US Food & Drug Administration. That’s the the law, and it’s not negotiable.

The justification for technical and product standards is clear. Technical standards introduce consistency, common protocols and terminology. They allow people, services and technology to be connected. Product standards protect consumers and make it easier for them to distinguish cheap, poor quality goods from more expensive but better quality competition.

Standards therefore bring information and mobility to the market and thus have huge economic benefits.

It is difficult to see where standards for software development or testing fit into this. To a limited extent they are technical standards, but only so far as they define the terminology, and that is a somewhat incidental role.

They appear superficially similar to product standards, but software development is not a manufacturing process, and buyers of applications are not in the same position as consumers choosing between rival, inter-changeable products.

Are software development standards more like the standards issued by professional bodies? Again, there’s a superficial resemblance. However, standards such as Generally Accepted Accounting Principles (Generally Accepted Accounting Practice in the UK) are backed up by company law and have a force no-one could dream of applying to software development.

Similarly, standards of professional practice and competence in the professions are strictly enforced and failure to meet these standards is punished.

Where does that leave software development standards? I do believe that they are valuable, but not as standards.

Susan Land gave a good definition and justification for standards in the context of software engineering in her book “Jumpstart CMM-CMMI Software Process Improvements – using IEEE software engineering standards”. [1]

“Standards are consensus-based documents that codify best practice. Consensus-based standards have seven essential attributes that aid in process engineering. They;

  1. Represent the collected experience of others who have been down the same road.
  2. Tell in detail what it means to perform a certain activity.
  3. Help to assure that two parties attach the same meaning to an engineering activity.
  4. Can be attached to or referenced by contracts.
  5. Improve the product.
  6. Protect the business and the buyer.
  7. Increase professional discipline.” (List sequence re-ordered from original).

The first four justifications are for standards in a descriptive form, to aid communication. Standards of this type would have a broader remit than the technical standards I referred to, and they would be guidelines rather than prescriptive. These justifications are not controversial, although the fourth has interesting implications that I will return to later.

The last three justifications hint at compulsion. These are valid justifications, but they are for standards in a prescriptive form and I believe that these justifications should be heavily qualified in the context of testing.

I believe that where testing standards have value they should be advisory, and that the word “standard” is unhelpful. “Standards” implies that they should be mandatory, or that they should at least be considered a level of best practice to which all practitioners should aspire.

Is the idea of “best practice” useful?

I don’t believe that software development standards, specifically the IEEE series, should be mandatory, or that they can be considered best practice. Their value is as guidelines, which would be a far more accurate and constructive term for them.

I do believe that there is a role for mandatory standards in software development. The time-wasting shambles that is created if people don’t follow file naming conventions is just one example. Secure coding standards that tell programmers about security flaws that they must not introduce into their programs are also a good example of standards that should be mandatory.

However, these are local, site-specific standards. They are about consistency, security and good housekeeping, rather than attempting to define an over-arching vision of “best practice”.

Testing standards should be treated as guidelines, practices that experienced practitioners would regard as generally sound and which should be understood and regarded as the default approach by inexperienced staff.

Making these practices mandatory “standards”, as if they were akin to technical or product standards and the best approach in any situation, will never ensure that experienced staff do a better job, and will often ensure they do a worse job than if they’d been allowed to use their own judgement.

Testing consultant Ben Simo, has clear views on the notion of best practice. He told me;

“‘Best’ only has meaning in context. And even in a narrow context, what we think is best now may not really be the best.

In practice, ‘best practice’ often seems to be either something that once worked somewhere else, or a technical process required to make a computer system do a task. I like for words to mean something. If it isn’t really best, let’s not call it best.

In my experience, things called best practices are falsifiable as not being best, or even good, in common contexts. I like guidelines that help people do their work. The word ‘guideline’ doesn’t imply a command. Guidelines can help set some parameters around what and how to do work and still give the worker the freedom to deviate from the guidelines when it makes sense.”

Rather than tie people’s hands and minds with standards and best practices, I like to use guidelines that help people think and communicate lessons learned – allowing the more experienced to share some of their wisdom with the novices.”

Such views cannot be dismissed as the musings of maverick testers who can’t abide the discipline and order that professional software development and testing require.

Ben is the President of the Association of Software Testing. His comments will be supported by many testers who see how it matches their own experience. Also, there has been some interesting academic work that justify such scepticism about standards. Interestingly, it has not come from orthodox IT academics.

Lloyd Roden drew on the work of the Dreyfus brothers as he presented a powerful argument against the idea of “best practice” at Starwest 2009 and the TestNet Najaarsevent. Hubert Dreyfus is a philosopher and psychologist and Stuart Dreyfus works in the fields of industrial engineering and artificial intelligence.

In 1980 they wrote an influential paper that described how people pass through five levels of competence as they move from novice to expert status, and analysed how rules and guidelines helped them along the way. The five level of the Dreyfus Model of Skills Acquisition can be summarised as follows.

  1. Novices require rules that can be applied in narrowly defined situations, free of the wider context.
  2. Advanced beginners can work with guidelines that are less rigid than the rules that novices require.
  3. Competent practitioners understand the plan and goals, and can evaluate alternative ways to reach the goal.
  4. Proficient practitioners have sufficient experience to foresee the likely result of different approaches and can predict what is likely to be the best outcome.
  5. Experts can intuitively see the best approach. Their vast experience and skill mean that rules and guidelines have no practical value.

For novices the context of the problem presents potentially confusing complications. Rules provide clarity. For experts, understanding the context is crucial and rules are at best an irrelevant hindrance.

Roden argued that we should challenge any references to “best practices”. We should talk about good practices instead, and know when and when not to apply them. He argued that imposing “best practice” on experienced professionals stifles creativity, frustrates the best people and can prompt them to leave.

However, the problem is not simply a matter of “rules for beginners, no rules for experts”. Rules can have unintended consequences, even for beginners.

Chris Atherton, a senior lecturer in psychology at the University of Central Lancashire, made an interesting point in a general, anecdotal discussion about the ways in which learners relate to rules.

“The trouble with rules is that people cling to them for reassurance, and what was originally intended as a guideline quickly becomes a noose.

The issue of rules being constrictive or restrictive to experienced professionals is a really interesting one, because I also see it at the opposite end of the scale, among beginners.”

Obviously the key difference is that beginners do need some kind of structural scaffold or support; but I think we often fail to acknowledge that the nature of that early support can seriously constrain the possibilities apparent to a beginner, and restrict their later development.”

The issue of whether rules can hinder the development of beginners has significant implications for the way our profession structures its processes. Looking back at work I did at the turn of the decade improving testing processes for an organisation that was aiming for CMMI level 3, I worry about the effect it had.

Independent professional testing was a novelty for this client and the testers were very inexperienced. We did the job to the best of our ability at the time, and our processes were certainly considered best practice by my employers and the client.

The trouble is that people can learn, change and grow faster than strict processes adapt. A year later and I’d have done it better. Two years later, it would have been different and better, and so on.

Meanwhile, the testers would have been gaining in experience and confidence, but the processes I left behind were set in tablets of stone.

As Ben Simo put it; “if an organisation is at a level less than the intent of level 5, CMM seems to often lock in ignorance that existed when the process was created”.

CMMI has its merits but also has dangers. Continuous process improvement is at its heart, but these are incremental advances and refinements in response to analysis of metrics.

Step changes or significant changes in response to a new problem don’t fit comfortably with that approach. Beginners advance from the first stage of the Dreyfus Model, but the context they come to know and accept is one of rigid processes and rules.

Rules, mandatory standards and inflexible processes can hinder the development of beginners. Rigid standards don’t promote quality. They can have the opposite effect if they keep testers in the kindergarten.

IEEE829 and the motivation behind documentation

One could argue that standards do not have to be mandatory. Software developers are pragmatic, and understand when standards should be mandatory and when they should be discretionary. That is true, but the problem is that the word “standards” strongly implies compulsion. That is the interpretation that most outsiders would place on the word.

People do act on the assumption that the standard should be mandatory, and then regard non-compliance as a failure, deviation or problem. These people include accountants and lawyers, and perhaps most significantly, auditors.

My particular concern is the effect of IEEE 829 testing documentation standard. I wonder if much more than 1% of testers have ever seen a copy of the standard. However, much of its content is very familiar, and its influence is pervasive.

IEEE 829 is a good document with much valuable material in it. It has excellent templates, which provide great examples of how to meticulously document a project.

Or at least they’re great examples of meticulous documentation if that is the right approach for the project. That of course is the question that has to be asked. What is the right approach? Too often the existence of a detailed documentation standard is taken as sufficient justification for detailed documentation.

I’m going to run through two objections to detailed documentation. They are related, but one refers to design and the other to testing. It could be argued that both have their roots in psychology as much as IT.

I believe that the fixation of many projects on documentation, and the highly dubious assumption that quality and planning are synonymous with detailed documentation, have their roots in the structured methods that dominated software development for so long.

These methods were built on the assumption that software development was an engineering discipline, rather than a creative process, and that greater quality and certainty in the development process could be achieved only through engineering style rigour and structure.

Paul Ward, one of the leading developers of structured methods, wrote a series of articles [2] on the history of structured methods, which admitted that they were neither based on empirical research nor subjected to peer-review.

Two other proponents of structured methods, Larry Constantine and Ed Yourdon, admitted that the early investigations were no more than informal “noon-hour” critiques” [3].

Fitzgerald, Russo and Stolterman gave a brief history of structured methods in their book “Information Systems Development – Methods in Action” [4] and concluded that “the authors relied on intuition rather than real-world experience that the techniques would work”.

One of the main problem areas for structured methods was the leap from the requirements to the design. Fitzgerald et al wrote that “the creation of hierarchical structure charts from data flow diagrams is poorly defined, thus causing the design to be loosely coupled to the results of the analysis. Coad & Yourdon [5] label this shift as a ‘Grand Canyon’ due to its fundamental discontinuity.”

The solution to this discontinuity, according to the advocates of structured methods, was an avalanche of documentation to help analysts to crawl carefully from the current physical system, through the current logical system to a future logical system and finally a future physical system.

Not surprisingly, given the massive documentation overhead, and developers’ propensity to pragmatically tailor and trim formal methods, this full process was seldom followed. What was actually done was more informal, intuitive, and opaque to outsiders.

An interesting strand of research was pursued by Human Computer Interface academics such as Curtis, Iscoe and Krasner [6], and Robbins, Hilbert and Redmiles [7].

They attempted to identify the mental processes followed by successful software designers when building designs. Their conclusion was that they did so using a high-speed, iterative process; repeatedly building, proving and refining mental simulations of how the system might work.

Unsuccessful designers couldn’t conceive working simulations, and fixed on designs whose effectiveness they couldn’t test till they’d been built.

Curtis et al wrote;

Exceptional designers were extremely familiar with the application domain. Their crucial contribution was their ability to map between the behavior required of the application system and the computational structures that implemented this behavior.

In particular, they envisioned how the design would generate the system behavior customers expected, even under exceptional circumstances.”

Robbins et al stressed the importance of iteration;

“The cognitive theory of reflection-in-action observes that designers of complex systems do not conceive a design fully-formed. Instead, they must construct a partial design, evaluate, reflect on, and revise it, until they are ready to extend it further.”

The eminent US software pioneer Robert Glass discussed these studies in his book “Software Conflict 2.0” [8] and observed that;

“people who are not very good at design … tend to build representations of a design rather than models; they are then unable to perform simulation runs; and the result is they invent and are stuck with inadequate design solutions.”

These studies fatally undermine the argument that linear and documentation driven processes are necessary for a quality product and that more flexible, light-weight documentation approaches are irresponsible.

Flexibility and intuition are vital to developers. Heavyweight documentation can waste time and suffocate staff if used when there is no need.

Ironically, it was the heavyweight approach that was founded on guesswork and intuition, and the lightweight approach that has sound conceptual underpinnings.

The lessons of the HCI academics have obvious implications for exploratory testing, which again is rooted in psychology as much as in IT. In particular, the finding by Curtis et al that “exceptional designers were extremely familiar with the application domain” takes us to the heart of exploratory testing.

What matters is not extensive documentation of test plans and scripts, but deep knowledge of the application. These need not be mutually exclusive, but on high-pressure, time-constrained projects it can be hard to do both.

Itkonen, Mäntylä and Lassenius conducted a fascinating experiment at the University of Helsinki in 2007 in which they tried to compare the effectiveness of exploratory testing and test case based testing. [9]

Their findings were that test case testing was no more effective in finding defects. The defects were a mixture of native defects in the application and defects seeded by the researchers. Defects were categorised according to the ease with which they could be found. Defects were also assigned to one of eight defect types (performance, usability etc.).

Exploratory testing scored better for defects at all four levels of “ease of detection”, and in 6 out of the 8 defect type categories. The differences were not considered statistically significant, but it is interesting that exploratory testing had the slight edge given that conventional wisdom for many years was that heavily documented scripting was essential for effective testing.

However, the really significant finding, which the researchers surprisingly did not make great play of, was that the exploratory testing results were achieved with 18% of the effort of the test case testing.

The exploratory testing required 1.5 hours per tester, and the test case testing required an average of 8.5 hours (7 hours preparation and 1.5 hours testing).

It is possible to criticise the methods of the researchers, particularly their use of students taking a course in software testing, rather than professionals experienced in applying the techniques they were using.

However, exploratory testing has often been presumed to be suitable only for experienced testers, with scripted, test case based testing being more appropriate for the less experienced.

The methods followed by the Helsinki researchers might have been expected to bias the results in favour of test case testing. Therefore, the finding that exploratory testing is at least as effective as test case testing with a fraction of the effort should make proponents of heavily documented test planning pause to reconsider whether it is always appropriate.

Documentation per se does not produce quality. Quality is not necessarily dependent on documentation. Sometimes they can be in conflict.

Firstly, the emphasis on producing the documentation can be a major distraction for test managers. Most of their effort goes into producing, refining and updating plans that often bear little relation to reality.

Meanwhile the team are working hard firming up detailed test cases based on an imperfect and possibly outdated understanding of the application. While the application is undergoing the early stages of testing, with consequent fixes and changes, detailed test plans for the later stages are being built on shifting sand.

You may think that is being too cynical and negative, and that testers will be able to produce useful test cases based on a correct understanding of the system as it is supposed to be delivered to the testing stage in question. However, even if that is so, the Helsinki study shows that this is not a necessary condition for effective testing.

Further, if similar results can be achieved with less than 20% of the effort, how much more could be achieved if the testers were freed from the documentation drudgery in order to carry out more imaginative and proactive testing during the earlier stages of development?

Susan Land’s fourth justification for standards (see start of article) has interesting implications.

Standards “can be attached to or referenced by contracts”. That is certainly true. However, the danger of detailed templates in the form of a standard is that organisations tailor their development practices to the templates rather than the other way round.

If the lawyers fasten onto the standard and write its content into the contract then documentation can become an end and not just a means to an end.

Documentation becomes a “deliverable”. The dreaded phrase “work product” is used, as if the documentation output is a product of similar value to the software.

In truth, sometimes it is more valuable if the payments are staged under the terms of the contract, and dependent on the production of satisfactory documentation.

I have seen triumphant announcements of “success” following approval of “work products” with the consequent release of payment to the supplier when I have known the underlying project to be in a state of chaos.

Formal, traditional methods attempt to represent a highly complex, even chaotic, process in a defined, repeatable model. These methods often bear only vague similarities to what developers have to do to craft applications.

The end product is usually poor quality, late and over budget. Any review of the development will find constant deviations from the mandated method.

The suppliers, and defenders, of the method can then breathe a sigh of relief. The sacred method was not followed. It was the team’s fault. If only they’d done it by the book! The possibility that the developers’ and testers’ apparent sins were the only reason anything was produced at all is never considered.

What about the auditors?

Adopting standards like IEEE 829 without sufficient thought causes real problems. If the standard doesn’t reflect what really has to be done to bring the project to a successful conclusion then mandated tasks or documents may be ignored or skimped on, with the result that a subsequent review or audit reports on a failure to comply.

An alternative danger is that testers do comply when there is no need, and put too much effort into the wrong things. Often testers arrive late on the project. Sometimes the emphasis is on catching up with plans and documentation that are of dubious value, and are not an effective use of the limited resources and time.

However, if the contract requires it, or if there is a fear of the consequences of an audit, then it could be rational to assign valuable staff to unproductive tasks.

Sadly, auditors are often portrayed as corporate bogey-men. It is assumed that they will conduct audits by following ticklists, with simplistic questions that require yes/no answers. “Have you done x to y, yes or no”.

If the auditees start answering “No, but …” they would be cut off with “So, it’s no”.

I have seen that style of auditing. It is unprofessional and organisations that tolerate it have deeper problems than unskilled, poorly trained auditors. It is senior management that creates the environment in which the ticklist approach thrives. However, I don’t believe it is common. Unfortunately people often assume that this style of auditing is the norm.

IT audit is an interesting example of a job that looks extremely easy at first sight, but is actually very difficult when you get into it.

It is very easy for an inexperienced auditor to do what appears to be a decent job. At least it looks competent to everyone except experienced auditors and those who really understand the area under review.

If auditors are to add value they have to be able to use their judgement, and that has to be based on their own skills and experience as well as formal standards.

They have to be able to analyse a situation and evaluate whether the risks have been identified and whether the controls are appropriate to the level of risk.

It is very difficult to find the right line and you need good experienced auditors to do that. I believe that ideally IT auditors should come from an IT background so that they do understand what is going on; poachers turned gamekeepers if you like.

Too often testers assume that they know what auditors expect, and they do not speak directly to the auditors or check exactly what professional auditing consists of.

They assume that auditors expect to see detailed documentation of every stage, without consideration of whether it truly adds value, promotes quality or helps to manage the risk.

Professional auditors take a constructive and pragmatic approach and can help testers. I want to help testers understand that. I used to find it frustrating when I worked as an IT auditor when I found that people had wasted time on unnecessary and unhelpful actions on the assumption that “the auditors require it”.

Kanwal Mookhey, an IT auditor and founder of NII consulting, wrote an interesting article for the Internal Auditor magazine of May 2008 [10] about auditing IT project management.

He described the checking that auditors should carry out at each stage of a project. He made no mention of the need to see documentation of detailed test plans and scripts whereas he did emphasize the need for early testing.

Kanwal told me.

“I would agree that auditors are – or should be – more inclined to see comprehensive testing, rather than comprehensive test documentation.

Documentation of test results is another matter of course. As an auditor, I would be more keen to know that a broad-based testing manual exists, and that for the system in question, key risks and controls identified during the design phase have been tested for. The test results would provide a higher degree of assurance than exhaustive test plans.”

One of the most significant developments in the field of IT governance in the last few decades has been the US 2002 Sarbanes-Oxley Act, which imposed new standards of reporting, auditing and control for US companies. It has had massive worldwide influence because it applies to the foreign subsidiaries of US companies, and foreign companies that are listed on the US stock exchanges.

The act attracted considerable criticism for the additional overheads it imposed on companies, duplicating existing controls and imposing new ones of dubious value.

Unfortunately, the response to Sarbanes-Oxley verged on the hysterical, with companies, and unfortunately some auditors, reading more into the legislation than a calmer reading could justify. The assumption was that every process and activity should be tied down and documented in great detail.

However, not even Sarbanes-Oxley, supposedly the sacred text of extreme documentation, requires detailed documentation of test plans or scripts. That may be how some people misinterpret the act. It is neither mandated by the act nor recommended in the guidance documents issued by the Institute of Internal Auditors [11] and the Information Systems Audit & Control Association [12].

If anyone tries to justify extensive documentation by telling you that “the auditors will expect it”, call their bluff. Go and speak to the auditors. Explain that what you are doing is planned, responsible and will have sufficient documentation of the test results.

Documentation is never required “for the auditors”. If it is required it is because it is needed to manage the project, or it is a requirement of the project that has to be justified like any other requirement. That is certainly true of safety critical applications, or applications related to pharmaceutical development and manufacture. It is not true in all cases.

IEEE 829 and other standards do have value, but in my opinion their value is not as standards! They do contain some good advice and the fruits of vast experience. However, they should be guidelines to help the inexperienced, and memory joggers for the more experienced.

I hope this article has made people think about whether mandatory standards are appropriate for software development and testing, and whether detailed documentation in the style of IEEE 829 is always needed. I hope that I have provided some arguments and evidence that will help testers persuade others of the need to give testers the freedom to leave the kindergarten and grow as professionals.

References

[1] Land, S. (2005). “Jumpstart CMM-CMMI Software Process Improvements – using IEEE software engineering standards”, Wiley.

[2a] Ward, P. (1991). “The evolution of structured analysis: Part 1 – the early years”. American Programmer, vol 4, issue 11, 1991. pp4-16.

[2b] Ward, P. (1992). “The evolution of structured analysis: Part 2 – maturity and its problems”. American Programmer, vol 5, issue 4, 1992. pp18-29.

[2c] Ward, P. (1992). “The evolution of structured analysis: Part 3 – spin offs, mergers and acquisitions”. American Programmer, vol 5, issue 9, 1992. pp41-53.

[3] Yourdon, E., Constantine, L. (1977) “Structured Design”. Yourdon Press, New York.

[4] Fitzgerald B., Russo N., Stolterman, E. (2002). “Information Systems Development – Methods in Action”, McGraw Hill.

[5] Coad, P., Yourdon, E. (1991). “Object-Oriented Analysis”, 2nd edition. Yourdon Press.

[6] Curtis, B., Iscoe, N., Krasner, H. (1988). “A field study of the software design process for large systems” (NB PDF download). Communications of the ACM, Volume 31, Issue 11 (November 1988), pp1268-1287.

[7] Robbins, J., Hilbert, D., Redmiles, D. (1998). “Extending Design Environments to Software Architecture Design” (NB PDF download). Automated Software Engineering, Vol. 5, No. 3, July 1998, pp261-290.

[8] Glass, R. (2006). “Software Conflict 2.0: The Art and Science of Software Engineering” Developer Dot Star Books.

[9a] Itkonen, J., Mäntylä, M., Lassenius C., (2007). “Defect detection efficiency – test case based vs exploratory testing”. First International Symposium on Empirical Software Engineering and Measurement. (Payment required).

[9b] Itkonen, J. (2008). “Do test cases really matter? An experiment comparing test case based and exploratory testing”.

[10] Mookhey, K. (2008). “Auditing IT Project Management”. Internal Auditor, May 2008, the Institute of Internal Auditors.

[11] The Institute of Internal Auditors (2008). “Sarbanes-Oxley Section 404: A Guide for Management by Internal Controls Practitioners”.

[12] Information Systems Audit and Control Association (2006). “IT Control Objectives for Sarbanes-Oxley 2nd Edition”.

Has opposition to ISO 29119 really died down?

One of my concerns about the Stop 29119 campaign, ever since it was launched four years ago, was that ISO would try to win the debate by default, by ignoring the opposition. In my CAST 2014 talk, which kicked off the campaign, I talked about ISO’s attempt to define its opponents as being irrelevant. ISO breached its own rules requiring consensus from the profession, and in order to justify doing so they had to maintain a pretence that testers who opposed their efforts were a troublesome, old-fashioned faction that should be ignored.

That’s exactly what has happened. ISO have kept their collective heads down, tried to ride out the storm and emerged to claim that it was all a lot of fuss about nothing; the few malcontents have given up and gone away.

I have just come across a comment in the “talk” section of the Wikipedia article on ISO 29119, arguing for some warning flags on the article to be removed.

“…finally, the objection to this standard was (a) from a small but vocal group and (b) died down – the ballots of member National Bodies were unanimous in favour of publication. Furthermore, the same group objected to IEEE 829 as well.”

The opposition is significantly more than “a small but vocal group”, but I won’t dwell on that point. My concern here is point b. Have the objections died down? Yes, they have in the sense that the opponents of ISO 29119 have been less vocal. There have been fewer talks and articles pointing out the flaws in the principle and the detail of the standard.

However, there has been no change in the beliefs of the opposition. There comes a point when it feels unnecessary, even pointless, to keep repeating the same arguments without the other side engaging. You can’t have a one-sided debate. The Stop 29119 campaigners have other things to do. Frankly, attacking ISO 29119 is a dreary activity compared with most of the alternatives. I would prefer to do something interesting and positive rather than launching another negative attack on a flawed standard. However, needs must.

The argument that “ballots of member National Bodies were unanimous in favour of publication” may be true, but it is a circular point. The opponents of ISO 29119 argued convincingly that software testing is not an activity that lends itself to ISO style standardisation and that ISO failed to gain any consensus outside its own ranks. The fact that ISO are quite happy with that arrangement is hardly a convincing refutation of our argument.

The point about our opposition to the IEEE 829 standard is also true, but it’s irrelevant. Even ISO thought that standard was dated and inadequate for modern purposes. It decided to replace it rather than try to keep updating it. Unfortunately the creators of ISO 29119 repeated the fundamental mistakes that rendered IEEE 829 flawed and unsuitable for good testing.

I was pleased to discover that the author of the Wikipedia comment was on the ISO working group that developed ISO 29119 and wrote a blog defending the standard, or rather dismissing the opposition. It was written four years ago in the immediate aftermath of the launch of Stop 29119. It’s a pity it didn’t receive more attention at the time. The debate was far too one sided and we badly needed contributions from ISO 29119’s supporters. So, in order to provide a small demonstration that opposition to the standard is continuing I shall offer a belated response. I’ll quote Andrew’s arguments, section by section, in dark blue and then respond.

“As a member of the UK Mirror Panel to WG26, which is responsible for the ISO 29119 standard, I am disappointed to read of the objection to the standard led by the International Society for Software Testing, which has resulted in a formal petition to ISO.

I respectfully suggest that their objections would be more effective if they engaged with their respective national bodies, and sought to overcome their objections, constructively.

People who are opposing ISO 29119 claim:

  1. It is costly.
  2. It will be seen as mandatory skill for testers (which may harm individuality and freedom).
  3. It may reduce the ability to experiment and try non-conventional ways.
  4. Once the standard is accepted, testers can be held responsible for project failures (or non-compliance).
  5. Effort will be more on documentation and process rather than testing.
    Let us consider each of these in turn.”

The International Society for Software Testing (ISST) launched the petition against ISO 29119, but this was merely one aspect of the campaign against the standard. Opposition was certainly not confined to ISST. The situation is somewhat confused by the fact that ISST disbanded in 2017. One of the prime reasons was that the “objectives set out by the founders have been met, or are in the capable hands of organisations that we support”. The main organisation referred to here is the larger and more established Association for Software Testing (AST), which can put more resources into the fight. I always felt the main differences between ISST and AST were in style and approach rather than principles and objectives.

The suggestion that the opponents of ISO 29119 should have worked through national ISO bodies is completely unrealistic. ISO’s approach is fundamentally wrong and opponents would have been seen as a wrecking crew preventing any progress. I know of a couple of people who did try and involve themselves in ISO groups and gave up in frustration. The debate within ISO about a standard like 29119 concerns the detail, not the underlying approach. In any case the committment required to join an ISO working group is massive. Meetings are held all over the world. They take up a lot of time and require huge expenses for travel and accommodation. That completely excludes independent consultants like myself.

“Costly

Opponents object to this standard because it is not freely available.

While this is a fair point, it is no different from every other standard that is in place – and which companies follow, often because it gives them a competitive advantage.

Personally, I would like to see more standards placed freely in the public domain, but I am not in a position to do it!”

The cost of the standard is a minor criticism. As a member of the AST’s Committee on Standards and Professional Practice I am fortunate to have access to the documents comprising the standard. These cost hundreds of dollars and I would baulk at buying them for myself. The full set would cost as much as a family holiday. I know which would be more important!

However, the cost does hamper informed debate about the content, and that was the true concern. The real damage of a poorly conceived standard will be poorer quality testing and that will be far more costly than the initial cost of the documents.

“Mandatory

Opponents claim this standard will be seen as a mandatory skill for testers (which may harm individuality and freedom).

ISO 29119 replaces a number of IEEE and British standards that have been in place for many years. And while those standards are seen to represent best practice, they have not been mandatory.”

I have two big issues with this counter argument. Firstly, the standards that ISO 29119 replaced were emphatically not “seen to represent best practice”. If they were best practice there would have been no need to replace them. They were hopelessly out of date but IEEE 829 was unhelpful, even damaging, when it was new.

My second concern is about the way that people respond to a standard. Back in 2009 I wrote this article “Do standards keep testers in the kindergarten?” in Testing Experience magazine arguing against the principle of testing standards, the idea of best practice and the inevitable danger of an unhelpful and dated standard like IEEE 829 being imposed on unwilling testers.

Once you’ve called a set of procedures a standard the argument is over in many organisations; testers are required to use them. It is disingenuous to say that standards are not mandatory. They are sold on the basis that they offer reassurance and that the wise, safe option is to make them compulsory.

I made this argument nine years ago thinking the case against standards had been won. I was dismayed to discover subsequently that ISO was trying to take us back several decades with ISO 29119.

“Experimentation

A formal testing environment should be a place where processes and procedures are in place, and is not one where ‘experiment and non-conventional’ methods are put in place. But having said that, there is nothing within ISO 29199 that prevents other methods being used.”

There may be a problem over the word “experiment” here. Andrew seems to think that testers who talk of experimentation are admitting they don’t know what they’re doing and are making it up as they go along. That would be an unfortunate interpretation. When testers from the Context Driven School refer to experimentation they mean the act of testing itself.

Good testing is a form of exploration and experimentation to find out how the product behaves. Michael Bolton describes that well here. A prescriptive standard that focuses on documentation distracts from, and effectively discourages, such exploring and experimentation. We have argued that at length and convincingly. It would be easier to analyse Andrew’s case if he had provided links to arguments from opponents who had advocated a form of experimentation he disapproves of.

“Accountability

Opponents claim that, once the standard is accepted, testers can be held responsible for project failures (or non-compliance).

As with any process or procedure, all staff are required to ensure compliance with the company manual – and project managers should be managing their projects to ensure that all staff are doing so.

Whether complying with ISO 29119 or any other standard or process, completion of testing and signing off as ‘passed’ carries accountability. This standard does not change that.”

This is a distortion of the opponents’ case. We do believe in accountability, but that has to be meaningful. Accountability must be based on something to which we can reasonably sign up. We strongly oppose attempts to enforce accountability to an irrelevant, poorly conceived and damaging standard. Complying with such a standard is orthogonal to good testing; there is no correlation between the two activities.

At best ISO 29119 would be an irrelevance. In reality it is more likely to be a hugely damaging distraction. If a company imposes a standard that all testers should wear laboratory technicians’ white coats it might look impressively professional, but complying with the standard would tell us nothing about the quality of the testing.

As a former auditor I have strong, well informed, views about accountability. One of ISO 29119’s serious flaws is that it fails to explain why we test. We need such clarity before we can have any meaningful discussion about compliance. I discussed this here, in “Do we want to be ‘compliant’ or valuable?”

The standard defines in great detail the process and the documents for testing, but fails to clarify the purpose of testing, the outcomes that stakeholders expect. To put it bluntly, ISO 29119 is vague about the ends towards which we are working, but tries to be precise about the means of getting there. That is an absurd combination.

ISO 29119 tries to set out rules without principles. Understanding the distinction between rules and principles is fundamental to the process of crafting professional standards that can hold practitioners meaningfully to account. I made this argument in the Fall 2015 edition of Better Software magazine. The article is also available on my blog, “Why ISO 29119 is a flawed quality standard”.

This confusion of rules and principles, means and ends, has led to an obsessive focus on delivering documentation rather than valuable information to stakeholders. That takes us on to Andrew’s next argument.

“Documentation

Opponents claim that effort will be more on documentation and process rather than testing.

I fail to understand this line of reasoning – any formal test regime requires a test specification, test cases and recorded test results. And the evidence produced by those results need argument. None of this is possible without documentation.”

Opponents of ISO 29119 have argued repeatedly and convincingly that a prescriptive standard which concentrates on documentation will inevitably lead to goal displacement; testers will concentrate on the documentation mandated by the standard and lose sight of why they are testing. That was our experience with IEEE 829. ISO 29119 repeats the same mistake.

Andrew’s second paragraph offers no refutation of the opponents’ argument. He apparently believes that we are opposed to documentation per se. That’s a straw man. Andrew justifies ISO 29119’s demand for documentation, which I believe is onerous and inappropriate, by asserting that it serves as evidence. Opponents argue that the standard places far too much emphasis on advance documentation and neglects evidence of what was discovered by the testing.

The statement that any formal test regime requires a test specification and test cases is highly contentious. Auditors would expect to see evidence of planning, but test specifications and test cases are just one way of doing it, the way that ISO 29119 advocates. In any case, advance planning is not evidence that good testing was performed any more than a neat project plan provides evidence that the project ran to time.

As for the results, the section of ISO 29119 covering test completion reports is woefully inadequate. It would be possible to produce a report that complied fully with the standard and offered nothing of value. That sums up the problems with ISO 29119. Testers can comply while doing bad testing. That is in stark contrast to the standards governing more established professions, such as accountants and auditors.

“Conclusion

Someone wise once said:

  1. Argument without Evidence is unfounded.
  2. Evidence without Argument is unexplained.

Having considered the argument put forward, and the evidence to support the case:

  • The evidence is circumstantial with no coherence.
  • The argument is weak, and seems only to support their vested interests.

For a body that represents test engineers, I would have expected better.”

The quote that Andrew uses from “someone wise” actually comes from the field of safety critical systems. There is much that we in conventional software testing can learn from that field. Perhaps the most important lessons are about realism and humility. We must deal with the world as it is, not as we would like it to be. We must accept the limitations of our knowledge and what we can realistically know.

The proponents of ISO 29119 are too confident in their ability to manage in a complex, evolving field using techniques rooted in the 1970s. Their whole approach tempts testers to look for confirmation of what they think they know already, rather than explore the unknown and explain what they cannot know.

Andrew’s verdict on the opposition to ISO 29119 should be turned around and directed at ISO and the standard itself. It was developed and launched in the absence of evidence that it would help testers to do a better job. The standard may have internal consistency, but it is incoherent when confronted with the complexities of the real world.

Testers who are forced to use it have to contort their testing to fit the process. Any good work they do is in spite of the standard and not because of it. It might provide a welcome route map to novice testers, but it offers a dangerous illusion. The standard tells them how to package their work so it appears plausible to those who don’t know any better. It defines testing in a way that makes it appear easier than it really is. But testing is not meant to be easy. It must be valuable. If you want to learn how to provide value to those who pay you then you need to look elsewhere.

Finally, I should acknowledge that some of the work I have cited was not available to Andrew when he wrote his blog in 2014. However, all of the underlying arguments and research that opponents of 29119 have drawn on were available long before then. ISO simply did not want to go looking for them. Our arguments about ISO 29119 being anti-competitive were at the heart of the Stop 29119 campaign. Andrew has not addressed those arguments.

If ISO wants to be taken seriously it must justify ISO 29119’s status as a standard. Principled, evidenced and coherent objections deserve a response. In a debate all sides have a duty to respond to their opponents’ strongest case, rather than evading difficult objections, setting up straw men and selectively choosing the arguments that are easiest to deal with.

ISO must provide evidence and coherent evidence that 29119 is effective. Simply labelling a set of prescriptive processes as a standard and expecting the industry to respect it for that reason will not do. That is the sign of a vested interest seeking to impose itself on a whole profession. No, the opposition has not died down; it has not had anything credible to oppose.