I believe that prescriptive standards are inappropriate for software testing. This paper states my objections in principle, i.e. it explains why I believe that the decision to develop ISO 29119 was fundamentally misguided. These objections are valid without any consideration of the detailed content of the standard, which testers cannot view for themselves unless they or their employers buy a set. Members of the Context Driven School of testing (CDT) were making these arguments before ISO 29119 was issued, or even conceived.
This is not a conventional article in essay style. It is more in the nature of a work of reference, providing sources, links and a basis for further work. I expect to update it periodically. The original version arose from work that I did for the Association for Software Testing’s (AST) Committee on Standards and Professional Practices.
My objections in principle fall into four categories, based on regulatory theory and practice, the nature of software development, the social sciences, and the need for fair competition. These objections are based on academic and practical evidence. A full explanation of my objections could run to book length. I will therefore restrict myself to a brief statement of each objection, with supporting links.
The AST’s mission and the principles of CDT have informed and guided my analysis so I shall start by quoting them.
The AST’s mission, as stated on its website is as follows.
“The Association for Software Testing is dedicated to advancing the understanding of the science and practice of software testing according to Context-Driven principles.
The Association for Software Testing (AST) is an international non-profit professional association with members in over 50 countries. AST is dedicated and strives to build a testing community that views the role of testing as skilled, relevant, and essential to the production of faster, better, and less expensive software products. We value a scientific approach to developing and evaluating techniques, processes, and tools. We believe that a self-aware, self-critical attitude is essential to understanding and assessing the impact of new ideas on the practice of testing.”
The seven basic principles of Context-Driven Testing (CDT) are as follows.
- “The value of any practice depends on its context.
- There are good practices in context, but there are no best practices.
- People, working together, are the most important part of any project’s context.
- Projects unfold over time in ways that are often not predictable.
- The product is a solution. If the problem isn’t solved, the product doesn’t work.
- Good software testing is a challenging intellectual process.
- Only through judgment and skill, exercised cooperatively throughout the entire project, are we able to do the right things at the right times to effectively test our products.”
1 – Objections based on regulatory theory and practice
1a – Principles and rules
There has been much discussion in recent years about the relative merits of principles and rules in regulating and influencing behavior. For the purposes of this paper I will define principles as non-specific statements of what is expected, and rules as detailed and specific statements of what must be done.
I believe that for a complex, context dependent and cognitively demanding activity such as software testing it is unhelpful to present a binary choice between either principles or rules. It is far more effective to combine a set of general principles applying to all testers with specific rules that are relevant to particular organizations and settings. See Braithwaite’s “Rules and principles; a theory of legal certainty”. For these rules and principles to work effectively there should be constant discussion, or regulatory conversations (PDF, opens in a new tab) between regulators and those being regulated about the meaning and application of the principles.
The situation is confused because standards in legal discussions are usually assumed to be synonymous with principles. A standard is usually conceived as a clear statement of what must be achieved by an activity, rather than how it should be performed in detail. Standards for software testing have traditionally be pitched at the detailed level of rules, e.g. IEEE 829.
A standard that would be appropriate for software testing would therefore be brief, clear and non-specific.
A suitable example of such a standard would be the International Standards for the Professional Practice of Internal Auditing.
1b – Regulators
Following on from the last point I believe that the specific requirements of industry regulators are of paramount importance for testers in those industries. Any testing standards should be framed as principles in such a way that they are consistent with those requirements and not attempt to provide competing, detailed rules.
As an example, the US Food and Drugs Administration offers clear advice about what is required, focusing on the need for testing to provide high quality evidence that is capable of standing up in court. The FDA advice does not specify exactly how this should be done, but does allow companies to adopt the “least burdensome approach”. See “General Principles of Software Validation; Final Guidance for Industry and FDA Staff”, 2002.
1c – Accountability
A frequent justification of the need for software testing standards is that testers should be accountable to stakeholders: testers must demonstrate that they are providing valuable information, effectively and efficiently.
We agree that accountability is important, but do not believe that prescriptive standards meet that need. In evidence I point to the requirements of the audit profession.
1c1 – IIA standards
The standards of the global Institute of Internal Auditors offer no support for generic, prescriptive testing standards. The Global Technology Audit Guide, “Management of IT Audit” (PDF, opens in a new tab), 1st edition, 2006 describes the Snowflake Theory of IT Audit:
“Every IT environment is unique and represents a unique set of risks. The differences make it increasingly difficult to take a generic or checklist approach to auditing.”
Sadly the 2nd edition, released in 2013, removed the Snowflake Theory when the guidance was made more concise, but the IIA’s approach is unchanged. Generic baselines that auditors can use as a checklist are not helpful. This passage in the Audit Guide, from the section “Frameworks and Standards” survived. It says:
“One challenge auditors face when executing IT audit work is knowing what to audit against. Most organizations have not fully developed IT control baselines for all applications and technologies. The rapid evolution of technology could likely render any baselines useless after a short period of time.”
Although the IIA Global Technology Audit Guides refer to ISO standards “for consideration” as baselines against which to audit they make no mention of software testing standards and emphasize the need to understand the different mix of standards, methods and tools that each organization uses, and that they should not expect to see any set of “best practice” implemented wholesale without customization (Global Technology Audit Guide, GTAG 12: Auditing IT Projects).
1c2 – Information Systems Audit and Control Association
COBIT 5 is ISACA’s model for IT governance. It stresses the need for clear organization-specific testing standards, and for careful documentation of test plans and results. However, there is no mention of organizations incorporating external standards. “Best practices” are to be used as a “reference when improving and tailoring”. Organizations should “employ code inspection, test-driven development practices, automated testing, continuous integration, walk-throughs and testing of applications as appropriate.” COBIT 5 has countless references to various external standards, but none to testing standards.
COBIT 5 is important because it is widely accepted as an effective means of complying with the requirements of Sarbanes-Oxley.
1c3 – End of binary opinions
The audit profession has traditionally offered binary opinions on whether accounts were true and fair, or internal controls were present or lacking. Increasingly the profession requires more nuanced reporting about the risks that matter, the risks that keep stakeholders up at night. This requires testers to offer more valuable information about the quality of products than merely saying how many test cases were run and passed. Testers have to tell a valuable story rather than rely on filling in generic standard based templates.
1c4 – Lifebelt for inexperienced investigators
Prescriptive standards act as a lifebelt for investigators and auditors who lack the experience and confidence to make their own judgments. They search for something they can use to give them reassuring answers about how a job should be done. The US Government Accountability Office in its March 2015 report (PDF, opens in a new tab) on the problems with the Healthcare.gov website checked the project for compliance with the IEEE 829 test documentation standard. The last revision was made in 2008, and it said that the content of standards more than five years old “do not wholly reflect the present state of the art”. In fact IEEE 829 is hopelessly out of date. It is now quite irrelevant to modern testing.
Standards have credibility and huge influence simply from their status as standards. If testers are to have standards it is essential that they are relevant, credible and framed in a way that does not mislead investigators.
2 – Objections based on the nature of software development
2a – Experience with IEEE829
2a1 – Documentation obsession
IEEE 829 was for many years the pre-eminent testing standard. It had a huge influence on testing. Many organizations based their testing processes and documents on this standard and its templates.
The result was a widespread fixation with excessively large, uninformative documents, which became the focus for testing teams’ activities, rather than acquiring knowledge about the product under test.
2a2 – Traditional methods
IEEE 829 was aligned conceptually with traditional methods, especially Structured Methods, which were based on the assumption that linear and documentation driven processes are necessary for a quality product and that more flexible, light-weight documentation approaches were irresponsible. However, Structured Methods were based on intuition rather than evidence and studies showed that a document driven approach did not match the way people think when creating software. There is a lack of evidence that the detailed and document driven approach associated with IEEE 829 was ever an effective, generic approach to testing.
2b – Complexity
Prescriptive processes are ill equipped to handle complex activities like software development. In software development it is impossible to state with certainty what the end product should look like at an early stage. This undermines the rational for a heavy investment in advance testing documentation.
2c – Cynefin
The Cynefin Framework is a valuable way to help us to make sense of the differing responses that different situations require. Software development and testing are inherently complex, and therefore “best practice” and prescriptive standards are inappropriate. A flexible, iterative approach is required.
3 – Objections based on the social sciences
3a – The nature of expertise.
Prescriptive standards do not take account of how people acquire skills and apply their knowledge and experience in cognitively demanding jobs such as testing.
3a1 – Tacit knowledge
Michael Polanyi and Harry Collins have offered valuable arguments about how we apply knowledge. Polanyi introduced the term “tacit knowledge” to describe the knowledge we have and use but cannot articulate; Collins built on his work. Foisting prescriptive standards onto skilled people is counter productive and encourages them to concentrate on the means rather than the ends.
3a2 – Schön’s reflection-in-action
Donald Schön (PDF, opens in a new tab) argues that creative professionals, such as software designers or architects, have an iterative approach to developing ideas. Much of their knowledge is tacit. They can’t write down all of their knowledge as a neatly defined process. To gain access to what they know they have to perform the creative act so they can learn, reflect on what they’ve learned and then apply the new knowledge. This is inconsistent with following a prescriptive process.
3b – Goal displacement
Losing sight of the true goals and focusing on methods, processes and documentation is a constant danger of prescriptive standards. Not only does this reflect my own experience there is a wealth of academic studies backing this up.
3b1 – Trained incapacity
A full century ago, in 1914, Thorstein Veblen identified the problem of trained incapacity. People who are trained in specific skills can lack the ability to adapt. Their response has worked in the past, and they apply it regardless thereafter. They focus on responding in the way they have been trained, and cannot see that the circumstances require a different response. Their training has rendered them incapable of doing the job effectively unless it fits their mental framework.
3b2 – Ways of seeing
In 1935 Kenneth Burke built on Veblen’s work, arguing that trained incapacity meant that one’s abilities become blindnesses. People can focus on the means or the ends, not both, and their specific training in prescriptive methods or processes leads them to focus on the means. They do not even see what they are missing.
3b3 – Conformity to the rules
Robert Merton made the point more explicitly in 1957.
“Adherence to the rules… becomes an end in itself… Formalism, even ritualism, ensues with an unchallenged insistence upon punctilious adherence to formalized procedures. This may be exaggerated to the point where primary concern with conformity to the rules interferes with the achievement of the purposes of the organization”.
So the problem had been recognised before software development was even in its infancy.
3c – Defense against anxiety
3c1 – Social defenses
Isabel Menzies Lyth (PDF, opens in a new tab) provided a different slant in the 1950s using her experience in psychoanalysis. Her specialism was analyzing the social dynamics of organizations.
She argued that the main factor shaping an organization’s structure and processes was the need for people to cope with stress and anxiety. “Social defenses” were therefore built to help people cope. The defenses identified by Menzies Lyth included rigid processes that removed discretion and the need for decision making, hierarchical staffing structures, increased specialization, and people being managed as fungible (i.e. readily interchangeable) units, rather than skilled professionals.
3c2 – Transitional objects
Donald Winnicott’s contribution (PDF, opens in a new tab) was the idea of the transitional object. This is something that helps infants to cope with loosening the bonds with their mother. Babies don’t distinguish between themselves and their mother. Objects like security blankets and teddy bears give them something comforting to cling onto while they come to terms with the beginnings of independence in a big, scary world.
David Wastell linked the work of Menzies Lyth and Winnicott. He found that developers used Structured Methods as transitional objects, i.e. as a defence mechanism to alleviate the stress and anxiety of a difficult job.
Wastell could see no evidence that Structured Methods worked. The evidence was that the resulting systems were no better than the old ones, took much longer to develop and were more expensive. Managers became hooked on technique and lost sight of the true goal.
“Methodology becomes a fetish, a procedure used with pathological rigidity for its own sake, not as a means to an end. Used in this way, methodology provides a relief against anxiety; it insulates the practitioner from the risks and uncertainties of real engagement with people and problems.”
3d – Loss of communication
Effective two way communication requires effort, “interpretive labor”. The anthropologist David Graeber (PDF, opens in a new tab) argues that the greater the degree of force or compulsion, and the greater the bureaucratic regime of rules and forms, the less is the communication. Those who issue the orders don’t need to, and therefore don’t take the trouble to understand the complexities of the situation they’re managing. This problem works against regulatory conversations (see 1a).
4 – Objections based on the need for competition in testing services
4a – ISO brand advantage
ISO has the reputation for being the global leader in standardization. Any standard that it issues has a huge advantage over alternative methods, simply because of the ISO brand. It is therefore vital that any testing standard is both credible and widely accepted on its own merits. My view, based on the evidence I have set out, is that a detailed, prescriptive standard could not meet those tests.
4b – Market for lemons
Buyers of testing services are often ill informed about the quality of the service that they buy. Economists recognize that where there is asymmetric information (PDF, opens in a new tab) purchasers are vulnerable and the market is distorted. Naive buyers at used car auctions cannot distinguish between good cars and lemons. This puts the sellers of lemons at an advantage. They can get a higher price than their product is worth; sellers of better products find it difficult to reach the price they want and are likely to leave the market, which becomes dominated by poor products.
For the reasons I have outlined, any prescriptive testing standard will help sellers of poor testing services to sell plausible “lemon” services. That will make it harder for sellers of high quality testing services to gain a worthwhile price; prices will drift downwards as testing is commoditized, sold on price rather than quality.
4c – Compulsion through contracts
Any ISO standard is likely to be referenced in contracts by lawyers and managers. This will introduce compulsion. This was the profession’s experience with IEEE 829. Even when IEEE 829 was not directly mandated it had huge influence on many organizations’ test strategies.
If standards are detailed, prescriptive and mandatory then that will reduce the flexibility that testers need for context driven testing. This would not fit the principles driven style of regulation that I outlined as being desirable in 1a. It would also lead to poorer communication, as described in 3d.