Posted by: James Christie | March 12, 2015

Standards – a charming illusion of action

The other day I posted an article I’d written that appeared on the uTest blog a few weeks ago. It was a follow up to an article I wrote last year about ISO 29119. Pmhut (the Project Management Hut website) provided an interesting comment.

”…are you sure that the ISO standards will be really enforced on testing – notably if they don’t really work? After all, lawyers want to get paid and clients want their projects done (regardless of how big the clients are).”

Well, as I answered, whether or not ISO 29119 works is, in a sense, irrelevant. Whether or not it is adopted and enforced will not depend on its value or efficacy. ISO 29119 might go against the grain of good software development and testing, but it is very much aligned with a hugely pervasive trend in bureaucratic, corporate life.

I pointed the commenter to an article I wrote on “Teddy Bear Methods”. People cling to methods not because they work, but because they gain comfort from doing so. That is the only way they can deal with difficult, stressful jobs in messy and complex environments. I could also have pointed to this article “Why do we think we’re different?”, in which I talk about goal displacement, our tendency to focus on what we can manage while losing sight of what we’re supposed to be managing.

A lesson from Afghanistan

I was mulling over this when I started to read a fascinating looking book I was given at Christmas; “Heirs to Forgotten Kingdoms” by Gerard Russell, a deep specialist in the Middle East and a fluent Arabic and Farsi speaker.

The book is about minority religions in the Middle East. Russell is a former diplomat in the British Foreign Office. The foreword was by Rory Stewart, the British Conservative MP. Stewart was writing about his lack of surprise that Russell, a man deeply immersed in the culture of the region, had left the diplomatic service, then added;

”Foreign services and policy makers now want ‘management competency’ – slick and articulate plans, not nuance, deep knowledge, and complexity.”

That sentence resonated with me, and reminded me of a blistering passage from Stewart’s great book “The Places in Between”, his account of walking through the mountains of Afghanistan in early 2002 in the immediate aftermath of the expulsion of the Taliban and the NATO intervention.

Rory Stewart is a fascinating character, far removed from the modern identikit politician. The book is almost entirely a dispassionate account of his adventures and the people whom he met and who provided him with hospitality. Towards the end he lets rip, giving his brutally honest and well-informed perspective of the inadequacies of the western, bureaucratic, managerial approach to building a democratic state where none had previously existed.

It’s worth quoting at some length.

“I now had half a dozen friends working in embassies, thinktanks, international development agencies, the UN and the Afghan government, controlling projects worth millions of dollars. A year before they had been in Kosovo or East Timor and in a year’s time they would have been moved to Iraq or Washington or New York.

Their objective was (to quote the United Nations Assistance Mission for Afghanistan) ‘The creation of a centralised, broad-based, multi-ethnic government committed to democracy, human rights and the rule of law’. They worked twelve- or fourteen- hour days, drafting documents for heavily-funded initiatives on ‘democratisation’, ‘enhancing capacity’, ‘gender’, ‘sustainable development,’ ‘skills training’ or ‘protection issues’. They were mostly in their late twenties or early thirties, with at least two degrees – often in international law, economics or development. They came from middle class backgrounds in Western countries and in the evenings they dined with each other and swapped anecdotes about corruption in the Government and the incompetence of the United Nations. They rarely drove their 4WDs outside Kabul because they were forbidden to do so by their security advisers. There were people who were experienced and well informed about conditions in rural areas of Afghanistan. But such people were barely fifty individuals out of many thousands. Most of the policy makers knew next to nothing about the villages where 90% of the population of Afghanistan lived…

Their policy makers did not have the time, structures or resources for a serious study of an alien culture. They justified their lack of knowledge and experience by focusing on poverty and implying that dramatic cultural differences did not exist. They acted as though villagers were interested in all the priorities of international organisations, even when they were mutually contradictory…

Critics have accused this new breed of administrators of neo-colonialism. But in fact their approach is not that of a nineteenth-century colonial officer. Colonial administrations may have been racist and exploitative but they did at least work seriously at the business of understanding the people they were governing. They recruited people prepared to spend their entire careers in dangerous provinces of a single alien nation. They invested in teaching administrators and military officers the local language…

Post-conflict experts have got the prestige without the effort or stigma of imperialism. Their implicit denial of the difference between cultures is the new mass brand of international intervention. Their policy fails but no one notices. There are no credible monitoring bodies and there is no one to take formal responsibility. Individual officers are never in any one place and rarely in any one organisation long enough to be adequately assessed. The colonial enterprise could be judged by the security or revenue it delivered, but neo-colonialists have no such performance criteria. In fact their very uselessness benefits them. By avoiding any serious action or judgement they, unlike their colonial predecessors, are able to escape accusations of racism, exploitation and oppression.

Perhaps it is because no one requires more than a charming illusion of action in the developing world. If the policy makers know little about the Afghans, the public knows even less, and few care about policy failure when the effects are felt only in Afghanistan.”

Stewart’s experience and insight, backed up by the recent history of Afghanistan, allow him to present an irrefutable case. Yet, in the eyes of pretty much everyone who matters he is wrong. Governments and the military are prepared to ignore the evidence and place their trust in irrelevant and failed techniques rather than confront the awful truth; they don’t know what they’re doing and they can’t know the answers.

Vast sums of money, and millions of lives are at stake. Yet very smart and experienced people will cling on to things that don’t work, and will repeat their mistakes in the future. Stewart, meanwhile, is very unlikely to be allowed anywhere near the levers of power in the United Kingdom. Being right isn’t necessarily a great career move.

Deep knowledge, nuance and complexity

I’m conscious that I’m mixing up quite different subjects here. Software development and testing are very different activities from state building. However, both are complex and difficult. Governments fail repeatedly at something as important and high-profile as constructing new, democratic states, and do so without feeling the need to reconsider their approach. If that can happen in the glare of publicity is it likely that corporations will refrain from adopting and enforcing standards just because they don’t work? Whether or not they work barely matters. Such approaches fit the mindset and culture of many organisations, especially large bureaucracies, and once adopted it is very difficult to persuade them to abandon them.

Any approach to testing that is based on standardisation is doomed to fail unless you define success in a way that is consistent with the flawed assumptions of the standardisation. What’s the answer? Not adopting standards that don’t work is an obvious start, but that doesn’t take you very far. You’ve got to acknowledge those things that Stewart referred to in his foreword to Gerard Russell’s book; answers aren’t easy, they require deep knowledge, an understanding of nuance and an acceptance of complexity.

A video worth watching

Finally, I’d strongly recommend this video of Rory Stewart being interviewed by Harry Kreisler of the University of California about his experiences and the problems I’ve been discussing. I’ve marked the parts I found most interesting.

34 minutes; Stewart is asked about applying abstract ideas in practice.

40:20; Stewart talks about a modernist approach of applying measurement, metrics and standardisation in contexts where they are irrelevant.

47:05; Harry Kreisler and then Stewart talk about participants failing to spot the obvious, that their efforts are futile.

49:33; Stewart says how his Harvard students regarded him as a colourful contrarian, and that all Afghanistan needed was a new plan and new resources.

Posted by: James Christie | March 10, 2015

ISO 29119: Why is the Debate One-Sided?

This article originally appeared on the uTest blog on February 23rd 2015.

In August last year the Stop 29119 campaign and petition kicked off at the CAST conference in New York.

In September I wrote on the uTest blog about why the new ISO/IEEE 29119 software testing standards are a danger to good testing and the whole testing profession.

I was amazed at the commotion that Stop 29119 caused. It was the biggest talking point in testing in 2014. Six months on it’s time to look back. What has actually happened?

The remarkable answer is – very little. The Stop 29119 campaigners haven’t given up. There has been a steady stream of blogs and articles. However, there has been no real debate; the discussion has been almost entirely one sided.

There has been only one response from ISO. In September Dr Stuart Reid, the convenor of the working goup that produced the standard, issued a statement attempting to rebut the arguments of Stop 29119. That was it. ISO then retreated into its bunker and ignored invitations to debate.

Dr Reid’s response was interesting, both in its content and the way it engaged with the arguments of Stop 29119. The Stop 29119 petition was initiated by the board of the International Society for Software Testing. ISST’s website had a link to the petition, and a long list of blogs and articles from highly credible testing experts criticising ISO 29119. It is a basic rule of debate that one always tackles an opponent’s strongest points. However, Dr Reid ignored these authoritative arguments and responded to a series of points that he quoted from the comments on the petition site.

To be more accurate, Dr Reid paraphrased a selection of the comments and criticisms from elsewhere, framing them in a way that made it easier to refute them. Some of these points were no more than strawmen.

So Cem Kaner argued that IEEE adopts a “software engineering standards process that I see as a closed vehicle that serves the interests of a relatively small portion of the software engineering community… The imposition of a standard that imposes practices and views on a community that would not otherwise agree to them, is a political power play”..

Dr Reid presented such arguments as “no-one outside the Working Group is allowed to participate” and “the standards ‘movement’ is politicized and driven by big business to the exclusion of others”.

These arguments were then dismissed by stating that anyone can join the Working Group, which consists of people from all parts of the industry. Dr Reid also emphasized that “consensus” applies only to those within the ISO process, failing to address the criticism that this excludes those who believe, with compelling evidence, that ISO-style standardization is inappropriate for testing.

These criticisms had been made forcefully for many years, in articles and at conferences, yet Dr Reid blithely presented the strawman that “no-one knew about the standards and the Working Group worked in isolation”. He then effortlessly demolished the argument that no-one was making.

What of the content? There were concerns about how ISO 29119 deals with Agile and Exploratory Testing. Eg, Rikard Edgren offered a critique arguing that the standards tried but failed to deal with Agile. Similarly, Huib Schoots argued that a close reading of the standards revealed that the writers didn’t understand exploratory testing at all.

These are serious arguments that defenders of the standard must deal with if they are to appear credible. What was the ISO response?

Dr Reid reduced such concerns to bland and inaccurate statements that “the standards represent an old-fashioned view and do not address testing on agile projects” and ”the Testing Standards do not allow exploratory testing to be used”. Again these were strawmen that he could dismiss easily.

I could go on to highlight in detail other flaws in the ISO response; the failure to address the criticism that the standards weren’t based on research or experience that demonstrates the validity of that approach; the failure to answer the concern that the standards will lead to compulsion by the back door; the failure to address the charge from the founders of Context Driven Testing that the standards are the antithesis of CDT; the evasion of the documented links between certification and standards.

In the case of research Dr Reid told us of the distinctly underwhelming claims from a Finnish PhD thesis (PDF, opens in a new tab) that the standards represent “a feasible process model for a practical organisation with some limitations”. These limitations are pretty serious; “too detailed” and “the standard model is top heavy”. It’s interesting to note that the PhD study was produced before ISO 29119 part 3 was issued; the study does not mention part 3 in the references. The study can therefore offer no support for the heavyweight documentation approach that ISO 29119 embodies.

So instead of standards based on credible research we see a search for any research offering even lukewarm support for standards that have already been developed. That is not the way to advance knowledge and practice.

These are all huge concerns, and the testing community has received no satisfactory answers. As I said, we should always confront our opponents strongest arguments in a debate. In this case I’ve run through the only arguments that ISO have presented. Is it any wonder that the Stop 29119 campaigners don’t believe we have been given any credible answers at all?

What will ISO do? Does it wish to avoid public discussion in the hope that the ISO brand and the magic word “standards” will help them embed the standards in the profession? That might have worked in the past. Now, in the era of social media and blogging there is no hiding place. Anyone searching for information about ISO 29119 will have no difficulty finding persuasive arguments against it. They will not find equally strong arguments in favour of the standards. That seems to be ISO’s choice.

Posted by: James Christie | February 9, 2015

“A novel-long standard” A1QA interview

A1QA blogI was asked to take part in this interview about ISO 29119 by Elizabeth Soroka of A1QA. The interview ran in January 2015 under the headline “A novel-long standard; interview with James Christie”

James, you’ve tried so many IT fields. Can you explain why you switched into auditing?

I worked for a big insurance company. They had just re-organized their Audit department. One of the guys who worked there knew me and thought I’d be well suited to audit. I was a developer who had moved on into a mixture of programming and systems analysis. However, I had studied accountancy at university and spent a couple of years working in accountancy and insurance investment, so I had a wider business perspective than most devs. I think that was a major reason for me being approached.

I turned down the opportunity because I was enjoying my job and I wanted to finish the project I was responsible for. The Audit department kept in touch with me and I gradually realised that it would be a much more interesting role than I’d thought. A couple of years later another opportunity came up at a time when I was doing less interesting work so I jumped at the chance. It was a great decision. I learned a huge amount about how IT fitted into the business.

As a person with an audit background, do you think standards improve software testing or block it?

They don’t improve testing. I don’t think there’s any evidence to support that assertion. The most that ISO 29119 defenders have come up with is the claim that you can do good testing using the standard. That’s arguable, but even if it is true it is a very weak defence for making something a standard. It’s basically saying that ISO 29119 isn’t necessarily harmful.

I wouldn’t have said that ISO 29119 blocks testing. It’s a distraction from testing because it focuses attention on the documentation, rather than the real testing. An auditor should expect three things; a clear idea about how testing will be performed, and evidence that explains what testing was done, and what the significance was of the results.

ISO 29119, and the previous testing standard IEEE 829, emphasize heavy advance documentation and deal pitifully with the final reports. Auditors should expect an over-arching test strategy saying “this is our approach to testing in this organization”. They should also expect an explanation of how that strategy will be interpreted for the project in question.

Detailed test case specifications shouldn’t impress auditors any more than detailed project plans would convince anyone that the project was successful. ISO 29119 says that “test cases shall be recorded in the test case specification” and “the test case specification shall be approved by the stakeholders”.

That means that if testers are to be compliant with the standard they have to document their planned testing in detail, then get the documents approved by many people who can’t be expected to understand all that detail. Trying to comply with the standard will create a mountain of unnecessary paper. As I said, it’s a distraction from the real work.

You started the campaign “STOP 29119”.Tell us a few words about the standard?

I don’t claim that I started the campaign. The people who deserve most credit for that are probably Karen Johnson and Iain McCowatt, who responded so energetically to my talk at CAST 2014 in New York.

ISO 29119 is an ambitious attempt, in ISO’s words “to define an internationally-agreed set of standards for software testing that can be used by any organization when performing any form of software testing.”

The full standard will consist of five documents; glossary, processes, documentation, techniques and finally key-word driven testing. So far the first three documents have been issued, i.e. the glossary, processes and documentation. The fourth document, test techniques, is due to be issued any time now. The fifth, on key-word driven testing should come out in 2015.

The campaign has called on ISO to withdraw the standard. However, I would happily settle for damaging its credibility as a standard for “any organization when performing any form of software testing”. That aim is more than just being ambitious. It stretches credulity.

Testing standards are beneficial for testing (I hope you agree): they implement some new practices and can school the untutored. Still, what is wrong with the 29119 standard?

The content of ISO 29119 is very old-fashioned. It is based on a world view from the 1970s and 1980s that confused rigour and professionalism with massive documentation. It really is the last place to go to look for new ideas. Newcomers to testing should be encouraged to look elsewhere for ideas about how to perform good testing.

Testing standards can be beneficial in a particular organization. They may even be beneficial in industries that have specific needs, such as medical devices and drugs, and financial services. However, they have to be very carefully written and they must maintain a clear distinction between true standards and overly prescriptive guidance. ISO 29119 fails to make the distinction. It is far too detailed and prescriptive.

The three documents that have been issued so far add up to 89,000 words over 270 pages. That’s as long as many novels. In fact it’s as long as George Orwell’s “Animal Farm” plus Erich Maria Remarque’s “All Quiet on the Western Front” combined. It’s almost exactly the same length as Orwell’s “1984” and Jane Austen’s “Persuasion”.

That is ridiculously long for a standard. The Institute of Internal Auditors’ “International Standards for the Professional Practice of Internal Auditing” runs to only 26 pages and 8,000 words. The IIA’s standards are high level statements of principle, covering all types of auditing. More detailed guidance about how to perform audits in particular fields is published separately. That guidance doesn’t amount to a series of “you shall do x, y & z”. It offers auditors advice on potential problems, and gives useful tips to guide the inexperienced. The difference between standards and guidance is crucial, and ISO blurs that distinction.

The defenders of ISO 29119 argue that tailored compliance is possible; testers don’t have to follow the full standard. There are two problems with that. Tailored compliance requires agreement from all of the stakeholders for all of the tasks that won’t be performed, and documents that won’t be produced. There are hundreds of mandatory tasks and documents, so even tailored compliance imposes a huge bureaucratic overhead. The second problem is that tailored compliance will look irresponsible. The marketing of the standard appeals to fear. Stuart Reid has put it explicitly.

“Imagine something goes noticeably wrong. How easy will you find it to explain that your testing doesn’t comply with international testing standards? So, can you afford not to use them?”

Anyone who is motivated by that to introduce ISO 29119 is likely to believe that full compliance must be safer and more responsible than tailored compliance. The old IEEE 829 test documentation standard also permitted tailored compliance. That wasn’t the way it worked out in practice. Organizations which followed the standard didn’t tailor their compliance and produced far too much wasteful documentation. ISO should have thought more carefully about how they would promote the standard and what the effects might be of their appeal to fear.

And in the end, what are the results of your campaign?

It’s hard to say what the results are. No-one seriously expected that ISO would roll over and withdraw the standard. I did think that ISO would make a serious attempt to defend it, and to engage with the arguments of the Stop 29119 campaigners. That hasn’t happened. The result has been that when people search for information about ISO 29119 they can’t fail to find articles by Stop 29119 campaigners. They will find nothing to refute them. I think that damages ISO’s credibility. ISO is now caught in a bind. It can ignore the opposition, and therefore concede the field to its opponents. Or it can try to engage in debate and reveal the lack of credible foundations of the standard.

I think the campaign has been successful in demonstrating that the standard lacks credibility in a very important part of the testing profession and therefore lacks the consensus that a standard should enjoy. I hope that if the campaign keeps going then it will prevent many organizations from forcing the standard onto their testers and thus forcing them to do less effective and efficient testing. Sometimes it feels like Stop 29119 is very negative, but if we can persuade people not to adopt the standard then I think that makes a positive contribution towards more testers doing better testing.

Posted by: James Christie | January 22, 2015

Service Virtualization interview about usability

Service VirtualizationI was asked to take part in this interview by George Lawton of Service Virtualization. Initially I wasn’t enthusiastic because I didn’t think I would have much to say. However, the questions set me thinking, and I felt they were relevant to my experience so I was happy to take part. It gave me something to do while I was waiting to fly back from EuroSTAR in Dublin!

How does usability relate to the notion of the purpose of a software project?

When I started in IT over 30 years ago I never heard the word usability. It was “user friendliness”, but that was just a nice thing to have. It was nice if your manager was friendly, but that was incidental to whether he was actually good at the job. Likewise, user friendliness was incidental. If everything else was ok then you could worry about that, but no-one was going to spend time or money, or sacrifice any functionality just to make the application user friendly. And what did “user friendly” mean anyway. “Who knows? Who cares? We’ve got serious work do do. Forget about that touchy feely stuff.”

The purpose of software development was to save money by automating clerical routines. Any online part of the system was a mildly anomalous relic of the past. It was just a way of getting the data into the system so the real work could be done. Ok, that’s an over-simplification, but I think there’s enough truth in it to illustrate why developers just didn’t much care about the users and their experience. Development moved on from that to changing the business, rather than merely changing the business’s bureaucracy, but it took a long time for these attitudes to shift.

The internet revolution turned everything upside down. Users are no longer employees who have to put up with whatever they’re given. They are more likely to be customers. They are ruthless and rightly so. Is your website confusing? Too slow to load? Your customers have gone to your rivals before you’ve even got anywhere near their credit card number.

The lesson that’s been getting hammered into the heads of software engineers over the last decade or so is that usability isn’t an extra. I hate the way that we traditionally called it a “non-functional requirement”, or one of the “quality criteria”. Usability is so important and integral to every product that telling developers that they’ve got to remember it is like telling drivers they’ve got to remember to use the steering wheel and the brakes. If they’re not doing these things as a matter of course they shouldn’t be allowed out in public. Usability has to be designed in from the very start. It can’t be considered separately.

What are the main problems in specifying for and designing for software usability?

Well, who’s using the application? Where are they? What is the platform? What else are they doing? Why are they using the application? Do they have an alternative to using your application, and if so, how do you keep them with yours? All these things can affect decisions you take that are going to have a massive impact on usability.

It’s payback time for software engineering. In the olden days it would have been easy to answer these questions, but we didn’t care. Now we have to care, and it’s all got horribly difficult.

These questions require serious research plus the experience and nous to make sound judgements with imperfect evidence.

In what ways do organisations lose track of the usability across the software development lifecycle?

I’ve already hinted at a major reason. Treating usability as a non-functional requirement or quality criterion is the wrong approach. That segregates the issue. It’s treated as being like the other quality criteria, the “…ities” like security, maintainability, portability, reliability. It creates the delusion that the core function is of primary importance and the other criteria can be tackled separately, even bolted on afterwards.

Lewis & Rieman came out with a great phrase fully 20 years ago to describe that mindset. They called it the peanut butter theory of usability. You built the application, and then at the end you smeared a nice interface over the top, like a layer of peanut butter (PDF, opens in new tab).

“Usability is seen as a spread that can be smeared over any design, however dreadful, with good results if the spread is thick enough. If the underlying functionality is confusing, then spread a graphical user interface on it. … If the user interface still has some problems, smear some manuals over it. If the manuals are still deficient, smear on some training which you force users to take.”

Of course they were talking specifically about the idea that usability was a matter of getting the interface right, and that it could be developed separately from the main application. However, this was an incredibly damaging fallacy amongst usability specialists in the 80s and 90s. There was a huge effort to try to justify this idea by experts like Hartson & Hix, Edmonds, and Green. Perhaps the arrival of Object Oriented technology contributed towards the confusion. A low level of coupling so that different parts of the system are independent of each other is a good thing. I wonder if that lured usability professionals into believing what they wanted to believe, that they could be independent from the grubby developers.

Usability professionals tried to persuaded themselves that they could operate a separate development lifecycle that would liberate them from the constraints and compromises that would be inevitable if they were fully integrated into development projects. The fallacy was flawed conceptually and architecturally. However, it was also a politically disastrous approach. The usability people made themselves even less visible, and were ignored at a time when they really needed to be getting more involved at the heart of the development process.

As I’ve explained, the developers were only too happy to ignore the usability people. They were following methods and lifecycles that couldn’t easily accommodate usability.

How can organisations incorporate the idea of usability engineering into the software development and testing process?

There aren’t any right answers, certainly none that will guarantee success. However, there are plenty of wrong answers. Historically in software development we’ve kidded ourselves thinking that the next fad, whether Structured Methods, Agile, CMMi or whatever, will transform us into rigorous, respected professionals who can craft high quality applications. Now some (like Structured Methods) suck, while others (like Agile) are far more positive, but the uncomfortable truth is that it’s all hard and the most important thing is our attitude. We have to acknowledge that development is inherently very difficult. Providing good UX is even harder and it’s not going to happen organically as a by-product of some over-arching transformation of the way we develop. We have to consciously work at it.

Whatever the answer is for any particular organisation it has to incorporate UX at the very heart of the process, from the start. Iteration and prototyping are both crucial. One of the few fundamental truths of development is that users can’t know what they want and like till they’ve seen what is possible and what might be provided.

Even before the first build there should have been some attempt to understand the users and how they might be using the proposed product. There should be walkthroughs of the proposed design. It’s important to get UX professionals involved, if at all possible. I think developers have advanced to the point that they are less likely to get it horribly wrong, but actually getting it right, and delivering good UX is asking too much. For that I think you need the professionals.

I do think that Agile is much better suited to producing good UX than traditional methods, but there are still dangers. A big one is that many Agile developers are understandably sceptical about anything that smells of Big Up-Front Analysis and Design. It’s possible to strike a balance and learn about your users and their needs without committing to detailed functional requirements and design.

How can usability relate to the notion of testable hypothesis that can lead to better software?

Usability and testability go together naturally. They’re also consistent with good development practice. I’ve worked on, or closely observed, many applications where the design had been fixed and the build had been completed before anyone realised that there were serious usability problems, or that it would be extremely difficult to detect and isolate defects, or that there would be serious performance issues arising from the architectural choices that had been made.

We need to learn from work that’s been done with complexity theory and organisation theory. Developing software is mostly a complex activity, in the sense that there are rarely predictable causes and effects. Good outcomes emerge from trialling possible solutions. These possibilities aren’t just guesswork. They’re based on experience, skill, knowledge of the users. But that initial knowledge can’t tell you the solution, because trying different option changes your understanding of the problem. Indeed it changes the problem. The trials give you more knowledge about what will work. So you have to create further opportunities that will allow you to exploit that knowledge. It’s a delusion that you can get it right first time just by running through a sequential process. It would help if people thought of good software as being grown rather than built.

Posted by: James Christie | January 19, 2015

“Fix on failure” – a failure to understand failure

Wikipedia is a source that should always be treated with extreme scepticism and the article on the “Year 2000 problem” is a good example. It is now being widely quoted on the subject, even though it contains some assertions that are either clearly wrong, or implausible, and lacking any supporting evidence.

Since I wrote about ”Y2K – why I know it was a real problem” last week I’ve been doing more reading around the subject. I’ve been struck by how often I’ve come across arguments, or rather assertions, that a “fix on failure” response would have been the best response. Those who argue that Y2K was a big scare and a scam usually offer a rewording of this gem from the Wikipedia article.

”Others have claimed that there were no, or very few, critical problems to begin with, and that correcting the few minor mistakes as they occurred, the “fix on failure” approach, would have been the most efficient and cost-effective way to solve the problem.”

There is nothing to back up these remarkable claims, but Wikipedia now seems to be regarded as an authoritative source on Y2K.

I want to talk about the infantile assertion that that “fix on failure was the right approach. Infantile? Yes, I use the word carefully. It ignores big practical problems that would have been obvious to anyone with experience of developing and supporting large, complicated applications. Perhaps worse, it betrays a dangerously naive understanding of “failure”, a misunderstanding that it shares with powerful people in software testing nowadays. Ok, I’m talking about the standards lobby there.

”Fix on failure” – deliberate negligence

Firstly, “fix on failure” doesn’t allow for the seriousness of the failure. As Larry Burkett wrote;

“It is the same mindset that believes it is better to put an ambulance at the bottom of a cliff rather than a guardrail at the top”.

“Fix on failure” could have been justified only if the problems were few and minor. That is a contentious assumption that has to be justified. However, the only justification on offer is that those problems which occurred would have been suitable for “fix on failure”. It is a circular argument lacking evidence or credibility, and crucially ignores all the serious problems that were prevented.

Once one acknowledges that there were a huge number of problems to be fixed one has to deal with the practical consequences of “fix on failure”. That approach does not allow for the difficulty of managing masses of simultaneous failures. These failures might not have been individually serious, but the accumulation might have been crippling. It would have been impossible to fix them all within acceptable timescales. There would have been insufficient staff to do the work in time.

Release and configuration management would have posed massive problems. If anyone tells you Y2K was a scam ask them how they would have handled configuration and release management when many interfacing applications were experiencing simultaneous problems. If they don’t know what you are talking about then they don’t know what they are talking about.

Of course not all Y2K problems would have occurred on 1st January 2000. Financial applications in particular would have been affected at various points in 1999 and even earlier. That doesn’t affect my point, however. There might have been a range of critical dates across the whole economy, but for any individual organisation there would have been relatively few, each of which would have brought a massive, urgent workload.

Attempting to treat Y2K problems as if they were run of the mill, “business as usual” problems, as advocated by sceptics, betrays appalling ignorance of how a big IT shop works. They are staffed and prepared to cope with a relatively modest level of errors and enhancements in their applications. The developers who support applications aren’t readily inter-changeable. They’re not fungible burger flippers. Supporting a big complicated application requires extensive experience with that application. Staff have to be rotated in and out carefully and piecemeal so that a core of deep experience remains.

IT installations couldn’t have coped with Y2K problems in the normal course of events any more than garages could cope if all cars started to have problems. The Ford workshops would be overwhelmed when the Fords started breaking down, the Toyota dealers would seize up when the Toyotas suffered.

The idea that “fix on failure” was a generally feasible and responsible approach simply doesn’t withstand scrutiny. Code that wasn’t Y2K-compliant could be spotted at a glance. It was then possible to predict the type of error that might arise, if not always the exact consequences. Why on earth would anyone wait to see if one could detect obscure, but potentially serious distortions? Why would anyone wait to let unfortunate citizens suffer or angry customers complain?

The Y2K sceptics argue that organisations took expensive pre-emptive action because they were scared of being sued. Well, yes, that’s true, and it was responsible. The sceptics were advocating a policy of conscious, deliberate negligence. The legal consequences would quite rightly have been appalling. “Fix on failure” was never a serious contribution to the debate.

”Fix on failure” – a childlike view of failure

The practical objections to a “fix on failure” strategy were all hugely significant. However, I have a deeper, fundamental objection. “Fix on failure” is a wholly misguided notion for anything but simple applications. It is based on a childlike, binary view of failure. We are supposed to believe an application is either right or wrong; it is working or it is broken; that if there is a Y2K problem then the application obligingly falls over. Really? That is not my experience.

With complicated financial applications an honest and constructive answer to the question “is the application correct?” would be some variant on “what do you mean by correct?”, or “I don’t know. It depends”. It might be possible to say the application is definitely not correct if it is producing obvious garbage. But the real difficulty is distinguishing between the seriously inaccurate, but plausible, and the acceptably accurate. Discussion of accuracy requires understanding of critical assumptions, acceptable margins of error, confidence levels, the nature and availability of oracles, and the business context of the application.

I’ve never seen any discussion of Y2K by one of the “sceptical” conspiracy theorists that showed any awareness of these factors. There is just the naïve assumption that a “failed” application is like a patient in a doctor’s surgery, saying “I’m sick, and here are my symptons”.

Complicated applications have to be nursed and constantly monitored to detect whether some new, extraneous factor, or long hidden bug, is skewing the figures. A failing application might appear to be working as normal, but it would be gradually introducing distortions.

Testing highly complicated applications is not a simple, binary exercise of determining “pass or fail”. Testing has to be a process of learning about the application and offering an informed opinion about what it is, and what it does. That is very different from checking it against our preconceptions, which might have been seriously flawed. Determining accuracy is more a matter of judgement than inspection.

Throughout my career I have seen failures and problems of all types, with many different causes. However, if there is a single common underlying theme then the best candidate would be the illusion that development is like manufacturing, with a predictable end product that can be checked. The whole development and testing process is then distorted to try and fit the illusion.

The advocates of Y2K “fix on failure” had much in common with the ISO 29119 standards lobby. Both shared that “manufacturing” mindset, that unwillingness to recognise the complexity of development, and the difficulty of performing good, effective testing. Both looked for certainty and simplicity where it was not available.

Good testers know that an application is not necessarily “correct” just because it has passed the checks on the test script. Likewise failure is not an absolute concept. Ignoring these truths is ignoring reality, trying to redefine it so we can adopt practices that seem more efficient and effective. I suspect the mantra that “fix on failure would have been more effective and efficient” has its roots with economists, like the Australian Quiggin, who wanted to assume complexity away. See this poor paper (PDF, opens in a new tab).

Doing the wrong thing is never effective. Negligence is rarely efficient. Reality is uncomfortable. We have to understand that and know what we are talking about before coming up with simplistic, snake-oil solutions that assume simplicity where the reality is complexity.

Posted by: James Christie | January 12, 2015

Y2K – why I know it was a real problem

It’s confession time. I was a Y2K test manager for IBM. As far as some people are concerned that means I was party to a huge scam that allowed IT companies to make billions out of spooking poor deluded politicians and the public at large. However, my role in Y2K means I know what I am talking about, so when I saw some recent comment that it was all nothing more than hype I felt the need to set down my first hand experience. At the time, and in the immediate aftermath of Y2K, we were constrained by client confidentiality from explaining what we did, but 15 years on I feel comfortable about speaking out.

Was there a huge amount of hype? Unquestionably.

Was money wasted? Certainly, but show me huge IT programmes where that hasn’t happened.

Would it have been better to do nothing and adopt a “fix on failure” approach? No, emphatically not as a general rule and I will explain why.

There has been a remarkable lack of studies of Y2K and the effectiveness of the actions that were taken to mitigate the problem. The field has been left to those who saw few serious incidents and concluded that this must mean there could have been no serious problem to start with.

The logic runs as follows. Action was taken in an attempt to turn outcome X into outcome Y. The outcome was Y. Therefore X would not have happened anyway and the action was pointless. The fallacy is so obvious it hardly needs pointing out. If action was pointless then the critics have to demonstrate why the action that was taken had no impact and why outcome Y would have happened regardless. In all the years since 2000 I have seen only unsubstantiated assertion and reference to those countries, industries and sectors where Y2K was not going to be a signficant problem anyway. The critics always ignore the sectors where there would have been massive damage.

An academic’s flawed perspective

This quote from Anthony Finkelstein, professor of software systems engineering at University College London, on the BBC website, is typical of the critics’ reasoning.

”The reaction to what happened was that of a tiger repellent salesman in Golders Green High Street,” says Finkelstein. ‘No-one who bought my tiger repellent has been hurt. Had it not been for my foresight, they would have.’ “

The analogy is presumably flippant and it is entirely fatuous. There were no tigers roaming the streets of suburban London. There were very significant problems with computer systems. Professor Finkelstein also used the analogy back in 2000 (PDF, opens in new tab).

In that paper he made a point that revealed he had little understanding of how dates were being processed in commercial systems.

”In the period leading up to January 1st those who had made dire predictions of catastrophe proved amazingly unwilling to adjust their views in the face of what was actually happening. A good example of this was September 9th 1999 (9/9/99). On this date data marked “never to expire” (realised as expiry 9999) would be deleted bringing major problems. This was supposed to be a pre-shock that would prepare the way for the disaster of January 1st. Nothing happened. Now, if you regarded the problem as a serious threat in the first place, this should surely have acted as a spur to some serious rethinking. It did not.”

I have never seen a date stored in the way Finkelstein describes, 9th September 1999 being held as 9999. If that were done there would be no way to distinguish 1st December 2014 from 11th February 2014. Both would be 1122014. Dates are held either in the form 090999, with leading zeroes so the dates can be interpreted correctly, or with days, months and years in separate sub-fields for simpler processing. Programmers who flooded date fields with the integer 9 would have created 99/99/99, which could obviously not be interpreted as 9th September 1999.

Anyway, the main language of affected applications was Cobol, and the convention was for programmers to move “high values”, i.e. the highest possible value the compiler could handle, into the field rather than nines. “High values” doesn’t translate into any date. Why doesn’t Finkelstein know this sort of basic thing if he’s setting himself up as a Y2K expert? I never heard any concern about 9/9/99 at the time, and it certainly never featured in our planning or work. It is a straw man, quite irrelevant to the main issue.

In the same paper from 2000 Finkelstein made another claim that revealed his lack of understanding of what had actually been happening.

September 9th 1999 is only an example. Similar signs should have been evident on January 1st 1999, the beginning of the financial year 99-00, December 1st, and so on. Indeed assuming, as was frequently stated, poor progress had been made on Y2K compliance programmes we would have anticipated that such early problems would be common and severe. I see no reason to suppose that problems should not have been more frequent (or at any rate as frequent) in the period leading up to December 31st 1999 than afterwards given that transactions started in 1999 may complete in 2000, while after January 1st new transactions start and finish in the new millennium.

Finkelstein is entirely correct that the problem would not have suddenly manifested itself in January 2000, but he writes as if this is an insight the practitioners lacked at the front line. At General Accident the first critical date that we had to hit was the middle of October 1998, when renewal invitations for the first annual insurance contracts extending past December 1999 would be issued. At various points over the next 18 months until the spring of 2000 all the other applications would hit their trigger dates. Everything of significance had been fixed, tested and re-implemented by September 1999.

We knew that timetable because it was our job to know it. We were in trouble not because time was running out till 31/12/1999, but because we had little time before 15/10/1998. We made sure we did the right work at the right time so that all of the business critical applications were fixed in time. Finkelstein seems unaware of what was happening. A massed army of technical staff were dealing with a succession of large waves sweeping towards them over a long period, rather than a single tsunami at the millennium.

Academics like Finkelstein have a deep understanding of the technology and how it can, and should be used, but this is a different matter from knowing how it is being applied by practitioners acting under extreme pressure in messy and complex environments. These practitioners aren’t doing a bad job because of difficult conditions, lack of knowledge and insufficient expertise. They are usually doing a good job, despite those difficult conditions, drawing on vast experience and deep technical knowledge.

Comments such as those of Professor Finkelstein betray a lack of respect for practitioners, as if the only worthwhile knowledge is that possessed by academics.

What I did in the great Y2K “scare”

Let me tell you why I was recruited as a Y2K test manager by IBM. I had worked as a computer auditor for General Accident. A vital aspect of that role had been to understand how all the different business critical applications fitted together, so that we could provide an overview to the business. We could advise on the implications and risks of amending applications, or building new ones to interface with the existing applications.

A primary source - my report explaining the problem with a business critical application

A primary source – my report explaining the problem with a business critical application

Shortly before General Accident’s Y2K programme kicked off I was transferred to IBM under an outsourcing deal. General Accident wanted a review performed of a vital back office insurance claims system. The review had to establish whether the application should be replaced before Y2K, or converted. Senior management asked IBM that I should perform the review because I was considered the person with the deepest understanding of the business and technical issues. The review was extremely urgent, but it was delayed by a month till I had finished my previous project.

I explained in the review exactly why the system was business critical and how it was vital to the company’s reserving, and therefore the production of the company accounts. I explained how the processing was all date dependent, and showed how and when it would fail. If the system was unavailable then the accountants and premium setters would be flying blind, and the external auditors would be unable to sign off the company accounts. The risks involved in trying to replace the application in the available time were unacceptable. The best option was therefore to make the application Y2K compliant. This advice was accepted.

As soon as I’d completed the review IBM moved me into a test management position on Y2K, precisely because I had all the business and technical experience to understand how eveything fitted together, and what the implications of Y2K would be. The first thing I did was to write a suite of SAS programs that crawled through the production code libraries, job schedules and job control language libraries to track the relationship between programs, data and schedules. For the first time we had a good understanding of the inventory, and which assets depended on each other. Although I was nominally only the test manager I drew up the conversion strategy and timetable for all the applications within my remit, based on my accumulated experience and the new knowledge we’d derived from the inventory.

An insurance company’s processing is heavily date dependent. Premiums are earned on a daily basis, with the appropriate proportion being refunded if a policy is cancelled mid-term. Claims are paid only if the appropriate cover is in place on the date that the incident occurred. Income and expenditure might be paid on a certain date, but then spread over many years. If the date processing doesn’t work then the company can’t take in money, or pay it out. It cannot survive. The processing is so complex that individual errors in production often require lengthy investigation and fixing, and then careful testing. The notion that a “fix on failure” response to Y2K would have worked is risible.

We fixed the applications, taking a careful, triaged risk-based approach. The most date sensitive programs within the most critical applications received the most attention. Some applications were triaged out of sight. For these, “fix on failure” was appropriate.

We tested the converted applications in simulated runs across the end of 1999, in 2000 and again in 2004. These simulations exposed many more problems not just with our code, but also with all the utility and housekeeping routines and tools. In these test runs we overrode the mainframe system date within the test runs.

In the final stage of testing we went a step further. We booted up a mainframe LPAR (logical partition) to run with the future dates. I managed this exercise. We had a corner of the office with a sign saying “you are now entering 2000”, and everything was done with future dates. This exercise flagged up further problems with code that we had been confident would run smoothly.

Y2K was a fascinating time in my career because I was at a point that I now recognise as a sweet spot. I was still sufficiently technically skilled to do anything that my team members could do, even being called on to fix overnight production problems. However, I was sufficiently confident, experienced and senior to be able to give presentations to the most senior managers explaining problems and what the appropriate solutions would be.

December 19th 1999, Mary, her brother Malcolm & I in the snow. Not panicking about Y2K.

December 19th 1999, Mary, her brother Malcolm & I in the snow. Not panicking much about Y2K.

For these reasons I know what I’m talking about when I write that Y2K was a huge problem that had to be tackled. The UK’s financial sector would have suffered a massive blow if we had not fixed the problem. I can’t say how widespread the damage might have been, but I do know it would have been appalling.

My personal millennium experience

What was on my mind on 31st December 1999

What was on my mind on 31st December 1999

When I finished with Y2K in September 1999, at the end of the future mainframe exercise, at the end of a hugely pressurised 30 months, I negotiated seven weeks leave and took off to Peru. IBM could be a great employer at times! My job was done, and I knew that General Accident, or CGU as it had evolved into by then, would be okay. There would inevitably be a few glitches, but then there always are in IT. I was so relaxed about Y2K that on my return from Peru it was the least of my concerns. There was much more interesting stuff going on in my life.

I got engaged in December 1999, and on 31st December Mary and I bought our engagement and wedding rings. That night we were at a wonderful party with our friends, and after midnight we were on Perth’s North Inch to watch the most spectacular fireworks display I’ve ever seen. 1st January 2000? It was a great day that I’ll always remember happily. It was far from being a disaster, and that was thanks to people like me.

PS – I have written a follow up article explaining why “fix on failure” was based on an infantile view of software failure.

Posted by: James Christie | December 30, 2014

2014 in review – WordPress’s report on my blog

This is the standard WordPress annual report for my blog in 2014.

Here's an excerpt:

The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 13,000 times in 2014. If it were a concert at Sydney Opera House, it would take about 5 sold-out performances for that many people to see it.

Click here to see the complete report.

Posted by: James Christie | December 5, 2014

Interview about Stop 29119 with Service Virtualization

This is an email interview I gave to Jeff Bounds of Service Virtualization about the Stop 29119 campaign in October 2014. It appeared in two parts, ”James Christie explains resistance to ISO standards” and “ISO 29119 is damaging, so ignore it, advises James Christie”.

The full interview in the original format follows. Jeff’s questions are in red.

What is ISO 29119?

And why is it important to the software testing field?

ISO 29119 is described by the International Organization for Standardization (ISO) as “an internationally agreed set of standards for software testing that can be used within any software development life cycle or organization” .

When ISO are promoting a standard that is intended to cover everything that testers do then that is a big deal for all testers. We cannot afford to ignore it.

What’s wrong with ISO 29119?

Why do you oppose ISO29119?

I think the question is framed the wrong way round. A standard requires consensus and it has to be relevant. It is up to the promoters of the standard to justify it. They’ve not made any serious, credible attempt to do so. Their interpretation of agreement and consensus is restricted to insiders, to those already in the working group developing the standard. Those testers who don’t believe that formal, generic standards are the way ahead have been ignored.

Even before I found out about ISO 29119 I was opposed to it in principle. Standards in general are a a good thing, but software testing is an intellectual activity that doesn’t lend itself to standardization.

There is a wide range of evidence from pyschology, sociology and management studies to back up the argument that document driven standards like ISO 29119 are counter-productive. They just don’t fit with the way that people think and work in organizations. On the other hand the defenders to standards have never bothered to refute these arguments, or even address them. They simply assert, without evidence, that standardization is good for testing. Typically they make spurious arguments that because standards are a good thing in many contexts then they must be a good thing for testing. It’s logical nonsense.

This is all without considering the detailed content of the standard. It is dated, excessively bureaucratic, prescriptive and badly written.

Sure, the standard does say that it is possible to apply parts of the standard selectively and claim “tailored conformance”. However, the standard requires agreement with stakeholders for each departure from the standard. For any significant project that means documented agreement with many people on all sorts of detailed points.

Dr Stuart Reid has claimed that he wants to see companies and governments mandating the use of ISO 29119 in contracts. Lawyers and procurement managers don’t understand testing, as Dr Reid concedes. He sees that as being a case for providing them with a standard they can mandate.

My perspective is that such people, precisely because they don’t understand testing, will require full compliance. In their eyes, full compliance will seem responsible and professional while tailored compliance will look like cutting corners. That’s the way that people react. It’s no good shrugging that off by saying people don’t have to act that way.

All the evidence supports the opponents because they know how people behave. There is no evidence to support the standards lobby.

Isn’t standardisation good?

An argument in favor of ISO 29119 is that it would bring standardization to a software testing process that historically has seen people using a variety of techniques and methods, rather than one set way of doing things. What’s wrong with that?

Everything. Testing has to fit the problem. It seems crazy to think that everyone should be expected to do the same things. Again, why should that be the case? If everyone is doing the same then most people will be doing the wrong thing.

Are opponents trying to save their jobs?

Some proponents of ISO 29119 could also argue that opponent of the standard are simply trying to save their jobs, when automation and simulation represent a better, faster and cheaper way of doing testing. What are your thoughts about that?

Even if it were the case that opponents were simply concerned about their jobs it would still be a compelling argument against ISO 29119. As the ISO working group has conceded many opponents are more expert than the average tester. Why should they have to change the way they operate, for the worse, or pass up opportunities for work?

I could actually earn more money by collaborating with the standard and cleaning up the mess it will create. There will be a good market for test consultants to do that. However, I am not interested in that sort of work.

Anyway, opponents are unhappy about the standard, not automation and simulation, which are extremely important and valuable at the right time. The standard isn’t based on the assumption of an automated approach and the test process paper (Part 2) doesn’t even mention simulation. The discussion about automation is a quite separate matter from the debate about ISO 29119.

Can ISO 29119 provide a baseline?

Dr. Stuart Reid recently argued that ISO 29119 would, among other things, help define good practices in testing, along with providing a baseline to compare different test design techniques. What are your thoughts about that? (His full argument is here).

I don’t think the standards deals with good testing practices. It advocates what it sees as good practices in test management, specifically documentation. It is really a documentation standard rather than a testing standard. It is a classic case of confusing the process with the real work. The difference is crucial. It is like the difference between the map and the territory. The map is a guide to the territory, but it is not the real thing.

Dr Reid hasn’t argued the point about a baseline. He has merely asserted it without evidence or explanation. I’m afraid that is typical of ISO’s approach. Even if it is so, I don’t think testers should have to tie themselves in knots for the benefit of others.

What is the alternative?

If you believe ISO 29119 isn’t the solution, then what is the best standard for software testing, and why?

As I’ve said above I don’t think a generic standard is appropriate for testing. A good alternative to doing wasteful and damaging things is to ignore them. There are many sound alternatives to ISO 29119. I don’t think it is up to the opponents of the standard to justify these. They are being applied and they work. Where is the evidence that ISO 29119 works?

Posted by: James Christie | November 17, 2014

Too smart for checklists, and a consultants’ war?

My post last week “Why do we think we’re different” “Why do we think we’re different” (about goal displacement, trained incapacity and functional stupidity) attracted an interesting comment that deserved a considered response. In the end that merited a new post, rather than a reply in the comments section. Here is the comment, followed by my response.

I don’t disagree with your basic premise nor your conclusion. Your arguments are logical and elegant, but I wonder if people hear the anti-ISO 29119 group as saying “Don’t bother having process,” and “I’m too smart to need a checklist” (ala Gawande’s “Checklist Manifesto”).

Worse yet, the people arguing against the standard the loudest are not the common tester but the already heavily active in the community of testers. I described it as a “consultant’s war” because consultants seem the most active in the community are often consultants.

As a people problem, it seems both sides have a great deal of apathy outside of perhaps 1/100th of the testing community. How do we change that? The best argument in the world will have no affect if most people are apathetic. This is the problem I struggle with and not just around the standards but around testing as a career. I would love to see any insight you have around the problem, assuming you see it as a problem.


JCD raises some interesting points and I am grateful for that.

I do need to keep thinking about the danger that people hear the Stop29119 campaigners as saying “don’t bother having process”. That is not the message I want to get across. I would phrase it as “beware of the dangers of prescriptive processes”, i.e. processes that spell out in detail each step. These may be required in some contexts, but not in software development or testing, where their value has been hugely overstated.

However, processes are required. The tricky problem is finding the sweet spot between providing sufficient guidance and ensuring appropriate standardisation on the one hand, and, on the other hand, going into excessive detail and preventing practitioners from using their judgement and initiative in a complex setting.

Similarly, checklists do have their place. They are important, but of limited value. They take you only so far. I certainly don’t decry them, but I do deplore excessive dependence on them at the expense, again, of judgement and initiative. I don’t disagree with the crux of Gawande’s book, though I do have reservations about the emphasis, or maybe it’s just with the way the book has been promoted. I’m also doubtful about his defintion of complexity.

I agree with what Gawande is saying here in The Checklist Manifesto.

It is common to misconceive how checklists function in complex lines of work. They are not comprehensive how-to guides, whether for building a skyscraper or getting a plan out of trouble. They are quick and simple tools aimed to buttress the skills of expert professionals. And by remaining swift and usable and resolutely modest, they are saving thousands upon thousands of lives.

I would quibble about building skyscrapers being a complex activity. I would call it highly complicated. I’d prefer to stick to the Cynefin definition of complexity, which is reserved for situations where there is no obvious causality; cause and effect are obvious only in hindsight. Nevertheless, I do like Gawande’s phrase that checklists are “quick and simple tools to buttress the skills of expert professionals”.

You’re right about this being a consultants’ war, but I don’t see how it could be anything else. It’s hard to get new ideas through to people who are doing the real work, grafting away day to day. Some testers are certainly apathetic, but the bigger issues are time and priorities. Campaigning against ISO 29119 is important, but it’s unlikely to be urgent for most people.

When I was a permanent test manager I didn’t have the time to lift my head and take part in these debates. Working as a contract test manager isn’t much better, though at least it’s possible, in theory, to control time between contracts.

I think that places some responsibility on those who can campaign to do so. If enough people do if for long enough then those testers who are less publicly visible, and lay people, are more likely to stumble across arguments that will make them question their assumptions. The lack of public defence of the standard from ISO has meant that anyone searching for information about ISO 29119 is now very likely to come across serious arguments against the standard. That might tilt the balance against general acceptance of the standard as The Standard. A few more people might join the campaign. Some organisations that might have adopted the standard may think better. It could all have a cumulative effect.

I’m working on the idea of a counter to ISO 29119. It wouldn’t be an attempted rival to ISO 29119. Nor would it be an anti-standard. Many organisations will adopt the standard because they want to seem responsible. That would be doing the wrong thing for the right reason.

What I am thinking of doing is documenting the links between the requirements of the Institute of Internal Auditors, and the Information Systems Audit & Control Association (and maybe other regulatory sources) with a credible alternative to the traditional, document heavy approach, showing how it is possible to be entirely responsible and accountable without going down the ISO 29119 dead end.

It’s a matter of explaining that there are viable and valid choices, something that is in danger of being hidden from public view by the arrival of an ISO standard intended to cover the whole of software testing. Checklists could well play a role in this.

I smiled at the suggestion that opponents of ISO 29119 might believe they’re too smart to need a checklist. I can see why some people might think that. However, I, like most ISO 29119 opponents I suspect, am acutely aware of the limits of my knowledge and competence. Excessive reliance on checklists and standards create the illusion, the delusion, that testers are more competent, professional and effective than they really are. I prefer a spot of realistic humility. Buttressing skill is something I can appreciate; supplanting skill is another matter altogether, and ISO 29119 goes too far in that direction.

Posted by: James Christie | November 14, 2014

Why do we think we’re different?

The longer my career lasts the more aware I am of the importance of Gerald Weinberg’s Second Law of Consulting (from his book “The Secrets of Consulting”), “No matter what the problem is, it’s always a people problem.”

The first glimmer of light that illuminated this truth was when I came across the term “goal displacement” and reflected on how many times I had seen it in action. People are given goals that aren’t quite aligned with what their work should deliver. They focus on the goals, not the real work. This isn’t just an incidental feature of working life, however. It is deeply engrained in our psychological make-up. There is a long history of academic work to explain this phenomenon.

Focal and subsidiary awareness

I’ll start with Michael Polanyi. In his book “Personal Knowledge”, Polanyi makes a distinction between focal and subsidiary awareness. Focal awareness is what we consciously think about. Subsidiary awareness is like tacit knowledge. We don’t think about the mechanics of holding a hammer to drive in a nail. We think about what we are trying to achieve. If we try to focus on the mechanics of holding the hammer correctly, and consciously aim for the nail then we are far more likely to make a painful mess of things. Focal and subsidiary awareness are therefore, in a sense, mutually exclusive. As Polanyi puts it.

“If a pianist shifts his attention from the piece he is playing to the observation of what he is doing with his fingers while he is playing it, he gets confused and may have to stop. This happens generally if we switch our focal attention to particulars of which we had previously been aware only in their subsidiary role.

Our attention can hold only one focus at a time… it would be hence contradictory to be both subsidiarily and focally aware of the same particulars at the same time.”

Does this apply in organisational life too, as well as to musicians and carpenters performing skilled physical activities? I think it does. We have often focused too closely on the process of software development and of testing and lost sight of the end we are trying to reach. Formal processes, prescriptive methods and standards encourage exactly that sort of misplaced focus.

Thorstein Veblen and trained incapacity

This problem of misplaced focus has long been observed by organisational psychologists and sociologists. A full century ago, in 1914, Thorstein Veblen identified the problem of trained incapacity.

People who are trained in specific skills tend to lose the ability to adapt. Their response has worked in the past, and they apply it regardless thereafter. They focus on responding in the way they have been trained, and cannot see that the circumstances require a different response. Their training has rendered them incapable of doing the job effectively unless it fits their mental framework. This is “trained incapacity”. They have been trained to be useless. The phenomenon applies to all workers, but especially to managers.

However, the problem that Veblen identified was worse than that. Highly specialised training and education meant that people were increasingly becoming expert in narrower fields and their areas of ignorance were increasing. When they entered the active workforce their jobs required wider skills and knowledge than their education had given them, but they were unable to contribute effectively to those other areas. They focussed on what they knew. Veblen was especially concerned about business school graduates.

“[These schools’] specialization on commerce is like other specializations in that it draws off attention and interest from other lines than those in which the specialization falls, thereby widening the candidate’s field of ignorance while it intensifies his effectiveness within his specialty. The effect, as touches the community’s interest in the matter, should be an enhancement of the candidate’s proficiency in all the futile ways and means of salesmanship and “conspiracy in restraint of trade” together with a heightened incapacity and ignorance bearing on such work as is of material use.”

A way of not seeing

In 1935 Kenneth Burke built on Veblen’s work, arguing that trained incapacity was;

“that state of affairs whereby one’s very abilities can function as blindnesses.”

People can focus on the means or the ends, not both, and their specific training in prescriptive methods or processes leads them to focus on the means. They do not even see what they are missing.

“A way of seeing is also a way of not seeing- a focus on object ‘A’ involves a neglect of object ‘B’.”

Robert Merton Robert Merton made the point more explicitly in 1957 when he introduced the concept of goal displacement.

“Adherence to the rules… becomes an end in itself… Formalism, even ritualism, ensues with an unchallenged insistence upon punctilious adherence to formalised procedures. This may be exaggerated to the point where primary concern with conformity to the rules interferes with the achievement of the purposes of the organization.”

Why do we think we are different?

So the problem had been recognised before software development was even in its infancy. How did it come to be such a pervasive problem in our profession? What possible reason could there be for thinking that we are different in software developing and testing? Why would we think we are immune from these problems?

I explored some of the reasons earlier this year in Teddy Bear Methods. Software development is difficult and stressful. It is tempting to seek refuge in neat, ordered structures. In that article I talked about social defences, transitional objects and how slavishly following prescriptive processes and methods can be become a fetish.

Functional stupidity

However, there is an over-arching explanation; functional stupidity. This was identified by Alvesson and Spicer in a fascinating paper in the Journal of Management Studies in 2012, ”A Stupidity-Based Theory of Organizations” (PDF, opens in new tab).

The concept is rather more nuanced than the headline grabbing name suggests. It is no glib piece of cod psychology; it is soundly rooted in organisational psychology and sociology and in management theory.

Organisations can function more smoothly if employees suspend their critical thinking faculties. It can actually be beneficial if they do not question the validity of management directives, if they don’t think about whether the actions they have to take are justified, and if they don’t waste cognitive effort thinking about whether their work is aligned with the objectives of the organisation.

In large organisations the goal towards which many employees are working is effectively the smooth running of the bureaucracy. Functional stupidity does help things run smoothly. It can be beneficial for compliant employees too. The people who thrive are those who play the game by the rules and don’t question whether the “game” is actually aligned with the objectives of the organisation.

However, where functional stupidity is beneficial it is in organisations operating in a fast moving, relatively well understood environment. In these cases fast and efficient action and reaction may be more important than reflective analysis, though it still carries serious dangers.

On the other hand if the environment is less well understood and there is a need to reflect and learn, then functional stupidity can be disastrous. Apart from a failure to learn from, or even detect mistakes, functional stupidity can commit the organisation to damaging initiatives, while corroding employee morale and effectiveness. Even if the organisation as a whole might be suited to functional stupdity there are roles where it is entirely inappropriate.

Software testing is exactly such a role. Testers must question, analyse, reflect and learn. These are all activities that functional stupidity discourages.

Management fads, lack of evidence and ISO 29119

Alvesson and Spicer refer to a further, damaging effect of functional stupidity that has particular relevance to the debate about ISO 29119. They argue that managers are prone to getting caught up in enthusiasm for unproven initiatives.

”Most managerial practices are adopted on the basis of faulty reasoning, accepted wisdom, and complete lack of evidence.

…organizations will often adopt new practices with few robust reasons beyond the fact that they make the company ‘look good’ or that ‘others are doing it’… Refraining from asking for justification beyond managerial edict, tradition or fashion, is a key aspect of functional stupidity.”

Does ISO 29119 fall into this category? Dr Stuart Reid, convener of the ISO 29119 Working Group is a surprising source of compelling evidence to support the claim.buyers unclear
no evidence He has conceded that there is no evidence of the standard’s efficacy and that the people who buy testing services do not understand what they are buying (see the slides from his presentation at ExpoQA14 in Madrid in May, with my added emphasis).

Yet he hopes that they will nevertheless write contracts that mandate the use of the standard (PDF, opens in new tab).

This standard will impose on testers working practices that are only loosely aligned with the real objective of testing. It will provide fertile breeding grounds for goal displacement. Will functional stupidity ease the way for ISO 29119? I fear the worst.

I asked why we think we are different in software development and testing. The question is poorly framed. It’s not that we think we are different. The problem is that we, as a global testing community, are not thinking enough. Far too many of us are simply going with the flow. Thousands have unthinkingly adopted functional stupidity as a career move. ISO 29119? That will do nicely.

“No matter what the problem is, it’s always a people problem.”

Any organisational initiative, or new methodology, or new standard that ignores that rule will not work. The lessons have been there for decades. We only have to look for them.

Older Posts »



Get every new post delivered to your Inbox.

Join 79 other followers