Posted by: James Christie | February 9, 2015

“A novel-long standard” A1QA interview

A1QA blogI was asked to take part in this interview about ISO 29119 by Elizabeth Soroka of A1QA. The interview ran in January 2015 under the headline “A novel-long standard; interview with James Christie”

James, you’ve tried so many IT fields. Can you explain why you switched into auditing?

I worked for a big insurance company. They had just re-organized their Audit department. One of the guys who worked there knew me and thought I’d be well suited to audit. I was a developer who had moved on into a mixture of programming and systems analysis. However, I had studied accountancy at university and spent a couple of years working in accountancy and insurance investment, so I had a wider business perspective than most devs. I think that was a major reason for me being approached.

I turned down the opportunity because I was enjoying my job and I wanted to finish the project I was responsible for. The Audit department kept in touch with me and I gradually realised that it would be a much more interesting role than I’d thought. A couple of years later another opportunity came up at a time when I was doing less interesting work so I jumped at the chance. It was a great decision. I learned a huge amount about how IT fitted into the business.

As a person with an audit background, do you think standards improve software testing or block it?

They don’t improve testing. I don’t think there’s any evidence to support that assertion. The most that ISO 29119 defenders have come up with is the claim that you can do good testing using the standard. That’s arguable, but even if it is true it is a very weak defence for making something a standard. It’s basically saying that ISO 29119 isn’t necessarily harmful.

I wouldn’t have said that ISO 29119 blocks testing. It’s a distraction from testing because it focuses attention on the documentation, rather than the real testing. An auditor should expect three things; a clear idea about how testing will be performed, and evidence that explains what testing was done, and what the significance was of the results.

ISO 29119, and the previous testing standard IEEE 829, emphasize heavy advance documentation and deal pitifully with the final reports. Auditors should expect an over-arching test strategy saying “this is our approach to testing in this organization”. They should also expect an explanation of how that strategy will be interpreted for the project in question.

Detailed test case specifications shouldn’t impress auditors any more than detailed project plans would convince anyone that the project was successful. ISO 29119 says that “test cases shall be recorded in the test case specification” and “the test case specification shall be approved by the stakeholders”.

That means that if testers are to be compliant with the standard they have to document their planned testing in detail, then get the documents approved by many people who can’t be expected to understand all that detail. Trying to comply with the standard will create a mountain of unnecessary paper. As I said, it’s a distraction from the real work.

You started the campaign “STOP 29119”.Tell us a few words about the standard?

I don’t claim that I started the campaign. The people who deserve most credit for that are probably Karen Johnson and Iain McCowatt, who responded so energetically to my talk at CAST 2014 in New York.

ISO 29119 is an ambitious attempt, in ISO’s words “to define an internationally-agreed set of standards for software testing that can be used by any organization when performing any form of software testing.”

The full standard will consist of five documents; glossary, processes, documentation, techniques and finally key-word driven testing. So far the first three documents have been issued, i.e. the glossary, processes and documentation. The fourth document, test techniques, is due to be issued any time now. The fifth, on key-word driven testing should come out in 2015.

The campaign has called on ISO to withdraw the standard. However, I would happily settle for damaging its credibility as a standard for “any organization when performing any form of software testing”. That aim is more than just being ambitious. It stretches credulity.

Testing standards are beneficial for testing (I hope you agree): they implement some new practices and can school the untutored. Still, what is wrong with the 29119 standard?

The content of ISO 29119 is very old-fashioned. It is based on a world view from the 1970s and 1980s that confused rigour and professionalism with massive documentation. It really is the last place to go to look for new ideas. Newcomers to testing should be encouraged to look elsewhere for ideas about how to perform good testing.

Testing standards can be beneficial in a particular organization. They may even be beneficial in industries that have specific needs, such as medical devices and drugs, and financial services. However, they have to be very carefully written and they must maintain a clear distinction between true standards and overly prescriptive guidance. ISO 29119 fails to make the distinction. It is far too detailed and prescriptive.

The three documents that have been issued so far add up to 89,000 words over 270 pages. That’s as long as many novels. In fact it’s as long as George Orwell’s “Animal Farm” plus Erich Maria Remarque’s “All Quiet on the Western Front” combined. It’s almost exactly the same length as Orwell’s “1984” and Jane Austen’s “Persuasion”.

That is ridiculously long for a standard. The Institute of Internal Auditors’ “International Standards for the Professional Practice of Internal Auditing” runs to only 26 pages and 8,000 words. The IIA’s standards are high level statements of principle, covering all types of auditing. More detailed guidance about how to perform audits in particular fields is published separately. That guidance doesn’t amount to a series of “you shall do x, y & z”. It offers auditors advice on potential problems, and gives useful tips to guide the inexperienced. The difference between standards and guidance is crucial, and ISO blurs that distinction.

The defenders of ISO 29119 argue that tailored compliance is possible; testers don’t have to follow the full standard. There are two problems with that. Tailored compliance requires agreement from all of the stakeholders for all of the tasks that won’t be performed, and documents that won’t be produced. There are hundreds of mandatory tasks and documents, so even tailored compliance imposes a huge bureaucratic overhead. The second problem is that tailored compliance will look irresponsible. The marketing of the standard appeals to fear. Stuart Reid has put it explicitly.

“Imagine something goes noticeably wrong. How easy will you find it to explain that your testing doesn’t comply with international testing standards? So, can you afford not to use them?”

Anyone who is motivated by that to introduce ISO 29119 is likely to believe that full compliance must be safer and more responsible than tailored compliance. The old IEEE 829 test documentation standard also permitted tailored compliance. That wasn’t the way it worked out in practice. Organizations which followed the standard didn’t tailor their compliance and produced far too much wasteful documentation. ISO should have thought more carefully about how they would promote the standard and what the effects might be of their appeal to fear.

And in the end, what are the results of your campaign?

It’s hard to say what the results are. No-one seriously expected that ISO would roll over and withdraw the standard. I did think that ISO would make a serious attempt to defend it, and to engage with the arguments of the Stop 29119 campaigners. That hasn’t happened. The result has been that when people search for information about ISO 29119 they can’t fail to find articles by Stop 29119 campaigners. They will find nothing to refute them. I think that damages ISO’s credibility. ISO is now caught in a bind. It can ignore the opposition, and therefore concede the field to its opponents. Or it can try to engage in debate and reveal the lack of credible foundations of the standard.

I think the campaign has been successful in demonstrating that the standard lacks credibility in a very important part of the testing profession and therefore lacks the consensus that a standard should enjoy. I hope that if the campaign keeps going then it will prevent many organizations from forcing the standard onto their testers and thus forcing them to do less effective and efficient testing. Sometimes it feels like Stop 29119 is very negative, but if we can persuade people not to adopt the standard then I think that makes a positive contribution towards more testers doing better testing.

Posted by: James Christie | January 22, 2015

Service Virtualization interview about usability

Service VirtualizationI was asked to take part in this interview by George Lawton of Service Virtualization. Initially I wasn’t enthusiastic because I didn’t think I would have much to say. However, the questions set me thinking, and I felt they were relevant to my experience so I was happy to take part. It gave me something to do while I was waiting to fly back from EuroSTAR in Dublin!

How does usability relate to the notion of the purpose of a software project?

When I started in IT over 30 years ago I never heard the word usability. It was “user friendliness”, but that was just a nice thing to have. It was nice if your manager was friendly, but that was incidental to whether he was actually good at the job. Likewise, user friendliness was incidental. If everything else was ok then you could worry about that, but no-one was going to spend time or money, or sacrifice any functionality just to make the application user friendly. And what did “user friendly” mean anyway. “Who knows? Who cares? We’ve got serious work do do. Forget about that touchy feely stuff.”

The purpose of software development was to save money by automating clerical routines. Any online part of the system was a mildly anomalous relic of the past. It was just a way of getting the data into the system so the real work could be done. Ok, that’s an over-simplification, but I think there’s enough truth in it to illustrate why developers just didn’t much care about the users and their experience. Development moved on from that to changing the business, rather than merely changing the business’s bureaucracy, but it took a long time for these attitudes to shift.

The internet revolution turned everything upside down. Users are no longer employees who have to put up with whatever they’re given. They are more likely to be customers. They are ruthless and rightly so. Is your website confusing? Too slow to load? Your customers have gone to your rivals before you’ve even got anywhere near their credit card number.

The lesson that’s been getting hammered into the heads of software engineers over the last decade or so is that usability isn’t an extra. I hate the way that we traditionally called it a “non-functional requirement”, or one of the “quality criteria”. Usability is so important and integral to every product that telling developers that they’ve got to remember it is like telling drivers they’ve got to remember to use the steering wheel and the brakes. If they’re not doing these things as a matter of course they shouldn’t be allowed out in public. Usability has to be designed in from the very start. It can’t be considered separately.

What are the main problems in specifying for and designing for software usability?

Well, who’s using the application? Where are they? What is the platform? What else are they doing? Why are they using the application? Do they have an alternative to using your application, and if so, how do you keep them with yours? All these things can affect decisions you take that are going to have a massive impact on usability.

It’s payback time for software engineering. In the olden days it would have been easy to answer these questions, but we didn’t care. Now we have to care, and it’s all got horribly difficult.

These questions require serious research plus the experience and nous to make sound judgements with imperfect evidence.

In what ways do organisations lose track of the usability across the software development lifecycle?

I’ve already hinted at a major reason. Treating usability as a non-functional requirement or quality criterion is the wrong approach. That segregates the issue. It’s treated as being like the other quality criteria, the “…ities” like security, maintainability, portability, reliability. It creates the delusion that the core function is of primary importance and the other criteria can be tackled separately, even bolted on afterwards.

Lewis & Rieman came out with a great phrase fully 20 years ago to describe that mindset. They called it the peanut butter theory of usability. You built the application, and then at the end you smeared a nice interface over the top, like a layer of peanut butter (PDF, opens in new tab).

“Usability is seen as a spread that can be smeared over any design, however dreadful, with good results if the spread is thick enough. If the underlying functionality is confusing, then spread a graphical user interface on it. … If the user interface still has some problems, smear some manuals over it. If the manuals are still deficient, smear on some training which you force users to take.”

Of course they were talking specifically about the idea that usability was a matter of getting the interface right, and that it could be developed separately from the main application. However, this was an incredibly damaging fallacy amongst usability specialists in the 80s and 90s. There was a huge effort to try to justify this idea by experts like Hartson & Hix, Edmonds, and Green. Perhaps the arrival of Object Oriented technology contributed towards the confusion. A low level of coupling so that different parts of the system are independent of each other is a good thing. I wonder if that lured usability professionals into believing what they wanted to believe, that they could be independent from the grubby developers.

Usability professionals tried to persuaded themselves that they could operate a separate development lifecycle that would liberate them from the constraints and compromises that would be inevitable if they were fully integrated into development projects. The fallacy was flawed conceptually and architecturally. However, it was also a politically disastrous approach. The usability people made themselves even less visible, and were ignored at a time when they really needed to be getting more involved at the heart of the development process.

As I’ve explained, the developers were only too happy to ignore the usability people. They were following methods and lifecycles that couldn’t easily accommodate usability.

How can organisations incorporate the idea of usability engineering into the software development and testing process?

There aren’t any right answers, certainly none that will guarantee success. However, there are plenty of wrong answers. Historically in software development we’ve kidded ourselves thinking that the next fad, whether Structured Methods, Agile, CMMi or whatever, will transform us into rigorous, respected professionals who can craft high quality applications. Now some (like Structured Methods) suck, while others (like Agile) are far more positive, but the uncomfortable truth is that it’s all hard and the most important thing is our attitude. We have to acknowledge that development is inherently very difficult. Providing good UX is even harder and it’s not going to happen organically as a by-product of some over-arching transformation of the way we develop. We have to consciously work at it.

Whatever the answer is for any particular organisation it has to incorporate UX at the very heart of the process, from the start. Iteration and prototyping are both crucial. One of the few fundamental truths of development is that users can’t know what they want and like till they’ve seen what is possible and what might be provided.

Even before the first build there should have been some attempt to understand the users and how they might be using the proposed product. There should be walkthroughs of the proposed design. It’s important to get UX professionals involved, if at all possible. I think developers have advanced to the point that they are less likely to get it horribly wrong, but actually getting it right, and delivering good UX is asking too much. For that I think you need the professionals.

I do think that Agile is much better suited to producing good UX than traditional methods, but there are still dangers. A big one is that many Agile developers are understandably sceptical about anything that smells of Big Up-Front Analysis and Design. It’s possible to strike a balance and learn about your users and their needs without committing to detailed functional requirements and design.

How can usability relate to the notion of testable hypothesis that can lead to better software?

Usability and testability go together naturally. They’re also consistent with good development practice. I’ve worked on, or closely observed, many applications where the design had been fixed and the build had been completed before anyone realised that there were serious usability problems, or that it would be extremely difficult to detect and isolate defects, or that there would be serious performance issues arising from the architectural choices that had been made.

We need to learn from work that’s been done with complexity theory and organisation theory. Developing software is mostly a complex activity, in the sense that there are rarely predictable causes and effects. Good outcomes emerge from trialling possible solutions. These possibilities aren’t just guesswork. They’re based on experience, skill, knowledge of the users. But that initial knowledge can’t tell you the solution, because trying different option changes your understanding of the problem. Indeed it changes the problem. The trials give you more knowledge about what will work. So you have to create further opportunities that will allow you to exploit that knowledge. It’s a delusion that you can get it right first time just by running through a sequential process. It would help if people thought of good software as being grown rather than built.

Posted by: James Christie | January 19, 2015

“Fix on failure” – a failure to understand failure

Wikipedia is a source that should always be treated with extreme scepticism and the article on the “Year 2000 problem” is a good example. It is now being widely quoted on the subject, even though it contains some assertions that are either clearly wrong, or implausible, and lacking any supporting evidence.

Since I wrote about ”Y2K – why I know it was a real problem” last week I’ve been doing more reading around the subject. I’ve been struck by how often I’ve come across arguments, or rather assertions, that a “fix on failure” response would have been the best response. Those who argue that Y2K was a big scare and a scam usually offer a rewording of this gem from the Wikipedia article.

”Others have claimed that there were no, or very few, critical problems to begin with, and that correcting the few minor mistakes as they occurred, the “fix on failure” approach, would have been the most efficient and cost-effective way to solve the problem.”

There is nothing to back up these remarkable claims, but Wikipedia now seems to be regarded as an authoritative source on Y2K.

I want to talk about the infantile assertion that that “fix on failure was the right approach. Infantile? Yes, I use the word carefully. It ignores big practical problems that would have been obvious to anyone with experience of developing and supporting large, complicated applications. Perhaps worse, it betrays a dangerously naive understanding of “failure”, a misunderstanding that it shares with powerful people in software testing nowadays. Ok, I’m talking about the standards lobby there.

”Fix on failure” – deliberate negligence

Firstly, “fix on failure” doesn’t allow for the seriousness of the failure. As Larry Burkett wrote;

“It is the same mindset that believes it is better to put an ambulance at the bottom of a cliff rather than a guardrail at the top”.

“Fix on failure” could have been justified only if the problems were few and minor. That is a contentious assumption that has to be justified. However, the only justification on offer is that those problems which occurred would have been suitable for “fix on failure”. It is a circular argument lacking evidence or credibility, and crucially ignores all the serious problems that were prevented.

Once one acknowledges that there were a huge number of problems to be fixed one has to deal with the practical consequences of “fix on failure”. That approach does not allow for the difficulty of managing masses of simultaneous failures. These failures might not have been individually serious, but the accumulation might have been crippling. It would have been impossible to fix them all within acceptable timescales. There would have been insufficient staff to do the work in time.

Release and configuration management would have posed massive problems. If anyone tells you Y2K was a scam ask them how they would have handled configuration and release management when many interfacing applications were experiencing simultaneous problems. If they don’t know what you are talking about then they don’t know what they are talking about.

Of course not all Y2K problems would have occurred on 1st January 2000. Financial applications in particular would have been affected at various points in 1999 and even earlier. That doesn’t affect my point, however. There might have been a range of critical dates across the whole economy, but for any individual organisation there would have been relatively few, each of which would have brought a massive, urgent workload.

Attempting to treat Y2K problems as if they were run of the mill, “business as usual” problems, as advocated by sceptics, betrays appalling ignorance of how a big IT shop works. They are staffed and prepared to cope with a relatively modest level of errors and enhancements in their applications. The developers who support applications aren’t readily inter-changeable. They’re not fungible burger flippers. Supporting a big complicated application requires extensive experience with that application. Staff have to be rotated in and out carefully and piecemeal so that a core of deep experience remains.

IT installations couldn’t have coped with Y2K problems in the normal course of events any more than garages could cope if all cars started to have problems. The Ford workshops would be overwhelmed when the Fords started breaking down, the Toyota dealers would seize up when the Toyotas suffered.

The idea that “fix on failure” was a generally feasible and responsible approach simply doesn’t withstand scrutiny. Code that wasn’t Y2K-compliant could be spotted at a glance. It was then possible to predict the type of error that might arise, if not always the exact consequences. Why on earth would anyone wait to see if one could detect obscure, but potentially serious distortions? Why would anyone wait to let unfortunate citizens suffer or angry customers complain?

The Y2K sceptics argue that organisations took expensive pre-emptive action because they were scared of being sued. Well, yes, that’s true, and it was responsible. The sceptics were advocating a policy of conscious, deliberate negligence. The legal consequences would quite rightly have been appalling. “Fix on failure” was never a serious contribution to the debate.

”Fix on failure” – a childlike view of failure

The practical objections to a “fix on failure” strategy were all hugely significant. However, I have a deeper, fundamental objection. “Fix on failure” is a wholly misguided notion for anything but simple applications. It is based on a childlike, binary view of failure. We are supposed to believe an application is either right or wrong; it is working or it is broken; that if there is a Y2K problem then the application obligingly falls over. Really? That is not my experience.

With complicated financial applications an honest and constructive answer to the question “is the application correct?” would be some variant on “what do you mean by correct?”, or “I don’t know. It depends”. It might be possible to say the application is definitely not correct if it is producing obvious garbage. But the real difficulty is distinguishing between the seriously inaccurate, but plausible, and the acceptably accurate. Discussion of accuracy requires understanding of critical assumptions, acceptable margins of error, confidence levels, the nature and availability of oracles, and the business context of the application.

I’ve never seen any discussion of Y2K by one of the “sceptical” conspiracy theorists that showed any awareness of these factors. There is just the naïve assumption that a “failed” application is like a patient in a doctor’s surgery, saying “I’m sick, and here are my symptons”.

Complicated applications have to be nursed and constantly monitored to detect whether some new, extraneous factor, or long hidden bug, is skewing the figures. A failing application might appear to be working as normal, but it would be gradually introducing distortions.

Testing highly complicated applications is not a simple, binary exercise of determining “pass or fail”. Testing has to be a process of learning about the application and offering an informed opinion about what it is, and what it does. That is very different from checking it against our preconceptions, which might have been seriously flawed. Determining accuracy is more a matter of judgement than inspection.

Throughout my career I have seen failures and problems of all types, with many different causes. However, if there is a single common underlying theme then the best candidate would be the illusion that development is like manufacturing, with a predictable end product that can be checked. The whole development and testing process is then distorted to try and fit the illusion.

The advocates of Y2K “fix on failure” had much in common with the ISO 29119 standards lobby. Both shared that “manufacturing” mindset, that unwillingness to recognise the complexity of development, and the difficulty of performing good, effective testing. Both looked for certainty and simplicity where it was not available.

Good testers know that an application is not necessarily “correct” just because it has passed the checks on the test script. Likewise failure is not an absolute concept. Ignoring these truths is ignoring reality, trying to redefine it so we can adopt practices that seem more efficient and effective. I suspect the mantra that “fix on failure would have been more effective and efficient” has its roots with economists, like the Australian Quiggin, who wanted to assume complexity away. See this poor paper (PDF, opens in a new tab).

Doing the wrong thing is never effective. Negligence is rarely efficient. Reality is uncomfortable. We have to understand that and know what we are talking about before coming up with simplistic, snake-oil solutions that assume simplicity where the reality is complexity.

Posted by: James Christie | January 12, 2015

Y2K – why I know it was a real problem

It’s confession time. I was a Y2K test manager for IBM. As far as some people are concerned that means I was party to a huge scam that allowed IT companies to make billions out spooking poor deluded politicians and the public at large. However, my role in Y2K means I know what I am talking about, so when I saw some recent comment that it was all nothing more than hype I felt the need to set down my first hand experience. At the time, and in the immediate aftermath of Y2K, we were constrained by client confidentiality from explaining what we did, but 15 years on I feel comfortable about speaking out.

Was there a huge amount of hype? Unquestionably.

Was money wasted? Certainly, but show me huge IT programmes where that hasn’t happened.

Would it have been better to do nothing and adopt a “fix on failure” approach? No, emphatically not as a general rule and I will explain why.

There has been a remarkable lack of studies of Y2K and the effectiveness of the actions that were taken to mitigate the problem. The field has been left to those who saw few serious incidents and concluded that this must mean there could have been no serious problem to start with.

The logic runs as follows. Action was taken in an attempt to turn outcome X into outcome Y. The outcome was Y. Therefore X would not have happened anyway and the action was pointless. The fallacy is so obvious it hardly needs pointing out. If action was pointless then the critics have to demonstrate why the action that was taken had no impact and why outcome Y would have happened regardless. In all the years since 2000 I have seen only unsubstantiated assertion and reference to those countries, industries and sectors where Y2K was not going to be a signficant problem anyway. The critics always ignore the sectors where there would have been massive damage.

An academic’s flawed perspective

This quote from Anthony Finkelstein, professor of software systems engineering at University College London, on the BBC website, is typical of the critics’ reasoning.

”The reaction to what happened was that of a tiger repellent salesman in Golders Green High Street,” says Finkelstein. ‘No-one who bought my tiger repellent has been hurt. Had it not been for my foresight, they would have.’ “

The analogy is presumably flippant and it is entirely fatuous. There were no tigers roaming the streets of suburban London. There were very significant problems with computer systems. Professor Finkelstein also used the analogy back in 2000 (PDF, opens in new tab).

In that paper he made a point that revealed he had little understanding of how dates were being processed in commercial systems.

”In the period leading up to January 1st those who had made dire predictions of catastrophe proved amazingly unwilling to adjust their views in the face of what was actually happening. A good example of this was September 9th 1999 (9/9/99). On this date data marked “never to expire” (realised as expiry 9999) would be deleted bringing major problems. This was supposed to be a pre-shock that would prepare the way for the disaster of January 1st. Nothing happened. Now, if you regarded the problem as a serious threat in the first place, this should surely have acted as a spur to some serious rethinking. It did not.”

I have never seen a date stored in the way Finkelstein describes, 9th September 1999 being held as 9999. If that were done there would be no way to distinguish 1st December 2014 from 11th February 2014. Both would be 1122014. Dates are held either in the form 090999, with leading zeroes so the dates can be interpreted correctly, or with days, months and years in separate sub-fields for simpler processing. Programmers who flooded date fields with the integer 9 would have created 99/99/99, which could obviously not be interpreted as 9th September 1999.

Anyway, the main language of affected applications was Cobol, and the convention was for programmers to move “high values”, i.e. the highest possible value the compiler could handle, into the field rather than nines. “High values” doesn’t translate into any date. Why doesn’t Finkelstein know this sort of basic thing if he’s setting himself up as a Y2K expert? I never heard any concern about 9/9/99 at the time, and it certainly never featured in our planning or work. It is a straw man, quite irrelevant to the main issue.

In the same paper from 2000 Finkelstein made another claim that revealed his lack of understanding of what had actually been happening.

September 9th 1999 is only an example. Similar signs should have been evident on January 1st 1999, the beginning of the financial year 99-00, December 1st, and so on. Indeed assuming, as was frequently stated, poor progress had been made on Y2K compliance programmes we would have anticipated that such early problems would be common and severe. I see no reason to suppose that problems should not have been more frequent (or at any rate as frequent) in the period leading up to December 31st 1999 than afterwards given that transactions started in 1999 may complete in 2000, while after January 1st new transactions start and finish in the new millennium.

Finkelstein is entirely correct that the problem would not have suddenly manifested itself in January 2000, but he writes as if this is an insight the practitioners lacked at the front line. At General Accident the first critical date that we had to hit was the middle of October 1998, when renewal invitations for the first annual insurance contracts extending past December 1999 would be issued. At various points over the next 18 months until the spring of 2000 all the other applications would hit their trigger dates. Everything of significance had been fixed, tested and re-implemented by September 1999.

We knew that timetable because it was our job to know it. We were in trouble not because time was running out till 31/12/1999, but because we had little time before 15/10/1998. We made sure we did the right work at the right time so that all of the business critical applications were fixed in time. Finkelstein seems unaware of what was happening. A massed army of technical staff were dealing with a succession of large waves sweeping towards them over a long period, rather than a single tsunami at the millennium.

Academics like Finkelstein have a deep understanding of the technology and how it can, and should be used, but this is a different matter from knowing how it is being applied by practitioners acting under extreme pressure in messy and complex environments. These practitioners aren’t doing a bad job because of difficult conditions, lack of knowledge and insufficient expertise. They are usually doing a good job, despite those difficult conditions, drawing on vast experience and deep technical knowledge.

Comments such as those of Professor Finkelstein betray a lack of respect for practitioners, as if the only worthwhile knowledge is that possessed by academics.

What I did in the great Y2K “scare”

Let me tell you why I was recruited as a Y2K test manager by IBM. I had worked as a computer auditor for General Accident. A vital aspect of that role had been to understand how all the different business critical applications fitted together, so that we could provide an overview to the business. We could advise on the implications and risks of amending applications, or building new ones to interface with the existing applications.

A primary source - my report explaining the problem with a business critical application

A primary source – my report explaining the problem with a business critical application

Shortly before General Accident’s Y2K programme kicked off I was transferred to IBM under an outsourcing deal. General Accident wanted a review performed of a vital back office insurance claims system. The review had to establish whether the application should be replaced before Y2K, or converted. Senior management asked IBM that I should perform the review because I was considered the person with the deepest understanding of the business and technical issues. The review was extremely urgent, but it was delayed by a month till I had finished my previous project.

I explained in the review exactly why the system was business critical and how it was vital to the company’s reserving, and therefore the production of the company accounts. I explained how the processing was all date dependent, and showed how and when it would fail. If the system was unavailable then the accountants and premium setters would be flying blind, and the external auditors would be unable to sign off the company accounts. The risks involved in trying to replace the application in the available time were unacceptable. The best option was therefore to make the application Y2K compliant. This advice was accepted.

As soon as I’d completed the review IBM moved me into a test management position on Y2K, precisely because I had all the business and technical experience to understand how eveything fitted together, and what the implications of Y2K would be. The first thing I did was to write a suite of SAS programs that crawled through the production code libraries, job schedules and job control language libraries to track the relationship between programs, data and schedules. For the first time we had a good understanding of the inventory, and which assets depended on each other. Although I was nominally only the test manager I drew up the conversion strategy and timetable for all the applications within my remit, based on my accumulated experience and the new knowledge we’d derived from the inventory.

An insurance company’s processing is heavily date dependent. Premiums are earned on a daily basis, with the appropriate proportion being refunded if a policy is cancelled mid-term. Claims are paid only if the appropriate cover is in place on the date that the incident occurred. Income and expenditure might be paid on a certain date, but then spread over many years. If the date processing doesn’t work then the company can’t take in money, or pay it out. It cannot survive. The processing is so complex that individual errors in production often require lengthy investigation and fixing, and then careful testing. The notion that a “fix on failure” response to Y2K would have worked is risible.

We fixed the applications, taking a careful, triaged risk-based approach. The most date sensitive programs within the most critical applications received the most attention. Some applications were triaged out of sight. For these, “fix on failure” was appropriate.

We tested the converted applications in simulated runs across the end of 1999, in 2000 and again in 2004. These simulations exposed many more problems not just with our code, but also with all the utility and housekeeping routines and tools. In these test runs we overrode the mainframe system date within the test runs.

In the final stage of testing we went a step further. We booted up a mainframe LPAR (logical partition) to run with the future dates. I managed this exercise. We had a corner of the office with a sign saying “you are now entering 2000”, and everything was done with future dates. This exercise flagged up further problems with code that we had been confident would run smoothly.

Y2K was a fascinating time in my career because I was at a point that I now recognise as a sweet spot. I was still sufficiently technically skilled to do anything that my team members could do, even being called on to fix overnight production problems. However, I was sufficiently confident, experienced and senior to be able to give presentations to the most senior managers explaining problems and what the appropriate solutions would be.

December 19th 1999, Mary, her brother Malcolm & I in the snow. Not panicking about Y2K.

December 19th 1999, Mary, her brother Malcolm & I in the snow. Not panicking much about Y2K.

For these reasons I know what I’m talking about when I write that Y2K was a huge problem that had to be tackled. The UK’s financial sector would have suffered a massive blow if we had not fixed the problem. I can’t say how widespread the damage might have been, but I do know it would have been appalling.

My personal millennium experience

What was on my mind on 31st December 1999

What was on my mind on 31st December 1999

When I finished with Y2K in September 1999, at the end of the future mainframe exercise, at the end of a hugely pressurised 30 months, I negotiated seven weeks leave and took off to Peru. IBM could be a great employer at times! My job was done, and I knew that General Accident, or CGU as it had evolved into by then, would be okay. There would inevitably be a few glitches, but then there always are in IT. I was so relaxed about Y2K that on my return from Peru it was the least of my concerns. There was much more interesting stuff going on in my life.

I got engaged in December 1999, and on 31st December Mary and I bought our engagement and wedding rings. That night we were at a wonderful party with our friends, and after midnight we were on Perth’s North Inch to watch the most spectacular fireworks display I’ve ever seen. 1st January 2000? It was a great day that I’ll always remember happily. It was far from being a disaster, and that was thanks to people like me.

PS – I have written a follow up article explaining why “fix on failure” was based on an infantile view of software failure.

Posted by: James Christie | December 30, 2014

2014 in review – WordPress’s report on my blog

This is the standard WordPress annual report for my blog in 2014.

Here's an excerpt:

The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 13,000 times in 2014. If it were a concert at Sydney Opera House, it would take about 5 sold-out performances for that many people to see it.

Click here to see the complete report.

Posted by: James Christie | December 5, 2014

Interview about Stop 29119 with Service Virtualization

This is an email interview I gave to Jeff Bounds of Service Virtualization about the Stop 29119 campaign in October 2014. It appeared in two parts, ”James Christie explains resistance to ISO standards” and “ISO 29119 is damaging, so ignore it, advises James Christie”.

The full interview in the original format follows. Jeff’s questions are in red.

What is ISO 29119?

And why is it important to the software testing field?

ISO 29119 is described by the International Organization for Standardization (ISO) as “an internationally agreed set of standards for software testing that can be used within any software development life cycle or organization” .

When ISO are promoting a standard that is intended to cover everything that testers do then that is a big deal for all testers. We cannot afford to ignore it.

What’s wrong with ISO 29119?

Why do you oppose ISO29119?

I think the question is framed the wrong way round. A standard requires consensus and it has to be relevant. It is up to the promoters of the standard to justify it. They’ve not made any serious, credible attempt to do so. Their interpretation of agreement and consensus is restricted to insiders, to those already in the working group developing the standard. Those testers who don’t believe that formal, generic standards are the way ahead have been ignored.

Even before I found out about ISO 29119 I was opposed to it in principle. Standards in general are a a good thing, but software testing is an intellectual activity that doesn’t lend itself to standardization.

There is a wide range of evidence from pyschology, sociology and management studies to back up the argument that document driven standards like ISO 29119 are counter-productive. They just don’t fit with the way that people think and work in organizations. On the other hand the defenders to standards have never bothered to refute these arguments, or even address them. They simply assert, without evidence, that standardization is good for testing. Typically they make spurious arguments that because standards are a good thing in many contexts then they must be a good thing for testing. It’s logical nonsense.

This is all without considering the detailed content of the standard. It is dated, excessively bureaucratic, prescriptive and badly written.

Sure, the standard does say that it is possible to apply parts of the standard selectively and claim “tailored conformance”. However, the standard requires agreement with stakeholders for each departure from the standard. For any significant project that means documented agreement with many people on all sorts of detailed points.

Dr Stuart Reid has claimed that he wants to see companies and governments mandating the use of ISO 29119 in contracts. Lawyers and procurement managers don’t understand testing, as Dr Reid concedes. He sees that as being a case for providing them with a standard they can mandate.

My perspective is that such people, precisely because they don’t understand testing, will require full compliance. In their eyes, full compliance will seem responsible and professional while tailored compliance will look like cutting corners. That’s the way that people react. It’s no good shrugging that off by saying people don’t have to act that way.

All the evidence supports the opponents because they know how people behave. There is no evidence to support the standards lobby.

Isn’t standardisation good?

An argument in favor of ISO 29119 is that it would bring standardization to a software testing process that historically has seen people using a variety of techniques and methods, rather than one set way of doing things. What’s wrong with that?

Everything. Testing has to fit the problem. It seems crazy to think that everyone should be expected to do the same things. Again, why should that be the case? If everyone is doing the same then most people will be doing the wrong thing.

Are opponents trying to save their jobs?

Some proponents of ISO 29119 could also argue that opponent of the standard are simply trying to save their jobs, when automation and simulation represent a better, faster and cheaper way of doing testing. What are your thoughts about that?

Even if it were the case that opponents were simply concerned about their jobs it would still be a compelling argument against ISO 29119. As the ISO working group has conceded many opponents are more expert than the average tester. Why should they have to change the way they operate, for the worse, or pass up opportunities for work?

I could actually earn more money by collaborating with the standard and cleaning up the mess it will create. There will be a good market for test consultants to do that. However, I am not interested in that sort of work.

Anyway, opponents are unhappy about the standard, not automation and simulation, which are extremely important and valuable at the right time. The standard isn’t based on the assumption of an automated approach and the test process paper (Part 2) doesn’t even mention simulation. The discussion about automation is a quite separate matter from the debate about ISO 29119.

Can ISO 29119 provide a baseline?

Dr. Stuart Reid recently argued that ISO 29119 would, among other things, help define good practices in testing, along with providing a baseline to compare different test design techniques. What are your thoughts about that? (His full argument is here).

I don’t think the standards deals with good testing practices. It advocates what it sees as good practices in test management, specifically documentation. It is really a documentation standard rather than a testing standard. It is a classic case of confusing the process with the real work. The difference is crucial. It is like the difference between the map and the territory. The map is a guide to the territory, but it is not the real thing.

Dr Reid hasn’t argued the point about a baseline. He has merely asserted it without evidence or explanation. I’m afraid that is typical of ISO’s approach. Even if it is so, I don’t think testers should have to tie themselves in knots for the benefit of others.

What is the alternative?

If you believe ISO 29119 isn’t the solution, then what is the best standard for software testing, and why?

As I’ve said above I don’t think a generic standard is appropriate for testing. A good alternative to doing wasteful and damaging things is to ignore them. There are many sound alternatives to ISO 29119. I don’t think it is up to the opponents of the standard to justify these. They are being applied and they work. Where is the evidence that ISO 29119 works?

Posted by: James Christie | November 17, 2014

Too smart for checklists, and a consultants’ war?

My post last week “Why do we think we’re different” “Why do we think we’re different” (about goal displacement, trained incapacity and functional stupidity) attracted an interesting comment that deserved a considered response. In the end that merited a new post, rather than a reply in the comments section. Here is the comment, followed by my response.

I don’t disagree with your basic premise nor your conclusion. Your arguments are logical and elegant, but I wonder if people hear the anti-ISO 29119 group as saying “Don’t bother having process,” and “I’m too smart to need a checklist” (ala Gawande’s “Checklist Manifesto”).

Worse yet, the people arguing against the standard the loudest are not the common tester but the already heavily active in the community of testers. I described it as a “consultant’s war” because consultants seem the most active in the community are often consultants.

As a people problem, it seems both sides have a great deal of apathy outside of perhaps 1/100th of the testing community. How do we change that? The best argument in the world will have no affect if most people are apathetic. This is the problem I struggle with and not just around the standards but around testing as a career. I would love to see any insight you have around the problem, assuming you see it as a problem.

– JCD

JCD raises some interesting points and I am grateful for that.

I do need to keep thinking about the danger that people hear the Stop29119 campaigners as saying “don’t bother having process”. That is not the message I want to get across. I would phrase it as “beware of the dangers of prescriptive processes”, i.e. processes that spell out in detail each step. These may be required in some contexts, but not in software development or testing, where their value has been hugely overstated.

However, processes are required. The tricky problem is finding the sweet spot between providing sufficient guidance and ensuring appropriate standardisation on the one hand, and, on the other hand, going into excessive detail and preventing practitioners from using their judgement and initiative in a complex setting.

Similarly, checklists do have their place. They are important, but of limited value. They take you only so far. I certainly don’t decry them, but I do deplore excessive dependence on them at the expense, again, of judgement and initiative. I don’t disagree with the crux of Gawande’s book, though I do have reservations about the emphasis, or maybe it’s just with the way the book has been promoted. I’m also doubtful about his defintion of complexity.

I agree with what Gawande is saying here in The Checklist Manifesto.

It is common to misconceive how checklists function in complex lines of work. They are not comprehensive how-to guides, whether for building a skyscraper or getting a plan out of trouble. They are quick and simple tools aimed to buttress the skills of expert professionals. And by remaining swift and usable and resolutely modest, they are saving thousands upon thousands of lives.

I would quibble about building skyscrapers being a complex activity. I would call it highly complicated. I’d prefer to stick to the Cynefin definition of complexity, which is reserved for situations where there is no obvious causality; cause and effect are obvious only in hindsight. Nevertheless, I do like Gawande’s phrase that checklists are “quick and simple tools to buttress the skills of expert professionals”.

You’re right about this being a consultants’ war, but I don’t see how it could be anything else. It’s hard to get new ideas through to people who are doing the real work, grafting away day to day. Some testers are certainly apathetic, but the bigger issues are time and priorities. Campaigning against ISO 29119 is important, but it’s unlikely to be urgent for most people.

When I was a permanent test manager I didn’t have the time to lift my head and take part in these debates. Working as a contract test manager isn’t much better, though at least it’s possible, in theory, to control time between contracts.

I think that places some responsibility on those who can campaign to do so. If enough people do if for long enough then those testers who are less publicly visible, and lay people, are more likely to stumble across arguments that will make them question their assumptions. The lack of public defence of the standard from ISO has meant that anyone searching for information about ISO 29119 is now very likely to come across serious arguments against the standard. That might tilt the balance against general acceptance of the standard as The Standard. A few more people might join the campaign. Some organisations that might have adopted the standard may think better. It could all have a cumulative effect.

I’m working on the idea of a counter to ISO 29119. It wouldn’t be an attempted rival to ISO 29119. Nor would it be an anti-standard. Many organisations will adopt the standard because they want to seem responsible. That would be doing the wrong thing for the right reason.

What I am thinking of doing is documenting the links between the requirements of the Institute of Internal Auditors, and the Information Systems Audit & Control Association (and maybe other regulatory sources) with a credible alternative to the traditional, document heavy approach, showing how it is possible to be entirely responsible and accountable without going down the ISO 29119 dead end.

It’s a matter of explaining that there are viable and valid choices, something that is in danger of being hidden from public view by the arrival of an ISO standard intended to cover the whole of software testing. Checklists could well play a role in this.

I smiled at the suggestion that opponents of ISO 29119 might believe they’re too smart to need a checklist. I can see why some people might think that. However, I, like most ISO 29119 opponents I suspect, am acutely aware of the limits of my knowledge and competence. Excessive reliance on checklists and standards create the illusion, the delusion, that testers are more competent, professional and effective than they really are. I prefer a spot of realistic humility. Buttressing skill is something I can appreciate; supplanting skill is another matter altogether, and ISO 29119 goes too far in that direction.

Posted by: James Christie | November 14, 2014

Why do we think we’re different?

The longer my career lasts the more aware I am of the importance of Gerald Weinberg’s Second Law of Consulting (from his book “The Secrets of Consulting”), “No matter what the problem is, it’s always a people problem.”

The first glimmer of light that illuminated this truth was when I came across the term “goal displacement” and reflected on how many times I had seen it in action. People are given goals that aren’t quite aligned with what their work should deliver. They focus on the goals, not the real work. This isn’t just an incidental feature of working life, however. It is deeply engrained in our psychological make-up. There is a long history of academic work to explain this phenomenon.

Focal and subsidiary awareness

I’ll start with Michael Polanyi. In his book “Personal Knowledge”, Polanyi makes a distinction between focal and subsidiary awareness. Focal awareness is what we consciously think about. Subsidiary awareness is like tacit knowledge. We don’t think about the mechanics of holding a hammer to drive in a nail. We think about what we are trying to achieve. If we try to focus on the mechanics of holding the hammer correctly, and consciously aim for the nail then we are far more likely to make a painful mess of things. Focal and subsidiary awareness are therefore, in a sense, mutually exclusive. As Polanyi puts it.

“If a pianist shifts his attention from the piece he is playing to the observation of what he is doing with his fingers while he is playing it, he gets confused and may have to stop. This happens generally if we switch our focal attention to particulars of which we had previously been aware only in their subsidiary role.

Our attention can hold only one focus at a time… it would be hence contradictory to be both subsidiarily and focally aware of the same particulars at the same time.”

Does this apply in organisational life too, as well as to musicians and carpenters performing skilled physical activities? I think it does. We have often focused too closely on the process of software development and of testing and lost sight of the end we are trying to reach. Formal processes, prescriptive methods and standards encourage exactly that sort of misplaced focus.

Thorstein Veblen and trained incapacity

This problem of misplaced focus has long been observed by organisational psychologists and sociologists. A full century ago, in 1914, Thorstein Veblen identified the problem of trained incapacity.

People who are trained in specific skills tend to lose the ability to adapt. Their response has worked in the past, and they apply it regardless thereafter. They focus on responding in the way they have been trained, and cannot see that the circumstances require a different response. Their training has rendered them incapable of doing the job effectively unless it fits their mental framework. This is “trained incapacity”. They have been trained to be useless. The phenomenon applies to all workers, but especially to managers.

However, the problem that Veblen identified was worse than that. Highly specialised training and education meant that people were increasingly becoming expert in narrower fields and their areas of ignorance were increasing. When they entered the active workforce their jobs required wider skills and knowledge than their education had given them, but they were unable to contribute effectively to those other areas. They focussed on what they knew. Veblen was especially concerned about business school graduates.

“[These schools’] specialization on commerce is like other specializations in that it draws off attention and interest from other lines than those in which the specialization falls, thereby widening the candidate’s field of ignorance while it intensifies his effectiveness within his specialty. The effect, as touches the community’s interest in the matter, should be an enhancement of the candidate’s proficiency in all the futile ways and means of salesmanship and “conspiracy in restraint of trade” together with a heightened incapacity and ignorance bearing on such work as is of material use.”

A way of not seeing

In 1935 Kenneth Burke built on Veblen’s work, arguing that trained incapacity was;

“that state of affairs whereby one’s very abilities can function as blindnesses.”

People can focus on the means or the ends, not both, and their specific training in prescriptive methods or processes leads them to focus on the means. They do not even see what they are missing.

“A way of seeing is also a way of not seeing- a focus on object ‘A’ involves a neglect of object ‘B’.”

Robert Merton Robert Merton made the point more explicitly in 1957 when he introduced the concept of goal displacement.

“Adherence to the rules… becomes an end in itself… Formalism, even ritualism, ensues with an unchallenged insistence upon punctilious adherence to formalised procedures. This may be exaggerated to the point where primary concern with conformity to the rules interferes with the achievement of the purposes of the organization.”

Why do we think we are different?

So the problem had been recognised before software development was even in its infancy. How did it come to be such a pervasive problem in our profession? What possible reason could there be for thinking that we are different in software developing and testing? Why would we think we are immune from these problems?

I explored some of the reasons earlier this year in Teddy Bear Methods. Software development is difficult and stressful. It is tempting to seek refuge in neat, ordered structures. In that article I talked about social defences, transitional objects and how slavishly following prescriptive processes and methods can be become a fetish.

Functional stupidity

However, there is an over-arching explanation; functional stupidity. This was identified by Alvesson and Spicer in a fascinating paper in the Journal of Management Studies in 2012, ”A Stupidity-Based Theory of Organizations” (PDF, opens in new tab).

The concept is rather more nuanced than the headline grabbing name suggests. It is no glib piece of cod psychology; it is soundly rooted in organisational psychology and sociology and in management theory.

Organisations can function more smoothly if employees suspend their critical thinking faculties. It can actually be beneficial if they do not question the validity of management directives, if they don’t think about whether the actions they have to take are justified, and if they don’t waste cognitive effort thinking about whether their work is aligned with the objectives of the organisation.

In large organisations the goal towards which many employees are working is effectively the smooth running of the bureaucracy. Functional stupidity does help things run smoothly. It can be beneficial for compliant employees too. The people who thrive are those who play the game by the rules and don’t question whether the “game” is actually aligned with the objectives of the organisation.

However, where functional stupidity is beneficial it is in organisations operating in a fast moving, relatively well understood environment. In these cases fast and efficient action and reaction may be more important than reflective analysis, though it still carries serious dangers.

On the other hand if the environment is less well understood and there is a need to reflect and learn, then functional stupidity can be disastrous. Apart from a failure to learn from, or even detect mistakes, functional stupidity can commit the organisation to damaging initiatives, while corroding employee morale and effectiveness. Even if the organisation as a whole might be suited to functional stupdity there are roles where it is entirely inappropriate.

Software testing is exactly such a role. Testers must question, analyse, reflect and learn. These are all activities that functional stupidity discourages.

Management fads, lack of evidence and ISO 29119

Alvesson and Spicer refer to a further, damaging effect of functional stupidity that has particular relevance to the debate about ISO 29119. They argue that managers are prone to getting caught up in enthusiasm for unproven initiatives.

”Most managerial practices are adopted on the basis of faulty reasoning, accepted wisdom, and complete lack of evidence.

…organizations will often adopt new practices with few robust reasons beyond the fact that they make the company ‘look good’ or that ‘others are doing it’… Refraining from asking for justification beyond managerial edict, tradition or fashion, is a key aspect of functional stupidity.”

Does ISO 29119 fall into this category? Dr Stuart Reid, convener of the ISO 29119 Working Group is a surprising source of compelling evidence to support the claim.buyers unclear
no evidence He has conceded that there is no evidence of the standard’s efficacy and that the people who buy testing services do not understand what they are buying (see the slides from his presentation at ExpoQA14 in Madrid in May, with my added emphasis).

Yet he hopes that they will nevertheless write contracts that mandate the use of the standard (PDF, opens in new tab).

This standard will impose on testers working practices that are only loosely aligned with the real objective of testing. It will provide fertile breeding grounds for goal displacement. Will functional stupidity ease the way for ISO 29119? I fear the worst.

I asked why we think we are different in software development and testing. The question is poorly framed. It’s not that we think we are different. The problem is that we, as a global testing community, are not thinking enough. Far too many of us are simply going with the flow. Thousands have unthinkingly adopted functional stupidity as a career move. ISO 29119? That will do nicely.

“No matter what the problem is, it’s always a people problem.”

Any organisational initiative, or new methodology, or new standard that ignores that rule will not work. The lessons have been there for decades. We only have to look for them.

Posted by: James Christie | October 8, 2014

What does the audit profession have to say about testing standards?

Yesterday I attended a half day meeting of the Scottish Testing Group in Edinburgh. As ever it was an enjoyable and interesting event that allowed me to catch up with people, meet new folk and listen to interesting talks.

The first talk was by Steve Green. He started with a brisk critique of the ISO 29119 software testing standard, then followed up with his vision of how testing should be performed, based on an exploratory approach.

Steve’s objections to ISO 29119 were as follows.

  1. It is based on a flawed assumption that testing is a mature profession.
  2. The unjustified claim by ISO that consensus has been achieved.
  3. The illusion that burying people in documentation produces “thoroughness” in testing.
  4. It is a documentation standard, not a testing standard.
  5. It will be mandated by organisations that do not understand testing, indeed it is targetted at such organisations.
  6. Exploratory testing is misunderstood and misrepresented.
  7. It reinforces the factory model of testing. It recommends the separation of test design from test execution. There is no feedback loop.
  8. ”It emphasises ass coverage over test coverage” (Michael Bolton).

After this section of the talk Steve took questions, and I joined him in answering a few. The question that intrigued me during Steve’s talk followed my observation that when you plough through the detailed guidance from ISACA (Information Systems Audit & Control Association) and the IIA (Global Institute of Internal Auditors) there is nothing whatsoever to support the use of ISO 29119, IEEE 829 or testing standards in general.

One tester asked if there was anywhere she could go that authoritatively demonstrated the lack of a link. Is there anything that maps between the various documents demonstrating that they don’t align? In effect she was asking for something that says ISO 29119 requires A, B & C, but ISACA & IIA do not require them, and say instead X, Y & Z.

There isn’t such a reference point, and that’s hardly surprising. Neither ISACA nor IIA have anything to say about the relevance of astrology to testing either. They simply ignore it. The only way to establish the lack of relevance is to plough through all the documentation and see that there is nothing to require, or even justify, the use of ISO 29119.

I’ve done that for all the ISACA and IIA material as preparation for my tutorial on auditors and testing at Eurostar last year, and I’ve just checked again prior to repeating the tutorial at Starwest next week.

However, it’s one thing sifting through all the guidance to satisfy myself and be able to state confidently that testing standards are irrelevant to auditors. It’s another matter to demonstrate that objectively to other people. It’s harder work to prove a negative than it would be to prove that there were links. It might be worthwhile taking the work I’ve already done a bit further so that testers like the one who asked yesterday’s question could say to managers “ISO 29119 is irrelevant to the auditors, and here’s the proof”. I intend to do that.

ISACA’s guidance material is particularly important because it’s all tied up in the COBIT 5 framework, which is aligned to the requirements of Sarbanes-Oxley (SOX). So if you comply with COBIT 5 then you can be extremely confident that you comply with SOX.

Maybe it’s because I was an auditor and I have an inflated sense of the importance of audit, but I think it was extremely ill-advised for ISO to develop a software testing standard while totally ignoring what ISACA were doing with COBIT 5. The ISO working group might argue that ISO 29119 is new and that the auditors have not caught up. However, both the high level principles of COBIT 5 and the IIA guidance, and the detailed recommendations within COBIT 5 are all inconsistent with the approach that ISO 29119 takes.

To quote ISACA;

“COBIT … concentrates on what should be achieved rather than how to achieve effective governance, management and control….

Implementation of good practices should be consistent with the enterprise’s governance and control framework, appropriate for the organisation, and integrated with other methods and practices that are being used. Standards and good practices are not a panacea. Their effectiveness depends on how they have been implemented and kept up to date. They are most useful when applied as a set of principles and as a starting point for tailoring specific procedures. To avoid practices becoming shelfware, management and staff should understand what to do, how to do it and why it is important.”

That quote comes from the introduction to COBIT 4.1, but the current version moves clearly in the direction signposted here.

You could comply with ISO 29119 and be compliant with COBIT 5, but why would you? You can achieve compliance with COBIT 5 without the bureaucracy and documentation implied by the standard, and astute auditors might well point out that excessive and misplaced effort was going into a process that produced little value.

It’s not that either ISACA or IIA ignore ISO standards. They refer to many of them in varying contexts. They just don’t refer to testing standards. Why? The simple answer is that ISO have taken years to produce a brand new standard that is now utterly irrelevant, to testers, developers, stakeholders, and auditors.

Posted by: James Christie | September 12, 2014

ISO 29119 and “best practice”; a confused mess

In yesterday’s post about the Cynefin Framework, testing and auditing I wrote that best practice belongs only in situations that are simple and predictable, with clear causes and effects. That aligned closely with what I’ve seen in practice and helped me to make sense of my distrust of the notion of best practice in software development. I’ve discussed that before here.

In stark contrast to the Cynefin Framework, the ISO 29119 lobby misunderstands the concept of best practice and its relevance to software development and testing.

Wikipedia defines a best practice as,

…a method or technique that has consistently shown results superior to those achieved with other means, and that is used as a benchmark.

Proponents of best practice argue that it is generally applicable. It is hard to get to grips with them in argument because they constantly shift their stance and reinterpret the meaning of words to help them evade difficult questions.

If one challenges them with examples of cases when “best practice” might not be useful they will respond that “best” does not necessarily mean best. There might indeed be situations when it doesn’t apply; there may be better choices than “best”. However, “best practice” is a valid term, they argue, even if they admit it really only means “generally the right thing to do”.

Sadly some people really do believe that when defenders of best practice say a practice is “best” they actually mean it, and write contracts or take legal action based on that naïve assumption.

This evasion and confusion is reflected in the promotion of ISO 29119. Frankly, I don’t really know whether the standard is supposed to promote best practice, because its producers don’t seem to know either. They don’t even seem clear about what best practice is.

Is ISO 29119 built on “best practice”?

These two quotes are extracts from “ISO/IEC/IEEE 29119 The New International Software Testing Standards”, written by Stuart Reid, the convener of the ISO 29119 working group. The current version of the document is dated 14th July 2014, but it dates back to last year at least.

Parts of ISO/IEC/IEEE 29119 have already been released in draft form for review (and subsequently been updated based on many thousands of comments) and are already being used within a number of multi-‐national organizations. These organizations are already seeing the benefits of reusing the well-defined processes and documentation provided by standards reflecting current industry best practices.

Imagine an industry where qualifications are based on accepted standards, required services are specified in contracts that reference these same standards, and best industry practices are based on the foundation of an agreed body of knowledge – this could easily be the testing industry of the near future.

This next quote comes from the ISO 29119 website itself.

A risk-based approach to testing is used throughout the standard. Risk-based testing is a best-practice approach to strategizing and managing testing…

That all seems clear enough. Standards reflect best practices. However, Stuart Reid has this to say in the YouTube video currently promoting ISO 29119 (5 minutes 25 seconds in).

A common misconception with standards is that they define best practice. This is obviously not a sensible approach as only one organisation or person can ever be best at any one time and there is only one gold medal for any Olympic event. We all know there is no one best approach to testing. It should be context driven.

If standards defined best practice then most of us would never even bother trying to achieve this unattainable goal knowing it was probably too far away from our current situation. Instead standards should define good practice with the aim of providing an achievable goal for those many organisations whose current practices are deficient in many ways.

Well, I think the world can be forgiven for the misconception that ISO 29119 is intended to define best practice since that is what the convener of the ISO 29119 working group has clearly said.

Stuart Reid repeated that last quote in Madrid in May this year, and the accompanying slide provides plenty to mull over, especially in the light of the assertion that standards are not about “best practice”.

quality & standards

The slide assumes quality improves as one moves from current practice, to good practice and on to best practice. This depiction of neat linear progression implied by the rising quality arrow is flawed. It seems to be based on the assumptions that quality can be tested into products, that potential users of the standard (and every organisation is considered a potential user) are currently not even doing good things, that best practice is the preserve of an elite and that it is by definition better than good practice.

Not only do these assumptions look indefensible to me, they are not even consistent with the words Stuart Reid uses. He says that there is no “best approach” and that testing should be “context driven”, yet uses a slide that clearly implies a standards driven progression from good to best. This is not the meaning that people usually ascribe to “best practice”, defined above. Best practice does not mean doing the best possible job in the current context regardless of what others are doing. If that were its generally accepted meaning then it would not be a controversial concept.

Muddled language, muddled thinking, or both?

It’s hard to say whether all these inconsistencies are just muddled thinking or whether ISO 29119’s defenders are starting to give way on some weak points in an attempt to evade challenge on the core message, that standards represent an accepted body of knowledge and we should buy into them. Is “best practice” now acknowledged to be an inherently unjustifiable concept in testing? Has it therefore been thrown overboard to stop it discrediting the rest of the package, and the various ISO 29119 materials have not all been brought into line yet?

The tone I infer from the various documents is a yearning to be seen as relevant and flexible. Thus we see “best practice” being suddenly disowned because it is becoming an embarrassment. The nod towards “context driven” is deeply unimpressive. It is clearly a meaningless gesture rather than evidence of any interest in real context driven testing. Trotting out the phrase glibly does not make a tester context driven and nor does it make the speaker seem any more relevant, flexible or credible.

If a standard is to mean anything than it must be clear. ISO 29119 seems hopelessly confused about best practice and how it relates to testing. Comparing the vague, marketing froth on which ISO 29119 is based with the clear perspective offered by the Cynefin Framework reveals ISO 29119 as intellectually incoherent and irrelevant to the realities of development and testing.

Ironically the framework that helps us see software development as a messy, confusing affair is intellectually clear and consistent. The framework that assumes development to be a neat and orderly process is a fog of confusion. How can we take a “standard” seriously when its defenders don’t even put up a coherent defence?

Older Posts »

Categories

Follow

Get every new post delivered to your Inbox.

Join 71 other followers