The quality gap – part 2

In my last blog, the first of two on the theme of “The Quality Gap” I discussed the harmful conflation of quality with requirements and argued that it was part of a mindset that hindered software development for decades.

In my studies and reading of software development and its history I’ve come to believe that academics and industry gurus misunderstood the nature of software development, regarding it as a more precise and orderly process than it really was, or at least they regarded it as potentially more precise and orderly than it could reasonably be. They saw practitioners managing projects with a structured, orderly process that superficially resembled civil engineering or construction management, and that fitted their perception of what development ought to be.

They missed the point that developers were managing projects that way because it was the simplest way to manage a chaotic and unpredictable process, and not because it was the right way to produce high quality software. The needs of project management were dictating development approaches, not the needs of software development.

The pundits drew the wrong conclusion from observing the uneasy mixture of chaos and rigid management. They decided that the chaos wasn’t the result of developers struggling to cope with the realities of software development and an inappropriate management regime; it was the result of a lack of formal methods and tools, and crucially it was also the consequence of a lack of discipline.

Teach them some discipline!

The answer wasn’t to support developers in coming to grips with the problems of development; it was to crack down on them and call for greater order and formality.

Some of the comments from the time are amusing, and highly revealing. Barry Boehm approvingly quoted a survey in 1976; “the average coder…(is)…generally introverted, sloppy, inflexible, in over his head, and undermanaged”.

Even in the 1990s Paul Ward and Ed Yourdon, two of the proponents of structured methods were berating developers for their sins and moral failings.

Ward – “the wealth of ignorance… the lack of professional discipline among the great unwashed masses of systems developers”.

Yourdon – “the majority of software development organisations operate in a “Realm of Darkness”, blissfully unaware of even the rudimentary concepts of structured analysis and design”

This was pretty rich considering the lack of theoretical and practical underpinning of structured analysis and design, as promoted by Ward and Yourdon. See this part of an article I wrote a few years ago for a fuller explanation. The whole article gives a more exhaustive argument against standards than I’m providing here.

Insulting people is never a great way to influence them, but that hardly mattered. Nobody cared much about what the developers themselves thought. Senior managers were convinced and happily signed the cheques for massive consultancy fees to apply methods built on misconceived conceptual models. These methods reinforced development practices which were damaging, certainly from the perspective of quality (and particularly usability).

Quality attributes

Now we come to the problem of quality attributes. For many years there has been a consensus that a high quality application should deliver the required levels of certain quality attributes, pretty much the same set that Glass listed in the article I referred to in part 1; reliability, modifiability, understandability, efficiency, usability, testability and portability. There is debate over the the members of this set, and their relative importance, but there is agreement that these are the attributes of quality.

They are also called “non-functional requirements”. I dislike the name, but it illustrates the problem. The relentless focus of traditional, engineering obsessed development was on the function, and thus the functional requirements, supposedly in the name of quality. Yet the very attributes that a system needed in order to enjoy high quality were shunted to one side in the development process and barely considered.

I have never seen these quality attributes given the attention they really require. They were often considered only as a lame afterthought and specified in such a way that testing was impossible. They were vague aspirations and lacked precision. Where there were clear criteria and targets they could usually be assessed only after the application had been running for months, by which time the developers would have been long gone. What they did not do, or not do effectively, was to shape the design.

The quality attributes are harder to specify than functional requirements; harder, but not impossible. However, the will to specify clear and measurable quality requirements was sadly lacking. All the attention was directed at the function, a matter of logical relationships, data flows and business rules.

The result was designs that reflected what the application was supposed to do and neglected how it would do it.

This problem was not attributable to incompetent developers and designers who failed to follow the prescribed methods properly. The problem was a consequence of the method, and one of the main reasons was the difficulty of finding the right design.

The design paradox

Traditional development, and structured methods in particular, had a fundamental problem, quite apart from the neglect of quality attributes, in trying to derive the design from the requirements. Again, that same part of my article on testing and standards explains how these methods matched the mental processes of bad designers and ignored the way that successful designers think.

It’s a paradox of the traditional approach to software development that developers did their designing both too early and too late. They subconsciously fixed on design solutions too early while they they should only have been trying to understand the users’ goals and high level requirements. The requirements would be captured in a way that assumed and constrained the solution. The analysts and designers would then work their way through detailed requirements to a design that was not exposed to testing until it was too late to change easily, if it was possible to change it at all.

Ignoring reality

So software development, in attempting to be more like a conventional engineering discipline, was adopting the trappings of formal engineering, whilst ignoring its inability to deal with issues that a civil engineer would never dream of neglecting.

If software engineering really was closely aligned to civil engineering it would have focussed relentlessly on practical problems. Civil engineering has to work. It is a pragmatic discipline and cannot afford to ignore practical problems. Software engineering, or rather the sellers of formal methods, could be commercially highly successful by ignoring problems and targeting their sales pitch at senior managers who didn’t understand software development, but wrote the cheques.

Civil engineering has sound scientific and mathematical groundings. The flow from requirements to design is just that, a flow rather than a series of jumps from hidden assumptions to arbitrary solutions.

Implicit requirements (e.g. relating to safety) in civil engineering are quite emphatically as important as those that are documented. They cannot be dismissed just because the users didn’t request them. The nature of the problem engineers are trying to solve must be understood so that the implicit requirements are exposed and addressed.

In civil engineering designs are not turned into reality before anyone is certain that they will work.

These discrepancies between software development and civil engineering have been casually ignored by the proponents of the civil engineering paradigm.

So why did the civil engineering paradigm survive so long?

There are two simple reasons for the enduring survival of this deeply flawed worldview. It was comforting, and it has been hugely profitable.

Developers had to adopt formal methods to appear professional and win business. They may not have really believed in their efficacy, but it was reassuring to be able to follow an orderly process. Even the sceptics could see the value of these methods in providing commercial advantage regardless of whether they built better applications.

The situation was summed up well by Brian Fitzgerald, back in 1995.

In fact, while methodologies may contribute little to either the process or product of systems development, they continue to be used in organisations, principally as a “comfort factor” to reassure all participants that “proper” practices are being followed in the face of the stressful complexity associated with system development.

Alternately, they are being used to legitimate the development process, perhaps to win development contracts with government agencies, or to help in the quest for ISO-certification. In this role, methodologies are more a placebo than a panacea, as developers may fall victim to goal displacement, that is, blindly and slavishly following the methodology at the expense of actual systems development. In this mode, the vital insight, sensitivity and flexibility of the developer are replaced by automatic, programmed behaviour.

The particular methodologies about which Fitzgerald was writing may now be obsolete and largely discredited, but the mindset he describes is very much alive. The desire to “legitimate the development process” is still massively influential, and it is that desire that the creators of ISO 29119 are seeking to feed, and to profit from.

However, legitimising the development process, in so far as it means anything, requires only that developers should be able to demonstrate that they are accountable for the resources they use, and that they are acting in a responsible manner, delivering applications as effectively and efficiently as they can given the nature of the task. None of that requires exhaustive, prescriptive standards. Sadly many organisations don’t realise that, and the standards lobby feeds off that ignorance.

The quality equation that Robert Glass described, and which I discussed in the first post of this short series, may be no more than a simple statement of the obvious. Software quality is not simply about complying with the requirements. That should be obvious, but it is a statement that many people refuse to acknowledge. They do not see that there is a gap between what the users expected to get, and their perceptions of the application when they get it.

It is that gap which professional testers seek to illuminate. Formal standards are complicit in obscuring the gap. Instead of encouraging testers and developers to understand reality they encouraging a focus on documentation and what was expected. They reinforce assumptions rather than question them. They confuse the means with the end.

I’ll sign off with a quote that sums up the problem with testing standards. It’s from
“Information Systems Development: Methods-in-Action” by Fitzgerald, Russo and Stolterman. The comment came from a developer interviewed during research for their book.

“We may be doing wrong but we’re doing so in the proper and customary manner”.

Advertisements

The quality gap – part 1

Lately I’ve been doing a lot of reading and thinking about standards in software testing and how they contribute towards high quality applications.

I went back to read a couple of blogs written by Robert Glass in 2001; “Quality: what a fuzzy term” and “Revisiting the definition of software quality”.

I found Glass’s articles thought provoking. They set off a chain of thought that I’ve decided to put in writing. It’s maybe a bit obscure and theoretical for many readers, but I found it helpful so, for better or worse, I’m leaving it for posterity.

Glass’s formulation of quality

In the first article Glass explains how quality is related to several factors; user satisfaction, compliance with the user requirements and whether the application was delivered on time and budget.

Glass presented the relationship in the form of a simple equation. Framing his argument in this way shed light on contradictions and weaknesses in the arguments for standards and for traditional development methods.

According to Glass user satisfaction = good quality + compliant product (i.e. compliant with the requirements) + delivery within budget/schedule.

I think it’s reasonable to infer from Glass’s articles that the factors on the right of the equation actually represent levels of satisfaction derived from these factors. Therefore, “good quality” in this equation represents the amount of satisfaction the users derive from good quality. Likewise, “compliant product” represents the satisfaction users gets from a product that complies with the requirements, and “delivery with budget/schedule” is the user satisfaction from a slickly run project.

S is total user satisfaction.
Q is the satisfaction from good quality.
C is the satisfaction from compliance with requirements.
B is the satisfaction from delivery within budget/schedule.

So Glass’s equation is; S=Q+C+B.

This equation means that a given level of satisfaction can be achieved by varying any of the three elements. If the requirements are poorly defined and don’t reflect the users’ real needs, but the application matches those requirements and is delivered on budget and schedule then user satisfaction would presumably be low, which implies low quality.

Restating Glass’s equation

For my purposes the equation is better presented as; Q=S-(C+B).

The cost and schedule factors (i.e. B) are certainly important, but these apply to the quality of the project, not the quality of the product. Returning to Glass’s equation total user satisfaction is the sum of satisfaction from project and product. Let’s call project satisfaction S1 & product satisfaction S2.

(S1+S2)=Q+C+B

By definition S1=B. Testers might have an interest in the quality of the project, which affects their ability to do their job, but their real role is providing information on the quality of the product so the equation we are interested in is; S2=Q+C, or Q=S2-C

The quality of the product is therefore the gap between user satisfaction with the product and their satisfaction from the level of compliance with the requirements. Compliance with the requirements has effectively been discounted. It’s a given. If you deliver exactly what the users require and the users are happy with that then, in golf terms, you’d have scored level par. That’s pretty good. If your users are less happy than you’d expect from hitting the requirements then the quality score is negative. If the product is better then you get a positive quality score.

An obvious reason for a negative score would be the example I mentioned earlier, a system that matched flawed requirements. The level of satisfaction will be less than you would expect from delivering the required solution, so the quality score will be negative. If the requirements were seriously flawed, but the solution overcomes those problems then you should get a positive score.

Of course these equations merely illustrate the relationship. It’s not possible to substitute numbers into the equations and get an objective numerical result. Well, you could do it, but it wouldn’t mean anything objectively justifiable.

The crucial point is that if anyone wants to know what the quality of the product is then they can’t do so simply by testing against the requirements. They must do so by testing the product itself to find out what it does for the users. The requirements are obviously an important input to the testing exercise, but they cannot be regarded as the sole input.

In order to learn about the product’s quality testers have to investigate that quality gap, the difference between what the users asked for and what they actually got. You cannot provide information about the quality of the product if you look only for confirmation that they’ve got what you expected. You must also take account of what the product does, regardless of whether it was asked for, and whether it matches the users’ needs.

Is quality “conformance to requirements”?

It was fascinating to read the comments on Robert Glass’s articles. Many of the readers refused to accept that quality can ever be any more than compliance with requirements.

Intriguingly there were two strands of criticism that boil down to the same thing in practice. Firstly, quality is conflated with compliance not because that is necessarily the correct definition, but because that is the easiest concept to cope with. The definition of quality is irrelevant. All that matters is complying with the requirements. The reasoning is that we can know only what the users tell us, so we should be judged on nothing else.

The second strand of argument was to try and justify the notion that quality really is a question of complying with requirements, not on brutally pragmatic grounds, but as a matter of principle. A few commenters referred to Philip Crosby, whose maxim was “the definition of quality is conformance to requirements”.

These comments missed the point by miles. Crosby was a passionate advocate of rigour in the whole development process. He believed that this was the key and it was therefore possible to get the requirements right; to produce requirements that were measurable and met the real business needs. I side with those who think his work is of more relevance to conventional production and engineering rather than software development, and that Tom Gilb has far more to offer us.

However, it is a travesty of Crosby’s work to suggest that he ever advocated that we should take whatever requirements the users give us and treat them as sacrosanct. That is the implication of quoting his precept “quality is conformance to requirements” without saying anything about how to ensure that the requirements are right, or even acknowledging that Crosby’s precept depended on that.

These two strands come together in practice with the same result; the requirements are blithely assumed to be correct. Any user dissatisfaction can be shrugged off; “it’s not our fault, you got what you asked for”.

Glass’s equation and analysis creates a dilemma for advocates of standards and traditional scripted testing. If they want to argue that quality is the same thing as compliance with requirements they must then justify the assumption that the requirements are not only correct, but that they are also complete. No requirements specification can possibly detail everything an application must not do.

The implications of Glass’s equation

If the traditionalists accept that quality is distinct from requirements then they have a choice. Do they try and explain how advance scripting based on requirements (or even on expectations about how the application should behave) can be effective in learning about the real behaviour of an application that is liable to do different things, and more things, than the testers expected?

Or do the traditionalists accept that it is not the testers job to say anything about the total quality of the application, and provide only a narrow subset of the possible knowledge that is available?

I don’t believe that the traditionalists have ever addressed this dilemma satisfactorily. They cannot justify the effectiveness of scripted testing, and they dare not concede that their vision of testing is connected only loosely with quality, when their whole worldview is rooted in the assumption that they are the high priests of Quality.

If documentation-heavy, scripted testing (as implied by formal testing standards such as IEEE 829) is not about quality then the whole traditional edifice comes tumbling down. The only way it can survive is by ignoring the difficult questions that Glass’s equation poses.

Quality and “engineering”

The conflation of quality with requirements is either delusional or a symptom of indifference to the real needs of users. Paradoxically it is often portrayed as being hard nosed professionalism, part of the quest to make software development more like civil engineering.

In reality it was part of a mindset that crippled developers for many years by adopting the trappings of engineering whilst ignoring the realities of software development. This mindset spawned practices that were counter-productive in developing high quality applications. They were nevertheless pursued with a willful blindness that regarded evidence of failure as proof that the dysfunctional methods had not been applied sufficiently rigorously.

In my next blog I will try to explain how this came about, and I will develop my argument about how it is relevant to the debate about standards.