Service Virtualization interview about usability

Service VirtualizationThis interview with Service Virtualization appeared in January 2015. Initially when George Lawton approached me I wasn’t enthusiastic. I didn’t think I would have much to say. However, the questions set me thinking, and I felt they were relevant to my experience so I was happy to take part. It gave me something to do while I was waiting to fly back from EuroSTAR in Dublin!

How does usability relate to the notion of the purpose of a software project?

When I started in IT over 30 years ago I never heard the word usability. It was “user friendliness”, but that was just a nice thing to have. It was nice if your manager was friendly, but that was incidental to whether he was actually good at the job. Likewise, user friendliness was incidental. If everything else was ok then you could worry about that, but no-one was going to spend time or money, or sacrifice any functionality just to make the application user friendly. And what did “user friendly” mean anyway. “Who knows? Who cares? We’ve got serious work do do. Forget about that touchy feely stuff.”

The purpose of software development was to save money by automating clerical routines. Any online part of the system was a mildly anomalous relic of the past. It was just a way of getting the data into the system so the real work could be done. Ok, that’s an over-simplification, but I think there’s enough truth in it to illustrate why developers just didn’t much care about the users and their experience. Development moved on from that to changing the business, rather than merely changing the business’s bureaucracy, but it took a long time for these attitudes to shift.

The internet revolution turned everything upside down. Users are no longer employees who have to put up with whatever they’re given. They are more likely to be customers. They are ruthless and rightly so. Is your website confusing? Too slow to load? Your customers have gone to your rivals before you’ve even got anywhere near their credit card number.

The lesson that’s been getting hammered into the heads of software engineers over the last decade or so is that usability isn’t an extra. I hate the way that we traditionally called it a “non-functional requirement”, or one of the “quality criteria”. Usability is so important and integral to every product that telling developers that they’ve got to remember it is like telling drivers they’ve got to remember to use the steering wheel and the brakes. If they’re not doing these things as a matter of course they shouldn’t be allowed out in public. Usability has to be designed in from the very start. It can’t be considered separately.

What are the main problems in specifying for and designing for software usability?

Well, who’s using the application? Where are they? What is the platform? What else are they doing? Why are they using the application? Do they have an alternative to using your application, and if so, how do you keep them with yours? All these things can affect decisions you take that are going to have a massive impact on usability.

It’s payback time for software engineering. In the olden days it would have been easy to answer these questions, but we didn’t care. Now we have to care, and it’s all got horribly difficult.

These questions require serious research plus the experience and nous to make sound judgements with imperfect evidence.

In what ways do organisations lose track of the usability across the software development lifecycle?

I’ve already hinted at a major reason. Treating usability as a non-functional requirement or quality criterion is the wrong approach. That segregates the issue. It’s treated as being like the other quality criteria, the “…ities” like security, maintainability, portability, reliability. It creates the delusion that the core function is of primary importance and the other criteria can be tackled separately, even bolted on afterwards.

Lewis & Rieman came out with a great phrase fully 20 years ago to describe that mindset. They called it the peanut butter theory of usability. You built the application, and then at the end you smeared a nice interface over the top, like a layer of peanut butter (PDF, opens in new tab).

“Usability is seen as a spread that can be smeared over any design, however dreadful, with good results if the spread is thick enough. If the underlying functionality is confusing, then spread a graphical user interface on it. … If the user interface still has some problems, smear some manuals over it. If the manuals are still deficient, smear on some training which you force users to take.”

Of course they were talking specifically about the idea that usability was a matter of getting the interface right, and that it could be developed separately from the main application. However, this was an incredibly damaging fallacy amongst usability specialists in the 80s and 90s. There was a huge effort to try to justify this idea by experts like Hartson & Hix, Edmonds, and Green. Perhaps the arrival of Object Oriented technology contributed towards the confusion. A low level of coupling so that different parts of the system are independent of each other is a good thing. I wonder if that lured usability professionals into believing what they wanted to believe, that they could be independent from the grubby developers.

Usability professionals tried to persuaded themselves that they could operate a separate development lifecycle that would liberate them from the constraints and compromises that would be inevitable if they were fully integrated into development projects. The fallacy was flawed conceptually and architecturally. However, it was also a politically disastrous approach. The usability people made themselves even less visible, and were ignored at a time when they really needed to be getting more involved at the heart of the development process.

As I’ve explained, the developers were only too happy to ignore the usability people. They were following methods and lifecycles that couldn’t easily accommodate usability.

How can organisations incorporate the idea of usability engineering into the software development and testing process?

There aren’t any right answers, certainly none that will guarantee success. However, there are plenty of wrong answers. Historically in software development we’ve kidded ourselves thinking that the next fad, whether Structured Methods, Agile, CMMi or whatever, will transform us into rigorous, respected professionals who can craft high quality applications. Now some (like Structured Methods) suck, while others (like Agile) are far more positive, but the uncomfortable truth is that it’s all hard and the most important thing is our attitude. We have to acknowledge that development is inherently very difficult. Providing good UX is even harder and it’s not going to happen organically as a by-product of some over-arching transformation of the way we develop. We have to consciously work at it.

Whatever the answer is for any particular organisation it has to incorporate UX at the very heart of the process, from the start. Iteration and prototyping are both crucial. One of the few fundamental truths of development is that users can’t know what they want and like till they’ve seen what is possible and what might be provided.

Even before the first build there should have been some attempt to understand the users and how they might be using the proposed product. There should be walkthroughs of the proposed design. It’s important to get UX professionals involved, if at all possible. I think developers have advanced to the point that they are less likely to get it horribly wrong, but actually getting it right, and delivering good UX is asking too much. For that I think you need the professionals.

I do think that Agile is much better suited to producing good UX than traditional methods, but there are still dangers. A big one is that many Agile developers are understandably sceptical about anything that smells of Big Up-Front Analysis and Design. It’s possible to strike a balance and learn about your users and their needs without committing to detailed functional requirements and design.

How can usability relate to the notion of testable hypothesis that can lead to better software?

Usability and testability go together naturally. They’re also consistent with good development practice. I’ve worked on, or closely observed, many applications where the design had been fixed and the build had been completed before anyone realised that there were serious usability problems, or that it would be extremely difficult to detect and isolate defects, or that there would be serious performance issues arising from the architectural choices that had been made.

We need to learn from work that’s been done with complexity theory and organisation theory. Developing software is mostly a complex activity, in the sense that there are rarely predictable causes and effects. Good outcomes emerge from trialling possible solutions. These possibilities aren’t just guesswork. They’re based on experience, skill, knowledge of the users. But that initial knowledge can’t tell you the solution, because trying different options changes your understanding of the problem. Indeed it changes the problem. The trials give you more knowledge about what will work. So you have to create further opportunities that will allow you to exploit that knowledge. It’s a delusion that you can get it right first time just by running through a sequential process. It would help if people thought of good software as being grown rather than built.

“Fix on failure” – a failure to understand failure

Wikipedia is a source that should always be treated with extreme scepticism and the article on the “Year 2000 problem” is a good example. It is now being widely quoted on the subject, even though it contains some assertions that are either clearly wrong, or implausible, and lacking any supporting evidence.

Since I wrote about ”Y2K – why I know it was a real problem” last week I’ve been doing more reading around the subject. I’ve been struck by how often I’ve come across arguments, or rather assertions, that a “fix on failure” response would have been the best response. Those who argue that Y2K was a big scare and a scam usually offer a rewording of this gem from the Wikipedia article.

”Others have claimed that there were no, or very few, critical problems to begin with, and that correcting the few minor mistakes as they occurred, the “fix on failure” approach, would have been the most efficient and cost-effective way to solve the problem.”

There is nothing to back up these remarkable claims, but Wikipedia now seems to be regarded as an authoritative source on Y2K. The first objection to these assertions is that the problems that occurred tell us nothing about those that were prevented. At the site where I worked as a test manager we triaged our work so that important problems were fixed in advance and trivial ones were left to be fixed when they occurred. So using the problems that did occur to justify “fix on failure” for all problems is a facile argument at best.

However, my objection to “fix on failure” runs deeper than that. The assertion that “fix on failure was the right approach for everything is infantile. Infantile? Yes, I use the word carefully. It ignores big practical problems that would have been obvious to anyone with experience of developing and supporting large, complicated applications. Perhaps worse, it betrays a dangerously naive understanding of “failure”, a misunderstanding that it shares with powerful people in software testing nowadays. Ok, I’m talking about the standards lobby there.

”Fix on failure” – deliberate negligence

Firstly, “fix on failure” doesn’t allow for the seriousness of the failure. As Larry Burkett wrote;

“It is the same mindset that believes it is better to put an ambulance at the bottom of a cliff rather than a guardrail at the top”.

“Fix on failure” could have been justified only if the problems were few and minor. That is a contentious assumption that has to be justified. However, the only justification on offer is that those problems which occurred would have been suitable for “fix on failure”. It is a circular argument lacking evidence or credibility, and crucially ignores all the serious problems that were prevented.

Once one acknowledges that there were a huge number of problems to be fixed one has to deal with the practical consequences of “fix on failure”. That approach does not allow for the difficulty of managing masses of simultaneous failures. These failures might not have been individually serious, but the accumulation might have been crippling. It would have been impossible to fix them all within acceptable timescales. There would have been insufficient staff to do the work in time.

Release and configuration management would have posed massive problems. If anyone tells you Y2K was a scam ask them how they would have handled configuration and release management when many interfacing applications were experiencing simultaneous problems. If they don’t know what you are talking about then they don’t know what they are talking about.

Of course not all Y2K problems would have occurred on 1st January 2000. Financial applications in particular would have been affected at various points in 1999 and even earlier. That doesn’t affect my point, however. There might have been a range of critical dates across the whole economy, but for any individual organisation there would have been relatively few, each of which would have brought a massive, urgent workload.

Attempting to treat Y2K problems as if they were run of the mill, “business as usual” problems, as advocated by sceptics, betrays appalling ignorance of how a big IT shop works. They are staffed and prepared to cope with a relatively modest level of errors and enhancements in their applications. The developers who support applications aren’t readily inter-changeable. They’re not fungible burger flippers. Supporting a big complicated application requires extensive experience with that application. Staff have to be rotated in and out carefully and piecemeal so that a core of deep experience remains.

IT installations couldn’t have coped with Y2K problems in the normal course of events any more than garages could cope if all cars started to have problems. The Ford workshops would be overwhelmed when the Fords started breaking down, the Toyota dealers would seize up when the Toyotas suffered.

The idea that “fix on failure” was a generally feasible and responsible approach simply doesn’t withstand scrutiny. Code that wasn’t Y2K-compliant could be spotted at a glance. It was then possible to predict the type of error that might arise, if not always the exact consequences. Why on earth would anyone wait to see if one could detect obscure, but potentially serious distortions? Why would anyone wait to let unfortunate citizens suffer or angry customers complain?

The Y2K sceptics argue that organisations took expensive pre-emptive action because they were scared of being sued. Well, yes, that’s true, and it was responsible. The sceptics were advocating a policy of conscious, deliberate negligence. The legal consequences would quite rightly have been appalling. “Fix on failure” was never a serious contribution to the debate.

”Fix on failure” – a childlike view of failure

The practical objections to a “fix on failure” strategy were all hugely significant. However, I have a deeper, fundamental objection. “Fix on failure” is a wholly misguided notion for anything but simple applications. It is based on a childlike, binary view of failure. We are supposed to believe an application is either right or wrong; it is working or it is broken; that if there is a Y2K problem then the application obligingly falls over. Really? That is not my experience.

With complicated financial applications an honest and constructive answer to the question “is the application correct?” would be some variant on “what do you mean by correct?”, or “I don’t know. It depends”. It might be possible to say the application is definitely not correct if it is producing obvious garbage. But the real difficulty is distinguishing between the seriously inaccurate, but plausible, and the acceptably accurate. Discussion of accuracy requires understanding of critical assumptions, acceptable margins of error, confidence levels, the nature and availability of oracles, and the business context of the application.

I’ve never seen any discussion of Y2K by one of the “sceptical” conspiracy theorists that showed any awareness of these factors. There is just the naïve assumption that a “failed” application is like a patient in a doctor’s surgery, saying “I’m sick, and here are my symptons”.

Complicated applications have to be nursed and constantly monitored to detect whether some new, extraneous factor, or long hidden bug, is skewing the figures. A failing application might appear to be working as normal, but it would be gradually introducing distortions.

Testing highly complicated applications is not a simple, binary exercise of determining “pass or fail”. Testing has to be a process of learning about the application and offering an informed opinion about what it is, and what it does. That is very different from checking it against our preconceptions, which might have been seriously flawed. Determining accuracy is more a matter of judgement than inspection.

Throughout my career I have seen failures and problems of all types, with many different causes. However, if there is a single common underlying theme then the best candidate would be the illusion that development is like manufacturing, with a predictable end product that can be checked. The whole development and testing process is then distorted to try and fit the illusion.

The advocates of Y2K “fix on failure” had much in common with the ISO 29119 standards lobby. Both shared that “manufacturing” mindset, that unwillingness to recognise the complexity of development, and the difficulty of performing good, effective testing. Both looked for certainty and simplicity where it was not available.

Good testers know that an application is not necessarily “correct” just because it has passed the checks on the test script. Likewise failure is not an absolute concept. Ignoring these truths is ignoring reality, trying to redefine it so we can adopt practices that seem more efficient and effective. I suspect the mantra that “fix on failure would have been more effective and efficient” has its roots with economists, like the Australian Quiggin, who wanted to assume complexity away. See this poor paper (PDF, opens in a new tab).

Doing the wrong thing is never effective. Negligence is rarely efficient. Reality is uncomfortable. We have to understand that and know what we are talking about before coming up with simplistic, snake-oil solutions that assume simplicity where the reality is complexity.

Y2K – why I know it was a real problem

It’s confession time. I was a Y2K test manager for IBM. As far as some people are concerned that means I was party to a huge scam that allowed IT companies to make billions out of spooking poor deluded politicians and the public at large. However, my role in Y2K means I know what I am talking about, so when I saw some recent comment that it was all nothing more than hype I felt the need to set down my first hand experience. At the time, and in the immediate aftermath of Y2K, we were constrained by client confidentiality from explaining what we did, but 15 years on I feel comfortable about speaking out.

Was there a huge amount of hype? Unquestionably.

Was money wasted? Certainly, but show me huge IT programmes where that hasn’t happened.

Would it have been better to do nothing and adopt a “fix on failure” approach? No, emphatically not as a general rule and I will explain why.

There has been a remarkable lack of studies of Y2K and the effectiveness of the actions that were taken to mitigate the problem. The field has been left to those who saw few serious incidents and concluded that this must mean there could have been no serious problem to start with.

The logic runs as follows. Action was taken in an attempt to turn outcome X into outcome Y. The outcome was Y. Therefore X would not have happened anyway and the action was pointless. The fallacy is so obvious it hardly needs pointing out. If action was pointless then the critics have to demonstrate why the action that was taken had no impact and why outcome Y would have happened regardless. In all the years since 2000 I have seen only unsubstantiated assertion and reference to those countries, industries and sectors where Y2K was not going to be a signficant problem anyway. The critics always ignore the sectors where there would have been massive damage.

An academic’s flawed perspective

This quote from Anthony Finkelstein, professor of software systems engineering at University College London, on the BBC website, is typical of the critics’ reasoning.

”The reaction to what happened was that of a tiger repellent salesman in Golders Green High Street,” says Finkelstein. ‘No-one who bought my tiger repellent has been hurt. Had it not been for my foresight, they would have.’ “

The analogy is presumably flippant and it is entirely fatuous. There were no tigers roaming the streets of suburban London. There were very significant problems with computer systems. Professor Finkelstein also used the analogy back in 2000 (PDF, opens in new tab).

In that paper he made a point that revealed he had little understanding of how dates were being processed in commercial systems.

”In the period leading up to January 1st those who had made dire predictions of catastrophe proved amazingly unwilling to adjust their views in the face of what was actually happening. A good example of this was September 9th 1999 (9/9/99). On this date data marked “never to expire” (realised as expiry 9999) would be deleted bringing major problems. This was supposed to be a pre-shock that would prepare the way for the disaster of January 1st. Nothing happened. Now, if you regarded the problem as a serious threat in the first place, this should surely have acted as a spur to some serious rethinking. It did not.”

I have never seen a date stored in the way Finkelstein describes, 9th September 1999 being held as 9999. If that were done there would be no way to distinguish 1st December 2014 from 11th February 2014. Both would be 1122014. Dates are held either in the form 090999, with leading zeroes so the dates can be interpreted correctly, or with days, months and years in separate sub-fields for simpler processing. Programmers who flooded date fields with the integer 9 would have created 99/99/99, which could obviously not be interpreted as 9th September 1999.

Anyway, the main language of affected applications was Cobol, and the convention was for programmers to move “high values”, i.e. the highest possible value the compiler could handle, into the field rather than nines. “High values” doesn’t translate into any date. Why doesn’t Finkelstein know this sort of basic thing if he’s setting himself up as a Y2K expert? I never heard any concern about 9/9/99 at the time, and it certainly never featured in our planning or work. It is a straw man, quite irrelevant to the main issue.

In the same paper from 2000 Finkelstein made another claim that revealed his lack of understanding of what had actually been happening.

September 9th 1999 is only an example. Similar signs should have been evident on January 1st 1999, the beginning of the financial year 99-00, December 1st, and so on. Indeed assuming, as was frequently stated, poor progress had been made on Y2K compliance programmes we would have anticipated that such early problems would be common and severe. I see no reason to suppose that problems should not have been more frequent (or at any rate as frequent) in the period leading up to December 31st 1999 than afterwards given that transactions started in 1999 may complete in 2000, while after January 1st new transactions start and finish in the new millennium.

Finkelstein is entirely correct that the problem would not have suddenly manifested itself in January 2000, but he writes as if this is an insight the practitioners lacked at the front line. At General Accident the first critical date that we had to hit was the middle of October 1998, when renewal invitations for the first annual insurance contracts extending past December 1999 would be issued. At various points over the next 18 months until the spring of 2000 all the other applications would hit their trigger dates. Everything of significance had been fixed, tested and re-implemented by September 1999.

We knew that timetable because it was our job to know it. We were in trouble not because time was running out till 31/12/1999, but because we had little time before 15/10/1998. We made sure we did the right work at the right time so that all of the business critical applications were fixed in time. Finkelstein seems unaware of what was happening. A massed army of technical staff were dealing with a succession of large waves sweeping towards them over a long period, rather than a single tsunami at the millennium.

Academics like Finkelstein have a deep understanding of the technology and how it can, and should be used, but this is a different matter from knowing how it is being applied by practitioners acting under extreme pressure in messy and complex environments. These practitioners aren’t doing a bad job because of difficult conditions, lack of knowledge and insufficient expertise. They are usually doing a good job, despite those difficult conditions, drawing on vast experience and deep technical knowledge.

Comments such as those of Professor Finkelstein betray a lack of respect for practitioners, as if the only worthwhile knowledge is that possessed by academics.

What I did in the great Y2K “scare”

Let me tell you why I was recruited as a Y2K test manager by IBM. I had worked as a computer auditor for General Accident. A vital aspect of that role had been to understand how all the different business critical applications fitted together, so that we could provide an overview to the business. We could advise on the implications and risks of amending applications, or building new ones to interface with the existing applications.

A primary source - my report explaining the problem with a business critical application

A primary source – my report explaining the problem with a business critical application

Shortly before General Accident’s Y2K programme kicked off I was transferred to IBM under an outsourcing deal. General Accident wanted a review performed of a vital back office insurance claims system. The review had to establish whether the application should be replaced before Y2K, or converted. Senior management asked IBM that I should perform the review because I was considered the person with the deepest understanding of the business and technical issues. The review was extremely urgent, but it was delayed by a month till I had finished my previous project.

I explained in the review exactly why the system was business critical and how it was vital to the company’s reserving, and therefore the production of the company accounts. I explained how the processing was all date dependent, and showed how and when it would fail. If the system was unavailable then the accountants and premium setters would be flying blind, and the external auditors would be unable to sign off the company accounts. The risks involved in trying to replace the application in the available time were unacceptable. The best option was therefore to make the application Y2K compliant. This advice was accepted.

As soon as I’d completed the review IBM moved me into a test management position on Y2K, precisely because I had all the business and technical experience to understand how eveything fitted together, and what the implications of Y2K would be. The first thing I did was to write a suite of SAS programs that crawled through the production code libraries, job schedules and job control language libraries to track the relationship between programs, data and schedules. For the first time we had a good understanding of the inventory, and which assets depended on each other. Although I was nominally only the test manager I drew up the conversion strategy and timetable for all the applications within my remit, based on my accumulated experience and the new knowledge we’d derived from the inventory.

An insurance company’s processing is heavily date dependent. Premiums are earned on a daily basis, with the appropriate proportion being refunded if a policy is cancelled mid-term. Claims are paid only if the appropriate cover is in place on the date that the incident occurred. Income and expenditure might be paid on a certain date, but then spread over many years. If the date processing doesn’t work then the company can’t take in money, or pay it out. It cannot survive. The processing is so complex that individual errors in production often require lengthy investigation and fixing, and then careful testing. The notion that a “fix on failure” response to Y2K would have worked is risible.

We fixed the applications, taking a careful, triaged risk-based approach. The most date sensitive programs within the most critical applications received the most attention. Some applications were triaged out of sight. For these, “fix on failure” was appropriate.

We tested the converted applications in simulated runs across the end of 1999, in 2000 and again in 2004. These simulations exposed many more problems not just with our code, but also with all the utility and housekeeping routines and tools. In these test runs we overrode the mainframe system date within the test runs.

In the final stage of testing we went a step further. We booted up a mainframe LPAR (logical partition) to run with the future dates. I managed this exercise. We had a corner of the office with a sign saying “you are now entering 2000”, and everything was done with future dates. This exercise flagged up further problems with code that we had been confident would run smoothly.

December 19th 1999, Mary, her brother Malcolm & I in the snow. Not panicking about Y2K.

December 19th 1999, Mary, her brother Malcolm & I in the snow. Not panicking much about Y2K.

Y2K was a fascinating time in my career because I was at a point that I now recognise as a sweet spot. I was still sufficiently technically skilled to do anything that my team members could do, even being called on to fix overnight production problems. However, I was sufficiently confident, experienced and senior to be able to give presentations to the most senior managers explaining problems and what the appropriate solutions would be.

For these reasons I know what I’m talking about when I write that Y2K was a huge problem that had to be tackled. The UK’s financial sector would have suffered a massive blow if we had not fixed the problem. I can’t say how widespread the damage might have been, but I do know it would have been appalling.

My personal millennium experience

When I finished with Y2K in September 1999, at the end of the future mainframe exercise, at the end of a hugely pressurised 30 months, I negotiated seven weeks leave and took off to Peru. IBM could be a great employer at times! My job was done, and I knew that General Accident, or CGU as it had evolved into by then, would be okay. There would inevitably be a few glitches, but then there always are in IT.

What was on my mind on 31st December 1999

What was on my mind on 31st December 1999

I was so relaxed about Y2K that on my return from Peru it was the least of my concerns. There was much more interesting stuff going on in my life. I got engaged in December 1999, and on 31st December Mary and I bought our engagement and wedding rings. That night we were at a wonderful party with our friends, and after midnight we were on Perth’s North Inch to watch the most spectacular fireworks display I’ve ever seen. 1st January 2000? It was a great day that I’ll always remember happily. It was far from being a disaster, and that was thanks to people like me.

PS – I have written a follow up article explaining why “fix on failure” was based on an infantile view of software failure.