Testing: valuable or bullshit?

I’ve recently been thinking about automation, not specifically about test automation, rather about the wider issue of machines replacing humans and how that might affect testers.Frey, C. Osborne, M. 'the future of employment' Oxford Univ 2013

It started when I discussed this chart with a friend, who is a church pastor. He had spotted that there was only a probability of 0.008 that computerisation would result in job losses for the clergy in the next two decades.

I was intrigued by the list, in particular the 94% probability that robots would replace many auditors. That’s highly questionable. Auditors are now being asked to understand people, risks and assess corporate culture. They are moving away from the style of auditing that would have lent itself to computerisation.

Tick and bash compliance checking auditing is increasingly seen as old fashioned and discredited. Of course, much of the auditing done prior to the financial crash was a waste of time, but the answer is to do it properly, not replace inadequate human practices with cheap machines.

I periodically see similar claims that testing can be fully automated. The usual process is to misunderstand what a job entails, define it in a way that makes it amenable to automation, then say that automation is inevitable and desirable.

If the job of a pastor were to stand at the front of the church and read out a prepared sermon, then that could be done by a robot. However, the authors of this study correctly assumed that the job entails rather more than that.

Drill into the assumptions behind claims about automation and you will often find that they’re all hooey. Or that was my particular presumption, however. Confirmation bias isn’t just something that affects other people!

The future of employment

So I went off to have a look at the sources for that study. Here is the paper itself, “The future of employment” (PDF, opens in a new tab) by Frey and Osborne.

The first think that struck me was the table above was only a very small selection of the jobs covered in the study. Here are the other jobs that are concerned with IT.

Job Probability
Software developers (applications) 0.04
Software developers (systems software) 0.13
Information security analysts, web developers, network architects 0.21
Computer programmers 0.48
Accountants & auditors 0.94
Inspectors, testers, sorters, samplers, & weighers 0.98

So testers are in even greater danger of being automated out of a job than auditors. You can see what’s going on. Testers have been assigned to a group that defines testing as checking. (See James Bach’s and Michael Bolton’s discussion of testing and checking). Not surprisingly the authors have then reasoned that testing can be readily automated.

There are some interesting questions raised by this list. The chances are almost 50:50 that computer programming jobs will be lost, yet 24 to 1 against applications software developers losing their jobs. Are these necessarily different jobs? Or is it just cooler to be a software developer than a computer programmer?

In fairness to the authors they are using job categories defined by the US Department of Labor. It’s also worth explaining that the authors don’t actually refer to the probability figure as being the probability that jobs would be lost. That would have made the conclusions meaningless. How many jobs would be lost? A probability figure could apply only to a certain level of job loss, e.g. 90% probability that 50% of the jobs would go, or 10% probability that all jobs would be lost.

The authors are calculating the “probability of computerisation”. I think they are really using susceptibility to computerisation as a proxy for probability. That susceptibility can be inferred from the characteristics that the US Department of Labor has defined for each of these jobs.

The process of calculating the probability is summarised as follows.

“…while sophisticated algorithms and developments in MR (mobile robotics), building upon big data, now allow many non-routine tasks to be automated, occupations that involve complex perception and manipulation tasks, creative intelligence tasks, and social intelligence tasks are unlikely to be substituted by computer capital over the next decade or two. The probability of an occupation being automated can thus be described as a function of these task characteristics.”

So testing is clearly lined up with the jobs that can be automated by sophisticated algorithms that might build upon big data. It doesn’t fall in with the jobs that require complex perception, creative intelligence and social intelligence.

Defining testing by defining the bugs that matter

Delving into the study, and its supporting sources, confirms this. The authors excitedly cite studies that have found that technology can spot bugs. Needless to say the “bugs” have been defined as those that technology can find. NB – all links are to PDFs, which open in a new tab.

Algorithms can further automatically detect bugs in software Hangal and Lam, 2002; Livshits and Zimmermann, 2005; Kim et al., 2008), with a reliability that humans are unlikely to match. Big databases of code also offer the eventual prospect of algorithms that learn how to write programs to satisfy specifications provided by a human. Such an approach is likely to eventually improve upon human programmers, in the same way that human-written compilers eventually proved inferior to automatically optimised compilers… Such algorithmic improvements over human judgement are likely to become increasingly common.

There we have it. Testing is just checking. Algorithms are better than human judgement at performing the tasks that we’ve already framed as being more suited to algorithms. Now we can act on that and start replacing testers. Okaaay.

The supporting sources are a variety of papers outlining what are essentially tools, or possible tools, that can improve the quality of coding. A harsh verdict would be that their vision is only to improve unit testing. Note the range of dates, going back to 2002, which I think weakens the argument that one can use them to predict trends in the two decades from now. If these developments are so influential why haven’t they already started to change the world?

However, I don’t want to go into the detail of these papers, or whether they are valid. I’m quite happy to accept that they are correct and make a useful contribution within the limits of their relevance. I do think these limits are tighter than the authors have assumed, but that’s not what concerns me.

The point is that confusing testing with checking places testers at the front of the line to be automated. If you define the bugs that matter as those that can be caught by automation then you define testing in a damaging way. That would be bad enough, but too many people in IT and the testing profession have followed policies and practices that keep testers firmly in the firing line.

There are four broad reasons for this;

  • a widely presented false choice between automated and manual testing,
  • a blindness to the value of testing that leads to value being sacrificed in attempts to cut costs,
  • testing standards, which encourage a mechanical and linear approach,
  • a general corporate willingness to create and tolerate meaningless jobs.

Automated or manual testing – a false choice

The subtext of the manual versus automated false dichotomy seems to be that manual is the poor, unprofessional relation of high quality, sophisticated automation. I wonder if part of the problem is a misplaced belief in the value of repeatability, for which CMMI has to take its full share of the blame.

The thinking goes, if something can be automated it is repeatable; it can be tailored to be precise, continually generating accurate, high quality results. Automated testing and “best practice” go hand in glove.

In contrast, manual testing seems frustratingly unpredictable. The actions of the testers are contingent. I think that is an interesting word. I like to use it in the way it is used in philosophy and logic. Certain things are not absolutely true or necessarily so; they are true or false, necessary or redundant, depending on other factors or on observation. Dictionaries offer subtly different alternative meanings. Contingent means accidental, casual or fortuitous according to dictionary.com. These are incredibly loaded words that are anathema to people trying to lever an organisation up through the CMMI levels.

I understand “contingent”, as a word and concept, as being neutral, useful and not particularly related to “repeatable”, certainly not its opposite. It is sensible and pragmatic to regard actions in testing as being contingent – it all depends. Others do regard “contingent” and “repeatable” as opposites; “contingent” then becomes evidence of chaos and unpredictability that can be cleaned up with repeatable automation.

Some people regard “it all depends” as an honest statement of uncertainty. Others regard it a weak and evasive admission of ignorance. There has always been a destructive yearning for unattainable certainty in software development.

To be clear, I am not decrying repeatability. I mean only that putting too much emphasis on it is unhelpful. Placing too much emphasis on automation because it is repeatable, and decrying manual testing, sells true testing short. It demeans testing, and makes it more vulnerable to the prophets of full automation.

Costs chasing value downwards in a vicious cycle

There are a couple of related vicious circles. Testing that is rigid, script-driven and squeezed at the end of project doesn’t provide much value. So unless project managers are prepared to step back and question their world view their response, entirely rational when viewed from a narrow perspective, is to squeeze testing further.

When the standard of testing is as poor as it often is on traditional projects there is little value to be lost by squeezing testing harder because the reduced costs more than compensate for the reduced benefits. So the cycle continues.

Meanwhile, if value is low then the costs become more visible and harder to justify. There is pressure to reduce these costs, and if you’re not getting any value then just about any cost-cutting measure is going to look attractive. So we’re heading down the route of outsourcing, offshoring and the commoditization of testing. Testing is seen as an undifferentiated commodity, bought and sold on the basis of price. The inevitable pressure is to send cost and prices spiralling down to the level set by the lowest cost supplier, regardless of value.

If project managers, and their corporate masters, were prepared to liberate the testers, and ensure that they were high quality people, with highly developed skills, they could do vastly more effective and valuable work at a lower cost. But that comes back to questioning their world view. It’s always tough to make people do that when they’ve built a career on sticking to a false world view.

Standards

And now I come to the classic false world view pervading testing; the idea that it should be standardised. I have written about this before, and I’d like to quote what I wrote in my blog last November.

Standards encourage a dangerous illusion. They feed the hunger to believe, against all the evidence, that testing, and software development in general, are neat, essentially linear activities that can be be rendered orderly and controllable with sufficient advance documentation. Standards feed the illusion that testing can be easier than it really is, and performed by people less skilled than are really needed.

Standards are designed to raise the status of testing. The danger is that they will have the opposite result. By focussing on aspirations towards order, repeatability and predictability, by envisaging testing as a checking exercise, the proponents of testing will inadvertently encourage others to place testing at the front of the queue for automation.

Bullshit jobs

This is probably the most controversial point. Technological innovation has created the opportunity to do things that were previously either impossible or not feasible. Modern corporations have grown so complex that merely operating the mechanics of the corporate bureaucracy has become a job in itself. Never mind what the corporation is supposed to achieve, for huge numbers of people the end towards which they are working is merely the continued running of the machine.

Put these two trends together and you get a proliferation of jobs that have no genuine value, but which are possible only because there are tools to support them. There’s no point to them. The organisation wouldn’t suffer if they were dispensed with. The possibility of rapid communication becomes the justification for rapid communication. In previous times people could have been assigned responsibility to complete a job, and left to get on with it because the technology wasn’t available to micro-manage them.

These worthless jobs have been beautifully described as ”bullshit jobs” by David Graeber. His perspective is that of an anthropologist. He argues that technological progression has freed up people to do these jobs, and it has suited the power of financial capital to create jobs to keep otherwise troublesome people employed. Well, you can decide that for yourself, but I do think that these jobs are a real feature of modern life, and I firmly believe that such jobs will be early candidates for automation. If there is a nagging doubt about their value, and if they’re only possible because of technology, why not go the whole hog and automate them?

What’s that got to do with testing you may ask? Testing frequently falls into that category. Or at least testing as it is often performed; commoditized, script-driven, process-constrained testing. I’ve worked as a test manager knowing full well that my role was pointless. I wasn’t there to drive good testing. The team wasn’t being given the chance to do real testing. We were just going through the motions, and I was managing the testing process, not managing testing.

Most of my time was spent organising and writing plans that would ultimately bear little relation to the needs of the users or the testers. Then, during test execution, I would be spending all my time collating daily progress reports for more senior managers who in turn would spend all their time reading these reports, micro-managing their subordinates, and providing summaries for the next manager up the hierarchy; all pointless. No-one was prepared to admit that the “testing” was an expensive way to do nothing worthwhile. It was unthinkable to scrap testing altogether, but no-one was allowed to think their way to a model that allowed real testing.

As Graeber would put it, it was all bullshit. Heck, why not just automate the lot and get rid of these expensive wasters?

Testing isn’t meant to be easy – it’s meant to be valuable

This has been a long article, but I think the message is so important it’s worth giving some space to my arguments.

Too many people outside the testing profession think of testing as being low status checking that doesn’t provide much value. Sadly, too many people inside the profession unwittingly ally themselves with that mindset. They’d be horrified at the suggestion, but I think it’s a valid charge.

Testers should constantly fight back against attempts to define them in ways that make them susceptible to replacement by automation. They should always stress the importance of sapient, intelligent testing. It’s not easy, it’s not mechanical, it’s not something that beginners can do to an acceptable standard simply by following a standardised process.

If testers aren’t going to follow typists, telephonists and filing clerks onto the scrapheap we have to ensure that we define the profession. We must do so in such a way that no-one could seriously argue that there is a 98% chance of it being automated out of existence.

98%? We know it’s nonsense. We should be shouting that out.

9 thoughts on “Testing: valuable or bullshit?

  1. Hi,
    I’m enjoying your blog, and thoughtful – and comprehensive explanation. I read the definitions that the government uses for these jobs, and the situation might not be as grim as it first appears.

    The government defined “tester” role doesn’t sound like any software tester that I know. The “tester” job profile, the one with 98% susceptibility of automation, looks more like an assembly line kind of job. “Measure dimensions of products to verify conformance to specifications, using measuring instruments such as rulers, calipers, gauges, or micrometers.”
    Link: http://www.onetonline.org/link/summary/51-9061.00

    Testers in this context inspect items to a very well defined, and objective, set of criteria. I can imagine a software tester that exclusively executes procedures, that other people developed, could fall into this description. These roles

    The role of software testers that I’m more familiar with sounds more in line with the Software Programmer description. “Conduct trial runs of programs and software applications to be sure they will produce the desired information and that the instructions are correct.”
    Link: http://www.onetonline.org/link/summary/15-1131.00

    Most of the people that I work with are more closely aligned with the Software Developer profile. “Develop and direct software system testing and validation procedures, programming, and documentation.” I mostly work with “Quality Engineers” or “Software Developers in Test” type of profiles.
    Link: http://www.onetonline.org/link/summary/15-1132.00

  2. Thanks for your comment John. You make a good point. That job category would not have been defined with software testers in mind, and it should not apply to testing as we know it. However, testers don’t appear elsewhere. There should be a category for them, especially since there are a few for developers. It’s a pity that it uses the word “testers” when it really deals with checking.

    I do think it’s a valid concern that testers don’t have sufficient recognition of why it’s an important and difficult role that isn’t amenable to complete automation. If influential people define testing in such a way that itr does fit into that Department of Labor category, then it could lead corporations to assume that testing does deserve to be automated.

  3. Pingback: Five Blogs – 10 February 2014 | 5blogs

  4. Last year I read a number of 40+ year old books about software development. One of them (“Program Test Methods”, edited by William Hetzel) referenced a series of presentations by industry, academic, and government leaders at a symposium at Chapel Hill, NC, in 1972.

    It was amusing to read how close they assumed we were as an industry in 1972 into automating the testing of software.

    Undaunted by history or reality, this quest to eliminate manual testing and manual testers continues on – like the search for the Holy Grail, building a perpetual motion machine, or extolling the medicinal qualities of snake oil.

    Of course I’m not saying that automation or toolsmiths don’t have a place in the Software Development Lifecycle. They most certainly do. It’s this idea that somehow manual testing’s benefit can be eliminated by an expensive software program or SDETs is what needs to be put to bed.

    • Thanks Mark. The history of software development is fascinating. It’s been very difficult to wean people off the assumption that it ought to be like a manufacturing process. When reality didn’t match that assumption the instinct of many people was to try to change reality rather than question their presumptions. I went into this in some detail in my masters dissertation. It was an eye opening experience tracking back through history.

  5. James,
    Yeah, pretty much Testing work is a bunch of B.S. And I say that as a 25+ yr veteran of the testing trenches, as someone who has been working with automation tools (test/defect management, test execution, performance test) for the vast majority of it.

    I know we can never replace the human in the testing equation. Automation tools, and that is all they are, only augment and aid the human in the task of testing. There are too many other parts of testing that require the human brain to be involved. I remember back in the early 90’s how CASE tools would make developers obsolete. Well… we know how that one turned out.

    Now we are getting the Codeless/Scriptless automation tool marketing hype going on. That is definitely a buch of Bullshit. All the tools do is add an extrapolation layer. There is still ‘coding’ (inputting intstructions) in the tools in their own languages (sounds like programming to me), and those ‘tests’ (scripts) are stored for use/reuse. Got to love the Marketing people, and I thank them for keeping me employed cleaning up their messes.

    But good post, I like a lot of your points. We as testers have to realize that our job is always going to have to deal with B.S., and that others will see us as unnecessary and replaceable. The we are part of the B.S. of software development. So be it.

    • Thanks Jim. You use “bullshit” in a slightly different sense from the way I used it, but your use is equally valid in context.

      In hindsight maybe I should have incorporated your use as an additional argument and split a rather long post into a series. However, on Friday afternoon I had the time, and I was in the mood, to get it off my chest, so I just went for it.

      I replied to your comment yesterday but WordPress seems to have lost my reply. No doubt yesterday’s reply will surface from somewhere now I have retyped my comment

  6. good points – well made.
    found this great research paper that review the capabilities of automated accessibility testing tools http://www.markelvigo.info/papers/w4a13t.pdf
    It shows clearly that manual testing is needed for this. We need more well structured and researched papers like this to help make the case for manual testing.
    [The URL provided no longer works. I think this is the paper. JDC 17/6/2022/ https://www.researchgate.net/publication/262352732_Benchmarking_web_accessibility_evaluation_tools_Measuring_the_harm_of_sole_reliance_on_automated_tests/]

Leave a reply to Jim Hazen Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.