The Volkswagen emissions scandal; responsible software testing?

The scandal blows up in Volkswagen’s face

The Volkswagen emissions scandal has been all over the media worldwide since the US Environmental Protection Agency hit VW with a notice of violation on 18th September.

This is a sensational story and there are many important and fascinating aspects to it, but there is one angle I haven’t seen explored that I find fascinating. Many of the early reports focused on the so called “defeat device” that the EPA referred to. That gave the impression the problem was a secret, discrete piece of kit hidden away in the engine. A defeat device, however, is just EPA shorthand for any illegal means of subverting its regulations. Such an illegal device is one that alters the emissions controls in normal running, outside a test. In the VW case the device is software in the car control software that could detect the special conditions under which emissions testing is performed. This is how the EPA reported the violation in its formal notice.

“VW manufactured and installed software in the electronic control module (ECM) of these vehicles that sensed when the vehicle was being tested for compliance with EPA emission standards. For ease of reference, the EPA is calling this the ‘switch’. The ‘switch’ senses whether the vehicle is being tested or not based on various inputs including the position of the steering wheel, vehicle speed, the duration of the engine’s operation, and barometric pressure. These inputs precisely track the parameters of the federal test procedure used for emission testing for EPA certification purposes.

During EPA emission testing, the vehicles’ ECM ran software which produced compliant emission results under an ECM calibration that VW referred to as the ‘dyno calibration’ (referring to the equipment used in emissions testing, called a dynamometer). At all other times during normal vehicle operation, the ‘switch’ was activated and the vehicle ECM software ran a separate ‘road calibration’ which reduced the effectiveness of the emission control system.”

What did Volkswagen’s testers know?

What interests me about this is that the defeat device is integral to the control system (ECM); the switch has to operate as part of the normal running of the car. The software is constantly checking the car’s behaviour to establish whether it is taking part in a federal emissions test or just running about normally. The testing of this switch would therefore have been part of the testing of the ECM. There’s no question of some separate piece of kit or software over-riding the ECM.

This means the software testers were presumably complicit in the conspiracy. If they were not complicit then that would mean that they were unaware of the existence of the different dyno and road calibrations of the ECM. They would have been so isolated from the development and the functionality of the ECM that they couldn’t have been performing any responsible, professional testing at all.

Passing on bad news – even to the very top

That brings me to my real interest. What does responsible and professional testing mean? That is something that the broadly defined testing community hasn’t resolved. The ISTQB/ISO community and the Context Driven School have different ideas about that, and neither has got much beyond high level aspirational statements. These say what testers believe, but don’t provide guiding principles that might help them translate their beliefs into action.

Other professions, or rather serious, established professions, have such guiding principles. After working as an IT auditor I am familiar with the demands that the Institute of Internal Auditors makes on the profession. If internal auditors were to discover the existence of the defeat device then their responsibility would be clear.

Breaking the law by cheating on environmental regulation introduces huge risk to the corporation. The auditors would have to report that and escalate their concern to the Audit Committee, on which non-executive directors should sit. In the case of VW the Audit Committee is responsible for risk management and compliance. Of its four members one is a senior trade union official and another is a Swedish banker. Such external, independent scrutiny is essential for responsible corporate governance. The internal auditors are accountable to them, not the usual management hierarchy.

Of course escalation to the Audit Committee would require some serious deliberation and would be no trivial matter. It would be the nuclear option for internal auditors, but in principle their responsibility is simple and clear; they must pursue and escalate the issue or they are guilty of professional misconduct or negligence. “In principle”; that familiar phrase that is meaningless in software testing.

If internal auditors had deteced the ECM defeat device they might have done so when conducting audit tests on the software as part of a risk based audit, having decided that the regulatory implications meant the ECM was extremely high risk software. However, it is far more likely that they would have discovered it after a tip off from a whistleblower (as is often the case with serious incidents).

What is the responsibility of testers?

This takes us back to the testers. Just what was their responsibility? I know what I would have considered my moral duty as a tester, but I know that I would have left myself in a very vulnerable position if I had been a whistleblower who exposed the existence of the defeat device. As an auditor I would have felt bullet proof. That is what auditor independence means.

So what should testers do when they’re expected to be complicit in activities that are unethical or illegal or which have the whiff of negligence? Until that question is resolved and testers can point to some accepted set of guiding principles then any attempts to create testing standards or treat testing as a profession are just window dressing.

Addendum – 30th September 2015

I thought I’d add this afterthought. I want to be clear that I don’t think the answer to the problem would be to beef up the ISTQB code of ethics and enforce certification on testers. That would be a depressingly retrograde step. ISTQB lacks any clear and accepted vision of what software testing is and should be. The code of ethics is vague and inconsistent with ISTQB’s own practices. It would therefore not be in a credible position to enforce compliance, which would inevitably be selective and arbitrary.

On a more general note, I don’t think any mandatory set of principles is viable or desirable under current and foreseeable circumstances. By “mandatory” I mean principles to which testers would have to sign up and adhere to if they wanted to work as testers.

As for ISO 29119, I don’t think that it is relevant one way or another to the VW case. The testers could have complied with the standard whilst conspiring in criminal acts. That would not take a particularly imaginative form of creative compliance.

I have followed up this article with a second post, written on 7th October.


12 thoughts on “The Volkswagen emissions scandal; responsible software testing?

  1. How would a tester who did not know the special code have found this?
    If the code was ( totally simplified ) IF temp = 57 AND steering wheel straight for 5 minutes AND GPS position not changed AND Brakes NOT applied for 8 minutes AND RPM = 3000
    Then how would this be found?

    Presumably though, there were testers who were testing that these special conditions did trigger the cheat code…

  2. You must be using a new email address since your last comment Phil. Once someone has a comment approved future comments go live immediately. I nearly missed I’d have to moderate your comment.

    When I was thinking about writing the article I orginally intended to leave open the question of whether the testers could have been expected to know. However, when I read the EPA Notice of Violation I decided the testers should have been expected to know about it. See the second paragraph in the quote above from the EPA. There were alternative calibrations for the ECM. Only if the conditions were missing that indicated an emissions test was in progress should the normal calibration have been activated.

    If the testers were somehow kept so far out of the loop that they didn’t know how the ECM (the engine control software) worked then they weren’t really testing it. But then as you say someone must have been doing carefu testing to see whether the defeat device was working, so I think my allegation stands. I’m not expecting to get sued by an irate Volkswagen!

  3. Not at all true that the testers were implicit, as rudyregner says and the article asks. We do not know yet–per who did know. Some former VW testers have said that this was touted as a “testing only” feature, which would only be present to make testing possible earlier in the cycle, and was presented as a “test only” feature that wouldn’t be used for customers internally, but that eye witness account has not yet be verified.

    Not only has the assumption you made here that the tester were unethical or irresponsible been made by the media, but a 10 year old picture of a VW quality manager who didn’t even work on these models was used for a newspaper cover story, basically harming an innocent person.

    We need to take a moment to do some root cause analysis before the witch hunting and blame happen. There is more evidence that the testers weren’t complicit than that they were. It seems more likely they were lied to. When the high emissions on the actual road were reported in 2014, VW said it was due to “unexpected road conditions”. For all we know, they reported this and it was deferred as “not a real world scenario”. Although we don’t know yet, the blame lies with those who made the decision to cheat, and it seems like those with the info might be a touch higher up in the org than the front line testers.

    • Hi Lanette – thanks for this. First, there’s no excuse for printing a photograph of the VW quality manager. That was deplorable.

      I think my presumption that the testers were complicit is justifiable, based on the EPA notification and also my understanding of complicity. I don’t believe that the testers are off the hook if they were lied to. However, I accept that my attitude might be influenced by my training as an auditor. People lied to us, misled us with partial information, or put a misleading gloss on events. It was drummed into us that we should make up our minds based on the evidence and be wary of taking assertions at face value. Perhaps that has made me more judgmental, as well as more sceptical than testers should be.

      I think the testers would have been complicit if they failed to raise, pursue and escalate concerns. I certainly don’t think they would have been in the lead. They would have been well down the chain of guilt. Complicity would mean that they went along with the conspiracy, even if they were only closing their minds to the possibility of illegal behaviour. I did think it was possible that the defeat device might have been presented to the testers as a “testing only” feature. However, that doesn’t seem to fit well with the description of the violation in the EPA notification. That suggests the defeat device was an integral part of the normal engine control software. There were different configurations for road running and for emissions testing, and both configurations were always present in the engine control software, because the engine checks on the road for the conditions indicating an emissions test. If the testers understood how the software worked then they should have understood the implications. If they didn’t understand that then it raises questions about the general level of their testing.

      Having said all that I’m wondering if I was right to use the VW case as a good example to illustrate my concern. The more difficult question is raised in my sentence “So what should testers do when they’re expected to be complicit in activities that are unethical or illegal or which have the whiff of negligence?” Illegal is easy, in principle at least. But what should the testers do if they believe there are unethical aspects to the software, or if they think there’s a suspicion of negligence? That’s harder. I believe that testers should be ready to approach auditors and compliance professionals with concerns, and the culture should encourage that. I’m not sure if I would enjoy widespread backing on that.

      • Hi James,

        I totally agree with your concern. I’ve faced issues where I was so concerned about doing harm that I actually quit my job at one point (with notice, of course) as I couldn’t live with NOT taking action. I did try to bring the facts up with proof to the right people. Unfortunately, when we keep rewarding breaking the rules including promotions and none of those involved in fraud even going to jail, like in the banking scandal, we are going to end up with no one left willing to take a stand. When quality is slipping to dangerous levels, yet the company is still making money and testing is becoming a less respected and “needed” profession to the point that huge companies brag about having no one doing it, what is a workable approach?

        I had to explain at the last conference I attended that if I had a daughter I would not want her testing. Why? She can code instead and have more respect and pay with less hassle. It really is the only skill we value and pay for anymore on many teams, which is a dire loss of skill and experience. The bar is so low for what we expect developers to do to test their own code, it is laughable. Yet so many companies are pretending this is the way forward. That is somehow replaces having actual testing or a balanced approach. I’m not sure how to even best stand up for testing, beyond showing the value it adds and doing the work that I love when allowed to. I try to insist that my work not be used for evil, to the point that I always have savings and am willing to leave if needed. Otherwise, it isn’t our decision. Tester has become a more powerless position over time. It is so weak and submissive it is considered useless by many companies in the US.

        What is going to happen if we keep rewarding those who cheat? I wonder if it is going to end up like the school shootings in the US, where people dying from software failure is just something we shake our heads about, yet accept as normal and do nothing to change. When you talk to other testers, they are so quick to say we just provide information, but do not make a decision. I think that is a poor approach in some cases. Shouldn’t we make a recommendation based on experience? It isn’t our decision alone, but to just wash your hands of it entirely? I’m not sure that is an empowered stance that I’d like to take. Contributing to the decision by giving a professional suggestions makes some sense to me. It is time that cross functional teams with varied skills is measured along with the results of the 100% development tested products so the results can be seen.

        Having used both kinds of products, those which have both good code facing developer tests along with skilled and focused testing is a much better product to use. Testers need to support tested products. For example, I do not use Uber. I use Lyft instead. Better team. Better testing. Better to the drivers. Better for end users. If we as testers don’t even care about using a product that is tested, how can we expect others to? We’ve done a terrible job, as a community, of showing the value of testing and making our work visible. With such a small profile, and a culture of being absolutely submissive bootlickers, willing to do anything at all to keep a job, I really wonder if the VW testers DID bring up the issue and were assured it wasn’t in scope. Or “It works in the test”? It will be interesting to see.

  4. Hi James,

    I expected to read about the problems of using a standardized selection of tests, such as the EPA emission testing that are (easily) circumvented and not consistent with reality.
    Maybe a topic for part three?

    I’ve consulted at a firm that makes good money by using gray-zones of the legislation.
    Sure, there was no (direct) harm to people and/or the environment, but one could compare those ‘workarounds’ to what happened at VW.
    I have raised some of these issues to the team and management and have been overruled. They were deliberate decisions.

    Who’s to blame is probably the least interesting question. Why it happened and how we can prevent further abuse are much more interesting. I’m unsure Testing (by internal people) is probably not the answer. I’d start with the standardized test.

    • Hi Beren – that’s a good point. The US EPA has serious questions to answer about how it conducted its emissions testing. Independent emissions testers who were doing the tests on the road discovered the problem. It seems reasonable to assume that the EPA should have been more aware of the limitations of its approach, which facilitated manipulation. The EPA does appear to have been naive and perhaps complacent. I would like to write about that, but it would take my out of my area of expertise and I’d have to do a lot of research. I have a long list of other topics I want to folllow up first.

  5. Hi James,

    I’d like to weigh in.

    Reading this article gave me an interesting perspective:
    VW Phpunit is an extension for PHPUnit, a popular code debugging tool.[..] When the VW Phpunit extension is activated, that code will pass automatic quality tests, no matter how bad or buggy that code is. Basically, it detects when those tests are being done, and tricks it into giving a passing grade.
    “Your primary objective is to ship more code to the world. No need to be slowed down by regressions or new bugs that happen during development,” writes VW Phpunit author.

    Taking that as a working assumption, I come to a number of conclusions.

    1. It’s entirely possible for testers to be unaware of such a feature if it’s a hidden part of the framework activated by some flag or configuration parameter.
    2. So much for automation and TDD as a replacement of skilled human testing. Automation can’t be trusted! Programmer’s self-testing adds value, but does not mitigate risks if there’s no second pair of eyes.
    3. “Tightening” testers’ code of ethics is just another layer of isolation! Where’s the programmers’ code of ethics? Or, better, a unification of all – coders, testers, .. – as developers because they are – and introducing a unified developers’ code of ethics?


  6. Hi Albert – these are certainly thought provoking points. The VW Phpunit is a jokey stunt that does make a good point, but I’d be wary of taking that too far. As for your conclusions.

    “1. It’s entirely possible for testers to be unaware of such a feature if it’s a hidden part of the framework activated by some flag or configuration parameter.”

    Yes, I accept that as a general point, but I’m sceptical about whether such a trick would have been possible in the VW case. The defeat device wasn’t a bug, it was a planned function. The design seems to have been that the engine control software would select different calibrations depending on whether it detected that the car was on the road or hooked up to an emissions testing harness. That takes me back to the point I made in the original post: the VW testers would surely have had to be unaware of the functionality of the engine control software if they knew nothing of the defeat device, which raises different questions about a failure of the wider testing role, as I argued in my follow up.

    “2. So much for automation and TDD as a replacement of skilled human testing. Automation can’t be trusted! Programmer’s self-testing adds value, but does not mitigate risks if there’s no second pair of eyes.”

    I’ve no argument with that.

    3.” ‘Tightening’ testers’ code of ethics is just another layer of isolation! Where’s the programmers’ code of ethics? Or, better, a unification of all – coders, testers, .. – as developers because they are – and introducing a unified developers’ code of ethics?”

    I do think testers should consider the ethics of the role, but I’m sceptical about the value of a code of ethics. My angle is slightly different from yours. My concern is more that such codes tend to be bland “motherhood and apple pie” statements that everyone can happily sign up to, secure in the knowledge they could then safely forget about as they do the real work. I’m not sure a code of ethics would do much harm. I just don’t think it would do much good. The ISTQB code isn’t really much more than window dressing.

    Your point about the lack of a code of ethics for coders is valid, but although I do agree that testers and coders are both developers, I also think that the testing role is distinct. I don’t think testers as a community are clear about that distinction. I don’t think we are clear about what we are trying do do. I shudder at the memory of the testing teams I’ve seen going through the motions, “faking the tesing” as James Bach has memorably called it. These teams were really just running through a process called testing, so that the project could plausibly say “we’ve done testing”. The idea of finding out about the product so the testers could give the stakeholders valuable information was foreign to these teams. Context driven testers think deeply about these matters, but many other testers are oblivious. I think testing lacks agreed guiding principles that would make it easier to call out testing that is heading down the fakery route. Guiding principles are a different matter from a code of ethics. I tried to develop that point in this article, see section 1 “objections based on regulatory theory and practice”.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s