When software is built, operated and used, the question arises as to who is responsible for potential negative consequences. This is not so much about liability: we are concerned with doing the right thing in advance, the right thing in an ethical sense. This text is part of a presentation at the Bavarian Representation in Brussels on March 24, 2021.
Prof. Dr. Alexander Pretschner
Chairman of bidt's Board of Directors and the Executive Commitee | Chair of Software & Systems Engineering, Technical University of Munich | Scientific director, fortiss
An obvious question is who makes decisions and is responsible accordingly. Society may choose to use facial recognition technology. An organisation can decide to use a particular technology-enabled application process. A company can decide to build weapons – and the software that is built today can be used as a weapon, in a very different sense of course. If an employee is not comfortable with this, i.e. he cannot support the objective for ethical reasons, he must ultimately leave the company. Love it or leave it. A developer, however, can decide within certain framework conditions how to implement a certain functionality. A quality assurance engineer can decide which testing measures to use. And a user of the system can decide to use or not use certain features of the system. You may remember the cynical slogan of the National Rifle Association: Guns don’t kill people, people with guns kill people.
I want to focus here on the developer of software. This does not mean that the other actors are not also responsible. But it also doesn’t mean that responsibility evaporates by resting on too many shoulders. I don’t believe at all that it is acceptable to have “just done what I was told”. However, I realise that a single developer is not responsible for everything that a system they have worked on potentially does. We need to keep in mind that the business goals or mission of an organisation are not normally defined by the developer. I am concerned here with the specific, individual responsibility of the engineer. It is clear that relevant ethical considerations cannot take place in a vacuum, but always in the balance with other objectives.
What interests us here is a specific quality of systems: Quality in the sense that no unethical consequences arise. It is not entirely clear to me myself how far-reaching, difficult-to-overlook consequences have to be considered here. When the software for AirBnB was written, should one have taken into account that flats in city centres might be largely occupied by tourists instead of tenants in ten years’ time? Should they have foreseen that Uber would be so successful that more individual traffic would result from people switching from public transport to Uber cars? Should we have foreseen that Bitcoin would eventually consume more electricity than Argentina? I don’t think so. In Germany in particular, we are perhaps sometimes allowed to be a little more confident that we will already get a grip on problems in the future. Conversely, however, engineers should perhaps not write software that recognises Uyghurs in pictures. Shutdown software in diesel engines. Or software that supposedly recognises a person’s sexual orientation. It’s all been done.
An interesting and perhaps surprising aspect of the quality of software products is that this quality is very difficult to measure. We can’t really say for sure whether such a product always works correctly. This does not mean that software is always of poor quality – because that is not the case. Of course we get annoyed about software errors from time to time. But if you consider how often, in comparison, software functions silently and as desired, you will admit that errors are the absolute exception. We can also analyse and evaluate the security of systems – but for very different reasons we cannot normally give any quality guarantees. By the way, the same applies to the response speed of a system, its maintainability, etc.
One reason why the quality of a software-based product is difficult to measure is the following: This product does not actually exist. By this I mean that software products today usually change continuously. Software updates take place that fix bugs and add new functionality. In the case of machine-learning-based AI, the data basis on which learning takes place is constantly evolving: Because new, more up-to-date data is collected; because we better understand what the relevant data is; and also because the world it represents is evolving. It makes little sense today, therefore, to certify a single finished product, especially when machine learning is involved.
My point is that it is difficult for us to measure the quality of a software product. That’s why certifications for software-intensive systems usually also refer to the development process, for example in the area of functional safety. I believe this is also the right starting point for requirements in the ethical area. There are already approaches that go in this direction, such as the development standard VDE-AR-E 2842-61 developed by fortiss, among others, or the P7000 process model of the IEEE, which puts ethical aspects in the foreground. In continuation, one can also certify whole organisations in this way, if they always organise all their projects in a certain way that has been found to be good. You all know ISO9000.
I think it is extremely important to realise that a software product never exists on its own. Its use is embedded in the business processes of an organisation. It may be operated by another organisation. And it is used by very different people. Software is always just one part of socio-technical systems. Just as security or privacy problems can never be solved with purely technical approaches, ethical values cannot be implemented through technology alone. Accompanying measures are always necessary: Development measures, organisational measures, training measures, liability measures, legal measures – and of course, in extreme cases, the decision not to do something. The context of use of software is always anticipated during development. An overview of appropriate measures under the heading of “operational AI”, which in my view leads to good proposals, can be found here.
An alternative to the certification of products would hypothetically be to certify software engineers, just as there are licensed physicians who have completed the appropriate training and have to undergo further training. Christopher Wylie, arguably one of the crucial programmers in the Cambridge Analytica scandal, has suggested something like this. I haven’t found any actual figures, but from my background I would guess that less than 30% of the programmers I know are trained computer scientists. In view of the fact that there are too few software engineers anyway, this proposal should perhaps be reconsidered for that reason alone.
In the area of information security, there are corresponding individual certifications, which in my eyes are particularly beneficial to the purses of the certification organisations. Overall, I think that certifications of individual projects or possibly entire organisations are the better option than certifying engineers. But perhaps it is not always necessary to “certify”. In the case of aviation, for example, there is certification. It can also be agreed that in the event of a claim, a manufacturer must prove that it has developed according to the specifications of a standard that defines the currently best accepted development steps. If the manufacturer cannot do that, he is liable. This is what the automotive industry does with the relevant standard on functional safety. This standard, by the way, comes from the automotive industry itself and not from outside and is quite successful overall.
My understanding is that the EU has in mind a kind of external certification for particularly critical AI applications and a kind of self-assessment for less critical applications. Maybe that’s a good idea, if you can define the boundary between “critical” and “non-critical”. It should be noted, however, that for the reasons mentioned, forced or voluntary “certification” must refer to the – not necessarily final – development process and, if necessary, accompanying non-technical measures, and not to the product!
There are different ways of organising the development process for software. One idea is so-called agile processes – such as Scrum, you may have heard of that. In fact, agile is much more than a development process for software: it is a culture. A culture in which you plan very intensively in the short term for the next two weeks, the so-called sprints. You don’t tend to plan for the long term here, because experience shows that everything always changes! Agility is a culture in which you have a functioning subsystem at any time, with which newly created functionalities are immediately integrated and carefully tested. A culture in which employees are empowered to make their own decisions and take responsibility. A culture in which learning from mistakes is a central component.
For this kind of software development, we at bidt have been thinking about how ethical considerations can be incorporated. The idea is this. There are about 120 codes that describe what values should be considered when developing software-intensive systems. These are values like transparency, fairness, sustainability, etc. These values are all important, but not always helpful for the engineer because these values are just very abstract. How are you going to incorporate “fairness” into a face recognition system?
I now believe that codes are forced to formulate only abstract values because software is extremely heterogeneous and context-dependent. Think of the software that plays a role in our interaction right now: Zoom; microphone and speaker controls; control software for transporting video and audio streams; the mobile phones on your desk; possibly your hearing aids or pacemakers or insulin pumps. These are all subject to completely different ethical requirements. That is why we should include ethical considerations in the individual development process.
We do this in our interdisciplinary approach with the EDAP scheme, which stands for Ethical Deliberation in Agile Processes, once at the beginning, when we know roughly what the system should do. At the beginning of the development of a software product, we think about values and what the consequences will be if we decide on certain implementations, and whether and how these values collide with other goals. So we weigh things up. These values serve as guidelines. And then in each two-week phase, these so-called sprints, we consider again and again whether and how we implement these values for the current concrete state of the system, if necessary by applying appropriate mechanisms. Roughly speaking, if an important value is transparency, then sophisticated logging technology can be used. If the value is fairness, then one of today’s 25 definitions can be selected and the data relevant to machine learning can be examined accordingly. When it comes to operations that can massively interfere with the lives of individuals, we might want to install a four-eyes principle. And in concrete terms, perhaps: let’s assume that our goal is a system that automatically pre-sorts job applicants. Then we consider at the beginning that we want to be non-discriminatory and fair, among other things. In Sprint 25, a developer is then supposed to implement the application mask. Now the question is how to represent gender. Man/Woman? Man/Woman/Diverse? Is it perhaps also possible without a form of address? All solutions have advantages and disadvantages. In any case, the mechanisms lie between a perhaps uncritical “yes” and an over-cautious “no”.
We have very carefully structured and broken down this process of reflection, so-called deliberation, and the identification of values and mechanisms for their implementation. The reason I mentioned agility in the first place is this: It fits perfectly with this idea of ethics in software development! Short-term planning makes it possible to think about ethical aspects again and again. Incremental development makes it possible to uncover contradictions when adding new functionality, which can be fixed. And empowerment is very important: in agile, developers work in a highly creative and self-determined way and do not develop exactly what they are told. Instead, they are given a rough problem to solve – and within the freedoms allowed, adequate ethical mechanisms can be developed.
This approach has the advantage that it works both for certifying projects and organisations in critical cases, and for informal self-assessment in less critical cases: For this, simply the deliberation steps have to be recorded. For certification, a baseline for criteria and perhaps value structures needs to be established in advance. Another advantage of this approach is that it works for software in general as well as for AI: Because the central challenge in my eyes is not AI and ethics, but software and ethics! What speaks for this approach is that it is lightweight. I – and many others, for that matter – am not sure whether compulsory ethics lectures for engineers are enough to really make the world a better place. In our approach, you live ethics concretely and don’t learn it abstractly. (Of course, I am not speaking against ethics lectures and seminars for software engineers, in which I myself occasionally participate together with philosophers) Finally, I like the fact that our approach does not unduly restrict the ability to innovate. If we were to define in advance what exactly must or must not be done for ethical reasons, we would define ever stronger framework conditions that the engineer will quickly perceive as a corset and that are detrimental to innovative strength. I like the approach because it defines ethics as a value proposition and not as a catalogue of prohibitions: We can decide on a case-by-case basis which values we want to implement and thus implement our European identity. And, to return to the beginning, it gives the engineer exactly the responsibility that he can and must assume in a whole chain of responsibility.
At the beginning I talked about the fact that decisions are of course made within the framework of what has been decided or is the consensus in the respective organisation and society. Whether and which form of automatic facial recognition we allow is first of all a social question! (Interestingly, the ethical aspects of facial recognition are still more important internationally today than the recognition and synthesis of speech and text) If such a consensus does not yet exist, the standards of an organisation and those of the individual development teams will probably apply – but these standards will probably often not be explicitly formulated. IT engineers can do the right thing here, weighing up all the relevant factors. They are in a powerful position vis-à-vis their employer because they cannot be blackmailed at will in view of the shortage of skilled workers in our industry. And conversely, this power obliges them to take responsibility and to exercise what is surprisingly called accountability, which has no German translation.
If we consider state regulation necessary, we should not underestimate the power of self-regulation in enforcing this accountability. And we should certainly also try in Europe to complement laws that restrict software and AI with laws that promote it – a consideration that arises directly from reading an overview of US legislative projects on the subject.
The guest contributions published by bidt reflect the views of the authors; they do not reflect the position of the Institute as a whole.