Generative AI is making great strides almost on a weekly basis. It is becoming increasingly impressive in the generation of texts, images and code. At the same time, the debates surrounding such tools are also becoming more heated: not only the creative process, but also entire creative professions may be in a state of upheaval. After all, the models behind them are fuelled by existing knowledge and works from this world. The AI models are based on billions of pieces of intellectual and creative work created by humans – newspaper articles, scientific papers, photos, videos, etc. So are (commercial) operators allowed to collect content online in order to train their AI models? For some time now, publishers and authors have been taking legal action to clarify this question. The issue at stake is the copyright to the materials that are incorporated into the AI models through their training.
First of all, it is important to realise what the purpose of copyright is: It turns a work (such as text, image, film, music) into a tradable economic good by excluding others from using it and allowing rights of use to be granted in return for remuneration. There is also an entitlement to the protection of personal rights, in particular the prohibition of distortion of works. At the same time, copyright law does not want to stand in the way of inspiration and innovation. The knowledge contained in a work is also free. Likewise, copyright law does not want to prevent works being created in the style of something that already exists. Only concrete works are protected, but never styles, facts or theories. Changing the law at this point would be a break with previous principles.
Comparability with analogue phenomena
How is it that the training of AI models is sometimes labelled as theft of intellectual property? After all, one could also draw a comparison with the fact that people are inspired by ‘enjoying works’ and absorbing the knowledge of the world. In legal terms at least, however, this is something different: if works are collected en masse for the training of AI models, this constitutes reproduction. For example, in the run-up to an image AI, the network is scraped, i.e. copies of images are made from many sources in order to train AI models. This training is carried out by processing large amounts of data, which is possible today.
Whether consent must now be obtained for such reproductions is a question that is currently very controversial. This is because copyright law does not make certain acts of utilisation in the public interest dependent on individual consent, but rather makes them legally optional. This balancing of interests has long been applied when, for example, a scientist copies an essay for personal reading or when records were copied onto tape for private purposes. In the context of AI training, the so-called text and data mining authorisation (§ 44b UrhG) is regularly used, which was also created for innovations in the field of AI according to the declared intention of the legislator, and under which the training of AI models can be affirmed with good reason. This was also the view of the Hamburg Regional Court in September 2024, the first court in Germany to have to deal with this (case no. 310 O 227/23). However, the extent to which this case can be generalised (the case in question involved an open source image model) remains to be seen. Ultimately, only the European Court of Justice will bring clarity in Europe.
The main use case and the main question of (generative) AI – much more than the cases of imitation or style copying – is likely to be that AI is “well-read”, has absorbed as much knowledge and culture as possible in order to process this in interaction with the users and reproduce it in a different form. As mentioned above, this actually has nothing to do with copyright and the generation of works, but rather with information acquisition and processing.
Finally, a rather minor question is whether AI-generated works can enjoy copyright. There is a broad consensus among legal scholars: only AI-generated works do not enjoy copyright, because a work requires a personal intellectual creation. And this can only be created by a human being. However, a distinction must be made if the AI is only a tool and the creative contribution by the human being takes centre stage. In these cases, copyright protection arises.
Social relevance
Whatever one’s opinion of generative AI, the underlying value judgements are of great importance and it is to be hoped that clarity will be created. Of course, this powerful technology poses huge challenges for many professions. However, it is not at all the case that generative AI automatically harms creators. It also helps many creatives in their work. Copyright law is just one of many instruments that can be used to negotiate the social consequences of generative AI and define the conditions for the provision of such tools. The technology, which is already so powerful, will presumably not be stopped by copyright law. Finally, it should also be mentioned that copyright holders themselves can declare a reservation of use of AI, which is binding at least for commercial AI providers.