Automated journalism (also known as algorithmic journalism) refers to the fully automated production of journalistic texts based on human instructions. Other terms are robot journalism or algorithmic journalism. In some cases, these terms are also used when algorithms only take over certain work steps, such as formulation (Mooshammer 2022). What is meant here is the complete automation of all sub-steps. The idea of using machines to generate texts for journalistic purposes is not fundamentally new (Bernhart/Richter 2021). However, its use has increased significantly since the turn of the millennium. The product, producers, legal rules and business models are presented below.
Diverse products
On the one hand, articles written by humans are being replaced by algorithms, while on the other hand they are being supplemented by automated journalism, as algorithms produce genuinely new goods. Due to the low marginal costs, algorithms can be used to write articles for small target groups, such as reports on clubs in the local league. Chat bots, such as Hej or Fragen Sie Zeit online, which respond to user enquiries, are another option. However, the technology is rarely used for topics for which there is no digital data basis, as well as for research-intensive or opinionated formats such as editorials. The use of algorithms therefore complements the range of journalistic offerings rather than replacing them (Dörr 2023, p. 210). In contrast to previous digitisation projects in journalism, this is not a use of digital technology for the purpose of expanding the distribution or presentation of human-generated products (Weber/Steffl/Buschow 2021), but a genuinely new digital product.
Producers of algorithmic journalism
The software for automated journalism mainly comes from non-journalistic actors with a background in information technology. However, the software is not only used by established media professionals, but also by new players. These include the operators of online portals such as NewsGPT. This continues a trend that has been ongoing for some time: New, not necessarily journalistic competitors are entering the market through the use of digital technology.
Legal framework
One obstacle to market development is legal uncertainty. The right to the exclusive utilisation of their intellectual property, such as their texts, is essential for market participants. Automated journalism challenges this in various ways. Firstly, the most widely used software ChatGPT has been trained with data to which third parties – prominent media companies from the USA – have registered claims. The companies accuse the distributor OpenAI of having used their content without payment and unlawfully. Secondly, European data protection organisations, for example, are examining complaints that OpenAI is circumventing data protection guidelines and thus personal rights by processing sensitive data. The software as a means of production could still be legally restricted here. Thirdly, the software can be used to quickly reproduce content and copy styles. Competitors have accused CNET of exploiting their intellectual property. The issue of labelling is a particular focus of public attention. There is no law on this, the Press Council’s press code does not include a labelling requirement (see Deutscher Presserat 2023) and previous unmarked forms of use have been uncovered either by competitors or monitoring organisations (Süddeutsche Zeitung 2023). However, the Council of Europe’s “Guidelines on the responsible implementation of artificial intelligence systems in journalism” call for transparency and many publishers have published publicly accessible guidelines on their use of artificial intelligence. So far, the use of artificial intelligence has therefore been little regulated and legal restrictions and obligations are currently being negotiated.
Business model
The market for journalistic services is characterised by competition based on price and quality. The cost of software was initially high for deterministic systems, but has fallen significantly with generative technology. In deterministic systems, people use instructions to tell the system in detail how the data should be processed. Generative systems automatically derive processing steps from sample data (machine learning). It is then no longer possible for humans to understand how an algorithm works, and the same input can lead to different outputs. Furthermore, although investments are required in the data and in programming the technology, the subsequent marginal costs for creating individual articles are low. The economies of scale that characterise the digital economy can be observed. Automated journalism is then more cost-effective than man-made journalism.
Different quality judgements can be observed. While algorithms are said to write faster and process large amounts of data less prone to error, humans are said to write more creatively and beautifully and to be able to make conscious value judgements (cf. Dörr 2023, p. 205). However, technology continues to advance in this area, with some systems now even able to recognise sarcasm (Băroiu/Trăuşan-Matu 2023).
In addition to the legal uncertainty, it is uncertain to what extent revenue can be generated from automated posts. Although recipients recognise the qualities of algorithms, they are critical of the technology, especially in the high-price segment (Meinungsmonitor KI 2022; Wellbrock/Buschow 2020). Haim et al. (2018), for example, therefore consider it likely that automated journalism will be monetised via the advertising market rather than subscriptions. Algorithms are used to create masses of search engine-optimised content, which in their entirety lead to more clicks, resulting in an increase in revenue on the advertising market. However, these are snapshots of a customer base that still has little personal experience, knowledge and trust in generative AI, even if there are slight socio-economic differences (Kero/Akyürek/Flaßhoff 2023). The extent to which automated journalism will generate more revenue and new sources of income will depend on their judgement and assessment of quality.
Niche formation on the provider side
Three strategies for dealing with automated journalism can currently be observed on the market. These are embedded in a fundamental economic crisis in journalism: income from the advertising market is falling, it is becoming more difficult to differentiate journalistic content and, most recently, distribution costs for print content have risen sharply due to the minimum wage, rural isolation and high paper costs. When it comes to the question of how to deal with automated journalism, economic constraints are therefore relevant in addition to qualitative reasons. Firstly, some, predominantly new providers rely entirely on automation and provide content free of charge (e.g. NewsGPT). A second group uses the technology only for specific purposes and under human supervision. In a survey conducted by BZDV (2024), 63 per cent of the newspaper publishers surveyed stated that they pursue this strategy. A third niche is formed by established journalistic players who refuse to use technology at all. In the BZDV survey, this was 36 per cent of respondents.
Comparability with analogue phenomena
In terms of results, human editors and algorithms can be compared, although they differ considerably in the production process. Algorithms can only process information that is available in the form of digital data. The amount and availability of this data is increasing enormously, but unknown data such as human nuances can still not be processed. Furthermore, algorithms cannot establish a link between the data and what the data is supposed to represent. They cannot validate or criticise the data in this respect. As a result, social inequalities inscribed in historically collected data are not recognised, reflected and corrected by the algorithm (Barocas/Selbst 2016). Furthermore, data can only be assessed statistically; an evaluation based on values or subjective relevance structures is not possible. This also harbours potential, as it allows discriminatory, learned heuristics and evaluation schemes to be overcome. There is also a difference in transparency: in non-deterministic systems, the prompt and – in some cases – the data basis of an article can be shown, but the processing remains unclear. In deterministic systems, processing rules are visible, they are in the code. Humans can also reveal their sources and their processing, but reach their limits with common knowledge and unconscious information. In addition, systems based on machine learning can lead to hallucinations. The technology then produces factually incorrect information.
Social relevance
Journalism is (still) relevant because people essentially receive information about their environment via the media (Luhmann 1997). In the absence of verification, there is an acute risk of falsehoods being spread due to incorrect data or hallucinations. Algorithms then become fake news machines. Secondly, there is competition from automated providers that produce free content but deviate from established ethical and journalistic quality standards. NewsGPT, for example, explicitly accepts no responsibility for the accuracy of its information (NewsGPT 2024). Competitors that value established ethical standards, on the other hand, will probably only be able to survive if consumers are also willing to (continue to) pay for the quality of this content. Another point relates specifically to the journalistic field. The profile of the desired competences of journalists is already changing and, in addition to further training, there are also redundancies. In this respect, redundancy effects can be observed, regardless of the number of jobs.
Sources
- Barocas, S./Selbst, A. (2016): Big data's disparate impact. In: Calif. Law Rev. 104: 671.
- Băroiu, A./Trăuşan-Matu, Ş. (2023). How capable are state-of-the-art language models to cope with sarcasm?. 24th International Conference on Control Systems and Computer Science (CSCS). Bucharest, Romania, 399–402. https://doi.org/10.1109/CSCS59211.2023.00069 [01.07.2024].
- BDZV/Hidhberg (2024): Trends der Zeitungsbranche [01.07.2024].
- Bernhart, T./Richter, S. (2021): Frühe digitale Poesie. In: Informatik-Spektrum 44(1), 11–18. https://doi.org/10.1007/s00287-021-01329-z [01.07.2024].
- BILD. Hey_Ihr Helfer mit KI. https://hey.bild.de/ [01.07.2024].
- Deutscher Presserat (2023): Jahresbericht 2023. [02.07.2024].
- Dörr, K. (2023): Algorithmische Werkzeuge–Chancen und Herausforderungen für den Journalismus. Journalismusforschung. Nomos Verlagsgesellschaft mbH & Co. KG: 210.
- Haim, M./Graefe, A./Brosius, H.-B. (2018): Wertschöpfung mithilfe von Algorithmen. Ansatzpunkte für die Veränderung von Geschäftsmodellen durch Computational Journalism. Medien Wirtschaft 15(3), 36–43.
- Kero, S./Akyürek, S./Flaßhoff, F. (2023): Bekanntheit und Akzeptanz von ChatGPT in Deutschland. Meinungsmonitor Künstliche Intelligenz. https://www.cais-research.de/wp-content/uploads/Factsheet-10-ChatGPT.pdf [01.07.2024].
- Landesanstalt für Medien NRW (2024): Algorithmen und KI im Aufwind. Ist der automatisierte Journalismus die Zukunft?. https://www.medienanstalt-nrw.de/fileadmin/fyi-Vol13_Newsletter_Informationsintermediaere.pdf [01.07.2024].
- Luhmann, N. (1998). Die Gesellschaft der Gesellschaft. Frankfurt am Main.
- Meinungsmonitor Künstliche Intelligenz (2022): Künstliche Intelligenz im Journalismus. https://www.cais-research.de/wp-content/uploads/Factsheet-4-Journalismus.pdf [01.07.2024].
- Mooshammer, S. (2022): There Are (Almost) No Robots in Journalism: An Attempt at a Differentiated Classification and Terminology of Automation in Journalism on the Base of the Concept of Distributed and Gradualised Action. In: Publizistik 67(4), 487–515. https://doi.org/10.1007/s11616-022-00757-5 [01.07.2024].
- News GPT (2024). NewsGPT, Terms of Use. https://newsgpt.ai/terms-of-use/ [01.07.2024].
- Süddeutsche Zeitung (12.05.2023): BURDA-Verlag. KI mit Soße.; Futurism (27.11.2023): Sports Illustrated Published Articles by Fake, AI-Generated Writers. https://futurism.com/sports-illustrated-ai-generated-writers [01.07.2024].
- Weber, J./Steffl, J./Buschow, C. (2021): Plattformen für digitalen Journalismus in Deutschland. In: MedienWirtschaft, 18(2), 20-34. https://doi.org/10.15358/1613-0669-2021-2-20 [01.07.2024].
- Wellbrock, C./Buschow, C. (2020): Money for nothing and content for free?. Schriftenreihe Medienforschung der Landesanstalt für Medien NRW. Baden-Baden.
- Zeit online (2024). Fragen Sie ZEIT ONLINE. (https://www.zeit.de/beta/fragen-sie-zeit-online-news) [02.07.2024].