Why has generative AI become indispensable nowadays?
The generation of texts and the extraction of information from large heterogeneous databases in particular are increasingly being used in companies, but also by pupils and students. There are reports of efficiency gains in marketing, for example, or in simple programming tasks such as creating overview graphics for a company’s sales figures.
I see great potential in the area of scientific discovery, i.e. the support of scientific research, for example for the development of new medical agents or materials. Here it is worth putting a lot of effort into prompting and giving very detailed instructions, for example through so-called chain-of-thought prompting. Prompt chaining, i.e. step-by-step generation in individual steps, is also used to generate complex content as a co-creation process between humans and AI.
And where do the applications reach their limits?
You have to be aware that generated content can be incorrect. Whenever something is really important, human control is essential. For example, a Nature article from 2024 showed that large and instructable language models are less reliable.
In several benchmark datasets, incorrect output was produced for questions on geographical knowledge, scientific topics, but also for something supposedly simple like addition. The problem is that you can only check whether an output is correct once it has been generated. Large language models are not yet able to say “I’m not sure about this” or “I don’t know”. There are currently hardly any approaches for generative AI to better assess the reliability of generated output.
At bidt, you therefore consider trust to be an essential condition for the success of collaboration between humans and artificial intelligence. When do people trust AI?
Trust is a term that is defined in social psychology for interpersonal relationships. But we also recognise trust in institutions, such as the rule of law or the TÜV, in professional groups or even technologies. This refers to trust in their reliability. However, trust in generative AI is more complex than trust in the reliability of an email programme or an airbag control system. With these technologies, we have long experience of using them and have a mental model of what the system does. The mental model does not have to be technically correct, but it must enable us to assess the effect of an interaction with the system, what the system can and cannot do.
Most people know much less about AI than they do about a car. We know that a car can’t fly and we have hypotheses as to why the car won’t start. We have no comparable mental models and concepts for AI technologies. As a result, there are sometimes very misconceptions, for example that chatbots actually have an understanding of the object for which content is being generated.
Trust in AI systems can be viewed on different levels: Do we trust the underlying method, the provider, the expertise of the AI developers, the training data, the specific AI tool we use or the output generated for a specific prompt?
These complex levels of trust in generative AI are the subject of interdisciplinary research at bidt in the research focus area “Humans and Generative Artificial Intelligence: Trust in Co-Creation”. What questions are you dealing with exactly?
The research focus looks at the conditions for trust, in particular the necessary skills, evaluation criteria for the quality of generated content as well as technical, ethical and normative framework conditions. The process of co-creation is addressed from the perspectives of the production, interaction and reception of generated content. For example, the pAIrProg project, which I lead, is researching the topic of programme code generation.
From the perspective of computer science and psychology, we want to develop and evaluate trustworthy interfaces for the use of code generators in programming training and professional software development.
Prof. Dr. Ute Schmid To the profile
Keyword “trustworthiness”: With generative AI, there is a lot of discussion about quality and reliability when the AI hallucinates and plays out invented information. Under what conditions can people trust AI?
In some cases, people have a generalised over- or under-trust in such systems. Instead, it would be important to “calibrate” trust appropriately. This means that, depending on the particular task that we delegate to a system, we can assess how much effort we should put into the critical control of the generated content.
In medicine in particular, generative AI has great potential on the one hand and high risks on the other. Doctor’s letters could be created faster or relevant information could be extracted more quickly. However, there is a risk that important information could be overlooked. In highly complex and safety-critical areas such as diagnostics, therapy planning or medical research, co-creative processes and human control are indispensable. This means that expertise is required in order to use generative AI sensibly.
When should caution be exercised with regard to trust in AI in general – are we thinking of deepfakes, for example?
Deepfakes are about indirect trust in images and texts that we encounter in social media or other media. We need methods that help us to assess whether content is factually correct or plausible. Fake news and images that do not correspond to the facts have always existed. Regardless of whether a person manipulates or deliberately generates false content or an AI system, the strategy remains the same: checking via independent sources. If we see a photo of Angela Merkel and Barak Obama on the beach, for example, we would use media we trust to check whether the two have met. In my opinion, one of the most urgent tasks of media education is to teach such judgement and verification skills.
So we need in-depth knowledge and skills when dealing with new technology. How do these skills relate to the human category of trust ?
I would differentiate between the concept of trust as it is used in interpersonal trust and the concept of trust in AI systems. For example, I can say: “I trust my mother’s advice”. “I trust the AI” is a rather inappropriate attribution. Instead, it is about trust in the functioning of a certain technology or a certain system.
In the sense of the Dagstuhl triangle for digital competences formulated by computer science didactics, we need the following as the basis for attributions of trust: (1) Knowledge of basic concepts and methods of AI systems. These form the basis for (2) the confident and reflective use of special AI tools. Basic skills and user experience together form the prerequisite for (3) critical reflection on the impact of AI methods on one’s own life, society and the environment.
It therefore seems urgently necessary to deal with ChatGPT and other AI applications as early as possible. Should AI be taught at school?
In my opinion, the topic of AI is an essential part of school and teacher training programmes. I am in favour of a compulsory school subject in computer science, in which AI methods are also taught. In general, all three aspects of the Dagstuhl Triangle should be addressed.
An increasing digital divide can already be observed. An international study from 2023(ICILS 2023, International Computer and Information Literacy Study) shows significant regression among eighth-graders compared to 2018. 40 per cent achieve only rudimentary digital skills. Performance in Germany is particularly unevenly distributed when it comes to the educational background of parents.
When it comes to generative AI in the school context, the main topic of discussion at the moment is how to prevent pupils from doing their homework with AI tools. However, no-one has ever bothered about the fact that homework may have been done by a parent. For this reason, teaching basic AI skills is a central task of school education, as this is the only way to level out inequalities.
Thank you very much for the interview!
In the media: Ute Schmid in the BR podcast
“When people should trust in AI” was the topic of the Bayern 2 programme series “Aktuelle Interviews” on 11 April 2025. Ute Schmid discusses the question of trust in generative AI and what reasonable interaction can look like. The programme is available in the BR podcast and in the ARD audio library.
Literature
Jansen, P. et al. (2024). Discoveryworld: A virtual environment for developing and evaluating automated scientific discovery agents. In: Advances in Neural Information Processing Systems 37, 10088–10116.
Schmid, U. (2024). Bildungskanon für eine digitale Gesellschaft: In der Schule müssen digitale Grundkompetenzen vermittelt werden. In: Akademie Aktuell 1 (82). https://badw.de/fileadmin/pub/akademieAktuell/2024/82/AA0124_32_Fokus_5_Schmid.pdf
Schmid, U. (2024). Trustworthy Artificial Intelligence: Comprehensible, Transparent and Correctable. In: Werthner, H. et al. Introduction to Digital Humanism. Springer, Cham. https://doi.org/10.1007/978-3-031-45304-5_10
Zhou, L. et al. (2024). Larger and more instructable language models become less reliable. In: Nature 634 (8032), 61–68. DOI: 10.1038/s41586-024-07930-y