Dr. Thilo Hagendorff

18. April 202418. April 2024

Podcast on my latest research

I recently had the pleasure of engaging in an extensive conversation with Stephan Dalügge for his podcast “Prioritäten”. If you’re interested, you can listen to the first of two episodes here (in German):

15. February 202415. February 2024

Mapping the ethics of generative AI

The advent of generative artificial intelligence and the widespread adoption of it in society engendered intensive debates about its ethical implications and risks. These risks often differ from those associated with traditional discriminative machine learning. To synthesize the recent discourse and map its normative concepts, I conducted a scoping review on the ethics of generative artificial intelligence, including especially large language models and text-to-image models. The paper is available as a preprint on arXiv, accessible via this link. It provides a taxonomy of 378 normative issues in 19 topic areas and ranks them according to their prevalence in the literature. The new study offers a comprehensive overview for scholars, practitioners, or policymakers, condensing the ethical debates surrounding fairness, safety, harmful content, hallucinations, privacy, interaction risks, security, alignment, societal impacts, and others.

6. January 20246. January 2024

New paper on “fairness hacking”

Our recent publication, in collaboration with Kristof Meding, delves into the concept of “fairness hacking” in machine learning. This research dissects the mechanics of fairness hacking, revealing how it can make biased algorithms seem fair. We also touch upon the ethical considerations, real-world applications, and future prospects of this approach. To explore the full details, check out the article here.

10. November 202331. January 2024

Recent media appearances

MIT Technology Review reported on OpenAI’s first empirical research on superalignment and included some comments of mine. Sentient Media as well as Green Queen reported on our research regarding speciesist biases in AI systems. I was interviewed about our work on the implementation of cognitive biases in AI systems for an Outlook article in Nature. A Medium contribution discussed my research on deception abilities in LLMs. Also, an article from Insights by Stanford Business covered our research on human-like intuitions in LLMs. This was also covered in a radio show at Deutschlandfunk, to which you can listen here:

9. November 202310. November 2023

Discussing the regulation of generative AI

A panel discussion about regulating generative AI systems, in which I took part, is available for viewing on YouTube (in German).

26. October 202326. October 2023

Join my team!

I am seeking applications for a second student research assistant position (f/m/d) in my research group at the University of Stuttgart. For more details on how to apply, visit this link.

12. October 202313. October 2023

Superhuman intuitions in language models

Our most recent paper on human-like intuitive decision-making in language models was published at Nature Computational Science. The research is also featured in a newspaper article (in German). We show that large language models, most notably GPT-3, exhibit behavior that strikingly resembles human-like intuition – and the cognitive errors that come with it. However, language models with higher cognitive capabilities, in particular ChatGPT, learned to avoid succumbing to these errors and perform in a hyperrational, superhuman manner. For our experiments, we probe language models with tests that were originally designed to investigate intuitive decision-making in humans.

10. October 202330. January 2024

Media coverage on my research on deception abilities in language models

It was a pleasure to be invited to the Data Skeptic podcast, where I discussed my latest research on deception abilities in large language models with Kyle Polich. You can listen to the episode using this link or right here:

The research was also featured in an article at FAZ as well as on a radio program (in German), to which you can listen here. Furthermore, I authored an article (also in German) for Golem. Unfortunately, this content is behind a paywall.

1. August 2023

Language models have deception abilities

Aligning large language models (LLMs) with human values is of great importance. However, given the steady increase in reasoning abilities, future LLMs are under suspicion of becoming able to deceive human operators and utilizing this ability to bypass monitoring efforts. As a prerequisite to this, LLMs need to possess a conceptual understanding of deception strategies. My latest research project reveals that such strategies emerged in state-of-the-art LLMs, such as GPT-4. This is one of the most fascinating findings I made since researching LLMs and I’m excited to share a preprint describing the results here. I’ll continue working on this project.

23. June 20236. December 2023

Re:publica and Oxford talk

I gave a talk addressing speciesist machine bias at this year’s re:publica, which is available for viewing on YouTube.

Furthermore, I presented on the same subject at the Oxford Animal Ethics Summer School, which offers a short film about the event (at 2:30min).

24. May 20231. September 2023

News

++++ Sarah Fabi and I updated the paper on human-like intuitive decision-making and errors in large language models by testing ChatGPT, GPT-4, BLOOM, and other models – here’s the new manuscript +++ I co-authored a paper on privacy literacy for the new Routledge Handbook of Privacy and Social Media +++ Together with Leonie Bossert, I published a paper on the ethics of sustainable AI +++ I got my own article series at Golem, called KI-Insider, where I will regularly publish new articles (in German) +++ I attended two further Science Slams in Friedrichshafen and Tübingen and won both of them +++ I was interviewed for a podcast about different AI-related topics (in German) +++

16. May 202316. May 2023

I’m hiring, again!

This time, I am seeking applications for a student research assistant position (f/m/d) in my independent research group at the University of Stuttgart. For more details on how to apply, visit this link.

27. March 202312. October 2023

Using psychology to investigate behavior in language models

Large language models (LLMs) are currently at the forefront of intertwining AI systems with human communication and everyday life. Therefore, it is of great importance to thoroughly assess and scrutinize their capabilities. Due to increasingly complex and novel behavioral patterns in current LLMs, this can be done by treating them as participants in psychology experiments that were originally designed to test humans. For this purpose, I wrote a new paper introducing the field of “machine psychology”. It aims to discover emergent abilities in LLMs that cannot be detected by most traditional natural language processing benchmarks. A preprint of the paper can be read here.

15. March 202328. March 2023

I’m hiring!

Looking for an exciting opportunity to explore the ethical implications of AI, specifically generative AI and large language models? I am seeking applications for a Ph.D. position (f/m/d) in my independent research group at the University of Stuttgart. For more details on how to apply, visit this link.

14. February 202314. February 2023

Why we need biased AI

In a new paper I co-authered together with my wonderful colleague Sarah Fabi, we stress the importance of biases in the field of artificial intelligence (AI). To foster efficient algorithmic decision-making in complex, unstable, and uncertain real-world environments, we argue for the implementation of human cognitive biases in learning algorithms. We use insights from cognitive science and apply them to the AI field, combining theoretical considerations with tangible examples depicting promising bias implementation scenarios. Ultimately, this paper is the first tentative step to explicitly putting the idea forth to implement cognitive biases into machines.

PS: We also wrote a short paper on AI alignment. Check it out here.

3. February 20233. April 2023

ChatGPT

Recently, I had several opportunities to be interviewed about ChatGPT and GPT-4, the powerful language model from OpenAI. You can read or hear the articles at heise, MDR, SWR, NDR, Süddeutsche Zeitung, FAZ, Wirtschaftswoche, Stimme, Der Standard, taz, rnd, and WELT. I also wrote an article myself.

31. August 202216. October 2022

New paper with Peter Singer on speciesist bias in AI

Somehow, this paper must be something special. It got desk-rejected without review not by one, not by two, but by three different journals! This never happened to me before and I can only speculate about the underlying reasons. However, I am grateful to the editors of AI and Ethics who had the guts to let our research be peer-reviewed and published. But what is it all about? Massive efforts are made to reduce machine biases in order to render AI applications fair. However, the AI fairness field succumbs to a blind spot, namely its insensitivity to discrimination against animals. In order to address this, I wrote a paper together with Peter Singer and colleagues about “speciesist bias” in AI. We investigated several different datasets and AI systems, in particular computer vision models trained on ImageNet, word embeddings, and large language models like GPT-3, revealing significant speciesist biases in them. Our conclusion: AI technologies currently play a significant role in perpetuating and normalizing violence against animals, especially farmed animals. This can only be changed when AI fairness frameworks widen their scope and include mitigation measures for speciesist biases.

PS: I had the opportunity to publish an op-ed article in the German tech magazine Golem as well as a research summary at The AI Ethics Brief regarding the paper.

28. June 202229. June 2022

New papers

Paper #1 – AI ethics and its side-effects (Link)

I wrote a critical article about my own discipline, AI ethics, in which I argue that the assumption that AI ethics automatically decrease the likelihood of unethical outcomes in the AI field is flawed. The article lists risks that either originate from AI ethicists themselves or from the consequences their embedding in AI organizations has. The compilation of risks comprises psychological considerations concerning the cognitive biases of AI ethicists themselves as well as biased reactions to their work, subject-specific and knowledge constraints AI ethicists often succumb to, negative side effects of ethics audits for AI applications, and many more.

Paper #2 – A virtue-based framework for AI ethics (Link)

Many ethics initiatives have stipulated standards for good technology development in the AI sector. I contribute to that endeavor by proposing a new approach that is based on virtue ethics. It defines four “basic AI virtues”, namely justice, honesty, responsibility, and care, all of which represent specific motivational settings that constitute the very precondition for ethical decision-making in the AI field. Moreover, it defines two “second-order AI virtues”, prudence and fortitude, that bolster achieving the basic virtues by helping with overcoming bounded ethicality or hidden psychological forces that can impair ethical decision making and that are hitherto disregarded in AI ethics. Lastly, the paper describes measures for successfully cultivating the mentioned virtues in organizations dealing with AI research and development.

Paper #3 – Ethical and methodological challenges in building morally informed AI systems (Link)

Recent progress in large language models has led to applications that can (at least) simulate possession of full moral agency due to their capacity to report context-sensitive moral assessments in open-domain conversations. However, automating moral decision-making faces several methodological as well as ethical challenges. In the paper, we comment on all these challenges and provide critical considerations for future research on full artificial moral agency.

21. May 202222. May 2022

Science Slam

In 2019, I competed in my first science slam. Then came covid. But finally, public events are possible again. Thus, I had the pleasure to be invited to a slam for the second time. In the end, the clapometer decided on a draw and I could happily share the win with Aysel Ahadova.

9. May 202214. May 2022

Racing is back

Finally, after two years of race cancellations due to Covid, I was able to compete in my first MTB race this year. I finished 1st in my class and 12th overall (280 participants). My aim was to finish in the top 10, but due to an injury that hindered my race preparation, I couldn’t perform at my best. Next time then.