How do you know when you’re right? – On hallucinations, the limits of trustworthy AI, and the power of ‘I don't know’
- 👤 Speaker: Anders Hansen
- 📅 Date & Time: Tuesday 06 February 2024, 16:00 - 17:00
- 📍 Venue: MR5, CMS, Wilberforce Road, Cambridge, CB3 0WB
Abstract
In 2023 the Cambridge Dictionary word-of-the-year was: ‘hallucinate’ – due to the overwhelming evidence of hallucinations in modern AI, in particular, those caused by chatbots. In the interest of creating trustworthy AI, one can ask the following questions:- Can AI be made so that it does not hallucinate?
- If not, can one design algorithms that will detect when AI hallucinates?
- If not, what do we then do?
In this talk we will show how the answer to the two first questions is ‘no’, even for basic problems in the sciences. This leaves us with the only option of trustworthy AI: the ability to say ‘I don’t know’. We will discuss how there is no theoretical limitation on creating AI that hallucinate, but will say ‘I know’ when it is certain that the output is correct (and this certainty is indeed true). Moreover, in the case it says ‘I don’t know’ the output could be either correct or an hallucination. We argue that the ability to say ‘I don’t know’ is a fundamental part of human intelligence and trust, and that it follows from the foundations of mathematics that this is the best form of trustworthy AI possible. This opens up the question on which problems can be tackled by AI – that can say ‘I don’t know’ – in a meaningful way. Indeed an AI saying ‘I don’t know’ all the time is not particularly useful. We will show how this question can be handled by the Solvability Complexity Index (SCI) hierarchy from the foundations of computational mathematics.
Series This talk is part of the C.U. Ethics in Mathematics Society (CUEiMS) series.
Included in Lists
- bld31
- Cambridge talks
- C.U. Ethics in Mathematics Society (CUEiMS)
- MR5, CMS, Wilberforce Road, Cambridge, CB3 0WB
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Tuesday 06 February 2024, 16:00-17:00