Testimony of a machine.
Text: Angelika Jacobs
Smart software constantly monitors us and provides key evidence in court cases. But does it always tell the truth? This is a question the justice system now faces with increasing regularity.
When the company OpenAI released its language model ChatGPT, the program was received with astonishment and enthusiasm – but negative reports were quick to follow. According to its detractors, some of the answers given by the artificial intelligence (AI) sounded plausible, but – to put it in human terms – were also bald-faced lies. In short, you shouldn’t believe everything the AI tells you. But what if it were responsible for providing evidence in a court case and could potentially affect the verdict?
In his doctoral dissertation at the Faculty of Law, Jannik Di Gallo developed recommendations for how courts might best handle this type of evidence. Of course, he wouldn’t use the term “lying when it comes to AI: “When human witnesses give false testimony, in some cases, they might be trying to deceive the court, so they’ll lie about the appearance of the suspect, for instance.” But the person could also simply be mistaken. Their perception may be skewed by nearby objects, lighting or their own biases or preconceptions.
Probable word order.
Di Gallo tends to equate incorrect statements made by the AI with the latter scenario. “Language models like ChatGPT are based on statistics. The system just adds the most probable next word to the one preceding it and builds an answer from there,” he says. But that answer may be wrong.
So, how can the court ensure that the AI is a reliable witness? Imagine the following scenario: The driver-assistance system in a car warns the driver that she is tired, but she keeps driving anyway and is involved in an accident that injures others. Yet she argues that she wasn’t feeling tired at all. Who should the court believe?
The first question the court needs to clarify is whether the AI system can be tested and how that would be done, explains Di Gallo. Was this AI developed to handle cases like the one in question? Di Gallo provides the following example: “Say the driver was wearing sunglasses. Was the system trained to correctly identify signs of fatigue based on facial expressions in that case?”
Data-driven justice.
Such scenarios still seem like abstractions, but the U.S. justice system has already seen high-profile cases in which verdicts were influenced by algorithms. The COMPAS software made headlines because, according to a study, it overestimated the risk of black people reoffending much more frequently than white people. Since the algorithms behind the software remain a trade secret, it is almost impossible for those impacted to dispute the system’s predictions.
Even in Switzerland, the outcomes of some cases, while not directly involving advanced AI, have been affected by data that was generated unintentionally. In early 2022, the Bern-Mittelland regional court was led to doubt the alibi of a murder suspect because of the pedometer on his mobile phone.
Learning algorithms are becoming increasingly pervasive in our everyday lives, so the judiciary must decide on how to verify their “testimonies,” emphasizes Di Gallo. To that end, his recommendations, developed as part of a project* of the Swiss National Science Foundation under the leadership of Sabine Gless, professor of criminal justice and procedural law, could be viewed as a kind of checklist. For instance, courts should ask: Is the AI system recognized by the scientific community and considered to be reliable? Could it be malfunctioning? Does it need to be updated? “Just because an AI system is ready for the consumer market doesn’t mean it can provide reliable evidence,” says Di Gallo. Ultimately, he says, it depends on what conclusions can be drawn from the AI system’s analysis in each individual case.
* Human-Robot Interaction: A Digital Shift in Law and its Narratives? Legal Blame, Criminal Law, and Procedure
More articles in this issue of UNI NOVA (May 2024).