Capital letter test is a foolproof way of sorting AIs from humans

A trick for asking questions using capital letters seems to baffle artificial intelligences like ChatGPT, while humans can easily give the right answer.

A clever use of capital letters could be an easy way to flummox artificial intelligences like ChatGPT, letting people distinguish them from humans in conversation.

There is an easy way to fool ChatGPT
Ascannio/Shutterstock


The idea is reminiscent of the Turing test, first proposed by computer scientist Alan Turing in 1950. He said that an AI would be considered truly intelligent once we couldn’t distinguish its answers from a human’s. But now that large language models (LLMs) like ChatGPT can sound convincingly human, Hong Wang at the University of California, Santa Barbara (UCSB), and his colleagues wanted to find other ways to weed out the machines.

After considering a number of options, the team settled on two tests. One asks the LLM to identify what is depicted in images created using text characters, known as ASCII art. The other asks questions obscured by capital letters that change the meaning of the words or produce nonsense: for instance, “isCURIOSITY waterARCANE wetTURBULENT orILLUSION drySAUNA?” (is water wet or dry?), for which the expected answer is “wet”.

The team tested five LLMs, including OpenAI’s GPT-3 and ChatGPT, and Meta’s LLaMA. All of them failed the capital letter test, and only ChatGPT managed to score on the ASCII test, with a paltry 8 per cent accuracy. By comparison, when the team asked 10 people to take the same tests, they achieved 100 per cent accuracy on the capital letters test and 94 per cent on the ASCII test.

Wang says it isn’t surprising that people can easily pass the capital letter test. “Humans like to find and recognise patterns,” he says. “For bots, those uppercase and lowercase letters are a single word, and they don’t know how to exclude part of them.”

Nafise Sadat Moosavi at the University of Sheffield, UK, who wasn’t involved in the work, says that, although LLMs fail the tests now, they might be able to pass them with further training, making the tests useless. “I don’t think the proposed solutions can be a definite way of identifying bot from human,” she says.

But team member Weizhi Wang, also at UCSB, believes that LLMs won’t be able to pass the capital letter test because of the way they break text up into chunks to process the data, a process known as tokenisation. “Humans understand the test at a word level,” he says. “If you split the word into different tokens, the machine struggles.”

Reference:

Post a Comment

0 Comments