For chatbots, mathematics is the final frontier. AI language models use statistics to generate responses that give the most likely satisfactory answer. This works great when the target is a passable sentence, but it means the chatbot will struggle with questions like math where there is only one correct answer.
There’s growing evidence that you can get better results if you give AI some friendly encouragement, but a new study takes this strange reality a step further.Research from software company VMware shows that chatbots perform better on math problems when you tell the model to pretend they are chatting Star Trek.
“It is both surprising and irritating that small modifications to the cues can lead to such dramatic fluctuations in performance,” the authors wrote in the paper. new scientists.
Researchpublished on arXiv, without departure Star Trek as its main directive.Previous research found chatbots answer math questions more accurately when you provide them with information friendly motives It’s like “take a deep breath and do this one step at a time.”Others find out you can cheat Chat GPT violate its own security guidelines if you threaten to kill it Or provide AI funding.
Rick Battle and Teja Gollapudi of WMWare’s Natural Language Processing Lab set out to test the effectiveness of framing questions using “positive thinking.”The study looked at three AI tools, including two versions Meta’s Alpaca 2 and models from French companies Mistral Artificial Intelligence.
They developed a series of encouraging questions, including starting prompts with phrases like “You’re as smart as ChatGPT” and “You’re a math whiz,” and ending with “This will be fun!”and
“Breathe deeply and think carefully.” The researchers then tested the results using GSM8K, a standard set of elementary school math problems.
In the first phase, results were mixed. Some prompts improved answers, others had only marginal effects, and there was no consistent pattern across the board. However, the researchers then asked the AI to help them in their efforts to help the AI. There, the results get even more interesting.
The study used an automated process to try out multiple prompt variations and adjust the language based on how much the chatbot’s accuracy improved. Not surprisingly, this automated process was more effective than researchers’ handwritten attempts to pose questions in an active-thinking way. But the most effective tips “demonstrate a level of uniqueness that goes well beyond expectations.”
For one of the models, the AI was asked to start responding with the phrase “Captain’s Log, Stardate” [insert date here]:.” came to the most accurate answer.
“Surprisingly, the mathematical reasoning capabilities of the model appear to be improved by expressing Star Trek,” the researchers wrote.
The authors wrote that they did not know what Star Trek References improve the performance of artificial intelligence. There is some logic that positive thinking or threats lead to better answers. These chatbots are trained on billions of lines of text collected from the real world. In the wild, humans writing the languages used to build artificial intelligence are likely to give more accurate answers to questions when violently pressured or offered encouragement. The same goes for bribes; people are more likely to follow instructions when money is available. Large language models may detect this phenomenon, so they behave the same way.
But it’s hard to imagine that in the data set used to train the chatbot, the most accurate answers started with the phrase “captain’s log.” The researchers don’t even have a theory as to why this should lead to better results. It tells the story of one of the strangest facts about artificial intelligence language models: Even the people who build and study them don’t really understand how they work.