Large language models can’t reason – study

Everyone knows that artificial intelligence can make mistakes and hallucinate. But recent Apple research has revealed even more significant flaws in the mathematical models that AI uses to “reason.”

Scientists asked the model the same question several times, slightly changing its wording. At the same time, the model’s answers also changed, especially if numbers were used in the questions.

Study Shows Even Advanced AI Can’t Reason

  1. Stories

Author: Daria Sidorova

Research published by arxiv.org showedthat the model’s answers change significantly if the same question is formulated differently. According to the scientists, “this calls into question the reliability of current GSM8K results, which rely on single-point accuracy metrics.” GSM8K is a dataset that is used for testing models. It includes more than 8 thousand questions and answers on elementary school level mathematics.

Apple researchers determined that the performance difference could be as much as 10%. And even minor changes in industrial data can seriously affect the reliability of the model’s answers.


Read on topic:

NYT learned about Microsoft’s attempts to create a replacement for OpenAI technologies

The AI ​​boom prevented environmentalists from shutting down a coal-fired power plant in the US.


The fact is that AI does not rely on logical reasoning, but on pattern recognition. Apple research shows that changing just a few unimportant words has an impact on pattern recognition.

One example is the task of counting the number of kiwis collected over several days. Apple researchers conducted a control experiment and then added information about the size of the kiwi.

Meta*’s Llama and OpenAI’s o1 changed their responses compared to the control experiment, although the kiwi size data did not affect the result. GPT-4o also encountered problems.

Large language patterns are becoming part of our lives, and these findings highlight the need to test the information they provide.

* Meta and its Facebook and Instagram members are recognized as extremist organizations whose activities are prohibited in the Russian Federation.

Source.

Cover photo: Paper Boat Creative / Getty Images

Source: rb.ru