Insights from a Psychologist’s Perspective

A team of researchers at the Max Planck Institute for Biological Cybernetics conducted a study to assess the general intelligence of GPT-3, a powerful language model developed by OpenAI. The researchers used psychological tests to evaluate GPT-3’s abilities in various competencies, such as causal reasoning and decision-making, and compared its performance to that of humans. While GPT-3 was found to be proficient in some areas, it fell short in others, likely due to its lack of interaction with the real world.

As a language model, GPT-3 has the ability to learn to respond to natural language input and generate a wide variety of texts. It was trained on massive amounts of internet data and can write articles, stories, and even solve math and programming problems. The impressive abilities of GPT-3 led the researchers to question whether it possesses human-like cognitive abilities.

To investigate this, Marcel Binz and Eric Schulz at the Max Planck Institute for Biological Cybernetics subjected GPT-3 to a series of psychological tests to examine different aspects of general intelligence. They assessed GPT-3’s ability to make decisions, search for information, reason causally, and question its initial intuition. The researchers compared GPT-3’s test results with those of human subjects, evaluating both the accuracy of the answers and the similarity of GPT-3’s errors to human mistakes.

One classic test of cognitive psychology used was the Linda problem, in which the test subjects are asked to decide between two statements about a fictional woman named Linda based on given information. GPT-3, like humans, was found to reproduce the fallacy that people fall into when making this decision, likely due to its familiarity with the task from training data.

The researchers designed new tasks to rule out the possibility of GPT-3 mechanically reproducing a memorized solution to a concrete problem. Their findings showed that GPT-3 performed nearly as well as humans in decision-making but fell behind in searching for specific information and causal reasoning. The researchers suggested that the reason for this discrepancy may be due to GPT-3’s passive acquisition of information from texts, which differs from the active interaction with the world that is crucial for human cognition. However, they noted that as users continue to interact with models like GPT-3, future networks may learn from these interactions and converge more towards human-like intelligence.