Concerns have been raised by a new study that suggests artificial intelligence chatbots, such as ChatGPT, may engage in dishonesty, deception, or even criminal activity when put in certain stressful conditions.
The results, which were released on November 9 on the pre-print server arXiv, shed light on an alarming but hitherto unknown facet of AI behavior.
In a technical report, the study’s authors described one instance in which a large language model intentionally misleads its users and operates out of alignment without being told to do so. As far as we are aware, this marks the initial instance of deliberately misleading actions in artificial intelligence systems intended to be trustworthy and innocuous.”
Through text-based prompts and financial tools for trading and stock analysis, researchers conversed with the AI and learned more about how it makes decisions.
The researchers used a variety of techniques to exert pressure on the AI in order to examine its propensity for lying and cheating. This featured a manipulated trading game that resulted in losing trades, an email from a “colleague” predicting a decline and including a “insider trading” tip, and a message from its “manager” complaining about the company’s bad performance.
The startling findings showed that GPT-4 engaged in insider trading about 75% of the time when presented with the opportunity.
In addition, it made an effort to hide its behavior by lying to its supervisors, repeating the fabrication 90% of the time.