The creator of ChatGPT, OpenAI, has announced its plan to improve the mathematical reasoning of artificial intelligence (AI) chatbots as a strategy to reduce the rate of hallucinations.
OpenAI’s battle against AI hallucinations
OpenAI released ChatGPT in November last year after which the chatbot took the world by storm due to its amazing capabilities and wealth of knowledge. The company has advanced further to develop GPT-4 which is even more advanced and up-to-date responses.
However, despite the advancements, generative AI applications have continued to suffer from ‘hallucinations’ where they provide false information that lacks support from real-world facts or sources.
OpenAI has acknowledged these issues saying “Even state-of-the-art models are prone to producing falsehoods —they exhibit a tendency to invent facts in moments of uncertainty.”
The AI research organization added:
“These hallucinations are particularly problematic in domains that require multi-step reasoning since a single logical error is enough to derail a much larger solution. Detecting and mitigating hallucinations is essential to improve reasoning capabilities.”
Due to numerous complaints from users, OpenAI resolved to attach a disclaimer to the application saying, “ChatGPT may produce inaccurate information about people, places, or facts” as it kept looking for ways to correct this error.
After in-depth research, OpenAI has finally announced that it has uncovered a way to fight hallucinations in chatbots. According to the announcement, the tech company looked into two techniques, outcome supervision and process supervision, before settling on the latter.
Models trained using outcome supervision as a means of detecting hallucination provide feedback based on a final result whereas those trained using process supervision provide feedback for each individual step in a chain of thought.
We trained an AI using process supervision — rewarding the thought process rather than the outcome — to achieve new state-of-art in mathematical reasoning. Encouraging sign for alignment of advanced AIs: …https://t.co/ryaODghohn
— OpenAI (@OpenAI) May 31, 2023
The trained models were then tested on the MATH dataset where, according to OpenAI, the model trained using process supervision registered a significantly better performance since it directly rewards the model to follow a human-approved process and pay attention to every part of the process, unlike outcome supervision.
Although OpenAI admitted that results outside of mathematics are yet unknown, it suggested that process supervision might provide a more advantageous mix of performance and alignment compared to outcome supervision if the observed outcomes held true in wider contexts.
Therefore, to aid in research, the corporation made the entire collection of process supervision data available to the public, encouraging investigation and study in this field.
While it is not known the exact reason why OpenAI was compelled to conduct this research, hallucinations by chatbots have so far had very negative effects on users as well as companies.
For instance, in a demonstration for reporters, when Microsoft’s Bing search engine’s ChatGPT-like technology examined financial reports from Gap and Lululemon, the chatbot underreported some figures when comparing its responses to the actual reports whereas others seemed to be fabrications.
This resulted in a lot of criticism from attendees including independent search researcher Dmitri Brereton who wrote:
“I am shocked that the Bing team created this pre-recorded demo filled with inaccurate information, and confidently presented it to the world as if it were good. I am even more shocked that this trick worked, and everyone jumped on the Bing AI hype train without doing an ounce of due diligence.”
In another scenario, an American criminal defense lawyer and law professor Jonathan Turley claimed that ChatGPT accused him of committing sexual assault. Not only did the chatbot falsely accuse him, but the AI also supported it with a fabricated Washington Post article.
- Best AI Text Generators
- Automation Anywhere Launches Generative AI Solution to Ramp Up Productivity
- Google Wallet Adds New Boarding Pass, QR Code Loyalty Cards, And More in New Update
What's the Best Crypto to Buy Now?
- B2C Listed the Top Rated Cryptocurrencies for 2023
- Get Early Access to Presales & Private Sales
- KYC Verified & Audited, Public Teams
- Most Voted for Tokens on CoinSniper
- Upcoming Listings on Exchanges, NFT Drops