“CriticGPT” New AI Model That Identifies Errors of ChatGPT

OpenAI's CriticGPT, based on GPT-4, enhances code review accuracy by 60%, outperforming human reviewers in identifying bugs, though it struggles with complex tasks.

Punam Singh
Updated On
New Update


OpenAI has developed a new AI model ‘CriticGPT’. This new AI model is based on GPT-4 which is designed to identify and critique errors in code generated by ChatGPT. The main objective of developing Criticgpt was to help human trainers catch mistakes in ChatGPT’s code output during the reinforcement learning from human feedback (RLHF) process. 


The AI model assists AI trainers in evaluating outputs from advanced AI systems and enhances the accuracy and reliability of code generated by ChatGPT. CriticGPT is said to be trained by human trainers by manually inserting errors into code generated by ChatGPT and providing feedback on the mistakes.  This process helped CriticGPT learn to identify and critique errors more accurately.

How does CriticGPT differ from traditional AI code reviewers?

CriticGPT was found to outperform human reviewers in 63% of cases when identifying naturally occurring bugs in code generated by ChatGPT. Teams using CriticGPT produced more comprehensive critiques and found fewer false positives compared to those working alone, improving code review outcomes by 60%.


CriticGPT uses a technique called "Force Sampling Beam Search" that allows users to customize the sensitivity of error detection.

Although CriticGPT excels at identifying simple bugs, it struggles more with longer, more complex coding tasks. This is partly because it was trained on relatively short ChatGPT responses.

Looking further


While CriticGPT is not infallible and can produce its own hallucinations, it helps human trainers write more comprehensive critiques than they would alone. The combination of human reviewers using CriticGPT outperforms unassisted human trainers by 60% when assessing ChatGPT's code output.

However, the AI model is still developing its efficiency and it may struggle with longer, more complex tasks, and it cannot always identify the source of errors that spread across multiple code strings.

OpenAI is currently integrating CriticGPT-like models into the RLHF labeling pipeline to further improve the accuracy of ChatGPT's outputs. This allows it to assist human trainers in evaluating outputs from advanced AI systems like ChatGPT, helping to improve their accuracy and reliability.