Skip to main content

One post tagged with "self"

View All Tags

The Research on LLM Self-Correction

· 13 min read
Vadim Nicolai
Senior Software Engineer

If you’re building with LLMs today, you’ve likely been sold a bill of goods about “reflection.” The narrative is seductive: just have the model check its own work, and watch quality magically improve. It’s the software equivalent of telling a student to “review your exam before turning it in.” The reality, backed by a mounting pile of peer-reviewed evidence, is far uglier. In most production scenarios, adding a self-reflection loop is the most expensive way to achieve precisely nothing—or worse, to degrade your output. The seminal paper that shattered the illusion is Huang et al.’s 2023 work, “Large Language Models Cannot Self-Correct Reasoning Yet.” Their finding was blunt: without external feedback, asking GPT-4 to review and correct its own answers on math and reasoning tasks consistently decreased accuracy. The model changed correct answers to wrong ones more often than it fixed errors. This isn’t an edge case; it’s a fundamental limitation of an autoregressive model critiquing its own autoregressive output with the same data, same biases, and zero new information.