HomeWinBuzzer NewsAIGCode's AutoCoder Outshines GPT-4 Turbo in Code Generation Accuracy

AIGCode’s AutoCoder Outshines GPT-4 Turbo in Code Generation Accuracy

AutoCoder uses AIEV-INSTRUCT as a novel training strategy that enhances code quality while reducing dependency on substantial proprietary models.

-

Researchers from the University of Connecticut and AIGCode have introduced AutoCoder, a new large language model (LLM) for , which has achieved a 90.9% pass rate on the HumanEval benchmark, surpassing ´s GPT-4 Turbo's 90.2%.

In comparing AutoCoder with and GPT-4o, AutoCoder leads with a superior pass@1 metric on the HumanEval benchmark, indicating higher coding precision and efficiency. AutoCoder's ability to handle external packages gives it a notable edge over its predecessors, which are restricted to built-in packages.

AIEV-INSTRUCT: A New Training Approach

AutoCoder uses AIEV-INSTRUCT as a novel training strategy that enhances code quality while reducing dependency on substantial proprietary models, suggesting more sustainable and open advancements in LLM coding.

AIEV-INSTRUCT uses an interactive process employing a pair of agents—a questioner and a coder—to engage in simulated coding dialogues. Initially, proprietary models create and validate instructions, with GPT-4 Turbo as the supervisor. Through iterative interactions, the generated code undergoes continuous refinement. When the student model exceeds the teacher model in performance, it enters a self-learning phase, independently generating and verifying code. This innovative method minimizes reliance on expensive models, boosting both the quality and robustness of the datasets produced.

Performance and Versatility

Trained via AIEV-INSTRUCT, AutoCoder has shown exceptional performance, not only surpassing GPT-4 Turbo on the HumanEval benchmark but also demonstrating significant prowess in code interpretation, including the installation of external packages. This capability greatly broadens AutoCoder's utility in practical coding environments. AutoCoder has been evaluated across multiple datasets, including HumanEval+, MBPP, MBPP+, MultiPL-E, and DS-1000, where it has secured top positions in various . Even the smaller AutoCoder-S version, with 6.7 billion parameters, has performed remarkably well, proving effective and accurate despite a reduced parameter count.

AutoCoder has the potential to significantly enhance . Its superior performance indicates a more accessible and accurate tool for developers worldwide. The underlying study presents a cost-effective, precise method for generating code instruction datasets, improving the overall efficiency of code generation tasks. For more details, the research paper is available on arXiv, and the code is accessible online via GitHub.

Markus Kasanmascheff
Markus Kasanmascheff
Markus is the founder of WinBuzzer and has been playing with Windows and technology for more than 25 years. He is holding a Master´s degree in International Economics and previously worked as Lead Windows Expert for Softonic.com.