EXAONE Deep Sets New AI Benchmark in Math, Science & Coding

In a major leap for artificial intelligence, LG AI Research has introduced EXAONE Deep, a groundbreaking reasoning model designed to excel at solving complex problems in mathematics, science, and coding. This new model positions LG among a select few global players capable of building advanced AI models focused on deep reasoning.

Creating highly capable reasoning models remains one of the most challenging frontiers in AI. Only a handful of organizations worldwide are working at this level. With EXAONE Deep, LG aims to compete directly with the industry’s best, setting a new benchmark for reasoning ability across multiple disciplines.

LG AI Research invested significant resources into enhancing EXAONE Deep’s reasoning power, enabling it to perform impressively across core knowledge areas. According to the company, the model not only excels in math but also shows remarkable understanding and application in diverse subjects, including science and programming.

Benchmark results released by LG speak volumes about the model’s capabilities:

In mathematics, the EXAONE Deep 32B model outshined a much larger competitor, achieving better results despite being just 5% of its size. Additionally, the smaller 7.8B and 2.4B versions topped all key math benchmarks within their respective categories.
For science and coding, both the 7.8B and 2.4B models secured first place across major tests, proving their strength in professional-level tasks.
On the MMLU (Massive Multitask Language Understanding) benchmark, the 32B model scored 83.0, making it the best-performing domestic model in South Korea.

This stellar performance quickly earned EXAONE Deep global recognition. It was named among the ‘Notable AI Models‘ by Epoch AI, a US-based nonprofit research group. This honor places LG’s latest model alongside its predecessor, EXAONE 3.5, and makes LG the only Korean entity to appear on the prestigious list for two consecutive years.

Where EXAONE Deep truly shines is in its mathematical reasoning. Evaluated against the rigorous 2025 academic year math curriculum, the model outperformed leading global reasoning models of similar sizes.

The flagship 32B model achieved a 94.5 score in general math competency and 90.0 in the 2024 American Invitational Mathematics Examination (AIME)—a qualifying test for the US Mathematical Olympiad. Even more impressive, in AIME 2025, it matched the performance of the DeepSeek-R1 model, which is over 20 times larger (671B). This shows EXAONE Deep’s ability to deliver efficient learning and powerful logic without needing massive computational resources.

Even the lighter 7.8B and 2.4B models led their respective categories:

7.8B Model: Scored 94.8 on MATH-500 and 59.6 on AIME 2025.
2.4B Model: Delivered 92.3 on MATH-500 and 47.9 on AIME 2025.

These results highlight EXAONE Deep’s ability to offer powerful reasoning in compact models, suitable even for on-device applications.

EXAONE Deep’s strengths go beyond math. In professional science reasoning and software coding, the model once again set new standards.

The 32B version scored 66.1 on the GPQA Diamond test, which evaluates PhD-level knowledge in physics, chemistry, and biology. It also excelled in LiveCodeBench, a benchmark designed to measure coding proficiency, with a solid score of 59.5.

Meanwhile, the 7.8B and 2.4B models repeated their success, claiming top spots in both GPQA Diamond and LiveCodeBench in their categories. This performance builds on the earlier success of EXAONE 3.5’s 2.4B model, which previously topped Hugging Face’s LLM Readerboard for edge models.

Beyond specialized reasoning, EXAONE Deep also excels at general knowledge tasks. The 32B model’s 83.0 score on the MMLU benchmark underlines its broad understanding of diverse topics. This achievement solidifies its position as South Korea’s top domestic AI model in this area.

LG AI Research believes these advancements in EXAONE Deep mark a major step forward in developing AI systems capable of tackling high-level problem-solving. With continuous research and innovation, the company aims to shape a future where AI not only solves complex tasks but also contributes to enriching human lives.

Share with others