Coding Reasoning Maths

Why Qwen QwQ 32B Could Be the Future of AI for Coding and Math Tasks

Have you ever found yourself frustrated by the limitations of AI models when tackling complex tasks like coding or solving intricate math problems? It’s a common struggle—balancing the need for ...

Geeky Gadgets

OpenAI o3-Mini Review & Performance Tested : Coding, Math and Logical Reasoning

Whether it’s automating tedious coding tasks, solving complex logic puzzles, or even weighing in on ethical dilemmas, AI tools like OpenAI’s o3-Mini promise to make our lives easier. But let’s be ...

VentureBeat

Microsoft’s GRIN-MoE AI model takes on coding and math, beating competitors in key benchmarks

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Microsoft has unveiled a groundbreaking artificial intelligence model, ...

SiliconANGLE

Imandra’s new AI coding assistant CodeLogician uses ‘reasoning’ to guarantee the accuracy of its code

A startup called Imandra Inc. says it’s taking artificial intelligence-driven code completion to the next level with the launch of an entirely new and automated reasoning system called CodeLogician.

SiliconANGLE

Google revamps Gemini 2.5 Pro again, claiming superiority in coding and math

Google LLC has launched another, even more capable preview of its powerful Gemini 2.5 Pro model, proclaiming it to be the “most intelligent” large language model it has released so far. Today’s is the ...

scmp.com

Alibaba upgrades flagship Qwen3 model to outperform OpenAI, DeepSeek in maths, coding

Alibaba Group Holding unveiled an upgraded version of its third-generation Qwen3 family of large language models (LLMs), improving one of its members to score higher in maths and coding than products ...

NextBigFuture

OpenAI Releases O3 Model With High Performance and High Cost

OpenaI o3 sets new records in several key areas, particularly in reasoning, coding and mathematical problem-solving. It scores 75.7% on the semi-private eval in low-compute mode (for $20 per task in ...

TMCnet

Logical Intelligence Tops Leading AI Verification Benchmarks as Verified Code Generation Nears Reality with Aleph

Aleph, an AI coding agent sets new records on four major formal reasoning benchmarks, proving that automated code generation can be formally verified for mission-critical systems.

TechSpot

OpenAI's newest o3 and o4-mini models excel at coding and math – but hallucinate more often

A hot potato: OpenAI's latest artificial intelligence models, o3 and o4-mini, have set new benchmarks in coding, math, and multimodal reasoning. Yet, despite these advancements, the models are drawing ...

Bloomberg L.P.

OpenAI Releases New Reasoning Models for Coding and Visual Tasks

OpenAI is rolling out a pair of new artificial intelligence models that mimic the process of human reasoning to field more complicated coding questions and visual tasks, the latest in a flurry of ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results