Tagged

#machine-learning

9 articles

DiffusionGemma: Google's open model that writes text in parallel

Google DeepMind shipped an open-weight model that generates text by denoising a block of tokens instead of predicting one at a time. Here's how it works and where it fits.

12 Jun 2026 10 min read ai machine-learning llm

GLM-5 vs Kimi K2.5 vs MiniMax M2.5: a practical coding showdown

Three Chinese MoE models claim frontier-class coding at a fraction of Opus pricing. Here's how they actually perform in Claude Code, Cursor, and real developer workflows.

18 Mar 2026 11 min read ai llm developer-tools

Nemotron 3 Super: NVIDIA's 120B open-weight model built for agentic workloads

NVIDIA's Nemotron 3 Super packs 120B parameters into 12B active, combining Mamba-2, Transformers, and a novel LatentMoE — all open-weight and purpose-built for multi-agent systems.

12 Mar 2026 8 min read ai machine-learning llm

Kimi K2.5 vs Claude Opus 4.6: a practical comparison for developers

Kimi K2.5 delivers 95% of Opus 4.6's coding capability at 10-25× lower cost. But the benchmarks don't tell the whole story.

2 Mar 2026 8 min read ai llm developer-tools

Anthropic studied 1.5 million conversations and found we're not checking AI's work

Two new Anthropic studies reveal that users voluntarily cede judgment to AI — and feel good about it until things go wrong.

20 Feb 2026 7 min read ai machine-learning ai-safety

Gemini 3.1 Pro: Google's reasoning leap

Google's latest frontier model more than doubles its predecessor's reasoning score in three months, leads 13 of 16 benchmarks, and ships at the same price. The adaptive compute architecture is the interesting part.

19 Feb 2026 6 min read machine-learning llm ai

Claude Sonnet 4.6: Opus-level performance at a fifth of the price

Anthropic's second major launch in two weeks puts near-flagship capability at $3/$15 per million tokens. The mid-tier label is starting to feel like a misnomer.

17 Feb 2026 9 min read machine-learning llm ai

MiniMax M2.5: the agent-native model that costs a dollar an hour

A 230B MoE model with 10B active parameters hits 80.2% on SWE-Bench Verified at 1/20th the cost of Opus. Here's what's real and what's hype.

12 Feb 2026 10 min read machine-learning llm open-source

GLM-5: the open-weight model that's rewriting the frontier economics

Zhipu AI's 744B mixture-of-experts model ships under MIT license with frontier-class benchmarks and aggressive pricing. Here's what actually matters.

11 Feb 2026 8 min read machine-learning llm open-source