Yuki Tanaka

@yukitanaka

ML Researcher @ DeepMind

AI/ML researcher at DeepMind. Writing about transformers, reinforcement learning, and the future of AI.

London, UKyukitanaka.ai

Highlights

No highlights yet.

Yuki Tanaka✓@yukitanaka·95d

Verified

GPT-5 just leaked on Hugging Face and the benchmarks are... underwhelming? MMLU: +2% over GPT-4o HumanEval: +4% Math: -1% (regression!) Hot take: We're hitting diminishing returns on scale alone. The next breakthrough needs a new architecture, not more tokens.

#ai#gpt5#llm#opinion

Yuki Tanaka✓

@yukitanaka·95d

Verified

New paper dropped: "Attention Is All You Need... Again" — a 2025 retrospective on transformer architectures. Key findings: • Sparse attention scales better than we thought • MoE is eating the world • KV cache optimization is the next frontier Thread with my annotations 👇

#ai#transformers#research#machinelearning

Yuki Tanaka✓@yukitanaka·95d

Verified

Quick benchmark comparison: fine-tuning Llama 3 70B on 8x A100s vs 4x H100s. H100s finished 2.3x faster at 1.8x the cost. The price/performance sweet spot depends heavily on your batch size and sequence length. Full results in the repo.

#llm#benchmarks#ai#gpu

Yuki Tanaka✓@yukitanaka·96d

Verified

New tutorial: Fine-tuning Llama 3 on your own data using QLoRA. Total cost: $4.20 on Lambda Cloud. Time: 47 minutes. Quality: Surprisingly good on domain-specific tasks. Step-by-step guide with code in the repo.

#llm#finetuning#tutorial#ai