Niklas Muennighoff

Niklas Muennighoff

I'm building Composer at Cursor, a frontier AI coding model. I'm also a PhD student & Knight-Hennessy Scholar at Stanford, advised by Yejin Choi & Andrew Ng.

My work includes MTEB, a widely used AI evaluation framework with 18M+ downloads; s1, which helped define test-time scaling; and Scaling Data-Constrained LMs, which helped establish multi-epoch pretraining, now standard at frontier labs.

Awards include NeurIPS Outstanding Paper Runner-Up, CVPR Best Paper Honorable Mention, ACL Best Paper + Best Theme Paper + Best Resource Paper, and 2nd/3300+ in Meta's Hateful Memes Challenge.

I did my bachelor's at Peking University & am comfortable working in Chinese.

Select AI Research

Watch
posttraining reasoning
EMNLP & ICLR 2025 Reasoning Workshop · Best Paper Award

TL;DR: Training LLMs to reason with just 1K training samples & a simple technique to control reasoning duration called "budget forcing".

Contact

Questions on papers I’ve co-authored: GitHub issues on the relevant code repository are usually the best place :)

Starting AI Research: If you want to get started in research I recommend contributing to MTEB. We’re a community building the go-to place for everything embeddings with 400K monthly users on our leaderboard & regular publications you can co-author! Example papers from our community: MMTEB, MIEB, MAEB, HUME, SEB.

My email is n.muennighoff@gmail.com :)

Other

Health: I'm pretty into health optimization; my fav sports are swimming/beachvb/tennis :)

Languages: I've worked in Chinese, Japanese, English, German & French. I also took extensive AI coursework in Chinese at Peking University & passed their Chinese placement test with 100/100.

Arts: As a kid I worked as a voice-over artist for 8 years dubbing German voices for Peter Pan (Disney), Pokemon, Game of Thrones (HBO), Dracula (NBC) & others (sample: Gortimer here/here & Victor here) 🎬

Reactions