Recent Posts
You and Your Big Heart Will Win
For the last 6 years I’ve dabbled in many things, but only one thing consistently. I’ve been running this weekly event, DLCT, rarely missing any week, for 6 years straight. The format of DLCT is simply, “talks”: a speaker comes to talk about a deep learning paper (usually one of their own), and engage with the audience (of a size between 40 to 80 on average) for an hour.
I Hope You Still Try
Hello, future you.
Two Years of MLC: My Protests
A little over a year ago, I wrote a 6000-word retrospective, A Year of MLC: Selfish Takes Only, reflecting on building ML Collective, the non-profit and non-traditional researchers community, for a full year.
Recent Publications
- 2026
The Topological Trouble With Transformers
TL;DR arXivPDF - 2025
Enhancing LLM Planning Capabilities through Intrinsic Self-Critique
TL;DR arXivPDF - 2024
TAB: Transformer Attention Bottlenecks enable User Intervention and Debugging in Vision-Language Models
TL;DR ICCV 2025 arXivPDF - 2024
Logit Scaling for Out-of-Distribution Detection
TL;DR arXivPDF - 2024
Training language models on the knowledge graph: Insights on hallucinations and their detectability
TL;DR COLM 2024 arXivPDF Twitter thread - 2024
Improve mathematical reasoning in language models by automated process supervision
TL;DR arXivPDF Twitter thread - 2024
Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation
TL;DR arXivPDF - 2024
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
TL;DR arXivPDF Twitter thread 1.5 Pro Update - 2023
Beyond human data: Scaling self-training for problem-solving with language models
TL;DR TMLR arXivPDF Twitter thread - 2023
Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?"
TL;DR arXivPDF - 2022
Character-Aware Models Improve Visual Text Rendering
TL;DR ACL 2023 arXivPDF Twitter thread - 2022
Extremely Simple Activation Shaping for Out-of-Distribution Detection
TL;DR ICLR 2023 arXivPDF Website Video Code Twitter thread