January 9, 2026
Deception Detection is Hard and Here's Why
Findings from my sprint with Jake Ward in Neel Nanda's 9.0 training phase.
Read More →AI Safety & Alignment Research
January 9, 2026
Findings from my sprint with Jake Ward in Neel Nanda's 9.0 training phase.
Read More →December 2, 2025
A brief post going into one of my mini-projects during Neel Nanda's 9.0 training phase.
Read More →December 1, 2025
The extension of our MNIST digit MI exploration: SAE-ception.
Read More →August 3, 2025
In this post, I explore the intersection of deep supervision and mechanistic interpretability on MNIST digit classification.
Read More →