Home

I'm a senior at the University of Pennsylvania working with Tomek Korbak through MATS on reducing risks from misaligned agents. Previously, I worked with Alex Turner and Alex Cloud on capability scoping.

Before moving to the US, I served in the Republic of Korea Marines, where I operated helicopters. Now, I'm on Penn's men's lightweight rowing team.

I'm learning to think more in public! I document things that interest me on this website. If you only read one thing, I recommend Bitter Lessons from "Distillation Robustifies Unlearning".

Papers Go to Google Scholar →

Distillation Robustifies Unlearning Bruce W. Lee, Addie Foote, Alex Infanger, Leni Shor, Harish Kamath, Jacob Goldman-Wetzler, Bryce Woodworth, Alex Cloud, Alexander Matt Turner NeurIPS 2025 (Spotlight)

Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs Mantas Mazeika, Xuwang Yin, Rishub Tamirisa, Jaehyuk Lim, Bruce W. Lee, Richard Ren, Long Phan, Norman Mu, Adam Khoja, Oliver Zhang, Dan Hendrycks NeurIPS 2025 (Spotlight)

Programming Refusal with Conditional Activation Steering Bruce W. Lee, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Erik Miehling, Pierre Dognin, Manish Nagireddy, Amit Dhurandhar ICLR 2025 (Spotlight)

Informal ThoughtsBrowse all →

Blog Posts

Bitter Lessons from "Distillation Robustifies Unlearning" December, 2025

Neural Networks, Strange Attractors, and Orderliness in Chaos August, 2025

On Getting Started in Research November, 2024

Mechanistically Programming a Language Model's Behavior September, 2024

Technical Blurbs

Loss for an example is just the negative of the log-probability that the model assigned to the true class? December, 2025

Error Bars as Degrees of Belief November, 2025

Reasoning about Neural Network Training with Bias-Variance Tradeoff November, 2025

Information Theory and Logistic Regression Both Arrive at Cross-Entropy October, 2025

Equivariance and Invariance Explain So Much of Deep Learning October, 2025

Why Not Initialize Neural Network Weight to Zero? September, 2025