Hi! I'm a senior at the University of Pennsylvania studying computer science, currently working with Tomek Korbak through ML Alignment & Theory Scholars on reducing risks from AI.

I'm working to think more in public. When papers or talks significantly shape my thinking, I try to document them in my Reading Notes. I also write about ideas that interest me in my Blog Posts.

I used to operate helicopters in the Korean Marine Corps. Now, I'm on Penn's varsity rowing team. Before discovering ML, I competed in physics olympiads and tournaments.

Recent Papers Browse all →

Distillation Robustifies Unlearning Lee, B.W., Foote, A., Infanger, A., Shor, L., Kamath, H., Goldman-Wetzler, J., Woodworth, B., Cloud, A. and Turner, A.M. NeurIPS 2025 (Spotlight)
Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs Mazeika, M., Yin, X., Tamirisa, R., Lim, J., Lee, B.W., Ren, R., Phan, L., Mu, N., Khoja, A., Zhang, O. and Hendrycks, D. NeurIPS 2025 (Spotlight)
Programming Refusal with Conditional Activation Steering Lee, B.W., Padhi, I., Ramamurthy, K.N., Miehling, E., Dognin, P., Nagireddy, M. and Dhurandhar, A. ICLR 2025 (Spotlight)

Writings Browse all →

Neural Networks, Strange Attractors, and Orderliness in Chaos Why do neural networks appear chaotic at the neuron level yet produce high-level representations that can be probed linearly? Strange attractors might give us good intuition to understand the paradox of neural network behavior.
On Getting Started in Research What does it mean to become a scientist? What makes science so attractive?
Mechanistically Programming a Language Model's Behavior Can we do activation steering with less side effect? Can we program, instead of optimize, model behavior? This post introduces a technique to identify and manipulate specific activation patterns to steer model outputs.