I'm a senior at the University of Pennsylvania studying computer science, currently working with Tomek Korbak through ML Alignment & Theory Scholars on reducing risks from AI.

Previously, I spent two years leading NLP research at an EdTech startup. I've also done research at IBM and NAVER Cloud, where I proposed conditional steering and curriculum instruction tuning.

I've been fortunate to learn from diverse mentors throughout my journey: Alex Cloud, Alex Turner, Mantas Mazeika, Inkit Padhi, Karthikeyan N. Ramamurthy, Hyunsoo Cho, and Kang Min Yoo.

Besides research, I spend a lot of time working on research tooling and getting better at it. Some of my open source projects include Activation-Steering (2024), LFTK (2023), LingFeat (2021).

I'm working to think more in public. When papers significantly shape my thinking, I try to document them in my Reading Notes. I also write about ideas that interest me in my Blog Posts.

Outside of research, I served as part of the cabin crew for a military utility helicopter in the Korean Marine Corps. Now, I compete on Penn's varsity rowing team. Before discovering ML, I competed in physics olympiads and tournaments throughout middle and high school.

Papers Browse all →

Distillation Robustifies Unlearning Lee, B.W., Foote, A., Infanger, A., Shor, L., Kamath, H., Goldman-Wetzler, J., Woodworth, B., Cloud, A. and Turner, A.M. NeurIPS 2025 (Spotlight)
Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs Mazeika, M., Yin, X., Tamirisa, R., Lim, J., Lee, B.W., Ren, R., Phan, L., Mu, N., Khoja, A., Zhang, O. and Hendrycks, D. NeurIPS 2025 (Spotlight)
Programming Refusal with Conditional Activation Steering Lee, B.W., Padhi, I., Ramamurthy, K.N., Miehling, E., Dognin, P., Nagireddy, M. and Dhurandhar, A. ICLR 2025 (Spotlight)

Writings Browse all →

Neural Networks, Strange Attractors, and Orderliness in Chaos Why do neural networks appear chaotic at the neuron level yet produce high-level representations that can be probed linearly? Strange attractors might give us good intuition to understand the paradox of neural network behavior.
On Getting Started in Research What does it mean to become a scientist? What makes science so attractive?
Mechanistically Programming a Language Model's Behavior Can we do activation steering with less side effect? Can we program, instead of optimize, model behavior? This post introduces a technique to identify and manipulate specific activation patterns to steer model outputs.