Currently, I am interested in building mathematical foundations of large language models and mechanistic interpretability.
Mathematical Foundations of Large Language Models: In recent years, researchers have been trying to understand why modern neural networks perform so well on various tasks. I try to understand what LLMs (mostly transformers) can or cannot do from an expressivity viewpoint - whether there exist weights such that a transformer can efficiently perform certain algorithmic tasks.Â
Mechanistic interpretability: Mechanistic interpretability aims to understand the internal workings of neural networks. I have worked on understanding implicit bias of transformers - what solutions gradient-based training could reach to for various tasks.
See my papers below for details! (Papers with an asterisk* follow alphabetical order.)