Author order is alphabetical unless marked with an asterisk*.Â
Foundations of Large Language Models
Two Heads are Better than One: Simulating Large Transformers with Small Ones* [arXiv]
Hantao Yu, Josh Alman
Fast Attention Mechanisms: a Tale of Parallelism*
Jingwen Liu, Hantao Yu, Clayton Sanford, Alexandr Andoni, Daniel Hsu
Fundamental Limitations on Subquadratic Alternatives to Transformers [arXiv]
Josh Alman, Hantao Yu
International Conference on Learning Representations (ICLR), 2025
Machine Learning Theory
Robust Empirical Risk Minimization with Tolerance* [arXiv]
Robi Bhattacharjee, Max Hopkins, Akash Kumar, Hantao Yu, Kamalika Chaudhuri
International Conference on Algorithmic Learning Theory (ALT), 2023
Active Learning Polynomial Threshold Functions [arXiv]
Omri Ben-Eliezer, Max Hopkins, Chutong Yang, Hantao Yu
Neural Information processing Systems (NeurIPS), 2022
Matrices and Tensors
Improving the Leading Constant of Matrix Multiplication [arXiv]
Josh Alman, Hantao Yu
Symposium on Discrete Algorithms (SODA), 2025
Tensor Ranks and the Fine-Grained Complexity of Dynamic Programming [arXiv][my 20-min talk]
Josh Alman, Ethan Turok, Hantao Yu, Hengzhi Zhang
Innovations in Theoretical Computer Science (ITCS), 2024
Differential Privacy
Differentially Private Shortest Distances in Continual Release Model
Rachel Cummings, Tamalika Mukherjee, Jalaj Upadhyay, Hantao Yu, Zongrui Zou
Theory and Practice of Differential Privacy Workshop (TPDP), 2025