4月19日 | 杨朋坤：Two phases of scaling laws for nearest neighbor classifiers

时间：2024年4月19日 9:15-10:00

地点：普陀校区理科大楼A1114

报告人：杨朋坤清华大学助理教授

主持人：於州华东师范大学教授

摘要：

A scaling law refers to the observation that the test performance of a model improves as the number of training data increases. A fast scaling law implies that one can solve machine learning problems by simply boosting the data and the model sizes. Yet, in many cases, the benefit of adding more data can be negligible. In this work, we study the rate of scaling laws of nearest neighbor classifiers. We show that a scaling law can have two phases: in the first phase, the generalization error depends polynomially on the data dimension and decreases fast; whereas in the second phase, the error depends exponentially on the data dimension and decreases slowly. Our analysis highlights the complexity of the data distribution in determining the generalization error. When the data distributes benignly, our result suggests that nearest neighbor classifier can achieve a generalization error that depends polynomially, instead of exponentially, on the data dimension.

报告人简介：

Pengkun Yang is an assistant professor at the Center for Statistical Science at Tsinghua University. Prior to joining Tsinghua, he was a Postdoctoral Research Associate at the Department of Electrical Engineering at Princeton University. He received a Ph.D. degree (2018) and a master degree (2016) from the Department of Electrical and Computer Engineering at University of Illinois at Urbana-Champaign, and a B.E. degree (2013) from the Department of Electronic Engineering at Tsinghua University. His research interests include statistical inference, learning, optimization, and systems. He is a recipient of Thomas M. Cover Dissertation Award in 2020, and a recipient of Jack Keil Wolf ISIT Student Paper Award at the 2015 IEEE International Symposium on Information Theory (semi-plenary talk).

发布者：张瑛发布时间：2024-04-10浏览次数：10