10月21日 | 王文佳：Sparse Additive Contextual Bandits: A Nonparametric Approach for Online Decision-making with High-dimensional Covariates

时间：2025-10-21 （周二）15:00 - 16:00

地点：中北理科大楼A1314室

报告人：王文佳香港科技大学（广州）助理教授

主持人：王亚平华东师范大学教授

摘要：

Personalized services are fundamental to today's digital economy, with their online decision-making often framed as contextual bandit problems. Modern applications present two significant challenges for this framework: high-dimensional covariates and the necessity for nonparametric models to accurately reflect the complex relationships between rewards and covariates. We propose a new contextual bandit algorithm based on a sparse additive reward model that addresses both challenges via: (i) a double penalization method for nonparametric reward function estimation, and (ii) an epoch-based structure that effectively balances exploration and exploitation. We prove that the cumulative regret of our algorithm is sublinear in the time horizon $T$ and grows linearly with the logarithm of the covariate dimensionality $\log(d)$. Through extensive numerical experiments, we show our algorithm's superior performance in high-dimensional settings compared to existing algorithms.

报告人简介：

王文佳是香港科技大学（广州）信息枢纽数据科学与分析学域的助理教授；2018年8月获得佐治亚理工学院工业工程系博士学位。王文佳的研究方向包括不确定性量化、随机仿真、机器学习、非参数统计和计算机实验，在统计学、机器学习、管理学顶级期刊、会议Journal of the American Statistical Association，Journal of Machine Learning Research，Management Science，Technometrics，NeurIPS，ICLR，ICML等发表数十篇篇文章。

发布者：张瑛发布时间：2025-10-20浏览次数：10