学术讲座

10月21日 | 王文佳:Sparse Additive Contextual Bandits: A Nonparametric Approach for Online Decision-making with High-dimensional Covariates

时   间:2025-10-21 (周二)15:00 - 16:00

地   点:中北理科大楼A1314室

报告人:王文佳   香港科技大学(广州)助理教授

主持人:王亚平   华东师范大学教授

摘   要:

Personalized services are fundamental to today's digital economy, with their online decision-making often framed as contextual bandit problems. Modern applications present two significant challenges for this framework: high-dimensional covariates and the necessity for nonparametric models to accurately reflect the complex relationships between rewards and covariates. We propose a new contextual bandit algorithm based on a sparse additive reward model that addresses both challenges via: (i) a double penalization method for nonparametric reward function estimation, and (ii) an epoch-based structure that effectively balances exploration and exploitation. We prove that the cumulative regret of our algorithm is sublinear in the time horizon $T$ and grows linearly with the logarithm of the covariate dimensionality $\log(d)$. Through extensive numerical experiments, we show our algorithm's superior performance in high-dimensional settings compared to existing algorithms.

报告人简介:

王文佳是香港科技大学(广州)信息枢纽数据科学与分析学域的助理教授;2018年8月获得佐治亚理工学院工业工程系博士学位。王文佳的研究方向包括不确定性量化、随机仿真、机器学习、非参数统计和计算机实验,在统计学、机器学习、管理学顶级期刊、会议Journal of the American Statistical Association,Journal of Machine Learning Research,Management Science,Technometrics,NeurIPS,ICLR,ICML等发表数十篇篇文章。


发布者:张瑛发布时间:2025-10-20浏览次数:10