10月11日:艾明要 | Optimal Subsampling Algorithm for Big Data Generalized Linear Models

时间:2018-10-09   浏览:10

报告时间:10月11日14:00-15:00

报告地点:闵行法商南楼135会议室

报告题目:Optimal Subsampling Algorithm for Big Data Generalized Linear Models

报告人:Mingyao Ai(Peking University)

Abstract:To fast approximate the maximum likelihood estimator with massive data, Wang et al. (JASA, 2017) proposed an optimal subsampling method under the A-optimality criterion (OSMAC) for in logistic regression. This paper extends the scope of the OSMAC framework to include generalized linear models with canonical link functions. The consistency and asymptotic normality of the estimator from a general subsampling algorithm are established, and optimal subsampling probabilities under the A- and L-optimality criteria are derived. Furthermore, using Frobenius norm matrix concentration inequality, finite sample properties of the subsample estimator based on optimal subsampling probabilities are derived. Since the optimal subsampling probabilities depend on the full data estimate, an adaptive two-step algorithm is developed. Asymptotic normality and optimality of the estimator from this adaptive algorithm are established.

The proposed methods are illustrated and evaluated through numerical experiments on simulated and real datasets.

报告人简介:艾明要,男,2003年在南开大学取得博士学位,之后来北京大学数学科学学院工作至今。2007年8月至2009年1月,美国佐治亚理工学院访问学者。现为北京大学数学科学学院统计学教研室主任、教授、博士生导师,兼任中国现场统计研究会常务理事,中国现场统计研究会试验设计分会理事长、高维数据统计分会副理事长、空间统计分会秘书长,国际统计期刊《Statistica Sinica》、《Journal of Statistical Planning and Inference》、《Statistics and Probability Letters》、《STAT》副主编,国内数学期刊 《系统科学与数学》编委。主要从事试验设计与分析、计算机试验、大数据分析和应用统计的教学和研究工作,在Ann Statist、JASA、Biometrika、Technometrics、Statist Sinica等国内外顶尖期刊发表学术论文六十余篇,主持完成国家自然科学基金面上项目5项、重点项目子课题1项,参与完成国家科技部973课题2项。