数学交叉科学研究所学术报告（曾锦山教授，江西师范大学）

来源：系统管理员 发布时间：2024-10-21

报告题目：On ADMM in Deep Learning: Convergence and Saturation-Avoidance

报告人：曾锦山教授，江西师范大学

报告时间：2024年10月24日（周四）10:30-11:30

报告地点：20-200

摘要：In this talk, we introduce an alternating direction method of multipliers (ADMM) for deep neural networks training with sigmoid-type activation functions (called sigmoid-ADMM pair), mainly motivated by the gradient-free nature of ADMM in avoiding the saturation of sigmoid-type activations and the advantages of deep neural networks with sigmoid- type activations (called deep sigmoid nets) over their rectified linear unit (ReLU) counterparts (called deep ReLU nets) in terms of approximation. In particular, we prove that the approximation capability of deep sigmoid nets is not worse than deep ReLU nets by showing that ReLU activation function can be well approximated by deep sigmoid nets with two hidden layers and finitely many free parameters but not vice-verse. We also establish the global convergence of the proposed ADMM for the nonlinearly constrained formulation of the deep sigmoid nets training to a Karush-Kuhn-Tucker (KKT) point at a rate of order O(1/k). Compared with the widely used stochastic gradient descent (SGD) algorithm for the deep ReLU nets training (called ReLU-SGD pair), the proposed sigmoid- ADMM pair is practically stable with respect to the algorithmic hyper- parameters including the learning rate and initial schemes. Moreover, we find that to approximate and learn simple but important functions the proposed sigmoid-ADMM pair numerically outperforms the ReLU-SGD pair.

报告人简介：曾锦山，江西师范大学计算机信息工程学院教授、博士生导师，现任计算机信息工程学院副院长、高性能计算江西省重点实验室主任、CCF科学创新论坛执委。2015年博士毕业于西安交通大学，师从徐宗本。曾先后在中国科学院电子学研究所、美国加州大学洛杉矶分校、香港科技大学和香港城市大学从事博士后或访问合作研究。2017年入选江西师范大学首批高端人才计划，2019年入选江西省重大人才计划，主持国家自然科学基金3项，参与国家自然科学基金多项。现已在人工智能相关领域主流期刊和会议上发表高水平论文70余篇，其中JMLR和IEEE汇刊系列论文20余篇，CCF A类论文20篇。两篇论文获得“世界华人数学家联盟最佳论文奖”（2018和2020年），单篇论文连续两年入选“中国数学领域热点论文榜单前十”（排名第5（2022年）和第4（2023年）），单篇论文最高引用逾1300次（谷歌学术）；授权发明专利15项，获批软件著作权9项。指导学生获得“挑战杯”国家特等奖等国家级学科竞赛奖励10余项，相关研究成果得到《人民日报》、《中国青年网》和《学习强国》平台等多家主流媒体的广泛报道。两度受邀在世界华人数学家大会上作四十五分钟学术报告。受邀担任国际高水平学术会议副主席或论坛主席10余次。主要研究方向是人工智能算法理论及其应用。

邀请人：向道红

数学交叉科学研究所学术报告（曾锦山教授，江西师范大学）

友情链接