当前位置: 首页  科学研究  学术活动

数学交叉科学研究所学术报告(吕绍高教授,南京审计大学)

来源:系统管理员 发布时间:2024-11-21

报告题目:Decentralized TD Learning with Spatio-temporal Information Dependence

报告人:吕绍高教授,南京审计大学

报告时间:2024年11月26日(周二)14:30-15:30

报告地点:20-308

报告摘要:We are concerned with the decentralized policy evaluation problem in multi-agent reinforcement learning by temporal-difference(TD) method with linear function approximation. Different from prior stochastic  optimization methods, we develop an improved dual averaging method for state-valued function estimation. At each agent, the direction of each update incorporates historical information about its own gradient, as well as information about the global gradient of the network at the current time. The method proposed in this paper for policy evaluation  fully utilizes the space-time gradient information, which can improve the communication efficiency between agents. Moreover, we provide a finite-time analysis theoretically under both independent and identically distributed and Markovian data.

报告人简介:吕绍高,现为南京审计大学统计与数据科学学院教授,博士生导师。2011年毕业于中国科大-香港城市大学联合培养项目,获得理学博士学位。主要研究方向是统计机器学习,当前主要研究兴趣包括联邦学习、统计强化学习以及深度学习的理论分析。迄今为止在SCI检索的国际杂志上发表论文30多篇,包括统计或人工智能类期刊《Annals of Statistics》、《Journal of Machine Learning Research》以及人工智能顶级会议NeurIPSIJCAI等。主持过国家自然科学基金项目4项。长期担任人工智能顶级会议“NeurIPS”、“ICML”、“AAAI”以及“AIStat”程序委员或审稿人。

邀请人:向道红