【会议资讯】人工智能会议汇总（5.9~5.14）-技术圈

正文共：2724字-6图

预计阅读时间：7分钟

Demystifying (Deep) Reinforcement Learning

报告人：Zhuoran Yang, Princeton University

时间：5月14日（周五）11:00am-12:00pm

参与方式：（线下）静园五院101

（线上）腾讯会议

https://meeting.tencent.com/s/LDapGj0QxMBq

会议 ID：363 421 047

听众报名：请扫码 ↓↓↓（报名后即可获得参会密码）

报告信息

Title

Demystifying (Deep) Reinforcement Learning: The Pessimist, The Optimist, and Their Provable Efficiency

Abstract

Coupled with powerful function approximators such as deep neural networks, reinforcement learning (RL) achieves tremendous empirical successes. However, its theoretical understandings lag behind. In particular, it remains unclear how to provably attain the optimal policy with a finite regret or sample complexity. In this talk, we will present the two sides of the same coin, which demonstrates an intriguing duality between pessimism and optimism.

- In the offline setting, we aim to learn the optimal policy based on a dataset collected a priori. Due to a lack of active interactions with the environment, we suffer from the insufficient coverage of the dataset. To maximally exploit the dataset, we propose a pessimistic least-squares value iteration algorithm, which achieves a minimax-optimal sample complexity.

- In the online setting, we aim to learn the optimal policy by actively interacting with an environment. To strike a balance between exploration and exploitation, we propose an optimistic least-squares value iteration algorithm, which achieves a \sqrt{T} regret in the presence of linear, kernel, and neural function approximators.

Biography

Zhuoran Yang is a final-year Ph.D. student in the Department of Operations Research and Financial Engineering at Princeton University, advised by Professor Jianqing Fan and Professor Han Liu. Before attending Princeton, He obtained a Bachelor of Mathematics degree from Tsinghua University. His research interests lie in the interface between machine learning, statistics, and optimization. The primary goal of his research is to design a new generation of machine learning algorithms for large-scale and multi-agent decision-making problems, with both statistical and computational guarantees. Besides, he is also interested in the application of learning-based decision-making algorithms to real-world problems that arise in robotics, personalized medicine, and computational social science.

5月14日CCF C³-04@百度：AI+开源（线下活动）

CCF C³活动是CCF CTO Club发起的，面向企业技术专家的热门技术和战略分享会，第一站在京东分享了关于智能客服的相关技术，第二站来到小米讲述智能家居的发展趋势，第三站在搜狗探讨深度语义学习与网络搜索，第四站将走进百度，共话“AI+开源”。

扫描下方二维码，报名参会。线下活动需在预报名成功后，得到CCF审核通过后获邀参加。

主题

AI+开源

时间

5月14日（周五）18:30-21:30

地点

北京市海淀区西北旺东路10号院百度科技园2号楼

活动议程

18:30-19:00

工作晚餐及交流

19:00-19:05

周明 CCF副理事长，创新工场首席科学家

CCF致辞

19:05-19:10

吴华 CCF企工委执委，百度技术委员会主席

承办单位致辞

19:10-19:35

于佃海 CCF高级会员，百度开源深度学习平台飞桨总架构师

报告主题：《AI+开源：百度飞桨的思考与实践》

19:35-20:20

“AI+开源”主题Panel

Panel嘉宾（按姓氏拼音排序）：

崔宝秋 CCF企工委主任，小米集团副总裁

黄东旭 PingCAP联合创始人兼CTO，顶级开源项目TiDB创始人（远程接入）

马越恒拓开源董事长，开源中国CEO

马艳军百度深度学习技术平台部高级总监

吴晟 Tetrate创始工程师，Apache软件基金会首位华人董事，Apache SkyWalking 创始人&VP

20:20-20:23

感谢分享嘉宾

20:23-20:25

承办旗帜交接仪式

20:25-20:45

Q&A

20:45-21:30

自由交流

来源 | 人工智能哲学探索

【会议资讯】人工智能会议汇总（5.9~5.14）

Demystifying (Deep) Reinforcement Learning

5月14日CCF C³-04@百度：AI+开源（线下活动）

添加附言

相关文章推荐