报告题目:Learning Dynamics of some High-Dimensional Optimization Algorithms
报告人:王乐达 耶鲁大学
报告时间: 6月5日 14:30-15:30
报告地点:五教5407
摘要:
We study the learning dynamics of 1) a multi-pass, mini-batch Stochastic Gradient Descent (SGD) procedure for empirical risk minimization in high-dimensional multi-index models with isotropic random data using dynamical mean field theory; and 2) some possible extensions on replica-symmetric free energy spin glasses with orthogonally invariant couplings using some similar techniques. Our analyses imply that the limiting dynamics for SGD are the same for any batch size scaling the same, and that under a commensurate scaling of the learning rate, the dynamics of SGD, SME, and gradient flow are mutually distinct. Based on joint work with Zhou Fan. https://arxiv.org/abs/2601.21093
