This course provides students with an in-depth understanding of optimization methods specifically tailored for machine learning. It covers the foundations of convex analysis and widely used optimization algorithms in machine learning such as gradient descent, subgradient, and projected gradient methods. Proximal methods and regularized optimization will be covered for handling optimization problems with non-smooth objectives or constraints. Stochastic gradient methods, including stochastic (sub)gradient descent and variance-reduced stochastic gradient, will be discussed for handling large datasets. The course will conclude by exploring non-convex optimization using coordinate descent, Newton's and Quasi-Newton Methods, and adaptive methods. The course emphasizes both the theoretical foundations and practical applications of these optimization techniques within the context of machine learning algorithms. By the end of the course, students will be well-versed in advanced optimization methods and equipped to apply them effectively in machine learning scenarios, enhancing model performance, convergence rates, and robustness. (3.0 credit units). PREREQUISITES: Working knowledge of linear algebra and probability. Prior exposure to optimization is a plus but not necessary.
This course provides students with an in-depth understanding of optimization methods specifically tailored for machine learning. It covers the foundations of convex analysis and widely used optimization algorithms in machine learning such as gradient descent, subgradient, and projected gradient methods. Proximal methods and regularized optimization will be covered for handling optimization problems with non-smooth objectives or constraints. Stochastic gradient methods, including stochastic (sub)gradient descent and variance-reduced stochastic gradient, will be discussed for handling large datasets. The course will conclude by exploring non-convex optimization using coordinate descent, Newton's and Quasi-Newton Methods, and adaptive methods. The course emphasizes both the theoretical foundations and practical applications of these optimization techniques within the context of machine learning algorithms. By the end of the course, students will be well-versed in advanced optimization methods and equipped to apply them effectively in machine learning scenarios, enhancing model performance, convergence rates, and robustness. (3.0 credit units). PREREQUISITES: Working knowledge of linear algebra and probability. Prior exposure to optimization is a plus but not necessary.