Statistical Learning Theory And Stochastic Optimization PDF Download

Are you looking for read ebook online? Search for your book and save it on your Kindle device, PC, phones or tablets. Download Statistical Learning Theory And Stochastic Optimization PDF full book. Access full book title Statistical Learning Theory And Stochastic Optimization.

Statistical Learning Theory and Stochastic Optimization

Statistical Learning Theory and Stochastic Optimization
Author: Olivier Catoni
Publisher: Springer
Total Pages: 278
Release: 2004-08-30
Genre: Mathematics
ISBN: 3540445072

Download Statistical Learning Theory and Stochastic Optimization Book in PDF, ePub and Kindle

Statistical learning theory is aimed at analyzing complex data with necessarily approximate models. This book is intended for an audience with a graduate background in probability theory and statistics. It will be useful to any reader wondering why it may be a good idea, to use as is often done in practice a notoriously "wrong'' (i.e. over-simplified) model to predict, estimate or classify. This point of view takes its roots in three fields: information theory, statistical mechanics, and PAC-Bayesian theorems. Results on the large deviations of trajectories of Markov chains with rare transitions are also included. They are meant to provide a better understanding of stochastic optimization algorithms of common use in computing estimators. The author focuses on non-asymptotic bounds of the statistical risk, allowing one to choose adaptively between rich and structured families of models and corresponding estimators. Two mathematical objects pervade the book: entropy and Gibbs measures. The goal is to show how to turn them into versatile and efficient technical tools, that will stimulate further studies and results.


Stochastic Optimization for Multi-Agent Statistical Learning and Control

Stochastic Optimization for Multi-Agent Statistical Learning and Control
Author: Alec Koppel
Publisher:
Total Pages: 0
Release: 2017
Genre:
ISBN:

Download Stochastic Optimization for Multi-Agent Statistical Learning and Control Book in PDF, ePub and Kindle

The goal of this thesis is to develop a mathematical framework for optimal, accurate, and affordable complexity statistical learning among networks of autonomous agents. We begin by noting the connection between statistical inference and stochastic programming, and consider extensions of this setup to settings in which a network of agents each observes a local data stream and would like to make decisions that are good with respect to information aggregated across the entire network. There is an open-ended degree of freedom in this problem formulation, however: the selection of the estimator function class which defines the feasible set of the stochastic program. Our central contribution is the design of stochastic optimization tools in reproducing kernel Hilbert spaces that yield optimal, accurate, and affordable complexity statistical learning for a multi-agent network. To obtain this result, we first explore the relative merits and drawbacks of different function class selections. In Part I, we consider multi-agent expected risk minimization this problem setting for the case that each agent seems to learn a common globally optimal generalized linear models (GLMs) by developing a stochastic variant of Arrow-Hurwicz primal-dual method. We establish convergence to the primal-dual optimal pair when either consensus or "proximity" constraints encode the fact that we want all agents' to agree, or nearby agents to make decisions that are close to one another. Empirically, we observe that these convergence results are substantiated but that convergence may not translate into statistical accuracy. More broadly, optimality within a given estimator function class is not the same as one that makes minimal inference errors. The optimality-accuracy tradeoff of GLMs motivates subsequent efforts to learn more sophisticated estimators based upon learned feature encodings of the data that is fed into the statistical model. The specific tool we turn to in Part II is dictionary learning, where we optimize both over regression weights and an encoding of the data, which yields a non-convex problem. We investigate the use of stochastic methods for online task-driven dictionary learning, and obtain promising performance for the task of a ground robot learning to anticipate control uncertainty based on its past experience. Heartened by this implementation, we then consider extensions of this framework for a multi-agent network to each learn globally optimal task-driven dictionaries based on stochastic primal-dual methods. However, it is here the non-convexity of the optimization problem causes problems: stringent conditions on stochastic errors and the duality gap limit the applicability of the convergence guarantees, and impractically small learning rates are required for convergence in practice. Thus, we seek to learn nonlinear statistical models while preserving convexity, which is possible through kernel methods (Part III). However, the increased descriptive power of nonparametric estimation comes at the cost of infinite complexity. Thus, we develop a stochastic approximation algorithm in reproducing kernel Hilbert spaces (RKHS) that ameliorates this complexity issue while preserving optimality: we combine the functional generalization of stochastic gradient method (FSGD) with greedily constructed low-dimensional subspace projections based on matching pursuit. We establish that the proposed method yields a controllable trade-off between optimality and memory, and yields highly accurate parsimonious statistical models in practice. Then, we develop a multi-agent extension of this method by proposing a new node-separable penalty function and applying FSGD together with low-dimensional subspace projections. This extension allows a network of autonomous agents to learn a memory-efficient approximation to the globally optimal regression function based only on their local data stream and message passing with neighbors. In practice, we observe agents are able to stably learn highly accurate and memory-efficient nonlinear statistical models from streaming data. From here, we shift focus to a more challenging class of problems, motivated by the fact that true learning is not just revising predictions based upon data but augmenting behavior over time based on temporal incentives. This goal may be described by Markov Decision Processes (MDPs): at each point, an agent is in some state of the world, takes an action and then receives a reward while randomly transitioning to a new state. The goal of the agent is to select the action sequence to maximize its long-term sum of rewards, but determining how to select this action sequence when both the state and action spaces are infinite has eluded researchers for decades. As a precursor to this feat, we consider the problem of policy evaluation in infinite MDPs, in which we seek to determine the long-term sum of rewards when starting in a given state when actions are chosen according to a fixed distribution called a policy. We reformulate this problem as a RKHS-valued compositional stochastic program and we develop a functional extension of stochastic quasi-gradient algorithm operating in tandem with the greedy subspace projections mentioned above. We prove convergence with probability 1 to the Bellman fixed point restricted to this function class, and we observe a state of the art trade off in memory versus Bellman error for the proposed method on the Mountain Car driving task, which bodes well for incorporating policy evaluation into more sophisticated, provably stable reinforcement learning techniques, and in time, developing optimal collaborative multi-agent learning-based control systems.


First-order and Stochastic Optimization Methods for Machine Learning

First-order and Stochastic Optimization Methods for Machine Learning
Author: Guanghui Lan
Publisher: Springer Nature
Total Pages: 591
Release: 2020-05-15
Genre: Mathematics
ISBN: 3030395685

Download First-order and Stochastic Optimization Methods for Machine Learning Book in PDF, ePub and Kindle

This book covers not only foundational materials but also the most recent progresses made during the past few years on the area of machine learning algorithms. In spite of the intensive research and development in this area, there does not exist a systematic treatment to introduce the fundamental concepts and recent progresses on machine learning algorithms, especially on those based on stochastic optimization methods, randomized algorithms, nonconvex optimization, distributed and online learning, and projection free methods. This book will benefit the broad audience in the area of machine learning, artificial intelligence and mathematical programming community by presenting these recent developments in a tutorial style, starting from the basic building blocks to the most carefully designed and complicated algorithms for machine learning.


Statistical Learning Theory and Stochastic Optimization

Statistical Learning Theory and Stochastic Optimization
Author: Olivier Catoni
Publisher: Springer
Total Pages: 284
Release: 2004-08-25
Genre: Mathematics
ISBN: 9783540225720

Download Statistical Learning Theory and Stochastic Optimization Book in PDF, ePub and Kindle

Statistical learning theory is aimed at analyzing complex data with necessarily approximate models. This book is intended for an audience with a graduate background in probability theory and statistics. It will be useful to any reader wondering why it may be a good idea, to use as is often done in practice a notoriously "wrong'' (i.e. over-simplified) model to predict, estimate or classify. This point of view takes its roots in three fields: information theory, statistical mechanics, and PAC-Bayesian theorems. Results on the large deviations of trajectories of Markov chains with rare transitions are also included. They are meant to provide a better understanding of stochastic optimization algorithms of common use in computing estimators. The author focuses on non-asymptotic bounds of the statistical risk, allowing one to choose adaptively between rich and structured families of models and corresponding estimators. Two mathematical objects pervade the book: entropy and Gibbs measures. The goal is to show how to turn them into versatile and efficient technical tools, that will stimulate further studies and results.


Distributionally Robust Learning

Distributionally Robust Learning
Author: Ruidi Chen
Publisher:
Total Pages: 258
Release: 2020-12-23
Genre: Mathematics
ISBN: 9781680837728

Download Distributionally Robust Learning Book in PDF, ePub and Kindle


The Nature of Statistical Learning Theory

The Nature of Statistical Learning Theory
Author: Vladimir Vapnik
Publisher: Springer Science & Business Media
Total Pages: 324
Release: 2013-06-29
Genre: Mathematics
ISBN: 1475732643

Download The Nature of Statistical Learning Theory Book in PDF, ePub and Kindle

The aim of this book is to discuss the fundamental ideas which lie behind the statistical theory of learning and generalization. It considers learning as a general problem of function estimation based on empirical data. Omitting proofs and technical details, the author concentrates on discussing the main results of learning theory and their connections to fundamental problems in statistics. This second edition contains three new chapters devoted to further development of the learning theory and SVM techniques. Written in a readable and concise style, the book is intended for statisticians, mathematicians, physicists, and computer scientists.


Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers

Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers
Author: Stephen Boyd
Publisher: Now Publishers Inc
Total Pages: 138
Release: 2011
Genre: Computers
ISBN: 160198460X

Download Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers Book in PDF, ePub and Kindle

Surveys the theory and history of the alternating direction method of multipliers, and discusses its applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others.