Note: Supplemental materials are not guaranteed with Rental or Used book purchases.
- ISBN: 9780470596692 | 0470596694
- Cover: Hardcover
- Copyright: 4/17/2012
This text presents optimal learning techniques with applications in energy, homeland security, health, sports, transportation science, biomedical research, biosurveillance, stochastic optimization, high technology, and complex resource allocation problems. The coverage utilizes a relatively new class of algorithmic strategies known as approximate dynamic programming, which merges dynamic programming (Markov decision processes), math programming (linear, nonlinear, and integer), simulation, and statistics. It features mathematical techniques that are applicable to a variety of situations, from identifying promising drug candidates to figuring out the best evacuation plan in the event of a natural disaster.
Warren B. Powell, PhD, is Professor of Operations Research and Financial Engineering at Princeton University, where he is founder and Director of CASTLE Laboratory, a research unit that works with industrial partners to test new ideas found in operations research. The recipient of the 2004 INFORMS Fellow Award, Dr. Powell is the author of Approximate Dynamic Programming: Solving the Curses of Dimensionality, Second Edition (Wiley). Ilya O. Ryzhov, PhD, is Assistant Professor in the Department of Decision, Operations, and Information Technologies at the Robert H. Smith School of Business at the University of Maryland. He has made fundamental contributions to bridge the fields of ranking and selection with multiarmed bandits and optimal learning with mathematical programming.
Preface | p. xv |
Acknowledgments | p. xix |
The Challenges of Learning | p. 1 |
Learning the Best Path | p. 2 |
Areas of Application | p. 4 |
Major Problem Classes | p. 12 |
The Different Types of Learning | p. 13 |
Learning from Different Communities | p. 16 |
Information Collection Using Decision Trees | p. 18 |
A Basic Decision Tree | p. 18 |
Decision Tree for Offline Learning | p. 20 |
Decision Tree for Online Learning | p. 21 |
Discussion | p. 25 |
Website and Downloadable Software | p. 26 |
Goals of this Book | p. 26 |
Problems | p. 27 |
Adaptive Learning | p. 31 |
The Frequentist View | p. 32 |
The Bayesian View | p. 33 |
The Updating Equations for Independent Beliefs | p. 34 |
The Expected Value of Information | p. 36 |
Updating for Correlated Normal Priors | p. 38 |
Bayesian Updating with an Uninformative Prior | p. 41 |
Updating for Non-Gaussian Priors | p. 42 |
The Gamma-Exponential Model | p. 43 |
The Gamma-Poisson Model | p. 44 |
The Pareto-Uniform Model | p. 45 |
Models for Learning Probabilities* | p. 46 |
Learning an Unknown Variance* | p. 49 |
Monte Carlo Simulation | p. 51 |
Why Does It Work?* | p. 54 |
Derivation of ¿ | p. 54 |
Derivation of Bayesian Updating Equations for Independent Beliefs | p. 55 |
Bibliographic Notes | p. 57 |
Problems | p. 57 |
The Economics of Information | p. 61 |
An Elementary Information Problem | p. 61 |
The Marginal Value of Information | p. 65 |
An information Acquisition Problem | p. 68 |
Bibliographic Notes | p. 70 |
Problems | p. 70 |
Ranking and Selection | p. 71 |
The Model | p. 72 |
Measurement Policies | p. 75 |
Deterministic Versus Sequential Policies | p. 75 |
Optimal Sequential Policies | p. 76 |
Heuristic Policies | p. 77 |
Evaluating Policies | p. 81 |
More Advanced Topics* | p. 83 |
An Alternative Representation of the Probability Space | p. 83 |
Equivalence of Using True Means and Sample Estimates | p. 84 |
Bibliographic Notes | p. 85 |
Problems | p. 85 |
The Knowledge Gradient | p. 89 |
The Knowledge Gradient for Independent Beliefs | p. 90 |
Computation | p. 91 |
Some Properties of the Knowledge Gradient | p. 93 |
The Four Distributions of Learning | p. 94 |
The Value of Information and the S-Curve Effect | p. 95 |
Knowledge Gradient for Correlated Beliefs | p. 98 |
Anticipatory Versus Experiential Learning | p. 103 |
The Knowledge Gradient for Some Non-Gaussian Distributions | p. 105 |
The Gamma-Exponential Model | p. 105 |
The Gamma-Poisson Model | p. 108 |
The Pareto-Uniform Model | p. 109 |
The Beta-Bernoulli Model | p. 111 |
Discussion | p. 113 |
Relatives of the Knowledge Gradient | p. 114 |
Expected Improvement | p. 114 |
Linear Loss* | p. 115 |
The Problem of Priors | p. 118 |
Discussion | p. 120 |
Why Does It Work?* | p. 120 |
Derivation of the Knowledge Gradient Formula | p. 120 |
Bibliographic Notes | p. 125 |
Problems | p. 125 |
Bandit Problems | p. 139 |
The Theory and Practice of Gittins Indices | p. 141 |
Gittins Indices in the Beta-Bernoulli Model | p. 142 |
Gittins Indices in tie Normal-Normal Model | p. 145 |
Approximating Gittins Indices | p. 147 |
Variations of Bandit Problems | p. 148 |
Upper Confidence Bounding | p. 149 |
The Knowledge Gradient for Bandit Problems | p. 151 |
The Basic Idea | p. 151 |
Some Experimental Comparisons | p. 153 |
Non-Normal Models | p. 156 |
Bibliographic Notes | p. 157 |
Problems | p. 157 |
Elements of a Learning Problem | p. 163 |
The States of our System | p. 164 |
Types of Decisions | p. 166 |
Exogenous Information | p. 167 |
Transition Functions | p. 168 |
Objective Functions | p. 168 |
Designing Versus Controlling | p. 169 |
Measurement Costs | p. 170 |
Objectives | p. 170 |
Evaluating Policies | p. 175 |
Discussion | p. 177 |
Bibliographic Notes | p. 178 |
Problems | p. 178 |
Linear Belief Models | p. 181 |
Applications | p. 182 |
Maximizing Ad Clicks | p. 182 |
Dynamic Pricing | p. 184 |
Housing Loans | p. 184 |
Optimizing Dose Response | p. 185 |
A Brief Review of Linear Regression | p. 186 |
The Normal Equations | p. 186 |
Recursive Least Squares | p. 187 |
A Bayesian Interpretation | p. 188 |
Generating a Prior | p. 189 |
The Knowledge Gradient for a Linear Model | p. 191 |
Application to Drug Discovery | p. 192 |
Application to Dynamic Pricing | p. 196 |
Bibliographic Notes | p. 200 |
Problems | p. 200 |
Subset Selection Problems | p. 203 |
Applications | p. 205 |
Choosing a Subset Using Ranking and Selection | p. 207 |
Setting Prior Means and Variances | p. 207 |
Two Strategies for Setting Prior Covariances | p. 208 |
Larger Sets | p. 209 |
Using Simulation to Reduce the Problem Size | p. 210 |
Computational Issues | p. 212 |
Experiments | p. 213 |
Very Large Sets | p. 214 |
Bibliographic Notes | p. 216 |
Problems | p. 216 |
Optimizing a Scalar Function | p. 219 |
Deterministic Measurements | p. 219 |
Stochastic Measurements | p. 223 |
The Model | p. 223 |
Finding the Posterior Distribution | p. 224 |
Choosing the Measurement | p. 226 |
Discussion | p. 229 |
Bibliographic Notes | p. 229 |
Problems | p. 229 |
Optimal Bidding | p. 231 |
Modeling Customer Demand | p. 233 |
Some Valuation Models | p. 233 |
The Logit Model | p. 234 |
Bayesian Modeling for Dynamic Pricing | p. 237 |
A Conjugate Prior for Choosing Between Two Demand Curves | p. 237 |
Moment Matching for Nonconjugate Problems | p. 239 |
An Approximation for the Logit Model | p. 242 |
Bidding Strategies | p. 244 |
An Idea From Multi-Armed Bandits | p. 245 |
Bayes-Greedy Bidding | p. 245 |
Numerical Illustrations | p. 247 |
Why Does It Work?* | p. 251 |
Moment Matching for Pareto Prior | p. 251 |
Approximating the Logistic Expectation | p. 252 |
Bibliographic Notes | p. 253 |
Problems | p. 254 |
Stopping Problems | p. 255 |
Sequential Probability Ratio Test | p. 255 |
The Secretary Problem | p. 261 |
Setup | p. 261 |
Solution | p. 262 |
Bibliographic Notes | p. 266 |
Problems | p. 266 |
Active Learning in Statistics | p. 269 |
Deterministic Policies | p. 270 |
Sequential Policies for Classification | p. 274 |
Uncertainty Sampling | p. 274 |
Query by Committee | p. 275 |
Expected Error Reduction | p. 277 |
A Variance-Minimizing Policy | p. 277 |
Mixtures of Gaussians | p. 280 |
Estimating Parameters | p. 280 |
Active Learning | p. 282 |
Bibliographic Notes | p. 283 |
Simulation Optimization | p. 285 |
Indifference Zone Selection | p. 288 |
Batch Procedures | p. 288 |
Sequential Procedures | p. 290 |
The 0-1 Procedure: Connection to Linear Loss | p. 292 |
Optimal Computing Budget Allocation | p. 293 |
Indifference-Zone Version | p. 293 |
Linear Loss Version | p. 295 |
When Does It Work? | p. 295 |
Model-Based Simulated Annealing | p. 296 |
Other Areas of Simulation Optimization | p. 298 |
Bibliographic Notes | p. 299 |
Learning in Mathematical Programming | p. 301 |
Applications | p. 303 |
Piloting a Hot Air Balloon | p. 303 |
Optimizing a Portfolio | p. 308 |
Network Problems | p. 309 |
Discussion | p. 313 |
Learning on Graphs | p. 313 |
Alternative Edge Selection Policies | p. 317 |
Learning Costs for Linear Programs* | p. 318 |
Bibliographic Notes | p. 324 |
Optimizing Over Continuous Measurements | p. 325 |
The Belief Model | p. 327 |
Updating Equations | p. 328 |
Parameter Estimation | p. 330 |
Sequential Kriging Optimization | p. 332 |
The Knowledge Gradient for Continuous Parameters* | p. 334 |
Maximizing the Knowledge Gradient | p. 334 |
Approximating the Knowledge Gradient | p. 335 |
The Gradient of the Knowledge Gradient | p. 336 |
Maximizing the Knowledge Gradient | p. 338 |
The KGCP Policy | p. 339 |
Efficient Global Optimization | p. 340 |
Experiments | p. 341 |
Extension to Higher-Dimensional Problems | p. 342 |
Bibliographic Notes | p. 343 |
Learning With a Physical State | p. 345 |
Introduction to Dynamic Programming | p. 347 |
Approximate Dynamic Programming | p. 348 |
The Exploration vs. Exploitation Problem | p. 350 |
Discussion | p. 351 |
Some Heuristic Learning Policies | p. 352 |
The Local Bandit Approximation | p. 353 |
The Knowledge Gradient in Dynamic Programming | p. 355 |
Generalized Learning Using Basis Functions | p. 355 |
The Knowledge Gradient | p. 358 |
Experiments | p. 361 |
An Expected Improvement Policy | p. 363 |
Bibliographic Notes | p. 364 |
Index | p. 381 |
Table of Contents provided by Ingram. All Rights Reserved. |
What is included with this book?
The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.
The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.