Note: Supplemental materials are not guaranteed with Rental or Used book purchases.
- ISBN: 9780470682289 | 0470682280
- Cover: Paperback
- Copyright: 11/7/2011
& Statistical Pattern Recognition provides an introduction to statistical pattern theory and techniques, with material drawn from a wide range of fields, including the areas of engineering, statistics, computer science and the social sciences. & The book describes techniques for analysing data comprising measurements made on individuals or objects.. The techniques are used to make a prediction such as disease of a patient, the type of object illuminated by a radar, economic forecast. Emphasis is placed on techniques for classification, a term used for predicting the class or group an object belongs to (based on a set of exemplars) and for methods that seek to discover natural groupings in a data set. & Each section concludes with a description of the wide range of practical applications that have been addressed and the further developments of theoretical techniques and includes a variety of exercises, from &'openbook&' questions to more lengthy projects. New material is presented, including the analysis of complex networks and basic techniques for analysing the properties of datasets and also introduces readers to the use of variational methods for Bayesian density estimation and looks at new applications in biometrics and security. & &
Dr Andrew Robert Webb, Senior Researcher, QinetiQ Ltd, Malvern, UK.
Dr Keith Derek Copsey, Senior Researcher, QinetiQ Ltd, Malvern, UK.
Preface | p. xix |
Notation | p. xxiii |
Introduction to Statistical Pattern Recognition | p. 1 |
Statistical Pattern Recognition | p. 1 |
Introduction | p. 1 |
The Basic Model | p. 2 |
Stages in a Pattern Recognition Problem | p. 4 |
Issues | p. 6 |
Approaches to Statistical Pattern Recognition | p. 7 |
Elementary Decision Theory | p. 8 |
Bayes' Decision Rule for Minimum Error | p. 8 |
Bayes' Decision Rule for Minimum Error - Reject Option | p. 12 |
Bayes' Decision Rule for Minimum Risk | p. 13 |
Bayes' Decision Rule for Minimum Risk - Reject Option | p. 15 |
Neyman-Pearson Decision Rule | p. 15 |
Minimax Criterion | p. 18 |
Discussion | p. 19 |
Discriminant Functions | p. 20 |
Introduction | p. 20 |
Linear Discriminant Functions | p. 21 |
Piecewise Linear Discriminant Functions | p. 23 |
Generalised Linear Discriminant Function | p. 24 |
Summary | p. 26 |
Multiple Regression | p. 27 |
Outline of Book | p. 29 |
Notes and References | p. 29 |
Exercises | p. 31 |
Density Estimation - Parametric | p. 33 |
Introduction | p. 33 |
Estimating the Parameters of the Distributions | p. 34 |
Estimative Approach | p. 34 |
Predictive Approach | p. 35 |
The Gaussian Classifier | p. 35 |
Specification | p. 35 |
Derivation of the Gaussian Classifier Plug-In Estimates | p. 37 |
Example Application Study | p. 39 |
Dealing with Singularities in the Gaussian Classifier | p. 40 |
Introduction | p. 40 |
Naïve Bayes | p. 40 |
Projection onto a Subspace | p. 41 |
Linear Discriminant Function | p. 41 |
Regularised Discriminant Analysis | p. 42 |
Example Application Study | p. 44 |
Further Developments | p. 45 |
Summary | p. 46 |
Finite Mixture Models | p. 46 |
Introduction | p. 46 |
Mixture Models for Discrimination | p. 48 |
Parameter Estimation for Normal Mixture Models | p. 49 |
Normal Mixture Model Covariance Matrix Constraints | p. 51 |
How Many Components? | p. 52 |
Maximum Likelihood Estimation via EM | p. 55 |
Example Application Study | p. 60 |
Further Developments | p. 62 |
Summary | p. 63 |
Application Studies | p. 63 |
Summary and Discussion | p. 66 |
Recommendations | p. 66 |
Notes and References | p. 67 |
Exercises | p. 67 |
Density Estimation - Bayesian | p. 70 |
Introduction | p. 70 |
Basics | p. 72 |
Recursive Calculation | p. 72 |
Proportionality | p. 73 |
Analytic Solutions | p. 73 |
Conjugate Priors | p. 73 |
Estimating the Mean of a Normal Distribution with Known Variance | p. 75 |
Estimating the Mean and the Covariance Matrix of a Multivariate Normal Distribution | p. 79 |
Unknown Prior Class Probabilities | p. 85 |
Summary | p. 87 |
Bayesian Sampling Schemes | p. 87 |
Introduction | p. 87 |
Summarisation | p. 87 |
Sampling Version of the Bayesian Classifier | p. 89 |
Rejection Sampling | p. 89 |
Ratio of Uniforms | p. 90 |
Importance Sampling | p. 92 |
Markov Chain Monte Carlo Methods | p. 95 |
Introduction | p. 95 |
The Gibbs Sampler | p. 95 |
Metropolis-Hastings Algorithm | p. 103 |
Data Augmentation | p. 107 |
Reversible Jump Markov Chain Monte Carlo | p. 108 |
Slice Sampling | p. 109 |
MCMC Example - Estimation of Noisy Sinusoids | p. 111 |
Summary | p. 115 |
Notes and References | p. 116 |
Bayesian Approaches to Discrimination | p. 116 |
Labelled Training Data | p. 116 |
Unlabelled Training Data | p. 117 |
Sequential Monte Carlo Samplers | p. 119 |
Introduction | p. 119 |
Basic Methodology | p. 121 |
Summary | p. 125 |
Variational Bayes | p. 126 |
Introduction | p. 126 |
Description | p. 126 |
Factorised Variational Approximation | p. 129 |
Simple Example | p. 131 |
Use of the Procedure for Model Selection | p. 135 |
Further Developments and Applications | p. 136 |
Summary | p. 137 |
Approximate Bayesian Computation | p. 137 |
Introduction | p. 137 |
ABC Rejection Sampling | p. 138 |
ABC MCMC Sampling | p. 140 |
ABC Population Monte Carlo Sampling | p. 141 |
Model Selection | p. 142 |
Summary | p. 143 |
Example Application Study | p. 144 |
Application Studies | p. 145 |
Summary and Discussion | p. 146 |
Recommendations | p. 147 |
Notes and References | p. 147 |
Exercises | p. 148 |
Density Estimation - Nonparametric | p. 150 |
Introduction | p. 150 |
Basic Properties of Density Estimators | p. 150 |
k-Nearest-Neighbour Method | p. 152 |
k-Nearest-Neighbour Classifier | p. 152 |
Derivation | p. 154 |
Choice of Distance Metric | p. 157 |
Properties of the Nearest-Neighbour Rule | p. 159 |
Linear Approximating and Eliminating Search Algorithm | p. 159 |
Branch and Bound Search Algorithms: kd-Trees | p. 163 |
Branch and Bound Search Algorithms: Ball-Trees | p. 170 |
Editing Techniques | p. 174 |
Example Application Study | p. 177 |
Further Developments | p. 178 |
Summary | p. 179 |
Histogram Method | p. 180 |
Data Adaptive Histograms | p. 181 |
Independence Assumption (Naïve Bayes) | p. 181 |
Lancaster Models | p. 182 |
Maximum Weight Dependence Trees | p. 183 |
Bayesian Networks | p. 186 |
Example Application Study - Naïve Bayes Text Classification | p. 190 |
Summary | p. 193 |
Kernel Methods | p. 194 |
Biasedness | p. 197 |
Multivariate Extension | p. 198 |
Choice of Smoothing Parameter | p. 199 |
Choice of Kernel | p. 201 |
Example Application Study | p. 202 |
Further Developments | p. 203 |
Summary | p. 203 |
Expansion by Basis Functions | p. 204 |
Copulas | p. 207 |
Introduction | p. 207 |
Mathematical Basis | p. 207 |
Copula Functions | p. 208 |
Estimating Copula Probability Density Functions | p. 209 |
Simple Example | p. 211 |
Summary | p. 212 |
Application Studies | p. 213 |
Comparative Studies | p. 216 |
Summary and Discussion | p. 216 |
Recommendations | p. 217 |
Notes and References | p. 217 |
Exercises | p. 218 |
Linear Discriminant Analysis | p. 221 |
Introduction | p. 221 |
Two-Class Algorithms | p. 222 |
General Ideas | p. 222 |
Perceptron Criterion | p. 223 |
Fisher's Criterion | p. 227 |
Least Mean-Squared-Error Procedures | p. 228 |
Further Developments | p. 235 |
Summary | p. 235 |
Multiclass Algorithms | p. 236 |
General Ideas | p. 236 |
Error-Correction Procedure | p. 237 |
Fisher's Criterion - Linear Discriminant Analysis | p. 238 |
Least Mean-Squared-Error Procedures | p. 241 |
Regularisation | p. 246 |
Example Application Study | p. 246 |
Further Developments | p. 247 |
Summary | p. 248 |
Support Vector Machines | p. 249 |
Introduction | p. 249 |
Linearly Separable Two-Class Data | p. 249 |
Linearly Nonseparable Two-Class Data | p. 253 |
Multiclass SVMs | p. 256 |
SVMs for Regression | p. 257 |
Implementation | p. 259 |
Example Application Study | p. 262 |
Summary | p. 263 |
Logistic Discrimination | p. 263 |
Two-Class Case | p. 263 |
Maximum Likelihood Estimation | p. 264 |
Multiclass Logistic Discrimination | p. 266 |
Example Application Study | p. 267 |
Further Developments | p. 267 |
Summary | p. 268 |
Application Studies | p. 268 |
Summary and Discussion | p. 268 |
Recommendations | p. 269 |
Notes and References | p. 270 |
Exercises | p. 270 |
Nonlinear Discriminant Analysis - Kernel and Projection Methods | p. 274 |
Introduction | p. 274 |
Radial Basis Functions | p. 276 |
Introduction | p. 276 |
Specifying the Model | p. 278 |
Specifying the Functional Form | p. 278 |
The Positions of the Centres | p. 279 |
Smoothing Parameters | p. 281 |
Calculation of the Weights | p. 282 |
Model Order Selection | p. 284 |
Simple RBF | p. 285 |
Motivation | p. 286 |
RBF Properties | p. 288 |
Example Application Study | p. 288 |
Further Developments | p. 289 |
Summary | p. 290 |
Nonlinear Support Vector Machines | p. 291 |
Introduction | p. 291 |
Binary Classification | p. 291 |
Types of Kernel | p. 292 |
Model Selection | p. 293 |
Multiclass SVMs | p. 294 |
Probability Estimates | p. 294 |
Nonlinear Regression | p. 296 |
Example Application Study | p. 296 |
Further Developments | p. 297 |
Summary | p. 298 |
The Multilayer Perceptron | p. 298 |
Introduction | p. 298 |
Specifying the MLP Structure | p. 299 |
Determining the MLP Weights | p. 300 |
Modelling Capacity of the MLP | p. 307 |
Logistic Classification | p. 307 |
Example Application Study | p. 310 |
Bayesian MLP Networks | p. 311 |
Projection Pursuit | p. 313 |
Summary | p. 313 |
Application Studies | p. 314 |
Summary and Discussion | p. 316 |
Recommendations | p. 317 |
Notes and References | p. 318 |
Exercises | p. 318 |
Rule and Decision Tree Induction | p. 322 |
Introduction | p. 322 |
Decision Trees | p. 323 |
Introduction | p. 323 |
Decision Tree Construction | p. 326 |
Selection of the Splitting Rule | p. 327 |
Terminating the Splitting Procedure | p. 330 |
Assigning Class Labels to Terminal Nodes | p. 332 |
Decision Tree Pruning - Worked Example | p. 332 |
Decision Tree Construction Methods | p. 337 |
Other Issues | p. 339 |
Example Application Study | p. 340 |
Further Developments | p. 341 |
Summary | p. 342 |
Rule Induction | p. 342 |
Introduction | p. 342 |
Generating Rules from a Decision Tree | p. 345 |
Rule Induction Using a Sequential Covering Algorithm | p. 345 |
Example Application Study | p. 350 |
Further Developments | p. 351 |
Summary | p. 351 |
Multivariate Adaptive Regression Splines | p. 351 |
Introduction | p. 351 |
Recursive Partitioning Model | p. 351 |
Example Application Study | p. 355 |
Further Developments | p. 355 |
Summary | p. 356 |
Application Studies | p. 356 |
Summary and Discussion | p. 358 |
Recommendations | p. 358 |
Notes and References | p. 359 |
Exercises | p. 359 |
Ensemble Methods | p. 361 |
Introduction | p. 361 |
Characterising a Classifier Combination Scheme | p. 362 |
Feature Space | p. 363 |
Level | p. 366 |
Degree of Training | p. 368 |
Form of Component Classifiers | p. 368 |
Structure | p. 369 |
Optimisation | p. 369 |
Data Fusion | p. 370 |
Architectures | p. 370 |
Bayesian Approaches | p. 371 |
Neyman-Pearson Formulation | p. 373 |
Trainable Rules | p. 374 |
Fixed Rules | p. 375 |
Classifier Combination Methods | p. 376 |
Product Rule | p. 376 |
Sum Rule | p. 377 |
Min, Max and Median Combiners | p. 378 |
Majority Vote | p. 379 |
Borda Count | p. 379 |
Combiners Trained on Class Predictions | p. 380 |
Stacked Generalisation | p. 382 |
Mixture of Experts | p. 382 |
Bagging | p. 385 |
Boosting | p. 387 |
Random Forests | p. 389 |
Model Averaging | p. 390 |
Summary of Methods | p. 396 |
Example Application Study | p. 398 |
Further Developments | p. 399 |
Application Studies | p. 399 |
Summary and Discussion | p. 400 |
Recommendations | p. 401 |
Notes and References | p. 401 |
Exercises | p. 402 |
Performance Assessment | p. 404 |
Introduction | p. 404 |
Performance Assessment | p. 405 |
Performance Measures | p. 405 |
Discriminability | p. 406 |
Reliability | p. 413 |
ROC Curves for Performance Assessment | p. 415 |
Population and Sensor Drift | p. 419 |
Example Application Study | p. 421 |
Further Developments | p. 422 |
Summary | p. 423 |
Comparing Classifier Performance | p. 424 |
Which Technique is Best? | p. 424 |
Statistical Tests | p. 425 |
Comparing Rules When Misclassification Costs are Uncertain | p. 426 |
Example Application Study | p. 428 |
Further Developments | p. 429 |
Summary | p. 429 |
Application Studies | p. 429 |
Summary and Discussion | p. 430 |
Recommendations | p. 430 |
Notes and References | p. 430 |
Exercises | p. 431 |
Feature Selection and Extraction | p. 433 |
Introduction | p. 433 |
Feature Selection | p. 435 |
Introduction | p. 435 |
Characterisation of Feature Selection Approaches | p. 439 |
Evaluation Measures | p. 440 |
Search Algorithms for Feature Subset Selection | p. 449 |
Complete Search - Branch and Bound | p. 450 |
Sequential Search | p. 454 |
Random Search | p. 458 |
Markov Blanket | p. 459 |
Stability of Feature Selection | p. 460 |
Example Application Study | p. 462 |
Further Developments | p. 462 |
Summary | p. 463 |
Linear Feature Extraction | p. 463 |
Principal Components Analysis | p. 464 |
Karhunen-Lo`eve Transformation | p. 475 |
Example Application Study | p. 481 |
Further Developments | p. 482 |
Summary | p. 483 |
Multidimensional Scaling | p. 484 |
Classical Scaling | p. 484 |
Metric MDS | p. 486 |
Ordinal Scaling | p. 487 |
Algorithms | p. 490 |
MDS for Feature Extraction | p. 491 |
Example Application Study | p. 492 |
Further Developments | p. 493 |
Summary | p. 493 |
Application Studies | p. 493 |
Summary and Discussion | p. 495 |
Recommendations | p. 495 |
Notes and References | p. 496 |
Exercises | p. 497 |
Clustering | p. 501 |
Introduction | p. 501 |
Hierarchical Methods | p. 502 |
Single-Link Method | p. 503 |
Complete-Link Method | p. 506 |
Sum-of-Squares Method | p. 507 |
General Agglomerative Algorithm | p. 508 |
Properties of a Hierarchical Classification | p. 508 |
Example Application Study | p. 509 |
Summary | p. 509 |
Quick Partitions | p. 510 |
Mixture Models | p. 511 |
Model Description | p. 511 |
Example Application Study | p. 512 |
Sum-of-Squares Methods | p. 513 |
Clustering Criteria | p. 514 |
Clustering Algorithms | p. 515 |
Vector Quantisation | p. 520 |
Example Application Study | p. 530 |
Further Developments | p. 530 |
Summary | p. 531 |
Spectral Clustering | p. 531 |
Elementary Graph Theory | p. 531 |
Similarity Matrices | p. 534 |
Application to Clustering | p. 534 |
Spectral Clustering Algorithm | p. 535 |
Forms of Graph Laplacian | p. 535 |
Example Application Study | p. 536 |
Further Developments | p. 538 |
Summary | p. 538 |
Cluster Validity | p. 538 |
Introduction | p. 538 |
Statistical Tests | p. 539 |
Absence of Class Structure | p. 540 |
Validity of Individual Clusters | p. 541 |
Hierarchical Clustering | p. 542 |
Validation of Individual Clusterings | p. 542 |
Partitions | p. 543 |
Relative Criteria | p. 543 |
Choosing the Number of Clusters | p. 545 |
Application Studies | p. 546 |
Summary and Discussion | p. 549 |
Recommendations | p. 551 |
Notes and References | p. 552 |
Exercises | p. 553 |
Complex Networks | p. 555 |
Introduction | p. 555 |
Characteristics | p. 557 |
Properties | p. 557 |
Questions to Address | p. 559 |
Descriptive Features | p. 560 |
Outline | p. 560 |
Mathematics of Networks | p. 561 |
Graph Matrices | p. 561 |
Connectivity | p. 562 |
Distance Measures | p. 562 |
Weighted Networks | p. 563 |
Centrality Measures | p. 563 |
Random Graphs | p. 564 |
Community Detection | p. 565 |
Clustering Methods | p. 565 |
Girvan-Newman Algorithm | p. 568 |
Modularity Approaches | p. 570 |
Local Modularity | p. 571 |
Clique Percolation | p. 573 |
Example Application Study | p. 574 |
Further Developments | p. 575 |
Summary | p. 575 |
Link Prediction | p. 575 |
Approaches to Link Prediction | p. 576 |
Example Application Study | p. 578 |
Further Developments | p. 578 |
Application Studies | p. 579 |
Summary and Discussion | p. 579 |
Recommendations | p. 580 |
Notes and References | p. 580 |
Exercises | p. 580 |
Additional Topics | p. 581 |
Model Selection | p. 581 |
Separate Training and Test Sets | p. 582 |
Cross-Validation | p. 582 |
The Bayesian Viewpoint | p. 583 |
Akaike's Information Criterion | p. 583 |
Minimum Description Length | p. 584 |
Missing Data | p. 585 |
Outlier Detection and Robust Procedures | p. 586 |
Mixed Continuous and Discrete Variables | p. 587 |
Structural Risk Minimisation and the Vapnik-Chervonenkis Dimension | p. 588 |
Bounds on the Expected Risk | p. 588 |
The VC Dimension | p. 589 |
References | p. 591 |
Index | p. 637 |
Table of Contents provided by Publisher. All Rights Reserved. |
What is included with this book?
The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.
The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.