Statistical Analysis Techniques in Particle Physics : Fits, Density Estimation and Supervised Learning.

Title:

Author:

Narsky, Ilya.

ISBN:

9783527677313

Personal Author:

Narsky, Ilya.

Edition:

1st ed.

Physical Description:

1 online resource (461 pages)

Contents:

Statistical Analysis Techniques in Particle Physics -- Contents -- Acknowledgements -- Notation and Vocabulary -- 1 Why We Wrote This Book and How You Should Read It -- 2 Parametric Likelihood Fits -- 2.1 Preliminaries -- 2.1.1 Example: CP Violation via Mixing -- 2.1.2 The Exponential Family -- 2.1.3 Confidence Intervals -- 2.1.4 Hypothesis Tests -- 2.2 Parametric Likelihood Fits -- 2.2.1 Nuisance Parameters -- 2.2.2 Confidence Intervals from Pivotal Quantities -- 2.2.3 Asymptotic Inference -- 2.2.4 Profile Likelihood -- 2.2.5 Conditional Likelihood -- 2.3 Fits for Small Statistics -- 2.3.1 Sample Study of Coverage at Small Statistics -- 2.3.2 When the pdf Goes Negative -- 2.4 Results Near the Boundary of a Physical Region -- 2.5 Likelihood Ratio Test for Presence of Signal -- 2.6 sPlots -- 2.7 Exercises -- References -- 3 Goodness of Fit -- 3.1 Binned Goodness of Fit Tests -- 3.2 Statistics Converging to Chi-Square -- 3.3 Univariate Unbinned Goodness of Fit Tests -- 3.3.1 Kolmogorov-Smirnov -- 3.3.2 Anderson-Darling -- 3.3.3 Watson -- 3.3.4 Neyman Smooth -- 3.4 Multivariate Tests -- 3.4.1 Energy Tests -- 3.4.2 Transformations to a Uniform Distribution -- 3.4.3 Local Density Tests -- 3.4.4 Kernel-based Tests -- 3.4.5 Mixed Sample Tests -- 3.4.6 Using a Classifier -- 3.5 Exercises -- References -- 4 Resampling Techniques -- 4.1 Permutation Sampling -- 4.2 Bootstrap -- 4.2.1 Bootstrap Confidence Intervals -- 4.2.2 Smoothed Bootstrap -- 4.2.3 Parametric Bootstrap -- 4.3 Jackknife -- 4.4 BCa Confidence Intervals -- 4.5 Cross-Validation -- 4.6 Resampling Weighted Observations -- 4.7 Exercises -- References -- 5 Density Estimation -- 5.1 Empirical Density Estimate -- 5.2 Histograms -- 5.3 Kernel Estimation -- 5.3.1 Multivariate Kernel Estimation -- 5.4 Ideogram -- 5.5 Parametric vs. Nonparametric Density Estimation -- 5.6 Optimization.

5.6.1 Choosing Histogram Binning -- 5.7 Estimating Errors -- 5.8 The Curse of Dimensionality -- 5.9 Adaptive Kernel Estimation -- 5.10 Naive Bayes Classification -- 5.11 Multivariate Kernel Estimation -- 5.12 Estimation Using Orthogonal Series -- 5.13 Using Monte Carlo Models -- 5.14 Unfolding -- 5.14.1 Unfolding: Regularization -- 5.15 Exercises -- References -- 6 Basic Concepts and Definitions of Machine Learning -- 6.1 Supervised, Unsupervised, and Semi-Supervised -- 6.2 Tall and Wide Data -- 6.3 Batch and Online Learning -- 6.4 Parallel Learning -- 6.5 Classification and Regression -- References -- 7 Data Preprocessing -- 7.1 Categorical Variables -- 7.2 Missing Values -- 7.2.1 Likelihood Optimization -- 7.2.2 Deletion -- 7.2.3 Augmentation -- 7.2.4 Imputation -- 7.2.5 Other Methods -- 7.3 Outliers -- 7.4 Exercises -- References -- 8 Linear Transformations and Dimensionality Reduction -- 8.1 Centering, Scaling, Reflection and Rotation -- 8.2 Rotation and Dimensionality Reduction -- 8.3 Principal Component Analysis (PCA) -- 8.3.1 Theory -- 8.3.2 Numerical Implementation -- 8.3.3 Weighted Data -- 8.3.4 How Many Principal Components Are Enough? -- 8.3.5 Example: Apply PCA and Choose the Optimal Number of Components -- 8.4 Independent Component Analysis (ICA) -- 8.4.1 Theory -- 8.4.2 Numerical implementation -- 8.4.3 Properties -- 8.5 Exercises -- References -- 9 Introduction to Classification -- 9.1 Loss Functions: Hard Labels and Soft Scores -- 9.2 Bias, Variance, and Noise -- 9.3 Training, Validating and Testing: The Optimal Splitting Rule -- 9.4 Resampling Techniques: Cross-Validation and Bootstrap -- 9.4.1 Cross-Validation -- 9.4.2 Bootstrap -- 9.4.3 Sampling with Stratification -- 9.5 Data with Unbalanced Classes -- 9.5.1 Adjusting Prior Probabilities -- 9.5.2 Undersampling the Majority Class -- 9.5.3 Oversampling the Minority Class.

9.5.4 Example: Classification of Forest Cover Type Data -- 9.6 Learning with Cost -- 9.7 Exercises -- References -- 10 Assessing Classifier Performance -- 10.1 Classification Error and Other Measures of Predictive Power -- 10.2 Receiver Operating Characteristic (ROC) and Other Curves -- 10.2.1 Empirical ROC curve -- 10.2.2 Other Performance Measures -- 10.2.3 Optimal Operating Point -- 10.2.4 Area Under Curve -- 10.2.5 Smooth ROC Curves -- 10.2.6 Confidence Bounds for ROC Curves -- 10.3 Testing Equivalence of Two Classification Models -- 10.4 Comparing Several Classifiers -- 10.5 Exercises -- References -- 11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial Least Squares Regression -- 11.1 Discriminant Analysis -- 11.1.1 Estimating the Covariance Matrix -- 11.1.2 Verifying Discriminant Analysis Assumptions -- 11.1.3 Applying LDA When LDA Assumptions Are Invalid -- 11.1.4 Numerical Implementation -- 11.1.5 Regularized Discriminant Analysis -- 11.1.6 LDA for Variable Transformation -- 11.2 Logistic Regression -- 11.2.1 Binomial Logistic Regression: Theory and Numerical Implementation -- 11.2.2 Properties of the Binomial Model -- 11.2.3 Verifying Model Assumptions -- 11.2.4 Logistic Regression with Multiple Classes -- 11.3 Classification by Linear Regression -- 11.4 Partial Least Squares Regression -- 11.5 Example: Linear Models for MAGIC Telescope Data -- 11.6 Choosing a Linear Classifier for Your Analysis -- 11.7 Exercises -- References -- 12 Neural Networks -- 12.1 Perceptrons -- 12.2 The Feed-Forward Neural Network -- 12.3 Backpropagation -- 12.4 Bayes Neural Networks -- 12.5 Genetic Algorithms -- 12.6 Exercises -- References -- 13 Local Learning and Kernel Expansion -- 13.1 From Input Variables to the Feature Space -- 13.1.1 Kernel Regression -- 13.2 Regularization -- 13.2.1 Kernel Ridge Regression.

13.3 Making and Choosing Kernels -- 13.4 Radial Basis Functions -- 13.4.1 Example: RBF Classification for the MAGIC Telescope Data -- 13.5 Support Vector Machines (SVM) -- 13.5.1 SVM with Weighted Data -- 13.5.2 SVM with Probabilistic Outputs -- 13.5.3 Numerical Implementation -- 13.5.4 Multiclass Extensions -- 13.6 Empirical Local Methods -- 13.6.1 Classification by Probability Density Estimation -- 13.6.2 Locally Weighted Regression -- 13.6.3 Nearest Neighbors and Fuzzy Rules -- 13.7 Kernel Methods: The Good, the Bad and the Curse of Dimensionality -- 13.8 Exercises -- References -- 14 Decision Trees -- 14.1 Growing Trees -- 14.2 Predicting by Decision Trees -- 14.3 Stopping Rules -- 14.4 Pruning Trees -- 14.4.1 Example: Pruning a Classification Tree -- 14.5 Trees for Multiple Classes -- 14.6 Splits on Categorical Variables -- 14.7 Surrogate Splits -- 14.8 Missing Values -- 14.9 Variable importance -- 14.10 Why Are Decision Trees Good (or Bad)? -- 14.11 Exercises -- References -- 15 Ensemble Learning -- 15.1 Boosting -- 15.1.1 Early Boosting -- 15.1.2 AdaBoost for Two Classes -- 15.1.3 Minimizing Convex Loss by Stagewise Additive Modeling -- 15.1.4 Maximizing the Minimal Margin -- 15.1.5 Nonconvex Loss and Robust Boosting -- 15.1.6 Boosting for Multiple Classes -- 15.2 Diversifying the Weak Learner: Bagging, Random Subspace and Random Forest -- 15.2.1 Measures of Diversity -- 15.2.2 Bagging and Random Forest -- 15.2.3 Random Subspace -- 15.2.4 Example: K/pi Separation for BaBar PID -- 15.3 Choosing an Ensemble for Your Analysis -- 15.4 Exercises -- References -- 16 Reducing Multiclass to Binary -- 16.1 Encoding -- 16.2 Decoding -- 16.3 Summary: Choosing the Right Design -- References -- 17 How to Choose the Right Classifier for Your Analysis and Apply It Correctly -- 17.1 Predictive Performance and Interpretability.

17.2 Matching Classifiers and Variables -- 17.3 Using Classifier Predictions -- 17.4 Optimizing Accuracy -- 17.5 CPU and Memory Requirements -- 18 Methods for Variable Ranking and Selection -- 18.1 Definitions -- 18.1.1 Variable Ranking and Selection -- 18.1.2 Strong and Weak Relevance -- 18.2 Variable Ranking -- 18.2.1 Filters: Correlation and Mutual Information -- 18.2.2 Wrappers: Sequential Forward Selection (SFS), Sequential Backward Elimination (SBE), and Feature-based Sensitivity of Posterior Probabilities (FSPP) -- 18.2.3 Embedded Methods: Estimation of Variable Importance by Decision Trees, Neural Networks, Nearest Neighbors, and Linear Models -- 18.3 Variable Selection -- 18.3.1 Optimal-Set Search Strategies -- 18.3.2 Multiple Testing: Backward Elimination by Change in Margin (BECM) -- 18.3.3 Estimation of the Reference Distribution by Permutations: Artificial Contrasts with Ensembles (ACE) Algorithm -- 18.4 Exercises -- References -- 19 Bump Hunting in Multivariate Data -- 19.1 Voronoi Tessellation and SLEUTH Algorithm -- 19.2 Identifying Box Regions by PRIM and Other Algorithms -- 19.3 Bump Hunting Through Supervised Learning -- References -- 20 Software Packages for Machine Learning -- 20.1 Tools Developed in HEP -- 20.2 R -- 20.3 MATLAB -- 20.4 Tools for Java and Python -- 20.5 What Software Tool Is Right for You? -- References -- Appendix A: Optimization Algorithms -- A.1 Line Search -- A.2 Linear Programming (LP) -- Index.

Abstract:

Modern analysis of HEP data needs advanced statistical tools to separate signal from background. This is the first book which focuses on machine learning techniques. It will be of interest to almost every high energy physicist, and, due to its coverage, suitable for students.

Local Note:

Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2017. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.

Subject Term:

Condensed matter.

Particles (Nuclear physics) -- Statistical methods.

Genre:

Added Author:

Electronic Access:

Holds: Copies:

Available:*

Bound With These Titles

On Order