Making Sense of Data II : A Practical Guide to Data Visualization, Advanced Data Mining Methods, and Applications.

Title:

Author:

Myatt, Glenn J.

ISBN:

9780470417393

Personal Author:

Myatt, Glenn J.

Edition:

1st ed.

Physical Description:

1 online resource (307 pages)

Contents:

MAKING SENSE OF DATA II -- CONTENTS -- PREFACE -- 1 INTRODUCTION -- 1.1 Overview -- 1.2 Definition -- 1.3 Preparation -- 1.3.1 Overview -- 1.3.2 Accessing Tabular Data -- 1.3.3 Accessing Unstructured Data -- 1.3.4 Understanding the Variables and Observations -- 1.3.5 Data Cleaning -- 1.3.6 Transformation -- 1.3.7 Variable Reduction -- 1.3.8 Segmentation -- 1.3.9 Preparing Data to Apply -- 1.4 Analysis -- 1.4.1 Data Mining Tasks -- 1.4.2 Optimization -- 1.4.3 Evaluation -- 1.4.4 Model Forensics -- 1.5 Deployment -- 1.6 Outline of Book -- 1.6.1 Overview -- 1.6.2 Data Visualization -- 1.6.3 Clustering -- 1.6.4 Predictive Analytics -- 1.6.5 Applications -- 1.6.6 Software -- 1.7 Summary -- 1.8 Further Reading -- 2 DATA VISUALIZATION -- 2.1 Overview -- 2.2 Visualization Design Principles -- 2.2.1 General Principles -- 2.2.2 Graphics Design -- 2.2.3 Anatomy of a Graph -- 2.3 Tables -- 2.3.1 Simple Tables -- 2.3.2 Summary Tables -- 2.3.3 Two-Way Contingency Tables -- 2.3.4 Supertables -- 2.4 Univariate Data Visualization -- 2.4.1 Bar Chart -- 2.4.2 Histograms -- 2.4.3 Frequency Polygram -- 2.4.4 Box Plots -- 2.4.5 Dot Plot -- 2.4.6 Stem-and-Leaf Plot -- 2.4.7 Quantile Plot -- 2.4.8 Quantile-Quantile Plot -- 2.5 Bivariate Data Visualization -- 2.5.1 Scatterplot -- 2.6 Multivariate Data Visualization -- 2.6.1 Histogram Matrix -- 2.6.2 Scatterplot Matrix -- 2.6.3 Multiple Box Plot -- 2.6.4 Trellis Plot -- 2.7 Visualizing Groups -- 2.7.1 Dendrograms -- 2.7.2 Decision Trees -- 2.7.3 Cluster Image Maps -- 2.8 Dynamic Techniques -- 2.8.1 Overview -- 2.8.2 Data Brushing -- 2.8.3 Nearness Selection -- 2.8.4 Sorting and Rearranging -- 2.8.5 Searching and Filtering -- 2.9 Summary -- 2.10 Further Reading -- 3 CLUSTERING -- 3.1 Overview -- 3.2 Distance Measures -- 3.2.1 Overview -- 3.2.2 Numeric Distance Measures -- 3.2.3 Binary Distance Measures.

3.2.4 Mixed Variables -- 3.2.5 Other Measures -- 3.3 Agglomerative Hierarchical Clustering -- 3.3.1 Overview -- 3.3.2 Single Linkage -- 3.3.3 Complete Linkage -- 3.3.4 Average Linkage -- 3.3.5 Other Methods -- 3.3.6 Selecting Groups -- 3.4 Partitioned-Based Clustering -- 3.4.1 Overview -- 3.4.2 k-Means -- 3.4.3 Worked Example -- 3.4.4 Miscellaneous Partitioned-Based Clustering -- 3.5 Fuzzy Clustering -- 3.5.1 Overview -- 3.5.2 Fuzzy k-Means -- 3.5.3 Worked Examples -- 3.6 Summary -- 3.7 Further Reading -- 4 PREDICTIVE ANALYTICS -- 4.1 Overview -- 4.1.1 Predictive Modeling -- 4.1.2 Testing Model Accuracy -- 4.1.3 Evaluating Regression Models' Predictive Accuracy -- 4.1.4 Evaluating Classification Models' Predictive Accuracy -- 4.1.5 Evaluating Binary Models' Predictive Accuracy -- 4.1.6 ROC Charts -- 4.1.7 Lift Chart -- 4.2 Principal Component Analysis -- 4.2.1 Overview -- 4.2.2 Principal Components -- 4.2.3 Generating Principal Components -- 4.2.4 Interpretation of Principal Components -- 4.3 Multiple Linear Regression -- 4.3.1 Overview -- 4.3.2 Generating Models -- 4.3.3 Prediction -- 4.3.4 Analysis of Residuals -- 4.3.5 Standard Error -- 4.3.6 Coefficient of Multiple Determination -- 4.3.7 Testing the Model Significance -- 4.3.8 Selecting and Transforming Variables -- 4.4 Discriminant Analysis -- 4.4.1 Overview -- 4.4.2 Discriminant Function -- 4.4.3 Discriminant Analysis Example -- 4.5 Logistic Regression -- 4.5.1 Overview -- 4.5.2 Logistic Regression Formula -- 4.5.3 Estimating Coefficients -- 4.5.4 Assessing and Optimizing Results -- 4.6 Naive Bayes Classifiers -- 4.6.1 Overview -- 4.6.2 Bayes Theorem and the Independence Assumption -- 4.6.3 Independence Assumption -- 4.6.4 Classification Process -- 4.7 Summary -- 4.8 Further Reading -- 5 APPLICATIONS -- 5.1 Overview -- 5.2 Sales and Marketing -- 5.3 Industry-Specific Data Mining.

5.3.1 Finance -- 5.3.2 Insurance -- 5.3.3 Retail -- 5.3.4 Telecommunications -- 5.3.5 Manufacturing -- 5.3.6 Entertainment -- 5.3.7 Government -- 5.3.8 Pharmaceuticals -- 5.3.9 Healthcare -- 5.4 microRNA Data Analysis Case Study -- 5.4.1 Defining the Problem -- 5.4.2 Preparing the Data -- 5.4.3 Analysis -- 5.5 Credit Scoring Case Study -- 5.5.1 Defining the Problem -- 5.5.2 Preparing the Data -- 5.5.3 Analysis -- 5.5.4 Deployment -- 5.6 Data Mining Nontabular Data -- 5.6.1 Overview -- 5.6.2 Data Mining Chemical Data -- 5.6.3 Data Mining Text -- 5.7 Further Reading -- APPENDIX A MATRICES -- A.1 Overview of Matrices -- A.2 Matrix Addition -- A.3 Matrix Multiplication -- A.4 Transpose of a Matrix -- A.5 Inverse of a Matrix -- APPENDIX B SOFTWARE -- B.1 Software Overview -- B.1.1 Software Objectives -- B.1.2 Access and Installation -- B.1.3 User Interface Overview -- B.2 Data Preparation -- B.2.1 Overview -- B.2.2 Reading in Data -- B.2.3 Searching the Data -- B.2.4 Variable Characterization -- B.2.5 Removing Observations and Variables -- B.2.6 Cleaning the Data -- B.2.7 Transforming the Data -- B.2.8 Segmentation -- B.2.9 Principal Component Analysis -- B.3 Tables and Graphs -- B.3.1 Overview -- B.3.2 Contingency Tables -- B.3.3 Summary Tables -- B.3.4 Graphs -- B.3.5 Graph Matrices -- B.4 Statistics -- B.4.1 Overview -- B.4.2 Descriptive Statistics -- B.4.3 Confidence Intervals -- B.4.4 Hypothesis Tests -- B.4.5 Chi-Square Test -- B.4.6 ANOVA -- B.4.7 Comparative Statistics -- B.5 Grouping -- B.5.1 Overview -- B.5.2 Clustering -- B.5.3 Associative Rules -- B.5.4 Decision Trees -- B.6 Prediction -- B.6.1 Overview -- B.6.2 Linear Regression -- B.6.3 Discriminant Analysis -- B.6.4 Logistic Regression -- B.6.5 Naive Bayes -- B.6.6 kNN -- B.6.7 CART -- B.6.8 Neural Networks -- B.6.9 Apply Model -- BIBLIOGRAPHY -- INDEX.

Abstract:

Glenn J. Myatt, PhD, is cofounder of Leadscope, Inc. and a Partner of Myatt & Johnson, Inc., a consulting company that focuses on business intelligence application development delivered through the Internet. Dr. Myatt is the author of Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining, also published by Wiley. WAYNE P. JOHNSON, MSc., is cofounder of Leadscope, Inc. and a Partner of Myatt & Johnson, Inc. Mr. Johnson has over two decades of experience in the design and development of large software systems, and his current professional interests include human-computer interaction, information visualization, and methodologies for contextual inquiry.

Local Note:

Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2017. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.

Subject Term:

Data mining.

Electronic books. -- local.

Information visualization.

Genre:

Added Author:

Electronic Access:

Holds: Copies:

Available:*

Bound With These Titles

On Order