
A Practical Guide to Data Mining for Business and Industry.
Title:
A Practical Guide to Data Mining for Business and Industry.
Author:
Ahlemeyer-Stubbe, Andrea.
ISBN:
9781118763728
Personal Author:
Edition:
1st ed.
Physical Description:
1 online resource (325 pages)
Contents:
A Practical Guide to Data Mining for Business and Industry -- Copyright -- Contents -- Glossary of terms -- Part I Data Mining Concept -- 1 Introduction -- 1.1 Aims of the Book -- 1.2 Data Mining Context -- 1.2.1 Domain Knowledge -- 1.2.2 Words to Remember -- 1.2.3 Associated Concepts -- 1.3 Global Appeal -- 1.4 Example Datasets Used in This Book -- 1.5 Recipe Structure -- 1.6 Further Reading and Resources -- 2 Data mining definition -- 2.1 Types of Data Mining Questions -- 2.1.1 Population and Sample -- 2.1.2 Data Preparation -- 2.1.3 Supervised and Unsupervised Methods -- 2.1.4 Knowledge-Discovery Techniques -- 2.2 Data Mining Process -- 2.3 Business Task: Clarification of the Business Question behind the Problem -- 2.4 Data: Provision and Processing of the Required Data -- 2.4.1 Fixing the Analysis Period -- 2.4.2 Basic Unit of Interest -- 2.4.3 Target Variables -- 2.4.4 Input Variables/Explanatory Variables -- 2.5 Modelling: Analysis of the Data -- 2.6 Evaluation and Validation during the Analysis Stage -- 2.7 Application of Data Mining Results and Learning from the Experience -- Part II Data Mining Practicalities -- 3 All about data -- 3.1 Some Basics -- 3.1.1 Data, Information, Knowledge and Wisdom -- 3.1.2 Sources and Quality of Data -- 3.1.3 Measurement Level and Types of Data -- 3.1.4 Measures of Magnitude and Dispersion -- 3.1.5 Data Distributions -- 3.2 Data Partition: Random Samples for Training, Testing and Validation -- 3.3 Types of Business Information Systems -- 3.3.1 Operational Systems Supporting Business Processes -- 3.3.2 Analysis-Based Information Systems -- 3.3.3 Importance of Information -- 3.4 Data Warehouses -- 3.4.1 Topic Orientation -- 3.4.2 Logical Integration and Homogenisation -- 3.4.3 Reference Period -- 3.4.4 Low Volatility -- 3.4.5 Using the Data Warehouse.
3.5 Three Components of a Data Warehouse: DBMS, DB and DBCS -- 3.5.1 Database Management System (DBMS) -- 3.5.2 Database (DB) -- 3.5.3 Database Communication Systems (DBCS) -- 3.6 Data Marts -- 3.6.1 Regularly Filled Data Marts -- 3.6.2 Comparison between Data Marts and Data Warehouses -- 3.7 A Typical Example from the Online Marketing Area -- 3.8 Unique Data Marts -- 3.8.1 Permanent Data Marts -- 3.8.2 Data Marts Resulting from Complex Analysis -- 3.9 Data Mart: Do's and Don'ts -- 3.9.1 Do's and Don'ts for Processes -- 3.9.2 Do's and Don'ts for Handling -- 3.9.3 Do's and Don'ts for Coding/Programming -- 4 Data Preparation -- 4.1 Necessity of Data Preparation -- 4.2 From Small and Long to Short and Wide -- 4.3 Transformation of Variables -- 4.4 Missing Data and Imputation Strategies -- 4.5 Outliers -- 4.6 Dealing with the Vagaries of Data -- 4.6.1 Distributions -- 4.6.2 Tests for Normality -- 4.6.3 Data with Totally Different Scales -- 4.7 Adjusting the Data Distributions -- 4.7.1 Standardisation and Normalisation -- 4.7.2 Ranking -- 4.7.3 Box-Cox Transformation -- 4.8 Binning -- 4.8.1 Bucket Method -- 4.8.2 Analytical Binning for Nominal Variables -- 4.8.3 Quantiles -- 4.8.4 Binning in Practice -- 4.9 Timing Considerations -- 4.10 Operational Issues -- 5 Analytics -- 5.1 Introduction -- 5.2 Basis of Statistical Tests -- 5.2.1 Hypothesis Tests and P Values -- 5.2.2 Tolerance Intervals -- 5.2.3 Standard Errors and Confidence Intervals -- 5.3 Sampling -- 5.3.1 Methods -- 5.3.2 Sample Sizes -- 5.3.3 Sample Quality and Stability -- 5.4 Basic Statistics for Pre-analytics -- 5.4.1 Frequencies -- 5.4.2 Comparative Tests -- 5.4.3 Cross Tabulation and Contingency Tables -- 5.4.4 Correlations -- 5.4.5 Association Measures for Nominal Variables -- 5.4.6 Examples of Output from Comparative and Cross Tabulation Tests.
5.5 Feature Selection/Reduction of Variables -- 5.5.1 Feature Reduction Using Domain Knowledge -- 5.5.2 Feature Selection Using Chi-Square -- 5.5.3 Principal Components Analysis and Factor Analysis -- 5.5.4 Canonical Correlation, PLS and SEM -- 5.5.5 Decision Trees -- 5.5.6 Random Forests -- 5.6 Time Series Analysis -- 6 Methods -- 6.1 Methods Overview -- 6.2 Supervised Learning -- 6.2.1 Introduction and Process Steps -- 6.2.2 Business Task -- 6.2.3 Provision and Processing of the Required Data -- 6.2.4 Analysis of the Data -- 6.2.5 Evaluation and Validation of the Results (during the Analysis) -- 6.2.6 Application of the Results -- 6.3 Multiple Linear Regression for Use When Target is Continuous -- 6.3.1 Rationale of Multiple Linear Regression Modelling -- 6.3.2 Regression Coefficients -- 6.3.3 Assessment of the Quality of the Model -- 6.3.4 Example of Linear Regression in Practice -- 6.4 Regression When the Target is Not Continuous -- 6.4.1 Logistic Regression -- 6.4.2 Example of Logistic Regression in Practice -- 6.4.3 Discriminant Analysis -- 6.4.4 Log-Linear Models and Poisson Regression -- 6.5 Decision Trees -- 6.5.1 Overview -- 6.5.2 Selection Procedures of the Relevant Input Variables -- 6.5.3 Splitting Criteria -- 6.5.4 Number of Splits (Branches of the Tree) -- 6.5.5 Symmetry/Asymmetry -- 6.5.6 Pruning -- 6.6 Neural Networks -- 6.7 Which Method Produces the Best Model? A Comparison of Regression, Decision Trees and Neural Networks -- 6.8 Unsupervised Learning -- 6.8.1 Introduction and Process Steps -- 6.8.2 Business Task -- 6.8.3 Provision and Processing of the Required Data -- 6.8.4 Analysis of the Data -- 6.8.5 Evaluation and Validation of the Results (during the Analysis) -- 6.8.6 Application of the Results -- 6.9 Cluster Analysis -- 6.9.1 Introduction -- 6.9.2 Hierarchical Cluster Analysis -- 6.9.3 K-Means Method of Cluster Analysis.
6.9.4 Example of Cluster Analysis in Practice -- 6.10 Kohonen Networks and Self-Organising Maps -- 6.10.1 Description -- 6.10.2 Example of SOMs in Practice -- 6.11 Group Purchase Methods: Association and Sequence Analysis -- 6.11.1 Introduction -- 6.11.2 Analysis of the Data -- 6.11.3 Group Purchase Methods -- 6.11.4 Examples of Group Purchase Methods in Practice -- 7 Validation and application -- 7.1 Introduction to Methods for Validation -- 7.2 Lift and Gain Charts -- 7.3 Model Stability -- 7.4 Sensitivity Analysis -- 7.5 Threshold Analytics and Confusion Matrix -- 7.6 ROC Curves -- 7.7 Cross-Validation and Robustness -- 7.8 Model Complexity -- Part III Data Mining in Action -- 8 Marketing: Prediction -- 8.1 Recipe 1: Response Optimisation: To Find and Address the Right Number of Customers -- 8.2 Recipe 2: To Find the x% of Customers with the Highest Affinity to an Offer -- 8.3 Recipe 3: To Find the Right Number of Customers to Ignore -- 8.4 Recipe 4: To Find the x% of Customers with the Lowest Affinity to an Offer -- 8.5 Recipe 5: To Find the x% of Customers with the Highest Affinity to Buy -- 8.6 Recipe 6: To Find the x% of Customers with the Lowest Affinity to Buy -- 8.7 Recipe 7: To Find the x% of Customers with the Highest Affinity to a Single Purchase -- 8.8 Recipe 8: To Find the x% of Customers with the Highest Affinity to Sign a Long-Term Contract in Communication Areas -- 8.9 Recipe 9: To Find the x% of Customers with the Highest Affinity to Sign a Long-Term Contract in Insurance Areas -- 9 Intra-Customer Analysis -- 9.1 Recipe 10: To Find the Optimal Amount of Single Communication to Activate One Customer -- 9.2 Recipe 11: To Find the Optimal Communication Mix to Activate One Customer -- 9.3 Recipe 12: To Find and Describe Homogeneous Groups of Products.
9.4 Recipe 13: To Find and Describe Groups of Customers with Homogeneous Usage -- 9.5 Recipe 14: To Predict the Order Size of Single Products or Product Groups -- 9.6 Recipe 15: Product Set Combination -- 9.7 Recipe 16: To Predict the Future Customer Lifetime Value of a Customer -- 10 Learning from a Small Testing Sample and Prediction -- 10.1 Recipe 17: To Predict Demographic Signs (Like Sex, Age, Education and Income) -- 10.2 Recipe 18: To Predict the Potential Customers of a Brand New Product or Service in Your Databases -- 10.3 Recipe 19: To Understand Operational Features and General Business Forecasting -- 11 Miscellaneous -- 11.1 Recipe 20: To Find Customers Who Will Potentially Churn -- 11.2 Recipe 21: Indirect Churn Based on a Discontinued Contract -- 11.3 Recipe 22: Social Media Target Group Descriptions -- 11.4 Recipe 23: Web Monitoring -- 11.5 Recipe 24: To Predict Who is Likely to Click on a Special Banner -- 12 Software and Tools : A Quick Guide -- 12.1 List of Requirements When Choosing a Data Mining Tool -- 12.2 Introduction to the Idea of Fully Automated Modelling (FAM) -- 12.2.1 Predictive Behavioural Targeting -- 12.2.2 Fully Automatic Predictive Targeting and Modelling Real-Time Online Behaviour -- 12.3 FAM Function -- 12.4 FAM Architecture -- 12.5 FAM Data Flows and Databases -- 12.6 FAM Modelling Aspects -- 12.7 FAM Challenges and Critical Success Factors -- 12.8 FAM Summary -- 13 Overviews -- 13.1 To Make Use of Official Statistics -- 13.2 How to Use Simple Maths to Make an Impression -- 13.2.1 Approximations -- 13.2.2 Absolute and Relative Values -- 13.2.3 % Change -- 13.2.4 Values in Context -- 13.2.5 Confidence Intervals -- 13.2.6 Rounding -- 13.2.7 Tables -- 13.2.8 Figures -- 13.3 Differences between Statistical Analysis and Data Mining -- 13.3.1 Assumptions -- 13.3.2 Values Missing Because 'Nothing Happened'.
13.3.3 Sample Sizes.
Abstract:
Data mining is well on its way to becoming a recognized discipline in the overlapping areas of IT, statistics, machine learning, and AI. Practical Data Mining for Business presents a user-friendly approach to data mining methods, covering the typical uses to which it is applied. The methodology is complemented by case studies to create a versatile reference book, allowing readers to look for specific methods as well as for specific applications. The book is formatted to allow statisticians, computer scientists, and economists to cross-reference from a particular application or method to sectors of interest.
Local Note:
Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2017. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.
Genre:
Added Author:
Electronic Access:
Click to View