• Title: Statistical and Machine-Learning Data Mining, 3rd Edition
  • Author: Bruce Ratner
  • Length: 696 pages
  • Edition: 3
  • Language: English
  • Publisher: Chapman and Hall/CRC
  • Publication Date: 2017-06-01
  • ISBN-10: 1498797601
  • ISBN-13: 9781498797603
  • Sales Rank: #939733 (See Top 100 Books)


Statistical and Machine-Learning Data Mining:: Techniques for Better Predictive Modeling and Analysis of Big Data, Third Edition

The third edition of a bestseller, Statistical and Machine-Learning Data Mining: Techniques for Better Predictive Modeling and Analysis of Big Data is still the only book, to date, to distinguish between statistical data mining and machine-learning data mining. is a compilation of new and creative data mining techniques, which address the scaling-up of the framework of classical and modern statistical methodology, for predictive modeling and analysis of big data. SM-DM provides proper solutions to common problems facing the newly minted data scientist in the data mining discipline. Its presentation focuses on the needs of the data scientists (commonly known as statisticians, data miners and data analysts), delivering practical yet powerful, simple yet insightful quantitative techniques, most of which use the "old" statistical methodologies improved upon by the new machine learning influence.

Table of Contents

Chapter 1: Introduction
Chapter 2: Science Dealing with Data: Statistics and Data Science
Chapter 3: Two Basic Data Mining Methods for Variable Assessment
Chapter 4: CHAID-Based Data Mining for Paired-Variable Assessment
Chapter 5: The Importance of Straight Data Simplicity and Desirability for Good Model-Building Practice
Chapter 6: Symmetrizing Ranked Data: A Statistical Data Mining Method for Improving the Predictive Power of Data
Chapter 7: Principal Component Analysis: A Statistical Data Mining Method for Many-Variable Assessment
Chapter 8: Market Share Estimation: Data Mining for an Exceptional Case
Chapter 9: The Correlation Coefficient: Its Values Range between Plus and Minus 1, or Do They?
Chapter 10: Logistic Regression: The Workhorse of Response Modeling
Chapter 11: Predicting Share of Wallet without Survey Data
Chapter 12: Ordinary Regression: The Workhorse of Profit Modeling
Chapter 13: Variable Selection Methods in Regression: Ignorable Problem, Notable Solution
Chapter 14: CHAID for Interpreting a Logistic Regression Model
Chapter 15: The Importance of the Regression Coefficient
Chapter 16: The Average Correlation: A Statistical Data Mining Measure for Assessment of Competing Predictive Models and the Importance of the Predictor Variables
Chapter 17: CHAID for Specifying a Model with Interaction Variables
Chapter 18: Market Segmentation Classification Modeling with Logistic Regression
Chapter 19: Market Segmentation Based on Time-Series Data Using Latent Class Analysis
Chapter 20: Market Segmentation: An Easy Way to Understand the Segments
Chapter 21: The Statistical Regression Model: An Easy Way to Understand the Model
Chapter 22: CHAID as a Method for Filling in Missing Values
Chapter 23: Model Building with Big Complete and Incomplete Data
Chapter 24: Art, Science, Numbers, and Poetry
Chapter 25: Identifying Your Best Customers: Descriptive, Predictive, and Look-Alike Profiling
Chapter 26: Assessment of Marketing Models
Chapter 27: Decile Analysis: Perspective and Performance
Chapter 28: Net T-C Lift Model: Assessing the Net Effects of Test and Control Campaigns
Chapter 29: Bootstrapping in Marketing: A New Approach for Validating Models
Chapter 30: Validating the Logistic Regression Model: Try Bootstrapping
Chapter 31: Visualization of Marketing Models: Data Mining to Uncover Innards of a Model
Chapter 32: The Predictive Contribution Coefficient: A Measure of Predictive Importance
Chapter 33: Regression Modeling Involves Art, Science, and Poetry, Too
Chapter 34: Opening the Dataset: A Twelve-Step Program for Dataholics
Chapter 35: Genetic and Statistic Regression Models: A Comparison
Chapter 36: Data Reuse: A Powerful Data Mining Effect of the GenIQ Model
Chapter 37: A Data Mining Method for Moderating Outliers Instead of Discarding Them
Chapter 38: Overfitting: Old Problem, New Solution
Chapter 39: The Importance of Straight Data: Revisited
Chapter 40: The GenIQ Model: Its Definition and an Application
Chapter 41: Finding the Best Variables for Marketing Models
Chapter 42: Interpretation of Coefficient-Free Models
Chapter 43: Text Mining: Primer, Illustration, and TXTDM Software
Chapter 44: Some of My Favorite Statistical Subroutines