- Title: Statistical and Machine-Learning Data Mining, 3rd Edition
- Author: Bruce Ratner
- Length: 696 pages
- Edition: 3
- Language: English
- Publisher: Chapman and Hall/CRC
- Publication Date: 2017-06-01
- ISBN-10: 1498797601
- ISBN-13: 9781498797603
- Sales Rank: #939733 (See Top 100 Books)

## Description

Statistical and Machine-Learning Data Mining:: Techniques for Better Predictive Modeling and Analysis of Big Data, Third Edition

The third edition of a bestseller, Statistical and Machine-Learning Data Mining: Techniques for Better Predictive Modeling and Analysis of Big Data is still the only book, to date, to distinguish between statistical data mining and machine-learning data mining. is a compilation of new and creative data mining techniques, which address the scaling-up of the framework of classical and modern statistical methodology, for predictive modeling and analysis of big data. SM-DM provides proper solutions to common problems facing the newly minted data scientist in the data mining discipline. Its presentation focuses on the needs of the data scientists (commonly known as statisticians, data miners and data analysts), delivering practical yet powerful, simple yet insightful quantitative techniques, most of which use the "old" statistical methodologies improved upon by the new machine learning influence.

### Table of Contents

Chapter 1: Introduction

Chapter 2: Science Dealing with Data: Statistics and Data Science

Chapter 3: Two Basic Data Mining Methods for Variable Assessment

Chapter 4: CHAID-Based Data Mining for Paired-Variable Assessment

Chapter 5: The Importance of Straight Data Simplicity and Desirability for Good Model-Building Practice

Chapter 6: Symmetrizing Ranked Data: A Statistical Data Mining Method for Improving the Predictive Power of Data

Chapter 7: Principal Component Analysis: A Statistical Data Mining Method for Many-Variable Assessment

Chapter 8: Market Share Estimation: Data Mining for an Exceptional Case

Chapter 9: The Correlation Coefficient: Its Values Range between Plus and Minus 1, or Do They?

Chapter 10: Logistic Regression: The Workhorse of Response Modeling

Chapter 11: Predicting Share of Wallet without Survey Data

Chapter 12: Ordinary Regression: The Workhorse of Profit Modeling

Chapter 13: Variable Selection Methods in Regression: Ignorable Problem, Notable Solution

Chapter 14: CHAID for Interpreting a Logistic Regression Model

Chapter 15: The Importance of the Regression Coefficient

Chapter 16: The Average Correlation: A Statistical Data Mining Measure for Assessment of Competing Predictive Models and the Importance of the Predictor Variables

Chapter 17: CHAID for Specifying a Model with Interaction Variables

Chapter 18: Market Segmentation Classification Modeling with Logistic Regression

Chapter 19: Market Segmentation Based on Time-Series Data Using Latent Class Analysis

Chapter 20: Market Segmentation: An Easy Way to Understand the Segments

Chapter 21: The Statistical Regression Model: An Easy Way to Understand the Model

Chapter 22: CHAID as a Method for Filling in Missing Values

Chapter 23: Model Building with Big Complete and Incomplete Data

Chapter 24: Art, Science, Numbers, and Poetry

Chapter 25: Identifying Your Best Customers: Descriptive, Predictive, and Look-Alike Profiling

Chapter 26: Assessment of Marketing Models

Chapter 27: Decile Analysis: Perspective and Performance

Chapter 28: Net T-C Lift Model: Assessing the Net Effects of Test and Control Campaigns

Chapter 29: Bootstrapping in Marketing: A New Approach for Validating Models

Chapter 30: Validating the Logistic Regression Model: Try Bootstrapping

Chapter 31: Visualization of Marketing Models: Data Mining to Uncover Innards of a Model

Chapter 32: The Predictive Contribution Coefficient: A Measure of Predictive Importance

Chapter 33: Regression Modeling Involves Art, Science, and Poetry, Too

Chapter 34: Opening the Dataset: A Twelve-Step Program for Dataholics

Chapter 35: Genetic and Statistic Regression Models: A Comparison

Chapter 36: Data Reuse: A Powerful Data Mining Effect of the GenIQ Model

Chapter 37: A Data Mining Method for Moderating Outliers Instead of Discarding Them

Chapter 38: Overfitting: Old Problem, New Solution

Chapter 39: The Importance of Straight Data: Revisited

Chapter 40: The GenIQ Model: Its Definition and an Application

Chapter 41: Finding the Best Variables for Marketing Models

Chapter 42: Interpretation of Coefficient-Free Models

Chapter 43: Text Mining: Primer, Illustration, and TXTDM Software

Chapter 44: Some of My Favorite Statistical Subroutines