stat.columbia.edu stat.columbia.edu

Stop talking about “statistical significance and practical significance”

You’ve heard it a million times already: “statistical significance” is not the same as “practical significance.” The idea you’re supposed to have in mind, I think, is some effect size estimated at 0.003 with a standard error of 0.001. It’s statistically significant but not practically significant! I’m assuming here that these numbers have some interpretable scale, for example normalized based on total sales or average...

stat.columbia.edu stat.columbia.edu

Estimates of “false positive” rates in various scientific fields

Uli Schimmack writes: I am curious what you think about our recent attempts to estimate the false discovery risk (maximum rate under assumption of 100% power) based on estimates of a bias-corrected discovery rate? We applied this method to medicine (similar to Jager and Leek, 2014) and psychology (hand-coding). Results are very similar with an FDR estimate between 10 and 20 percent. Based on results we recommend an alpha of .01 to...

stat.columbia.edu stat.columbia.edu

Wilkinson’s contribution to interactive visualization

This is Jessica. Upon learning this morning that Lee Wilkinson passed away I also felt compelled to write something on the extent to which his work has influenced interactive visualization research.  The Grammar of Graphics was an incredibly ambitious undertaking – Wilkinson set out to create a system that could produce any statistical graphic he’d ever seen, and that could deepen understanding of the meaning of graphics. The GoG...

stat.columbia.edu stat.columbia.edu

When confidence intervals include unreasonable values . . . When confidence intervals include only unreasonable values . . .

Robert Kaestner writes: Economists’ love affair with randomized controlled trials (RCTs) is growing stronger by the day. But what should we make of an RCT that produces a point estimate and confidence interval that largely includes values that most would consider implausible? The Goldin et al. article on effects of health insurance on mortality (QJE) provides a good example. Point estimate suggests that 6 months of extra insurance...

stat.columbia.edu stat.columbia.edu

Importance of understanding variation when considering how a treatment effect will scale

Art Owen writes: I saw the essay, “Nothing Scales,” by Jason Kerwin, which might be a good topic for one of your blog posts. Maybe a bunch of other people sent it to you already. He seems to think we just need more and better data and methods to get things to generalize/scale. It’s not clear to me that we’ll get enormously better data per subject on education or behavior. Maybe we will get better sets of subjects (more...

stat.columbia.edu stat.columbia.edu

Comparing bias and overfitting in learning from data across social psych and machine learning

This is Jessica. Not too long ago, I wrote about research on when claims in machine learning (ML)-oriented research are not reproducible, where I found it useful to try to contrast the nature of claims, and threats to their validity, in artificial intelligence and machine learning versus in a social science like psych where there’s been plenty of public discussion of ways authors can overclaim.  Writing 20 years ago on the “two...

stat.columbia.edu stat.columbia.edu

Holland academic statistics horror story

X points us to this news article by Mark Reid and Susan Wichgers, which “reads like a murder mystery, the victim being the best stats department in the Netherlands.” It’s a horrible story involving what appears to be the intentional destruction of data—a true statistical crime. Speaking as a political scientist, I’m reminded of an earlier discussion of academic misconduct, where I wrote: Academic and corporate environments...

stat.columbia.edu stat.columbia.edu

Responding to Richard Morey on p-values and inference

Jonathan Falk points to this post by Richard Morey, who writes: I [Morey] am convinced that most experienced scientists and statisticians have internalized statistical insights that frequentist statistics attempts to formalize: how you can be fooled by randomness; how what we see can be the result of biasing mechanisms; the importance of understanding sampling distributions. In typical scientific practice, the “null hypothesis...