Most modeling failures are caused by flawed (and often implicit) assumptions.
Statistical pragmatism recognizes that all forms of statistical inference make assumptions, assumptions which can only be tested very crudely (with such things as goodness-of-fit methods) and can almost never be verified. This is not only at the heart of statistical inference, it is also the great wisdom of our field.
This is also what we discuss in the Data Centricity Lab (see datacentricity.org for an overview). We underline the role of assumptions in the modeling process and how they dictate the usefulness of models (and the decisions they support).
This paper defends pragmatism over dogma:
- Using both frequentist (e.g., p-values, confidence intervals) and Bayesian (e.g., posterior probabilities) tools, depending on the problem.
- Prioritizing the assumptions that connect models to real-world data rather than debating the “true” nature of probability.
One implication is that we rethink how we frame the relationship between a sample (reality) and the population (hypothetical). We often describe statistical inference as random sampling from a finite population, but that can be misleading. The paper suggests we call the estimand “theoretical mean” rather than “population mean.”
Why does it matter? The more we emphasize the role of assumptions, the more modelers question if the theoretical world aligns well with the real world that produced the data. As we discuss at Data Duets, when assumptions are sidelined, a misconception takes hold: the idea that methodological rigor can substitute for conceptual accuracy. And causal (semi-)parametric solutions are often more sensitive to this misconception than predictive ones (as we further discuss here).