EN BG

P-hacking

Run enough analyses on any dataset and something will look significant. That's not discovery — that's fishing.

In research, a result is considered "statistically significant" when the p-value falls below 0.05. That threshold is supposed to mean there's less than a 5% chance the result happened by random noise. But here's the trick: if you test enough variables, slice the data enough ways, or tweak your methods enough times, you'll almost certainly stumble on something that crosses that line. This is p-hacking, and it's everywhere.

Sometimes it's deliberate. A researcher knows which result their funder wants and massages the numbers until it appears. But more often it's unconscious. You try one analysis, it doesn't work, so you try another. You drop a few outliers. You redefine your groups. Each step feels reasonable in isolation. But each step also moves you further from honest inquiry and closer to producing a predetermined answer.

The damage compounds with publication bias. Only the hacked "significant" result gets published. The twenty failed attempts stay hidden. What reaches the public looks like clean science. It's really the survivor of a selection process designed to produce exactly that outcome.

A p-value under 0.05 isn't proof. It's a starting point — and only if the analysis was planned before the data was collected.


References