A claim: 2.2e-16 is the most popular p-value in research papers, even more popular than 0.05 (or if you’re being cynical 0.049).

Why?

2.2e-16 happens to be the epsilon of a double-precision float (i.e. a decimal number stored using 64 bits). Roughly, this means that if you try to calculate 1 – epsilon, with anything smaller than epsilon, the answer will be 1.

In R, you can calculate this by running the following code (+2 due to convention):

i <- 1 while (1-2^(-i) != 1) {i <- i + 1} 2^(-i + 2)

Which you’ll find is the same as the built in constant .Machine$double.eps (which is 2.2e-16).

Note that if you are trying to add 1e-17 to 1 you won’t get anything useful, but adding numbers with more similar exponents everything works just fine.

> 2.2e-16 + 1.3e-17 [1] 2.33e-16

In fact, doubles can store between 2^-1023 and 2^1023, around 1e308, so why are they being truncated at 2.2e-16?

R stores all of its decimal numbers in the double format, and can be quite stubborn about not reporting them below this *upper bound* on rounding error. You can happily compare, add and subtract values of this 2.2e-16 magnitude, but if you started working with values out of this range you might run into problems. So R truncates it for you.

With datasets of even modest size, it’s easy to get p-values below this enforced bound:

> chisq.test(c(80, 1)) Chi-squared test for given probabilities data: c(80, 1) X-squared = 77.049, df = 1, p-value < 2.2e-16

So what should you do? Firstly, I’d start by asking whether the p-value at this level is actually meaningful. Is the model you’ve used to calculate it accurate enough to be this confident about the p calculation? Can we really make any meaningful interpretation of what say p < 1E-10 means (the rarest event I know of that has happened is around p < 1E-12). In this case, rather than reporting 2.2e-16, I’d just round your value and use something like p < 1E-10 (or p << 1E-10).

Sometimes this order of magnitude actually is needed. When you have a huge number of tests and apply a correction. To compare p-values from genome-wide association studies, the log(p-value) is very useful. R, being a sneaky so-and-so, actually has calculated the value without rounding, and you can usually get it by indexing into p.value in either the summary() or print() return of your test

print(chisq.test(c(80,1)))$p.value [1] 1.667363e-18

A friend replies: looking for more ‘rarest events’? See this page: https://improbability-principle.com/tales-of-strange-coincidences/