1 The Riemann Hypothesis (see also this paper) reflects the fact that all currently known approximations of the number $\pi(n)$ of primes $\leq n$ are heuristic

All known approximations of the number $\pi(n)$ of primes $\leq n$ are currently derived from real-valued functions that are asymptotic to $\pi(x)$, such as $\frac{x}{log_{e}x}$, $Li(x)$ and Riemann’s function $R(x) = \sum_{n=1}^{\infty}\frac{\mu(n)}{(n)}li(x^{1/n})$, where the degree of approximation for finite values of $n$ is determined only heuristically by conjecturing upon an error term in the asymptotic relation that can be seen’, but not yet proven, to yield good’ approximations of $\pi(n)$ (as described, for instance, by Barry Mazur and William Stein on p.36 of their absorbingly written, but without sacrificing rigour, entry-level book Prime Numbers and the Riemann Hypothesis).

The above graph compares the actual number $\pi(x)$ (red) of primes $\leq x$ with the distribution of primes as estimated variously by the functions $Li(x)$ (blue), $R(x)$ (black), and $\frac{x}{log_{e}x}$ (green), where $R(x)$ is Riemann’s function $\sum_{n=1}^{\infty}\frac{\mu(n)}{(n)}li(x^{1/n})$.

1A Two non-heuristic approximations of the number $\pi(n)$ of primes $\leq n$

The following introduces, and compares, two non-heuristic prime counting functions (investigated more fully here) that, prima facie, should apparently yield square-root accurate approximations of $\pi(n)$:

(i) $\pi_{_{H}}(n) = \sum_{j=1}^{n}\prod_{i=1}^{\sqrt{n}}(1-\frac{1}{p_{_{i}}}) = n.\prod_{i=1}^{\sqrt{n}}(1-\frac{1}{p_{_{i}}})$ (green); and

(ii) $\pi_{_{L}}(n) = \sum_{j=1}^{n}\prod_{i=1}^{\sqrt{j}}(1-\frac{1}{p_{_{i}}})$ (red);

Based on manual and spreadsheet calculations, Figs.1 and 2 compare the values of the two non-heuristically estimated prime counting functions (both of which have binomial standard deviations) with the actual values of $\pi(n)$ (blue) for $4 \leq n \leq 1500$ and $4 \leq n \leq 3000$.

Fig.1: The above graph compares the non-heuristically estimated values of $\pi_{_{H}}(n) = \sum_{j=1}^{n}\prod_{i=1}^{\sqrt{n}}(1-\frac{1}{p_{_{i}}}) = n.\prod_{i=1}^{\sqrt{n}}(1-\frac{1}{p_{_{i}}})$ (green) and $\pi_{_{L}}(n) = \sum_{j=1}^{n}\prod_{i=1}^{\sqrt{j}}(1-\frac{1}{p_{_{i}}})$ (red) vs the actual values of $\pi(n)$ (blue) for $4 \leq n \leq 1500$.

Fig.2: The above graph compares the non-heuristically estimated values of $\pi_{_{H}}(n) = \sum_{j=1}^{n}\prod_{i=1}^{\sqrt{n}}(1-\frac{1}{p_{_{i}}}) = n.\prod_{i=1}^{\sqrt{n}}(1-\frac{1}{p_{_{i}}})$ (green) and $\pi_{_{L}}(n) = \sum_{j=1}^{n}\prod_{i=1}^{\sqrt{j}}(1-\frac{1}{p_{_{i}}})$ (red) vs the actual values of $\pi(n)$ (blue) for $4 \leq n \leq 3000$. Note that the gradient of both $y = \pi_{_{H}}(x)$ and of $y = \pi_{_{L}}(x)$ in the interval $(p_{_{n}}^{2},\ p_{_{n+1}}^{2})$ is $\prod_{i = 1}^{n}(1 - \frac{1}{p_{_{i}}}) \rightarrow 0$.

1B Three intriguing queries

Since $\pi(n) \sim \frac{n}{log_{e}(n)}$ (by the Prime Number Theorem), whilst $\pi_{_{H}}(n) \sim 2e^{-\gamma}\frac{n}{log_{_{e}}n}$ (by Mertens’ Theorem, where $\gamma$ is the Euler-Mascheroni constant and $2.e^{-\gamma} \approx 1.12292 \ldots$), and it can be shown (see Corollary 3.2 here)  that $\pi_{_{L}}(n) \sim ae^{-\lambda}\frac{n}{log_{_{e}}n}$ for some $a \geq 2$, this raises the interesting queries (see also $\S$2 of this initial investigation):

Fig.3. The overlapping rectangles $A, B, C, D, \ldots$ represent $\pi_{_{H}}(p_{_{j+1}}^{2}) = p_{_{j+1}}^{2}.\prod_{i = 1}^{j}(1 - \frac{1}{p_{_{i}}})$ for $j \geq 1$. Figures within each rectangle are the primes and estimated primes corresponding to the functions $\pi(n)$ and $\pi_{_{H}}(n)$, respectively, within the interval $(1,\ p_{_{j+1}}^{2})$ for $j \geq 2$.

Fig.4: Graph of $y = \prod_{i = 1}^{\pi(\sqrt{x})}(1 - \frac{1}{p_{_{i}}})$. The rectangles represent $(p_{_{j+1}}^{2} - p_{_{j}}^{2})\prod_{i = 1}^{j}(1 - \frac{1}{p_{_{i}}})$ for $j \geq 1$. Figures within each rectangle are the primes corresponding to the functions $\pi(n)$ and $\pi_{_{L}}(n)$ within the interval $(p_{_{j}}^{2},\ p_{_{j+1}}^{2})$ for $j \geq 2$. The area under the curve is $\pi_{_{L}}(x) = (x - p_{_{n}}^{2})\prod_{i = 1}^{n}(1 - \frac{1}{p_{_{i}}}) + \sum_{j = 1}^{n-1}(p_{_{j+1}}^{2} - p_{_{j}}^{2})\prod_{i = 1}^{j}(1 - \frac{1}{p_{_{i}}}) + 2$.

(a) Which is the least $n$, if any, such that $\pi_{_{H}}(n) > \pi(n)$?

(b) Which is the largest $n$, if any, such that $\pi(n) > \pi_{_{H}}(n)$?

(c) Is $\pi(n) / \frac{n}{log_{_{e}}n}$ an algorithmically verifiable, but not algorithmically computable (see Definitions 1 and 2 here), oscillating function which is discontinuous in the limit $n \rightarrow \infty$ (in the sense of Zeno’s paradox in 2-dimensions considered in Case 4 here)?

2 Density of integers not divisible by primes $Q = \{q_{_{1}}, q_{_{2}}, \ldots, q_{_{k}}\}$

The above queries arise since (formal proofs for the following are detailed in Chapters 34 and 35 here):

Lemma 2.1: The asymptotic density of the set of all integers that are not divisible by any of a given set of primes $Q = \{q_{_{1}}, q_{_{2}}, \ldots, q_{_{k}}\}$ is:

$\prod_{q \in Q}(1-1/q)$.

It follows that:

Lemma 2.2: The expected number of integers in any interval $(a,b)$ that are not divisible by any of a given set of primes $Q = \{q_{_{1}}, q_{_{2}}, \ldots, q_{_{k}}\}$ is:

$(b-a)\prod_{q \in Q}(1-1/q)$.

2A The function $\pi_{_{H}}(n)$

In particular, the expected number $\pi_{_{H}}(n)$ of integers $\leq n$ that are not divisible by any of the first $k$ primes $p_{_{1}}, p_{_{2}}, \ldots, p_{_{k}}$ is:

Corollary 2.3: $\pi_{_{H}}(n) = n.\prod_{i = 1}^{k}(1 - \frac{1}{p_{_{i}}})$

It follows that:

Corollary 2.4: The expected number of primes $\leq p_{_{\pi(\sqrt n)+1}}^{2}$ is:

$\pi_{_{H}}(p_{_{\pi(\sqrt n)+1}}^{2}) = p_{_{\pi(\sqrt n) +1}}^{2}\prod_{i = 1}^{\pi(\sqrt n)}(1 - \frac{1}{p_{_{i}}})$

with cumulative standard deviation (in the direction of the defined probability distribution, which is orthogonal to $n \rightarrow \infty$):

$p_{_{\pi(\sqrt n)+1}}\sqrt{\prod_{i = 1}^{\pi(\sqrt n)}(1 - \frac{1}{p_{_{i}}})(1-\prod_{i = 1}^{\pi(\sqrt n)}(1 - \frac{1}{p_{_{i}}}))}$.

We conclude that $\pi_{_{H}}(n)$ is a non-heuristic approximation of the number of primes $\leq n$:

Lemma 2.5: $\pi(n) \approx \pi_{_{H}}(n) = n.\prod_{i = 1}^{\pi(\sqrt{n})}(1 - \frac{1}{p_{_{i}}}) \sim 2e^{-\gamma}\frac{n}{log_{_{e}}n}$.

2B The function $\pi_{_{L}}(n)$

We note similarly that:

Corollary 2.6: The expected number of primes in the interval ($p_{_{\pi(\sqrt n)}}^{2},\ p_{_{\pi(\sqrt n)+1}}^{2}$) is:

$(p_{_{\pi(\sqrt n)+1}}^{2} - p_{_{\pi(\sqrt n)}}^{2})\prod_{i = 1}^{\pi(\sqrt n)}(1 - \frac{1}{p_{_{i}}})$.

It further follows from Lemma 2.2 and Corollary 2.6 that:

Corollary 2.7: The number $\pi(p_{_{\pi(\sqrt n)+1}}^{2})$ of primes less than $p_{_{\pi(\sqrt n)+1}}^{2}$ is cumulatively approximated by:

$\pi_{_{L}}(p_{_{\pi(\sqrt n)+1}}^{2}) = \sum_{j=1}^{\pi(\sqrt n)}\{(p_{_{j+1}}^{2} - p_{_{j}}^{2})\prod_{i = 1}^{j}(1 - \frac{1}{p_{_{i}}})\}$.

We conclude that $\pi_{_{L}}(n)$ is also a cumulative non-heuristic approximation of the number of primes $\leq n$:

Lemma 2.8: $\pi(n) \approx \pi_{_{L}}(n) = \sum_{j = 1}^{n}\prod_{i = 1}^{\pi(\sqrt{j})}(1 - \frac{1}{p_{_{i}}})$.

2C An apparent paradox

We note that since the above non-heuristic estimates appeal to an asymptotic density, and the density of primes tends to $0$ as $n \rightarrow \infty$, Figs.1-4 suggest that:

(i) $\pi_{_{L}}(n)$ overestimates $\pi(n)$; whilst

(ii) $\pi_{_{H}}(n)$ underestimates $\pi(n)$.

For instance, Fig.3 suggests that, prima facie, $\pi(p_{_{n}}^{2}) > \pi_{_{H}}(p_{_{n}}^{2})$ for all $n > 1$, since:

$\bullet$ The number of primes in any interval $(p_{_{j}}^{2},\ p_{_{j+1}}^{2})$ for any given $j < n$ is constant as $n \rightarrow \infty$; but

$\bullet$ The contribution of the expected number of primes in the interval $(p_{_{j}}^{2},\ p_{_{j+1}}^{2})$ to the total expected number (see Corollary 2.4), $\pi_{_{H}}(p_{_{n+1}}^{2}) = p_{_{n+1}}^{2}.\prod_{i = 1}^{n+1}(1 - \frac{1}{p_{_{i}}})$, of primes in the interval $(1,\ p_{_{n+1}}^{2})$ decreases monotonically.

However, an apparent paradox then surfaces when we express $\pi(n)$ and the function $\pi_{_{H}}(n)$ in terms of the number of primes determined by each function respectively in each interval $(p_{_{n}}^{2},\ p_{_{n+1}}^{2})$ as follows:

$\pi(p_{_{n+1}}^{2}) = \sum_{j=1}^{n}(\pi(p_{_{j+1}}^{2}) - \pi(p_{_{j}}^{2})) + \pi(p_{_{1}}^{2})$

$\pi_{_{H}}(p_{_{n+1}}^{2}) = p_{_{n+1}}^{2}.\prod_{i = 1}^{\pi(\sqrt{p_{_{n+1}}^{2}})}(1 - \frac{1}{p_{_{i}}})$

$= (\sum_{j=1}^{n}(p_{_{j+1}}^{2} - p_{_{j}}^{2}) + p_{_{1}}^{2}).\prod_{i = 1}^{n}(1 - \frac{1}{p_{_{i}}})$

$= \sum_{j=1}^{n}(p_{_{j+1}}^{2}.\prod_{i = 1}^{n}(1 - \frac{1}{p_{_{i}}}) - p_{_{j}}^{2}.\prod_{i = 1}^{n}(1 - \frac{1}{p_{_{i}}})) + p_{_{1}}^{2}.\prod_{i = 1}^{n}(1 - \frac{1}{p_{_{i}}})$

Since $\prod_{i = 1}^{n}(1 - \frac{1}{p_{_{i}}}) \rightarrow 0$ as $n \rightarrow \infty$, we can, for any given $k$, define $n_{_{k}} > k$ such that:

$\pi_{_{H}}(p_{_{n_{_{k}}+1}}^{2}) = \sum_{j=1}^{k}(p_{_{j+1}}^{2}.\prod_{i = 1}^{n_{_{k}}}(1 - \frac{1}{p_{_{i}}}) - p_{_{j}}^{2}.\prod_{i = 1}^{n_{_{k}}}(1 - \frac{1}{p_{_{i}}}))\ +$

$\sum_{j=k+1}^{n_{_{k}}}(p_{_{j+1}}^{2}.\prod_{i = 1}^{n_{_{k}}}(1 - \frac{1}{p_{_{i}}}) - p_{_{j}}^{2}.\prod_{i = 1}^{n_{_{k}}}(1 - \frac{1}{p_{_{i}}})) + p_{_{1}}^{2}.\prod_{i = 1}^{n_{_{k}}}(1 - \frac{1}{p_{_{i}}})$

$< \sum_{j=1}^{k}(\pi(p_{_{j+1}}^{2}) - \pi(p_{_{j}}^{2}))\ +$

$\sum_{j=k+1}^{n_{_{k}}}(p_{_{j+1}}^{2}.\prod_{i = 1}^{n_{_{k}}}(1 - \frac{1}{p_{_{i}}}) - p_{_{j}}^{2}.\prod_{i = 1}^{n_{_{k}}}(1 - \frac{1}{p_{_{i}}})) + p_{_{1}}^{2}.\prod_{i = 1}^{n_{_{k}}}(1 - \frac{1}{p_{_{i}}})$

We conclude that:

Theorem For any given $k$, we can find $n_{_{k}} > k$ such that the estimated number $\pi_{_{H}}(p_{_{n_{_{k}}+1}}^{2})$ of primes $\leq p_{_{n_{_{k}}+1}}^{2}$ is always less than the actual number of primes less than $p_{_{k+1}}^{2}$ plus the number of primes between $p_{_{k+1}}^{2}$ and $p_{_{n_{_{k}}+1}}^{2}$ as estimated by $\pi_{_{H}}(p_{_{n_{_{k}}+1}}^{2})$.

Since $\pi(n) \sim \frac{n}{log_{e}(n)}$ (by the Prime Number Theorem), whilst $\pi_{_{H}}(n) \sim 2e^{-\gamma}\frac{n}{log_{_{e}}n}$ (where $2.e^{-\gamma} \approx 1.12292 \ldots$), and $\pi_{_{L}}(n) > \pi_{_{H}}(n)$ for all $n \geq 9$ (see Figs.3 and 4), we note the apparent paradox:

Corollary The Prime Number Theorem implies that:

$\lim_{n \rightarrow \infty} \frac{\pi_{_{H}}(p_{_{n+1}}^{2})}{\pi(p_{_{n+1}}^{2})} = 2.e^{^{-\gamma}} > 1.$

3 Conventional estimates of $\pi(x)$ for finite $x > 2$ are heuristic

We note that Guy Robin (Robin, G. (1983). “Sur l’ordre maximum de la fonction somme des diviseurs”. Séminaire Delange-Pisot-Poitou, Théorie des nombres (1981-1982). Progress in Mathematics. 38: 233-244.) proved (implicit assumptions?) that the following changes sign infinitely often:

$(log_{e}\ n).\prod_{p \leq n}(1 - \frac{1}{p}) - e^{^{-\gamma}}$

Robin’s result is analogous to Littlewood’s curious theorem (implicit assumptions?) that the difference $\pi(x) - Li(x)$ changes sign infinitely often. No analogue of the Skewes number (an upper bound on the first natural number $x$ for which $\pi(x) > Li(x)$) is, however, known for Robin’s result.

Littlewood’s theorem is `curious’ since:

(a) There is no explicitly defined arithmetical formula that, for any $x > 2$, will yield $\pi(x)$. Hence, Littlewood’s proof deduces the behaviour of $\pi(x) - Li(x)$ for finite values of $x > 2$ by implicitly appealing to the relation of $\pi(x)$ to $\zeta(s)$, defined as $\sum\frac{1}{n^{s}}$ over all integers $n \geq 1$, through the identity of the infinite summation with the Euler product $\prod(1 - \frac{1}{p^{s}})^{^{-1}}$ over all primes, which is valid only for Real$(s) > 1$;

(b) Littlewood’s proof deduces the behaviour of $\pi(x) - Li(x)$ for finite values of $x > 0$ by (uncritically, as I argue here?) appealing to the analytically continued behaviour of $\zeta(s)$ in areas where $\pi(x)$ is not defined!

Moreover, we note that—unlike $\pi_{_{H}}(n)$ and $\pi_{_{L}}(n)$—all conventional estimates of $\pi(x)$ for finite values of $x > 0$ can be treated as heuristic, since they appeal only to the limiting behaviour (Theorem 420, p.345, Hardy and Wright, Introduction to the Theory of Numbers, 4th ed, 1960, Oxford University Press) of a formally undefined arithmetical function, $\pi(n)$, to the limiting behaviours of formally defined arithmetical functions $\phi(n)$ and $\psi(n)$:

$\pi(x) \sim \frac{\phi(x)}{log\ x} \sim \frac{\psi(x)}{log\ x}$

where:

$\phi(x) = \sum_{_{p \leq x}}log\ p = log \prod_{_{p \leq x}}p$

$\psi(x) = \sum_{_{p^{m} \leq x}}log\ p$