$Updated: 05/11/2018$

1 The Riemann Hypothesis (see also this paper) reflects the fact that all currently known approximations of the number $\pi(n)$ of primes $\leq n$ are heuristic

All known approximations of the number $\pi(n)$ of primes $\leq n$ are currently derived from real-valued functions that are asymptotic to $\pi(x)$, such as $\frac{x}{log_{e}x}$, $Li(x)$ and Riemann’s function $R(x) = \sum_{n=1}^{\infty}\frac{\mu(n)}{(n)}li(x^{1/n})$, where the degree of approximation for finite values of $n$ is determined only heuristically by conjecturing upon an error term in the asymptotic relation that can be seen’, but not yet proven, to yield good’ approximations of $\pi(n)$ (as described, for instance, by Barry Mazur and William Stein on p.36 of their absorbingly written, but without sacrificing rigour, entry-level book Prime Numbers and the Riemann Hypothesis).

The above graph compares the actual number $\pi(x)$ (red) of primes $\leq x$ with the distribution of primes as estimated variously by the functions $Li(x)$ (blue), $R(x)$ (black), and $\frac{x}{log_{e}x}$ (green), where $R(x)$ is Riemann’s function $\sum_{n=1}^{\infty}\frac{\mu(n)}{(n)}li(x^{1/n})$.

1A Two non-heuristic approximations of the number $\pi(n)$ of primes $\leq n$

The following introduces, and compares, two non-heuristic prime counting functions (investigated more fully here) that, prima facie, yield non-heuristic approximations of $\pi(n)$:

(i) $\pi_{_{H}}(n) = \sum_{j=1}^{n}\prod_{i=1}^{\sqrt{n}}(1-\frac{1}{p_{_{i}}}) = n.\prod_{i=1}^{\sqrt{n}}(1-\frac{1}{p_{_{i}}})$ (green); and

(ii) $\pi_{_{L}}(n) = \sum_{j=1}^{n}\prod_{i=1}^{\sqrt{j}}(1-\frac{1}{p_{_{i}}})$ (red);

Based on manual and spreadsheet calculations, Figs.1 and 2 compare the values of the two non-heuristically estimated prime counting functions with the actual values of $\pi(n)$ (blue) for $4 \leq n \leq 1500$ and $4 \leq n \leq 3000$.

Fig.1: The above graph compares the non-heuristically estimated values of $\pi_{_{H}}(n) = \sum_{j=1}^{n}\prod_{i=1}^{\sqrt{n}}(1-\frac{1}{p_{_{i}}}) = n.\prod_{i=1}^{\sqrt{n}}(1-\frac{1}{p_{_{i}}})$ (green) and $\pi_{_{L}}(n) = \sum_{j=1}^{n}\prod_{i=1}^{\sqrt{j}}(1-\frac{1}{p_{_{i}}})$ (red) vs the actual values of $\pi(n)$ (blue) for $4 \leq n \leq 1500$.

Fig.2: The above graph compares the non-heuristically estimated values of $\pi_{_{H}}(n) = \sum_{j=1}^{n}\prod_{i=1}^{\sqrt{n}}(1-\frac{1}{p_{_{i}}}) = n.\prod_{i=1}^{\sqrt{n}}(1-\frac{1}{p_{_{i}}})$ (green) and $\pi_{_{L}}(n) = \sum_{j=1}^{n}\prod_{i=1}^{\sqrt{j}}(1-\frac{1}{p_{_{i}}})$ (red) vs the actual values of $\pi(n)$ (blue) for $4 \leq n \leq 3000$. Note that the gradient of both $y = \pi_{_{H}}(x)$ and of $y = \pi_{_{L}}(x)$ in the interval $(p_{_{n}}^{2},\ p_{_{n+1}}^{2})$ is $\prod_{i = 1}^{n}(1 - \frac{1}{p_{_{i}}}) \rightarrow 0$.

1B Three intriguing queries

Since $\pi(n) \sim \frac{n}{log_{e}(n)}$ (by the Prime Number Theorem), whilst $\pi_{_{H}}(n) \sim 2e^{-\gamma}\frac{n}{log_{_{e}}n}$ (by Mertens’ Theorem, where $\gamma$ is the Euler-Mascheroni constant and $2.e^{-\gamma} \approx 1.12292 \ldots$), and it can be shown (see Corollary 3.2 here)  that $\pi_{_{L}}(n) \sim ae^{-\lambda}\frac{n}{log_{_{e}}n}$ for some $a \geq 2$, this raises the interesting queries (see also $\S$2 of this initial investigation):

Fig.3. The overlapping rectangles $A, B, C, D, \ldots$ represent $\pi_{_{H}}(p_{_{j+1}}^{2}) = p_{_{j+1}}^{2}.\prod_{i = 1}^{j}(1 - \frac{1}{p_{_{i}}})$ for $j \geq 1$. Figures within each rectangle are the primes and estimated primes corresponding to the functions $\pi(n)$ and $\pi_{_{H}}(n)$, respectively, within the interval $(1,\ p_{_{j+1}}^{2})$ for $j \geq 2$.

Fig.4: Graph of $y = \prod_{i = 1}^{\pi(\sqrt{x})}(1 - \frac{1}{p_{_{i}}})$. The rectangles represent $(p_{_{j+1}}^{2} - p_{_{j}}^{2})\prod_{i = 1}^{j}(1 - \frac{1}{p_{_{i}}})$ for $j \geq 1$. Figures within each rectangle are the primes corresponding to the functions $\pi(n)$ and $\pi_{_{L}}(n)$ within the interval $(p_{_{j}}^{2},\ p_{_{j+1}}^{2})$ for $j \geq 2$. The area under the curve is $\pi_{_{L}}(x) = (x - p_{_{n}}^{2})\prod_{i = 1}^{n}(1 - \frac{1}{p_{_{i}}}) + \sum_{j = 1}^{n-1}(p_{_{j+1}}^{2} - p_{_{j}}^{2})\prod_{i = 1}^{j}(1 - \frac{1}{p_{_{i}}}) + 2$.

(a) Which is the least $n$, if any, such that $\pi_{_{H}}(n) > \pi(n)$?

(b) Which is the largest $n$, if any, such that $\pi(n) > \pi_{_{H}}(n)$?

(c) Is $\pi(n) / \frac{n}{log_{_{e}}n}$ an algorithmically verifiable, but not algorithmically computable (see Definitions 1 and 2 here), function which is discontinuous in the limit $n \rightarrow \infty$ (in the sense of Zeno’s paradox in 2-dimensions considered in Case 4 here)?

1C Three intriguing observations

Fig.5: The above graph compares the actual values of $\pi(n)$ in the range $4 \leq n \leq 100$, with $n.\prod_{_{j=i}}^{^{\pi(\sqrt{n})}}(1-\frac{1}{p_{_{j}}})$ in the range $4 \leq n \leq 110$, and $\frac{n}{log_{_{e}}\ n}$ in the range $4 \leq n \leq 120$.

Fig.6: The above graph compares the actual values of $\pi(n)$ in the range $4 \leq n \leq 1000$, with $n.\prod_{_{j=i}}^{^{\pi(\sqrt{n})}}(1-\frac{1}{p_{_{j}}})$ in the range $4 \leq n \leq 1100$, and $\frac{n}{log_{_{e}}\ n}$ in the range $4 \leq n \leq 1200$.

Fig.7: The above graph compares the actual values of $\pi(n)$ in the range $4 \leq n \leq 10000$, with $n.\prod_{_{j=i}}^{^{\pi(\sqrt{n})}}(1-\frac{1}{p_{_{j}}})$ in the range $4 \leq n \leq 11000$, and $\frac{n}{log_{_{e}}\ n}$ in the range $4 \leq n \leq 12000$.

Fig.8: The above graph compares the actual values of $\pi(n)$ in the range $4 \leq n \leq 100000$, with $n.\prod_{_{j=i}}^{^{\pi(\sqrt{n})}}(1-\frac{1}{p_{_{j}}})$ in the range $4 \leq n \leq 110000$, and $\frac{n}{log_{_{e}}\ n}$ in the range $4 \leq n \leq 120000$.

Fig.9: The above graph compares the actual values of $\pi(n)$ in the range $4 \leq n \leq 1000000$, with $n.\prod_{_{j=i}}^{^{\pi(\sqrt{n})}}(1-\frac{1}{p_{_{j}}})$ in the range $4 \leq n \leq 1100000$, and $\frac{n}{log_{_{e}}\ n}$ in the range $4 \leq n \leq 1200000$.

Query (a) is answered by Figs.8-9, which show that the least $n$ such that $\pi_{_{H}}(n) > \pi(n)$ occurs somewhere around $n = 100000$.

As regards Query (b) however, we conjecture that Fig.9 suggests that there is no larger $n$ such that $\pi(n) > \pi_{_{H}}(n)$.

Finally, we conjecture that—despite what is suggested by the Prime Number Theorem—Query (c) can be answered affirmatively if the three functions $\pi(n)$, $n.\prod_{_{j=i}}^{^{\pi(\sqrt{n})}}(1-\frac{1}{p_{_{j}}})$, and $\frac{n}{log_{_{e}}\ n}$ are as well-behaved as Fig.9 suggests!

2 Density of integers not divisible by primes $Q = \{q_{_{1}}, q_{_{2}}, \ldots, q_{_{k}}\}$

The above observations reflect the circumstance that (formal proofs for the following are detailed in Chapters 34 and 35 here):

Lemma 2.1: The asymptotic density of the set of all integers that are not divisible by any of a given set of primes $Q = \{q_{_{1}}, q_{_{2}}, \ldots, q_{_{k}}\}$ is:

$\prod_{q \in Q}(1-1/q)$.

It follows that:

Lemma 2.2: The expected number of integers in any interval $(a,b)$ that are not divisible by any of a given set of primes $Q = \{q_{_{1}}, q_{_{2}}, \ldots, q_{_{k}}\}$ is:

$(b-a)\prod_{q \in Q}(1-1/q)$.

2A The function $\pi_{_{H}}(n)$

In particular, the expected number $\pi_{_{H}}(n)$ of integers $\leq n$ that are not divisible by any of the first $k$ primes $p_{_{1}}, p_{_{2}}, \ldots, p_{_{k}}$ is:

Corollary 2.3: $\pi_{_{H}}(n) = n.\prod_{i = 1}^{k}(1 - \frac{1}{p_{_{i}}})$

It follows that:

Corollary 2.4: The expected number of primes $\leq p_{_{\pi(\sqrt n)+1}}^{2}$ is:

$\pi_{_{H}}(p_{_{\pi(\sqrt n)+1}}^{2}) = p_{_{\pi(\sqrt n) +1}}^{2}\prod_{i = 1}^{\pi(\sqrt n)}(1 - \frac{1}{p_{_{i}}})$

We conclude that $\pi_{_{H}}(n)$ is a non-heuristic approximation of the number of primes $\leq n$:

Lemma 2.5: $\pi(n) \approx \pi_{_{H}}(n) = n.\prod_{i = 1}^{\pi(\sqrt{n})}(1 - \frac{1}{p_{_{i}}}) \sim 2e^{-\gamma}\frac{n}{log_{_{e}}n}$.

2B The function $\pi_{_{L}}(n)$

We note similarly that:

Corollary 2.6: The expected number of primes in the interval ($p_{_{\pi(\sqrt n)}}^{2},\ p_{_{\pi(\sqrt n)+1}}^{2}$) is:

$(p_{_{\pi(\sqrt n)+1}}^{2} - p_{_{\pi(\sqrt n)}}^{2})\prod_{i = 1}^{\pi(\sqrt n)}(1 - \frac{1}{p_{_{i}}})$.

It further follows from Lemma 2.2 and Corollary 2.6 that:

Corollary 2.7: The number $\pi(p_{_{\pi(\sqrt n)+1}}^{2})$ of primes less than $p_{_{\pi(\sqrt n)+1}}^{2}$ is cumulatively approximated by:

$\pi_{_{L}}(p_{_{\pi(\sqrt n)+1}}^{2}) = \sum_{j=1}^{\pi(\sqrt n)}\{(p_{_{j+1}}^{2} - p_{_{j}}^{2})\prod_{i = 1}^{j}(1 - \frac{1}{p_{_{i}}})\}$.

We conclude that $\pi_{_{L}}(n)$ is also a cumulative non-heuristic approximation of the number of primes $\leq n$:

Lemma 2.8: $\pi(n) \approx \pi_{_{L}}(n) = \sum_{j = 1}^{n}\prod_{i = 1}^{\pi(\sqrt{j})}(1 - \frac{1}{p_{_{i}}})$.

We note that since the above non-heuristic estimates appeal to an asymptotic density, and the density of primes tends to $0$ as $n \rightarrow \infty$, Figs.1-9 suggest that:

(i) $\pi_{_{L}}(n)$ overestimates $\pi(n)$ cumulatively over each of the intervals; whilst

(ii) $\pi_{_{H}}(n)$ overestimates $\pi(n)$ over a single cumulative interval.

even though Fig.3 suggests that:

$\bullet$ The number of primes in any interval $(p_{_{j}}^{2},\ p_{_{j+1}}^{2})$ for any given $j < n$ is constant as $n \rightarrow \infty$; but

$\bullet$ The contribution of the expected number of primes in the interval $(p_{_{j}}^{2},\ p_{_{j+1}}^{2})$ to the total expected number (see Corollary 2.4), $\pi_{_{H}}(p_{_{n+1}}^{2}) = p_{_{n+1}}^{2}.\prod_{i = 1}^{n+1}(1 - \frac{1}{p_{_{i}}})$, of primes in the interval $(1,\ p_{_{n+1}}^{2})$ decreases monotonically.

In other words, an apparent paradox surfaces (as suggested by Fig.9) when we express $\pi(n)$ and the function $\pi_{_{H}}(n)$ in terms of the number of primes determined by each function respectively in each interval $(p_{_{n}}^{2},\ p_{_{n+1}}^{2})$ as follows:

$\pi(p_{_{n+1}}^{2}) = \sum_{j=1}^{n}(\pi(p_{_{j+1}}^{2}) - \pi(p_{_{j}}^{2})) + \pi(p_{_{1}}^{2})$

$\pi_{_{H}}(p_{_{n+1}}^{2}) = p_{_{n+1}}^{2}.\prod_{i = 1}^{\pi(\sqrt{p_{_{n+1}}^{2}})}(1 - \frac{1}{p_{_{i}}})$

$= (\sum_{j=1}^{n}(p_{_{j+1}}^{2} - p_{_{j}}^{2}) + p_{_{1}}^{2}).\prod_{i = 1}^{n}(1 - \frac{1}{p_{_{i}}})$

$= \sum_{j=1}^{n}(p_{_{j+1}}^{2}.\prod_{i = 1}^{n}(1 - \frac{1}{p_{_{i}}}) - p_{_{j}}^{2}.\prod_{i = 1}^{n}(1 - \frac{1}{p_{_{i}}})) + p_{_{1}}^{2}.\prod_{i = 1}^{n}(1 - \frac{1}{p_{_{i}}})$

Since $\prod_{i = 1}^{n}(1 - \frac{1}{p_{_{i}}}) \rightarrow 0$ as $n \rightarrow \infty$, we can, for any given $k$, define $n_{_{k}} > k$ such that:

$\pi_{_{H}}(p_{_{n_{_{k}}+1}}^{2}) = \sum_{j=1}^{k}(p_{_{j+1}}^{2}.\prod_{i = 1}^{n_{_{k}}}(1 - \frac{1}{p_{_{i}}}) - p_{_{j}}^{2}.\prod_{i = 1}^{n_{_{k}}}(1 - \frac{1}{p_{_{i}}}))\ +$

$\sum_{j=k+1}^{n_{_{k}}}(p_{_{j+1}}^{2}.\prod_{i = 1}^{n_{_{k}}}(1 - \frac{1}{p_{_{i}}}) - p_{_{j}}^{2}.\prod_{i = 1}^{n_{_{k}}}(1 - \frac{1}{p_{_{i}}})) + p_{_{1}}^{2}.\prod_{i = 1}^{n_{_{k}}}(1 - \frac{1}{p_{_{i}}})$

$< \sum_{j=1}^{k}(\pi(p_{_{j+1}}^{2}) - \pi(p_{_{j}}^{2}))\ +$

$\sum_{j=k+1}^{n_{_{k}}}(p_{_{j+1}}^{2}.\prod_{i = 1}^{n_{_{k}}}(1 - \frac{1}{p_{_{i}}}) - p_{_{j}}^{2}.\prod_{i = 1}^{n_{_{k}}}(1 - \frac{1}{p_{_{i}}})) + p_{_{1}}^{2}.\prod_{i = 1}^{n_{_{k}}}(1 - \frac{1}{p_{_{i}}})$

We conclude that:

Theorem For any given $k$, we can find $n_{_{k}} > k$ such that the estimated number $\pi_{_{H}}(p_{_{n_{_{k}}+1}}^{2})$ of primes $\leq p_{_{n_{_{k}}+1}}^{2}$ is always less than the actual number of primes less than $p_{_{k+1}}^{2}$ plus the number of primes between $p_{_{k+1}}^{2}$ and $p_{_{n_{_{k}}+1}}^{2}$ as estimated by $\pi_{_{H}}(p_{_{n_{_{k}}+1}}^{2})$.

Since $\pi(n) \sim \frac{n}{log_{e}(n)}$ (by the Prime Number Theorem), whilst $\pi_{_{H}}(n) \sim 2e^{-\gamma}\frac{n}{log_{_{e}}n}$ (where $2.e^{-\gamma} \approx 1.12292 \ldots$), and $\pi_{_{L}}(n) > \pi_{_{H}}(n)$ for all $n \geq 9$ (see Figs.3 and 4), we note the apparent paradox:

Corollary The Prime Number Theorem implies that:

$\lim_{n \rightarrow \infty} \frac{\pi_{_{H}}(p_{_{n+1}}^{2})}{\pi(p_{_{n+1}}^{2})} = 2.e^{^{-\gamma}} > 1.$

3 Conventional estimates of $\pi(x)$ for finite $x > 2$ are heuristic

We note that Guy Robin (Robin, G. (1983). “Sur l’ordre maximum de la fonction somme des diviseurs”. Séminaire Delange-Pisot-Poitou, Théorie des nombres (1981-1982). Progress in Mathematics. 38: 233-244.) proved (implicit assumptions?) that the following changes sign infinitely often:

$(log_{e}\ n).\prod_{p \leq n}(1 - \frac{1}{p}) - e^{^{-\gamma}}$

Robin’s result is analogous to Littlewood’s curious theorem (implicit assumptions?) that the difference $\pi(x) - Li(x)$ changes sign infinitely often. No analogue of the Skewes number (an upper bound on the first natural number $x$ for which $\pi(x) > Li(x)$) is, however, known for Robin’s result.

Littlewood’s theorem is `curious’ since:

(a) There is no explicitly defined arithmetical formula that, for any $x > 2$, will yield $\pi(x)$. Hence, Littlewood’s proof deduces the behaviour of $\pi(x) - Li(x)$ for finite values of $x > 2$ by implicitly appealing to the relation of $\pi(x)$ to $\zeta(s)$, defined as $\sum\frac{1}{n^{s}}$ over all integers $n \geq 1$, through the identity of the infinite summation with the Euler product $\prod(1 - \frac{1}{p^{s}})^{^{-1}}$ over all primes, which is valid only for Re$(s) > 1$; as is its consequence (which involves the re-arrangement of an infinite summation that, too, is valid only for Re$(s) > 1$):

$log_{_{e}}\ \zeta(s) = s.\int_{_{2}}^{^{\infty}} \frac{\pi (x)}{x(x^{s}-1)}dx = s.\int_{_{2}}^{^{\infty}} \frac{\pi (x)/(1 - \frac{1}{x^{s}})}{x^{s+1}}dx$

(b) Littlewood’s proof deduces the behaviour of $\pi(x) - Li(x)$ for finite values of $x > 0$ by (uncritically, as I argue here?) appealing to the analytically continued behaviour of $\zeta(s)$ in areas where $\pi(x)$ is not defined!

Moreover, we note that—unlike $\pi_{_{H}}(n)$ and $\pi_{_{L}}(n)$conventional estimates of $\pi(x)$ for finite values of $x > 0$ can be treated as heuristic, since they appeal only to the limiting behaviour (Theorem 420, p.345, Hardy and Wright, Introduction to the Theory of Numbers, 4th ed, 1960, Oxford University Press) of a formally (i.e., explicitly) undefined arithmetical function, $\pi(n)$, to the limiting behaviours of formally defined arithmetical functions $\phi(n)$ and $\psi(n)$:

$\pi(x) \sim \frac{\phi(x)}{log_{_{e}}\ x} \sim \frac{\psi(x)}{log_{_{e}}\ x}$

where:

$\phi(x) = \sum_{_{p \leq x}}log_{_{e}}\ p = log_{_{e}} \prod_{_{p \leq x}}p$

$\psi(x) = \sum_{_{p^{m} \leq x}}log_{_{e}}\ p$

and, as detailed here, the latter is curiously stipulated as valid only for $x > 1$, but definable in terms of a summation over the zeros of the zeta function in the critical strip $0 < R(\rho) < 1$:

$\psi(x)$ is given by the so-called explicit formula

$\psi(x) = x - \sum_{_{\rho}} \frac{x^{^{\rho}}}{\rho} - log_{_{e}}\ (2.\pi) - \frac{1}{2}log_{_{e}}\ (1 - x^{^{2}})$

for $x > 1$ and $x$ not a prime or prime power, and the sum is over all nontrivial zeros $\rho$ of the Riemann zeta function $\zeta(s)$, i.e., those in the critical strip so $0 < R(\rho) < 1$, and interprets as

$lim_{_{t \rightarrow \infty}} \sum_{_{|I(\rho)| < t}} \frac{x^{^{\rho}}}{\rho}$.”