It's often said that \(1+2+3+4+\cdots = -\frac{1}{12}\) but this is obviously rubbish. But it gives the right answer in some famous physics calculations. Why is that? We're going to answer that in the context of the Casimir Effect (in one dimension).
Firstly, we'll briefly introduce the physics of the calculation, then we'll explain a precise mathematical calculation that gives a useful answer, and see how it is related to
\[ 1+2+3+4+\cdots = -\frac{1}{12}\]
We'll see this famous formula is, in fact, not all wrong!
In physics jargon, we want to compute the zero-point energy of a quantum field with zero boundary conditions in one dimension. That's a bit of a mouthful though.
Suppose you have a smooth (sine) wave that goes to zero at the edges of a box of size \(L\). For example, you might be looking at possible light waves being reflected from the two walls. The possible waves are just \(\sin\left(\frac{n \pi x}{L}\right)\) for \(n=1,2,3,\ldots\) and they have wavelengths
\[ \lambda = 2L, \frac{2L}{2}, \frac{2L}{3}, \frac{2L}{4}, \ldots\]
If the waves travel at the speed of light \(c \approx 3\times 10^8 \mathrm{m}\mathrm{s}^{-1}\), then they have frequencies
\[ f = \frac{c}{\lambda} = \frac{c}{2L}, 2\times\frac{c}{2L}, 3\times\frac{c}{2L}, 4\times\frac{c}{2L}, \ldots\]
Now in quantum mechanics, the energy of a particle of frequency \(f\) is
\[ E = h f = \frac{hc}{2L}, 2\times\frac{hc}{2L}, 3\times\frac{hc}{2L}, 4\times\frac{hc}{2L}, \ldots\]
where \(h \approx 6.6\times 10^{-34} \mathrm{J}\mathrm{s}\) is Planck's constant.
Also, in quantum mechanics, the number of particles fluctuates; but on average there are \(\frac{1}{2}\) particles of each possible frequency in the box, so the total energy is apparently
\[ E = \left(1 + 2 + 3 + 4 + \ldots \right) \times \frac{hc}{4L}\]
... and there's the problem!
The main big idea to deal with this is that actually we probably only got the right answer for low-energy particles. We don't even know the laws of very high-energy physics! But more realistically, the walls of any box won't be perfectly reflective to even modestly high-energy waves.
One way to construct a sum which is like \(1+2+3+4+\cdots\) for the first terms, but which changes later terms to be smaller in a simple, consistent way, is to multiply the first term by \(x\) which is just a tiny bit smaller than 1, and then the next term by \(x^2\) which is a tiny bit smaller again, and the next by \(x^3\), and so on.
Let's give that a name and write
\[ S(x) = x + 2x^2 + 3x^3 + 4x^4 + \cdots\]
It turns out you can work out exactly what this sum is for any \(x\)! The trick is to multiply by \((1-x)\)...
\[ \begin{align} (1-x)S &= x + 2x^2 + 3x^3 + 4x^4 + \cdots \\ & \qquad - \,\,x^2 - 2x^3 - 3x^4 - \cdots \\ & = x + x^2 + x^3 + x^4 + \cdots \end{align}\]
... and again...
\[ \begin{align} (1-x)(1-x)S &= x + x^2 + x^3 + x^4 + \cdots \\ & \qquad - x^2 - x^3 - x^4 - \cdots \\ & = x \end{align}\]
... which means that
\[ S = \frac{x}{(1-x)^2}\]
This blows up when \(x=1\) as we expect.
Maybe the most striking thing about this is that if you plug in \(x=-1\) you apparently get
\[ -1 + 2 - 3 + 4 - 5 + 6 - \cdots \overset{?}{=} \frac{-1}{(1-(-1))^2} = -\frac{1}{4} \]
But really the unsmoothed sum on the left takes values \(-1, 1, -2, 2, -3, 4, \ldots\) as you add individual terms, so it doesn't really approach \(-\frac{1}{4}\). But if you smooth it out using the multiplication by powers of \(x\) trick, then the total sum is always finite, and it gets closer to \(-\frac{1}{4}\) as \(x\) approaches \(1\).
This suggests one way you might try to get a finite answer for \(1+2+3+4+\cdots\):
\[ \begin{align} a &= \ \ \ 1 + 2 + 3 + 4 + 5 + 6 + \cdots \\ b &= -1 + 2 - 3 + 4 - 5 + 6 - \cdots \\ a+b &= \ \ \qquad 4 \ \ \ +\ \ \ \ 8 \ \ \ + \ \ \ 12 \ \ \ + \cdots \\ &= 4\left(1+2+3+\cdots\right) \\ &= 4a \end{align}\]
which would imply that \(b = 3a\). Then since you decided \(b = -\frac{1}{4}\), you get \(a = -\frac{1}{12}\), which is the slightly dodgy result we started out with.
But obviously this isn't very precise, right? These sums all blow up, so the sorts of manipulations we just used aren't really allowed.
Yet... there's something about it that seems appealing. If we use our \(x\)s to make everything nice and finite, then what happens? We get
\[ \begin{align} S(x) &= \ \ \ x + 2x^2 + 3x^3 + 4x^4 + 5x^5 + 6x^6 + \cdots \\ S(-x) &= -x + 2x^2 - 3x^3 + 4x^4 - 5x^5 + 6x^6 - \cdots \\ S(x)+S(-x) &= \qquad \ \ 4x^2 \ \quad + \ \quad \ 8x^4 \quad + \ \quad 12x^6 \ \quad + \ \quad \cdots \\ &= 4\left(x^2+2(x^2)^2+3(x^2)^3+\cdots\right) \\ &= 4S(x^2) \end{align}\]
and that's now a proper, legal mathematical result! (If you wanted, you could actually prove this using the formula \(S(x) = \frac{x}{(1-x)^2}\) that we came up with earlier.)
Remembering that as we take \(x \to 1\), we know \(S(-x) \to -\frac{1}{4}\), we see that
\[ 4S(x^2) - S(x) \to -\frac{1}{4}\]
This shows where the cheat was in our earlier attempt to get \(-\frac{1}{12}\). We were happy to set \(x=1\) and \(x^2=1\) in this formula and get \(3S(1) = -\frac{1}{4}\) on the left-hand side. But actually, the way that \(S(x^2)\) and \(S(x)\) behave as \(x \to 1\) is a bit different!
But we could write
\[ S(x) \approx -\frac{1}{12} + f(x)\]
where the extra piece \(f(x)\) has to obey a nice, short formula:
\[ 4f(x^2) = f(x)\]
So there's our \(-\frac{1}{12}\), much more honestly! So where's the infinity hiding? Well, it's hiding in the term \(f(x)\), which is finite when \(x < 1\) but blows up as \(x\) approaches \(1\).
There's a formula for \(f(x)\) given lower down.
From our formula for \(S(x)\) we now know
\[ E(L) = - \frac{1}{12} \times \frac{hc}{4L} + E_{\textrm{extra}}(L)\]
where \(E_{\textrm{extra}}(L)\) comes from \(f(x)\). The \(-\frac{1}{12}\)... comes from the \(-\frac{1}{12}\).
There's a nice way to deal with this which is insipred by the physics setup. Notice that because \(S(x^2)\) gets contributions only from even powers of \(x\), which are those that come from 'even' frequencies \(f = 2n \times \frac{c}{2L} = n \times \frac{c}{2(L/2)}\), it is naturally related to the energies at the length \(L/2\). In fact,
\[ E\left(\frac{L}{2}\right) \approx \frac{hc}{4L} \times \left(2x^2 + 4x^4 + 6x^6 + \cdots \right) = 2 \frac{hc}{4L} S(x^2)\]
This means that the formula \(4f(x^2) = f(x)\) translates into
\[ 2 E_{\textrm{extra}}\left(\frac{L}{2}\right) = E_{\textrm{extra}}(L)\]
The only possible conclusion is that
\[ E_{\textrm{extra}}(L) = AL\]
for some number \(A\).
But there's a good physical reason to ignore \(E_{\textrm{extra}}(L)\). We already mentioned that very high energy photons would just punch into the conductor, rather than being reflected. Their contribution to the energy per unit length is more or less the same whether there's a conductor there or not - that's why they contribute the energy \(E_{\textrm{extra}}(L)\) proportional to \(L\).
Because this contribution to the total energy is always present everywhere, the separation \(L\) between the conductors won't actually change its size. So it's ultimately an overall constant in the total energy!
You could check this by including the spaces on the far side of the conductors. Then you would get a contribution \(A \times (\text{total space outside conductors} - L)\) and the term with \(L\) cancels the one above. So the total energy doesn't change when \(L\) changes!
This means that it doesn't give rise to any forces, and isn't important to us. (Where did the infinity go? \(A\) actually goes to infinity - but it doesn't matter because it cancels out!)
Therefore, we focus only on the energy without \(A\), obtaining the 1D Casimir energy
\[ E_{\textrm{Casimir}}(L) = -\frac{1}{12} \times \frac{hc}{4L}\]
In 3D, you need to include a slightly more complicated sum because there are photons travelling at all angles. You also have to include the fact that there are 2 polarizations (e.g. clockwise and anti-clockwise) for a photon. The result for two big plates facing each other is similar though:
\[ E_{\textrm{3D Casimir}}(L) = -\frac{1}{120} \times \frac{hc \pi}{12L^3} \times \text{Area of plate}\]
(This involves a similar sum \(1^3 + 2^3 + 3^3 + \cdots = \frac{1}{120}\) along the way.)
The 3D force is
\[ F_{\textrm{3D Casimir}}(L) = -\frac{1}{120} \times \frac{hc \pi}{4L^4} \times \text{Area of plate}\]
For a distance \(L = 10 \mathrm{nm}\), or about the size of 100 atoms, this comes out at about 8 times atmospheric pressure!
From a mathematical point of view, you might still be interested in what the contribution \(f(x)\) (which is blowing up) looks like.
The answer most easily worked out using Taylor series in calculus, but the result is that \(f(x) = 1/(\log x)^2\) is given in terms of a logarithm. This means that
\[ S(x) = x + 2x^2 + 3x^3 + 4x^4 = \frac{x}{(1-x)^2} = \frac{1}{(\log x)^2} - \frac{1}{12} + \cdots\]
where the terms in the dots gets smaller and smaller as \(x\) gets closer and closer to \(1\). This logarithmic term is the one which we are dropping.
The reason writing it in terms of a logarithm is a good idea is that when you change the frequencies by rescaling the interval, you have to change the powers of \(x\). For example, earlier we saw that we get \(S(x^2)\) when we halve the interval. The logarithm changes in a very simple way when we do this, because \(\log (x^2) = 2 \log(x)\). So
\[ S(x^2) = \frac{1}{4} \times \frac{1}{(\log x)^2} - \frac{1}{12} + \cdots\]
This is what gives rise to a simple formula relating \(S(x^2)\) and \(S(x)\):
\[ 4S(x^2) - S(x) = \left(4\times\frac{1}{4} - 1\right) \times \frac{1}{(\log x)^2} - \left(4 - 1\right)\frac{1}{12} + \cdots \approx - \frac{1}{4}\]
Get in touch with me! You can use carl at notallwrong dot com to drop me an e-mail.
Grav was with by Trilby Media.