This is an exposition on series, loosely following § 1.4 of Applied Scientific Computing With Python by Peter R. Turner, Thomas Arildsen, and Kathleen Kavanagh.
The goal here is to stress that we can do rigorous computations in Python even with infinite quantities. However, to do so, we need to combine mathematical arguments, with careful coding.
Mathematics reviewed:
Programming topics used:
Recall that a series is the sum of infitely many terms in a sequence, $$X = \sum_{k=0}^{+\infty} a_k.$$ Formally, the value a series takes is a limit of the partial sums. By definition the $N$-th partial sum is the sum $S_N$ of the first $N$ terms in the series. Assuming the series starts at zero (which will not always be true), we have $$S_N = \sum_{k=0}^{N-1} a_k.$$ Thus $X = \lim_{N \to +\infty} S_N.$
The $N$-th remainder is $$R_N = X-S_N = \sum_{k=N}^{\infty} a_k.$$ We call $|R_N|$ the truncation error. A series converges if and only if $\lim_{N \to \infty} |R_N|=0$.
Usually to evaluate a series, we are happy with an estimate for an answer using a partial sum, but we'd like to know how accurate our estimate is. There are two sources of errors:
First consider only truncation error. If you know that the $N$-th truncation error satisfies $|R_N| \leq \epsilon$, then you can conclude that the value $X$ taken by a convergent series satisfies $$S_N-\epsilon \leq X=S_N+R_N \leq S_N +\epsilon.$$
Saying $|R_N| \leq \epsilon$ is equivalent to $-\epsilon \leq R_N \leq \epsilon$. As we will see below, sometimes you are able to confine the $N$-th remainder to a interval $a \leq R_N \leq b$. In this case, we can conclude that $$S_N+a \leq X = S_N+R_N \leq S_N + b.$$
Usually, we can get away with ignoring rounding errors. (But, for example, subtracting numbers that agree to $n$-significant digits results in a loss of precision of approximately $n$ digits.) If you want to be careful about rounding error, you can use interval arithmetic. Interval arithmetic can be used to confine a partial sum $S_N$ to an interval, say $c \leq S_n \leq d$. Then depending on which case we are in above, we have $$c-\epsilon \leq X \leq d+\epsilon \quad \text{or} \quad a+c \leq X \leq b+d$$ (with interval arithmetic used to round these bounds outward).
Define the sequence $a_k=\frac{1}{k(k+1)}$ for $k \geq 1$. Then the sequence has the form $$a_1=\frac{1}{2},~ a_2 = \frac{1}{6},~ a_3 = \frac{1}{12},~ a_4 = \frac{1}{20},~ \ldots.$$ The following function returns the $k$-th term as a float.
def a(k):
return 1/(k*(k+1))
We now print the first $10$ terms.
for k in range(1, 11):
print("a_{}={:.5f}".format(k, a(k)), end=", ")
The following function computes the partial sum $S_N = \sum_{k=1}^{N} a_k.$
def partial_sum(N):
total = 0.0
for k in range(1, N+1): # iterates through {1, 2, ..., N}
total += a(k) # Same as total = total + a(k)
return total
Now we will print some values:
for N in [1, 2, 3, 4, 5, 10, 100, 1000, 100000]:
ps = partial_sum(N)
print("S_{} = {:.5f}".format(N, ps))
We might guess that this is getting closer and closer to $1$. In fact, it is true that $1 = \sum_{k=0}^\infty a_k$.
Furthermore, you might notice a pattern in the sums computed above. It looks like $$S_1 = \frac{1}{2}, \quad S_2 = \frac{2}{3}, \quad S_3 = \frac{3}{4}, \quad S_4=\frac{4}{5}, \ldots.$$ So, we conjecture that generally $S_N = \frac{N}{N+1}$. This statement can be proved by induction.
Proposition 1. We have $S_N=\frac{N}{N+1}$ for all $N \geq 1$.
Proof: It is clearly true that $S_1 = a_1 = \frac{1}{2}$. Therefore the statement holds for $k=1$.
Now suppose the statement holds for $N=j$. We will prove that it holds when $N=j+1$. Since the statement holds for $N=j$, we know that $S_j = \frac{j}{j+1}$. Observe that by definition $$S_{j+1} = S_j + a_{j+1}.$$ Since $a_{j+1}=\frac{1}{(j+1)(j+2)}$ we have $$S_{j+1} = \frac{j}{j+1} + \frac{1}{(j+1)(j+2)} = \frac{j(j+2)}{(j+1)(j+2)}+\frac{1}{(j+1)(j+2)},$$ where in the second step we have produced a common denominator. Expanding and simplifying, we see $$S_{j+1} = \frac{j^2+2j+1}{(j+1)(j+2)} = \frac{(j+1)^2}{(j+1)(j+2)}=\frac{j+1}{j+2}.$$ This proves that $S_{j+1}=\frac{j+1}{j+2}$, completing the inductive step. By the principal of mathematical induction, $S_N=\frac{N}{N+1}$ for all $N \geq 1$, as desired. QED
From the proposition above, we can rigorously prove that the series converges to $1$.
Theorem 2. We have $1=\sum_{k=0}^\infty \frac{1}{k(k+1)}.$
Proof: By definition, the series is the limit of the partial sums. From Proposition 1, we know that the partial sums have the form $$S_N=\frac{N}{N+1} = 1-\frac{1}{N+1}.$$ Observe that $\lim_{N \to \infty} \frac{1}{N+1}=0$ since the denominator goes to infinity. Thus, $$\lim_{N \to \infty} S_N = \lim_{N \to \infty} 1-\frac{1}{N+1} = 1 - 0 = 1.$$ QED
Also as a consequence we have that the $N$th remainder satisfies $$R_N=\sum_{j=N+1}^\infty a_n = 1-S_N = \frac{1}{N+1}.$$
We will consider the following series $$\sum_{k=1}^\infty \frac{1}{k^2}.$$
The problem of computing the limit of this series is called the Basel problem. Wikipedia has a good article on the subject.
The following function uses Python floats to compute the partial sums $S'_N$ consisting of the sum of the first $N$ terms. Concretely, $$S'_N = \sum_{k=1}^{N} \frac{1}{k^2}.$$ (We will use primes (${}'$) to denote partial sums and remainders for this new series.)
def naive_basel_partial_sum(N):
total = 0.0
for k in range(1, N+1): # iterate through [1, 2, ..., N].
total += 1/k**2
return total
for N in [1, 2, 3, 4, 5, 10, 100, 1000, 100000]:
ps = naive_basel_partial_sum(N)
print("S_{} = {:.5f}".format(N, ps))
The first few digits of the correct answer are $1.64493$ so we are getting very close.
The $N$-th remainder is the limit of the series minus $S'_N$, or equivalently $$R'_N = \sum_{k=N+1}^{\infty} \frac{1}{k^2}.$$ We would like to control $R'_N$ by finding lower and upper bounds. This will allow us to control how well a partial sum approximates the limit.
Upper bound. Let $a_k=\frac{1}{(k+1)(k+2)}$, which is the terms in the series studied in the previous example. Observe that if $k \geq 2$, we have $$\frac{1}{k^2} < \frac{1}{k (k-1)}=a_{k-1}.$$ (This holds because $k^2>k (k-1)$.) Since this holds for any $k \geq 2$, we have that for $N \geq 1$, $$R'_N = \sum_{k=N+1}^\infty \frac{1}{k^2} < \sum_{k=N}^\infty a_k =R_{N-1} = \frac{1}{N}.$$ Here, the remainder $R_{N-1}$ comes from the previous example, and from our work there we get that $R_{N-1}=\frac{1}{N}$.
Lower bound. We also need a bound in the other direction. An obvious lower bound is $0<R'_N$, since all the terms in our sum are positive, but we can do better.
Observe that $$a_{k}=\frac{1}{k(k+1)} < \frac{1}{k^2}$$ for all $k \geq 1$. It follows that for all $N \geq 0$, $$R_{N}=\sum_{k=N+1}^\infty a_k < \sum_{k=N+1}^\infty \frac{1}{k^2}=R'_N.$$ Then we know $R_N=\frac{1}{N+1}$ so we get that $R'_N > \frac{1}{N+1}$.
Combined bounds. We have shown that the remainder satisfies $$\frac{1}{N+1} < R'_N = \sum_{k=N+1}^{\infty} \frac{1}{k^2} < \frac{1}{N}.$$
Remark. It is a common trick to control truncation error by comparing a series to a series we understand well.
Suppose we want to compute our series to within an interval of length less than some $\epsilon>0$. From the above, we know that $$\frac{1}{N+1} < R'_N < \frac{1}{N}.$$ The actual value of the series is $S'_N + R'_N$ so, we get that $$S'_N + \frac{1}{N+1} < S'_N + R'_N < S'_N + \frac{1}{N}.$$ This confines the value taken by the series to an interval of length $$\frac{1}{N} - \frac{1}{N+1} = \frac{1}{N(N+1)}.$$ So, we see that given $\epsilon$, we should choose $N$ such that $$\frac{1}{N(N+1)} < \epsilon.$$
It is somewhat painful to solve this expression for $N$, but observe that
$\frac{1}{N(N+1)}<\frac{1}{N^2}.$
So it suffices to arrange that
$$\frac{1}{N^2} \leq \epsilon.$$
Solving for $N$ we see that we should have
$$N \geq \frac{1}{\sqrt{\epsilon}}.$$
So, the smallest integer we can take is
$$N=\left\lceil \frac{1}{\sqrt{\epsilon}} \right\rceil,$$
where $\lceil \cdot \rceil$ denotes the ceiling function. The ceiling of an real number is the smallest integer greater than or equal to that number. This is implemented in Python as ceil
in the math
package.
Revisiting the discussion above, with the choice of $N==\left\lceil \frac{1}{\sqrt{\epsilon}} \right\rceil$, we know that the value taken by the series lies in an interval of length less than $\epsilon$ with endpoints $S'_N+\frac{1}{N+1}$ and $S'_N+\frac{1}{N}$. A sensible thing to use to approximate the limit is the average $$S'_N + \frac{1}{2}\left(\frac{1}{N+1}+\frac{1}{N}\right)= S'_N + \frac{2N+1}{N(N+1)},$$ which we know is within $\frac{\epsilon}{2}$ of the actual limit.
from math import ceil, sqrt
def basel_sum(epsilon):
"""
Return the basel sum to within epsilon/2.
"""
N = ceil(1/sqrt(epsilon))
total = 0.0
for k in range(1, N+1): # iterate through [1, 2, ..., N].
total += 1/k**2
# Now total stores the partial sum
return total + (2*N+1)/(2*N*(N+1))
To get the value of the series correct to the thousandths place, we can call:
basel_sum(1/1000)
To get the value of the series correct $7$th significant digit (the millionths place), we call:
basel_sum(1/10**6)
In the previous part, we controlled truncation error but not round off error. We can combine the above with interval arithmetic to get a rigorous interval containing our answer.
from mpmath import iv
The comment in the first few lines of the function explains what the function does.
def rigorous_basel_sum(prec, epsilon):
"""
Return the basel sum computed using interval arithmetic using
interval arithmetic with prec bits of precision and a truncation
error of less than epsilon>0.
"""
iv.prec = prec # set precision
# The following should be safe even without interval
# arithmetic.
N = ceil(1/sqrt(epsilon))
remainder_upper = 1 / iv.mpf(N)
remainder_lower = 1 / iv.mpf(N+1)
remainder = iv.mpf([remainder_lower, remainder_upper])
# Now remainder is an interval containing [1/N+1, 1/N]
total = iv.mpf(0)
for k in range(1, N+1): # iterate through [1, 2, ..., N].
total += 1/iv.mpf(k**2)
# Now total contains the Nth interval aritmetic partial sum
# To give the rigorous bound we add the partial sum and
# the remainder interval.
return total + remainder
x1 = rigorous_basel_sum(15, 0.5/10**3)
x1
So we know that the sum lies in the interval above. You an get this intervals width (as a degenerate interval) as follows:
x1.delta
So the interval is correct to approximately $\frac{3}{1000}$. As noted above, the calculation should narrow t an interval of size $\frac{0.5}{1000}$. The reason this is not true is round off error. About $87\%$ of the error is due to round off error. For this reason, we should be able to get a more accurate evaluation by increasing the precision. This will lower the round off error, while keeping the truncation error the same.
x2 = rigorous_basel_sum(30, 0.5/10**3)
x2
x2.delta
The contribution of the truncation error is going to be almost exactly $0.0005$, so the round off error is very small now.
Here is a bit better estimate:
x3 = rigorous_basel_sum(43, 0.5/10**8)
x3
x3.delta
Here again the $5 \times 10^{-9}$ is truncation error, while the remaining roughly $3 \times 10^{-9}$ is due to round off error. We can rigorously see from the above that the series evaluates to a real number whose decimal expansion begins either $$1.64493406 \quad \text{or} \quad 1.64493407.$$
In fact, the series sums to $\frac{\pi^2}{6}$. Using interval arithmetic, we see can compute the this rigorously as well.
iv.dps = 10
iv.pi**2 / 6
Thus the first few digits of $\frac{\pi^2}{6}$ are $1.6449340668$, which falls in the interval found above.