### Numerical integration

As we start to see that integration ‘by formulas’ is a much more
difficult thing than differentiation, and sometimes is impossible to
do in elementary terms, it becomes reasonable to ask for *numerical approximations to definite integrals*. Since a *definite* integral is just a *number*, this is possible. By
contrast, *indefinite* integrals, being *functions* rather
than just numbers, are not easily described by ‘numerical
approximations’.

There are several related approaches, all of which use the idea that
a definite integral is related to *area*. Thus, each of these
approaches is really essentially a way of approximating area under a
curve. Of course, this isn't exactly right, because integrals are not
exactly areas, but thinking of area is a reasonable heuristic.

Of course, an approximation is not very valuable unless there is an
*estimate for the error*, in other words, an idea of the *tolerance*.

Each of the approaches starts the same way: To approximate $\int_a^bf(x)\;dx$, break the interval $[a,b]$ into smaller subintervals $$[x_0,x_1],\;\;[x_1,x_2],\;\;\ldots,[x_{n-2},x_{n-1}],\;\;[x_{n-1},x_n]$$ each of the same length $$ \Delta x={b-a\over n}$$ and where $x_0=a$ and $x_n=b$.

**Trapezoidal rule**: This rule says that
$$\int_a^bf(x)\;dx \approx {\Delta x\over 2}
[f(x_0)+2f(x_1)+2f(x_2)+\ldots+2f(x_{n-2})+2f(x_{n-1})+f(x_n)]$$ Yes,
all the values have a factor of ‘2’ except the first and the
last. (This method approximates the area under the curve by *trapezoids* inscribed under the curve in each subinterval).

**Midpoint rule**: Let
$\overline{x}_i={1\over 2}(x_i-x_{i-1})$ be the midpoint of the
subinterval $[x_{i-1},x_i]$. Then the **midpoint rule** says that
$$\int_a^b f(x)\;dx \approx \Delta
x[f(\overline{x}_1)+\ldots+f(\overline{x}_n)]$$
(This method approximates the area under the curve by rectangles whose
height is the midpoint of each subinterval).

**Simpson's rule**: This rule says that
$$\int_a^bf(x)\;dx \approx $$
$$\approx {\Delta x\over 3}
[f(x_0)+4f(x_1)+2f(x_2)+4f(x_3)+\ldots+2f(x_{n-2})+4f(x_{n-1})+f(x_n)]$$
Yes, the first and last coefficients are ‘1’, while the ‘inner’
coefficients alternate ‘4’ and ‘2’. And $n$ has to be an *even*
integer for this to make sense. (This method approximates the curve by
pieces of parabolas).

In general, the smaller the $\Delta x$ is, the better these
approximations are. We can be more precise: the error estimates for
the trapezoidal and midpoint rules depend upon the *second
derivative*: suppose that $|f''(x)|\le
M$ for some constant $M$, for all $a\le x\le b$. Then
$$\hbox{ error in trapezoidal rule }\le {M(b-a)^3\over 12n^2}$$
$$\hbox{ error in midpoint rule }\le {M(b-a)^3\over 24n^2}$$
The error estimate for Simpson's rule depends on the *fourth*
derivative: suppose that $|f^{(4)}(x)|\le
N$ for some constant $N$, for all $a\le x\le b$. Then
$$\hbox{ error in Simpson's rule }\le {N(b-a)^5\over 180n^4}$$

From these formulas estimating the error, it looks like the midpoint rule is always better than the trapezoidal rule. And for high accuracy, using a large number $n$ of subintervals, it looks like Simpson's rule is the best.

#### Thread navigation

##### Calculus Refresher

- Previous: Length of curves
- Next: Averages and weighted averages