Math Insight

Numerical integration

 

As we start to see that integration ‘by formulas’ is a much more difficult thing than differentiation, and sometimes is impossible to do in elementary terms, it becomes reasonable to ask for numerical approximations to definite integrals. Since a definite integral is just a number, this is possible. By contrast, indefinite integrals, being functions rather than just numbers, are not easily described by ‘numerical approximations’.

There are several related approaches, all of which use the idea that a definite integral is related to area. Thus, each of these approaches is really essentially a way of approximating area under a curve. Of course, this isn't exactly right, because integrals are not exactly areas, but thinking of area is a reasonable heuristic.

Of course, an approximation is not very valuable unless there is an estimate for the error, in other words, an idea of the tolerance.

Each of the approaches starts the same way: To approximate $\int_a^bf(x)\;dx$, break the interval $[a,b]$ into smaller subintervals $$[x_0,x_1],\;\;[x_1,x_2],\;\;\ldots,[x_{n-2},x_{n-1}],\;\;[x_{n-1},x_n]$$ each of the same length $$ \Delta x={b-a\over n}$$ and where $x_0=a$ and $x_n=b$.

Trapezoidal rule: This rule says that $$\int_a^bf(x)\;dx \approx {\Delta x\over 2} [f(x_0)+2f(x_1)+2f(x_2)+\ldots+2f(x_{n-2})+2f(x_{n-1})+f(x_n)]$$ Yes, all the values have a factor of ‘2’ except the first and the last. (This method approximates the area under the curve by trapezoids inscribed under the curve in each subinterval).

Midpoint rule: Let $\overline{x}_i={1\over 2}(x_i-x_{i-1})$ be the midpoint of the subinterval $[x_{i-1},x_i]$. Then the midpoint rule says that $$\int_a^b f(x)\;dx \approx \Delta x[f(\overline{x}_1)+\ldots+f(\overline{x}_n)]$$ (This method approximates the area under the curve by rectangles whose height is the midpoint of each subinterval).

Simpson's rule: This rule says that $$\int_a^bf(x)\;dx \approx $$ $$\approx {\Delta x\over 3} [f(x_0)+4f(x_1)+2f(x_2)+4f(x_3)+\ldots+2f(x_{n-2})+4f(x_{n-1})+f(x_n)]$$ Yes, the first and last coefficients are ‘1’, while the ‘inner’ coefficients alternate ‘4’ and ‘2’. And $n$ has to be an even integer for this to make sense. (This method approximates the curve by pieces of parabolas).

In general, the smaller the $\Delta x$ is, the better these approximations are. We can be more precise: the error estimates for the trapezoidal and midpoint rules depend upon the second derivative: suppose that $|f''(x)|\le M$ for some constant $M$, for all $a\le x\le b$. Then $$\hbox{ error in trapezoidal rule }\le {M(b-a)^3\over 12n^2}$$ $$\hbox{ error in midpoint rule }\le {M(b-a)^3\over 24n^2}$$ The error estimate for Simpson's rule depends on the fourth derivative: suppose that $|f^{(4)}(x)|\le N$ for some constant $N$, for all $a\le x\le b$. Then $$\hbox{ error in Simpson's rule }\le {N(b-a)^5\over 180n^4}$$

From these formulas estimating the error, it looks like the midpoint rule is always better than the trapezoidal rule. And for high accuracy, using a large number $n$ of subintervals, it looks like Simpson's rule is the best.