### Video: Riemann sums and the definite integral

#### Video links

This video is found in the pages

#### Transcript of video

In this lecture, we discuss how to compute the total amount of change of a function during a period of time. We'll calculate this change using a definite integral, which is defined as a limit of something called a Riemann sum.

Imagine that you took a two hour walk. At the beginning, you were walking slowly, at 1 km/h. If $x(t)$ is you position in kilometers, then, at first, your position was changing at the rate $dx/dt = 1$. But, you didn't stay at the speed, instead, you accelerated, increasing your speed by 3 km/h every hour. Hence, your position was changing at the rate $dx/dt = 1 + 3t$. By the end of the two hour walk, you increased your speed by 6 km/h so that you were walking at a rapid clip of 7 km/h. The question we want to answer is: how far did you walk during those two hours.

Let's examine a plot of your speed, which we'll call your rate, as a function of time. Here we see how you steadily increased your rate of walking from 1 km/h to 7 km/h. To estimate how far you walked, let's do the following. Let's break the two hour walk into two intervals of 1 hour each. We'll call the length of these intervals $\Delta t$, so right now, $\Delta t=1$. What makes it hard to figure out the distance is that you continuously changed your velocity. If you had just walked at a constant rate, it would have been much easier. So, let's pretend that, during each period of length $\Delta t=1$ hour, you didn't change your speed. During the first interval, from hour 0 to hour 1, let's imagine you kept walking at your initial rate of 1 km/h. In this imaginary exercise, we don't have you change speed until the second interval, beginning at hour 1. During this second interval, you will again walk at a constant speed, but you'll walk at the constant speed from the beginning of this interval, which in this case is 4 km/h. For this calculation, we are approximating the green curve by the two blue horizontal line segments.

Calculating the distance traveled in each interval is simple. From hours 0 to 1, we assume you walked steadily at 1 km/h. Multiplying the rate 1 by the time interval $\Delta t$, which is 1, we get a distance of 1 km for the first interval. The total distance up this point is 1 km. From hours 1 to 2, we assume you walk at 4 km/h. Multiplying this rate by $\Delta t$, which is 1, we get a distance of 4 km for the second interval. The total distance walked during the whole two hour walk is $1 + 4 = 5$ km.

Obviously, this is a poor approximation of the distance you walked. You actually walked faster than we assumed. You should get credit for more distance! You argue that it is unfair to use the slow initial rate and that we should instead use the rate at the end of each interval. Fair enough, let's try it that way, too. The first method was called a left-handed estimate, as we used the left point in each interval. Let's switch to a right-handed estimate, using the right point in each interval to give the approximate rate during that interval. With the right-handed estimate, we assume that you walked at 4 km/h during the first interval, since that was your ending rate at hour 1. During the second interval, we use the rate 7 km/h, as that is the rate at which you ended the walk. We get a pretty different result. Now, we calculate that you walked 4 km in the first hour and 7 km in the second hour, for a total of 11 km during the two hour walk. There is quite a difference between the two calculations: 5 km for the left-handed estimate and 11 km for the right-handed estimate.

I think we can agree that assuming your rate was constant during the 1 hour intervals was too gross of an approximation. To improve accuracy, let's double the number of intervals to 4, with an interval length, $\Delta t$, of one-half an hour. If we use a left-handed estimate, we'll still be underestimating your walking distance, but not by as much. Now, we have to do four calculations, using the rates of 1, 2.5, 4, and 5.5 km/h in each of the four intervals. We multiply those four numbers by $\Delta t$, which is now 1/2, to get the distances walked in those four intervals: 0.5, 1.25, 2, and 2.75 km. Adding up those four numbers, we get the total walking distance of 6.5 km.

That's an improved estimate, though it is still underestimating the distance. You also insist we try the right-handed estimate, approximating the rates in the four intervals by your final walking speed in each interval, i.e., we should use the rates 2.5, 4, 5.5, and 7. Multiplying those numbers by one-half and summing them up, gives us an estimate of 9.5 km. With four intervals, our left-handed estimate of 6.5 km and right-handed estimate of 9.5 km are closer together, but still not too close. We expect the actual distance you walked to be somewhere in between.

Let's refine our estimate further by increasing the number of intervals all the way up to 10, which bring the $\Delta t$ down to 0.2. Now we have 10 different rates, each of which we must multiply by 0.2, and then add them up. When we do this for the left-handed estimate, we get 7.4 km for your walking distance. When we repeat the work for the right-handed estimate, we get 8.6 km. It's seems clear that, as we increase the number of intervals, we get a better estimate of your walking distance.

It's time to derive a formula for the calculations we've been doing. Let's decrease the number of interval back down to 4 and move everything up a little to give us some space to figure out the formula.

Let's begin with the left-handed estimate, which we'll call $I_l$. The $I$ is supposed to warn you that we are working toward the integral we mentioned at the beginning, and “l” is for left. The formula for $I_l$ shows that we took the rates 1, 2.5, 4, and 5.5, multiplied them by the current $\Delta t$ of 1/2, and added them all up to get the result of 6.5.

To write a more general formula, we need some notation. Let $f(t)$ be the function that gives your velocity at hour $t$: $f(t)=1+3t$. We also define the time points $t_0$, $t_1$, etc., shown at the bottom of the graph. In this case, they are just multiples of the time interval $\Delta t$.

For the left-handed estimate, we need to use the rate at the beginning of each time interval. For interval 1, we used $t_0$; for interval 2, we used $t_1$, etc. In general, for interval $i$, we should use $t_{i-1}$. The rate at this time is $f(t_{i-1})$. To get the distance traveled in this interval, we multiply the rate by the interval length $\Delta t$.

We rewrite the formula for $I_l$ using this notation. The rates at the beginning of each interval are $f(t_0)$, $f(t_1)$, $f(t_2)$, and $f(t_3)$. We multiply these by $\Delta t$ and add them up.

If we had 100 intervals, writing this out would take a lot of space. To save space, we will use summation notation, denoted by this capital Greek letter Sigma. This summation notation means that we are going to take all the intervals $i$ from 1 to 4, i.e., we will let $i$ be 1, 2, 3, and 4. For each value of $i$, we calculate $f(t_{i-1})\Delta t$, in other words, we calculate the four terms we have above. The first is for $i=1$, which means $t_{i-1}$ is $t_0$. The second is for $i=2$, etc.

With this summation notation, we can compactly write the left-handed estimate for any number of intervals $n$. The total distance estimated is the sum for $i$ equals 1 to $n$ of $f(t_{i-1})\Delta t$.

We can repeat all these calculations for the right-handed estimate. The only difference is that for interval $i$, we now use the right time point, $t_i$, and calculate the rate as $f(t_i)$. When $n=4$, we use the rates $f(t_1)$, $f(t_2)$, $f(t_3)$ and $f(t_4)$. The summation notation is exactly the same, except that we use $t_i$ rather than $t_{i-1}$.

Since these sums are so important, we give them a special name: Riemann sums. We have derived two different Riemann sums, the left Riemann sum and the right Riemann sum, which use the left endpoint or the right endpoint of each interval to compute the estimate.

How large an $n$ should we use to estimate the distance you walked. Is $n=10$ enough? Should we use $n=50$? When $n=50$, the left Riemann sum gives 7.88 km and the right Riemann sum gives 8.12 km. They are getting quite close to each other. How about $n=100$? It seems our estimates are getting close to the value of 8 km, though even at $n=100$, the Riemann sums are still 0.06 km away from that number.

I think we shouldn't stop at $n=100$. We should let $n$ get even larger. Why hold back? Why not let $n$ go all the way to infinity?

When we let $n$ go all the way to infinity, or, more precisely, we take the limit as $n$ goes to infinity, we arrive at the definite integral. The definite integral, or Riemann integral, is defined as the limit of the Riemann sums as $n$ approaches infinity. The correct answer for the amount that you walked is the integral from 0 to 2 of $f(t)dt$. Notice that this integral can be defined from the left Riemann sum or the right Riemann sum. It doesn't which you choose, you'll get to the same integral. (More precisely, we say the integral exist when both limits arrive at the same number.) The definite integral is a single number. Here, we expect that the definite integral should be 8, though we won't show that here. For this example, we are integrating from 0 to 2; you walked for a total of 2 hours. The entire interval of integration has length two and, the length of each subinterval, $\Delta t$, is $2/n$, as we chopped up those two hours into $n$ pieces.

In general, we might take the integral of a function over an interval of $t$ from $a$ to $b$. In this case, the length of the interval is $b-a$, so when we chop it up into $n$ subintervals, the length of each of these subinterval, $\Delta t$ is $(b-a)/n$. To determine the endpoints $t_i$ we have to add the endpoint $a$, so $t_i = a + i \Delta t$. Let's see what this means with an example.

Let's estimate the integral from -4 to -1 of $x^2 dx$ using Riemann sums of 6 intervals. Notice I switched letters on you here. We're now using $x$ rather than $t$. But, it is all the same.

What is the length of each interval $\Delta x$? The lower endpoint $a$ is $-4$. The upper endpoint $b$ is $-1$. The length of each interval, $\Delta x$, must be $(b-a)/6$. Since $-1$ minus $-4$ is 3 (you get that, right?), $\Delta x$ is 3/6 or 1/2. To calculate the endpoints $x_i$, we start with $a=-4$ and add multiples of 1/2. We determine that $x_0$ is -4, $x_1$ is $-3.5$, etc. We are chopping up the integral into the intervals $[-4, -3.5]$, $[-3.5, -3]$, etc.

We start with the formula for the left Riemann sum. We need to take the sum from i equals 1 to 6 of $f(x_{i-1})$. The function we are integrating is just the squaring function, so we must add up $(x_{i-1})^2$, multiplied by $\Delta x$. In other words, we take $x_0^2$, $x_1^2$, all the way to $x_5^2$, and multiply these by $\Delta x$ before adding. Let's plug in the numbers. $\Delta x=0.5$, and the values of $x_0$ through $x_5$ are -4 through -1.5. When we carry out the calculation (a calculator is your friend), we get 24.875.

If we redo the calculation for the right Riemann sum, the only difference is that we use $x_{i}$ rather than $x_{i-1}$, i.e., we use the endpoints $x_1$ through $x_6$. Everything else remains the same. When we add these up, we get 17.375.

It's clear our left and right Riemann sums aren't too close together, so we should take more than 6 intervals to get a better estimate of the definite integral. But the value of the definite integral is probably somewhere between 17 and 25.

In summary, we define a definite integral as the limit of a Riemann sum. The definite integral is just a single number. This is in stark contrast to the indefinite integral, which is a function plus a constant. I wonder why, if they are so different, we call them both integrals and use a similar notation for both of them. We'll have to save that question for another time.