Derivation of the directional derivative and the gradient
In the introduction to the directional derivative and the gradient, we illustrated the concepts behind the directional derivative. The main points were that, given a multivariable scalar-valued function $f : \R^n \to \R$ (confused?),
- the directional derivative $D_{\vc{u}}f$ is a generalization of the partial derivative to the slope of $f$ in a direction of an arbitrary unit vector $\vc{u}$,
- the gradient $\nabla f$ is a vector that points in the direction of the greatest upward slope whose length is the directional derivative in that direction, and
- the directional derivative is the dot product between the gradient and the unit vector: $D_{\vc{u}}f = \nabla f \cdot \vc{u}$.
This introduction is missing one important piece of information: what exactly is the gradient? How can we calculate it from $f$? It's actually pretty simple to calculate an expression for the gradient, if you can remember what it means for a function to be differentiable.
What does it mean for a function $f(\vc{x})$ to be differentiable at the point $\vc{x}=\vc{a}$? The function must be locally be essentially linear, i.e., there must be a linear approximation \begin{align*} L(\vc{x}) = f(\vc{a}) + Df(\vc{a})(\vc{x}-\vc{a}) \end{align*} that is very close to to $f(\vc{x})$ for all $\vc{x}$ near $\vc{a}$. The definition of differentiability means that, for all directions emanating out of $\vc{a}$, $f(\vc{x})$ and $L(\vc{x})$ have the same slope. We can therefore calculate the directional derivatives of $f$ at $\vc{x}$ using $L$ rather than $f$.
Using the definition of directional derivative, we can calculate the directional derivative of $f$ at $\vc{a}$ in the direction of $\vc{u}$: \begin{align*} D_{\vc{u}}f(\vc{a}) &= D_{\vc{u}}L(\vc{a}) = \lim_{h \to 0} \frac{L(\vc{a}+h\vc{u}) - L(\vc{a})}{h}\\ &= \lim_{h \to 0} \frac{hDf(\vc{a})\vc{u}}{h} = \lim_{h \to 0}~ Df(\vc{a})\vc{u} = Df(\vc{a})\vc{u}. \end{align*} Since $Df(\vc{x})$ is a $1 \times n$ row vector and $\vc{u}$ is an $n \times 1$ column vector, the matrix-vector product is a scalar. We could rewrite this product as a dot-product between two vectors, by reforming the $1 \times n$ matrix of partial derivatives into a vector. We denote the vector by $\nabla f$ and we call it the gradient. We obtain that the directional derivative is \begin{align*} D_{\vc{u}}f(\vc{a}) = \nabla f(\vc{a}) \cdot \vc{u} \end{align*} as promised.
Thread navigation
Multivariable calculus
- Previous: An introduction to the directional derivative and the gradient
- Next: Directional derivative and gradient examples
Math 2374
- Previous: The gradient vector
- Next: Directional derivative and gradient examples
Similar pages
- An introduction to the directional derivative and the gradient
- Directional derivative and gradient examples
- Introduction to differentiability in higher dimensions
- Examples of calculating the derivative
- The definition of differentiability in higher dimensions
- Subtleties of differentiability in higher dimensions
- The derivative matrix
- The multidimensional differentiability theorem
- Non-differentiable functions must have discontinuous partial derivatives
- A differentiable function with discontinuous partial derivatives
- More similar pages