We already know that the derivative of a function represents the rate of change of its output corresponding to an infinitesimally small change of its input. However, when we work with a multivariable function, its input might change in more than one direction — since it is specified by more than one coordinate — as may its output. In this topic, we will take a look at partial derivatives and how can they help us study change in multivariable functions, building to the concept of gradient.
Different perspectives
As we learned previously, the derivative of a function at a given point represents the slope of the line tangent to at that point, and it tells us how much is increasing or decreasing within a very small interval as .
Now, let's consider an ; for example:
Since now takes two independent variables as input, we must take into account how changes in each one contribute to the change in.
Let's plot in using Cartesian coordinates using .
We can see that the rate of change of as increases depends not only on the value of , but also on .
For example, let's say while increases:
And now, let's say while increases:
In the same manner, when increasing from any , might either increase or decrease depending on the value of .
For instance, let's hold while increases:
This process of changing one variable while holding all the others constant results in what we call a partial derivative.
The rate of change of with respect to can be denoted as
while the rate of change of with respect to can be denoted as
As with traditional one-dimensional derivatives, we have more than one notation:
For example, let's define by
In order to determine the partial derivative of with respect to , we need to treat as a constant. Then, we can differentiate our expression normally, as if it only depended on :
Then, we have
Conversely, to determine the partial derivative of with respect to , we must treat as a constant.
Then,
For the previous example, we have the partial derivatives
More formally, given a function we can define its partial derivative with respect to at the point as:
Since is the basis vector, we have
Keep in mind that the value of a partial derivative depends on the coordinate system that we choose to express our function in. For example, if we decided to use polar coordinates to describe the set , the partial derivatives with respect to and wouldn't be the same as the partial derivatives we get from the Cartesian coordinates and . This makes sense because a change in radius length or angle isn't the same as a change along the or axis.
Gradient
Let's say
Then, we have:
We can define a vector-valued function by placing each partial derivative inside a vector in its corresponding coordinate:
This vector field is called a gradient. It is represented with the del operator , also known as nabla.
Sometimes, the del operator is defined separately as:
So, in Cartesian coordinates in it would be:
And, since is a scalar-valued function, by multiplying it times nabla,
We obtain its corresponding gradient.
Now, how can we express the partial derivatives if there is more than one output?
Higher order derivatives
Just like with ordinary derivatives, we can also have higher order partial derivatives. To obtain the second-order partial derivative of with respect to , we just take the partial derivative with respect to of the first partial derivative with respect to of :
However, if instead we take the partial derivative with respect to of the partial derivative with respect to of , we get the so called mixed partial derivative:
One neat fact about mixed partial derivatives is that, if , , and are continuous, then the following is true:
For example, let's calculate the second order derivatives for the function . We can start with the calculation of first order derivatives.
Conclusion
To sum up, in this topic we have learned:
- The partial derivative of a function with respect to any of its input variables is obtained by differentiating in one variable while all the others remain constant.
- The vector-valued gradient function of a scalar-valued function can be obtained by grouping all the partial derivatives of in a single vector.
- We can obtain higher order partial derivatives by differentiating a partial derivative with respect to one of its arguments.