MathAnalysisCalculusMultivariable calculus

Partial derivatives

8 minutes read

We already know that the derivative of a function represents the rate of change of its output corresponding to an infinitesimally small change of its input. However, when we work with a multivariable function, its input might change in more than one direction — since it is specified by more than one coordinate — as may its output. In this topic, we will take a look at partial derivatives and how can they help us study change in multivariable functions, building to the concept of gradient.

Different perspectives

As we learned previously, the derivative of a function f(x)f(x) at a given point represents the slope of the line tangent to f(x)f(x) at that point, and it tells us how much f(x)f(x) is increasing or decreasing within a very small interval (xΔx,  x+Δx)(x - \Delta x, \; x + \Delta x) as Δx0\Delta x \to0.

Now, let's consider an f:R2Rf:\mathbb R^2 \to \mathbb R; for example:

f(x,y)=7xyex2+y2.f(x,y)=\frac{7xy}{e^{x^2 + y^2}}.

Since ff now takes two independent variables as input, we must take into account how changes in each one contribute to the change inff.

Let's plot f(x,y)f(x,y) in R3\mathbb R^3 using Cartesian coordinates using z=f(x,y)z=f(x,y).

Graph in Cartesian coordinates

We can see that the rate of change of zz as xx increases depends not only on the value of xx, but also on yy.

For example, let's say y=1y=1 while xx increases:

Graph in Cartesian coordinates (2)

And now, let's say y=1y=-1 while xx increases:

Graph in Cartesian coordinates (3)

In the same manner, when increasing yy from any y0y_0, zz might either increase or decrease depending on the value of xx.

For instance, let's hold x=1x=-1 while yy increases:

Partial derivative.

This process of changing one variable while holding all the others constant results in what we call a partial derivative.

The rate of change of f(x,y)f(x,y) with respect to xx can be denoted as

fx(x,y),\frac{\partial f}{\partial x}(x,y),while the rate of change of f(x,y)f(x,y) with respect to yy can be denoted as

fy(x,y).\frac{\partial f}{\partial y}(x,y).As with traditional one-dimensional derivatives, we have more than one notation:

fx(x,y)fx(x,y);fy(x,y)fy(x,y).\frac{\partial f}{\partial x}(x,y) \equiv f_x (x,y) \quad ; \quad \frac{\partial f}{\partial y}(x,y) \equiv f_y (x,y).

For example, let's define g:R2Rg: \mathbb{R}^2 \to \mathbb{R} by

g(x,y)=x3+xyy.g(x,y) = x^3 + xy-y.

In order to determine the partial derivative of gg with respect to xx, we need to treat yy as a constant. Then, we can differentiate our expression normally, as if it only depended on xx:

(x3+kxk)=(x3)+(kx)(k)=3x2+k0(x^3 + kx-k)' = (x^3)' + (kx)' - (k)' = 3x^2 + k - 0Then, we have

gx(x,y)=gx(x,y)=3x2+y\frac{\partial g}{\partial x}(x,y) = g_x(x,y) = 3x^2 +yConversely, to determine the partial derivative of gg with respect to yy, we must treat xx as a constant.

Then,

gy(x,y)=gy(x,y)=x1\frac{\partial g}{\partial y}(x,y) = g_y(x,y) = x - 1For the previous example, we have the partial derivatives

f(x,y)=7xyex2+y2    {fx=7y(2x21)ex2+y2fy=7x(2y21)ex2+y2f(x,y)=\frac{7xy}{e^{x^2 + y^2}} \implies \begin{cases} \frac{\partial f}{\partial x} = - \frac{7y(2x^2-1)}{e^{x^2 + y^2}}\\ \\ \frac{\partial f}{\partial y} = - \frac{7x(2y^2-1)}{e^{x^2 + y^2}} \end{cases}

More formally, given a function f:RnRf: \mathbb R^n \to \mathbb R we can define its partial derivative with respect to xix_i at the point xRn\bold{x} \in \mathbb R^n as:

fxi(x)=limΔxi0f(x+Δxiei)f(x)Δxi;x=(x1,,xi,,xn).\frac{\partial f}{\partial x_i}(\bold{x})=\lim \limits_{\Delta x_i \to 0} \frac{f(\bold{x}+\Delta x_i \cdot \bold{e_i}) - f(\bold{x})}{\Delta x_i} \qquad ; \quad \bold{x}=(x_1,\dots,x_i,\dots,x_n).Since ei\mathbf{e_i} is the ithi^{\text{th}} basis vector, we havefxi(x)=limΔxi0f(x1,,xi+Δxi,,xn)f(x1,,xi,,xn)Δxi.\frac{\partial f}{\partial x_i}(\bold{x})=\lim \limits_{\Delta x_i \to 0} \frac{f(x_1,\dots,x_i +\Delta x_i ,\dots,x_n) - f(x_1,\dots,x_i,\dots,x_n)}{\Delta x_i}.

Keep in mind that the value of a partial derivative depends on the coordinate system that we choose to express our function in. For example, if we decided to use polar coordinates to describe the set R2\mathbb{R}^2, the partial derivatives with respect to rr and θ\theta wouldn't be the same as the partial derivatives we get from the Cartesian coordinates xx and yy. This makes sense because a change in radius length or angle isn't the same as a change along the xx or yy axis.

Gradient

Let's say

f(x,y,z)=x2y+yz+xyz2f(x,y,z)=x^2y+yz+xyz^2

Then, we have:

fx(x,y,z)=2xy+yz2fy(x,y,z)=x2+z+xz2fz(x,y,z)=y+2xyz\begin{matrix*}[l] \frac{\partial f}{\partial x}(x,y,z) = 2xy + yz^2 \\ \\ \frac{\partial f}{\partial y}(x,y,z) = x^2 + z + xz^2 \\ \\ \frac{\partial f}{\partial z}(x,y,z) = y + 2xyz \end{matrix*}We can define a vector-valued function by placing each partial derivative inside a vector in its corresponding coordinate:

f(x,y,z)=(fx(x,y,z)fy(x,y,z)fz(x,y,z))\nabla f(x,y,z)=\begin{pmatrix} \frac{\partial f}{\partial x}(x,y,z)\\ \\ \frac{\partial f}{\partial y}(x,y,z)\\ \\ \frac{\partial f}{\partial z}(x,y,z) \end{pmatrix}This vector field is called a gradient. It is represented with the del operator \nabla, also known as nabla.

Sometimes, the del operator is defined separately as:

=(x0,,xi,,xn)\nabla= \left ( \frac{\partial}{\partial x_0},\dots,\frac{\partial}{\partial x_i},\dots,\frac{\partial}{\partial x_n} \right)So, in Cartesian coordinates in R3\mathbb R^3 it would be:

=(x,y,z)\nabla= \left ( \frac{\partial}{\partial x},\frac{\partial}{\partial y},\frac{\partial}{\partial z} \right)And, since ff is a scalar-valued function, by multiplying it times nabla,

f(x,y,z)=(x,y,z)f(x,y,z)=(fx,fy,fz)\nabla f(x,y,z) = \left ( \frac{\partial}{\partial x},\frac{\partial}{\partial y},\frac{\partial}{\partial z} \right) \cdot f(x,y,z) = \left ( \frac{\partial f}{\partial x},\frac{\partial f}{\partial y},\frac{\partial f}{\partial z} \right)We obtain its corresponding gradient.

Now, how can we express the partial derivatives if there is more than one output?

Higher order derivatives

Just like with ordinary derivatives, we can also have higher order partial derivatives. To obtain the second-order partial derivative of ff with respect to xx, we just take the partial derivative with respect to xx of the first partial derivative with respect to xx of ff:

2fx2=x(fx)fxx{\frac {\partial ^{2}f}{\partial x^{2}}} = \frac{\partial}{\partial x} \left ( \frac{\partial f}{\partial x} \right ) \equiv f_{xx}However, if instead we take the partial derivative with respect to yy of the partial derivative with respect to xx of ff, we get the so called mixed partial derivative:

2fyx=y(fx)fxy{\frac {\partial ^{2}f}{\partial y \partial x}} = \frac{\partial}{\partial y} \left ( \frac{\partial f}{\partial x} \right ) \equiv f_{xy}

One neat fact about mixed partial derivatives is that, if fx\dfrac{\partial f}{\partial x}, fy\dfrac{\partial f}{\partial y}, 2fyx{\dfrac {\partial ^{2}f}{\partial y \partial x}} and 2fxy{\dfrac {\partial ^{2}f}{\partial x \partial y}} are continuous, then the following is true:

2fyx=2fxy{\frac {\partial ^{2}f}{\partial y \partial x}} ={\frac {\partial ^{2}f}{\partial x \partial y}}

For example, let's calculate the second order derivatives for the function f(x,y)=x2y2sin(x)+cos(y)f(x,y) = x^2y^2 -\sin(x) + \cos(y). We can start with the calculation of first order derivatives.

fx=2xy2cos(x),\frac{\partial f}{\partial x} = 2xy^2 - \cos(x),

Notice that cos(y)x=0\dfrac{\partial \cos(y)}{\partial x} = 0 as cos(y)\cos(y) is constant with respect to the change of xx.

fy=2x2ysin(y),\frac{\partial f}{\partial y} = 2x^2y - \sin(y),2fyx=y(2xy2cos(x))=4xy,\frac{\partial^2 f}{\partial y \partial x} = \frac{\partial}{\partial y }(2xy^2 - \cos(x)) = 4xy,2fxy=x(2x2ysin(y))=4xy,\frac{\partial^2 f}{\partial x \partial y} = \frac{\partial}{\partial x}(2x^2y - \sin(y)) = 4xy,2fx2=x(2xy2cos(x))=2y2+sin(x),\frac{\partial^2 f}{\partial x^2} = \frac{\partial}{\partial x}(2xy^2 - \cos(x)) = 2y^2 + \sin(x),2fy2=y(2x2ysin(y))=2x2cos(y).\frac{\partial^2 f}{\partial y^2} = \frac{\partial}{\partial y}(2x^2y - \sin(y)) = 2x^2 - \cos(y).

Conclusion

To sum up, in this topic we have learned:

  • The partial derivative of a function with respect to any of its input variables is obtained by differentiating in one variable while all the others remain constant.
  • The vector-valued gradient function f\nabla f of a scalar-valued function ff can be obtained by grouping all the partial derivatives of ff in a single vector.
  • We can obtain higher order partial derivatives by differentiating a partial derivative with respect to one of its arguments.
30 learners liked this piece of theory. 2 didn't like it. What about you?
Report a typo