Computer scienceData scienceInstrumentsVisualizationMatplotlib overview

Errors visualization

8 minutes read

If you've come to this topic, you probably know the basics of matplotlib and can create various plots. Are you looking for ways to make your plots even better? Then you're in the right place! In this topic, you'll learn how to add information on data errors to your plots to make plots more informative. Let's start!

What is error visualization?

As you know, any statistical data inevitably contains errors. An error in statistics is simply a difference between the measured value and the actual value. Plotting errors on your charts is a way to convey more information. A graphical representation of errors or uncertainties in the measurement is called an error bar. It's a line (a cap line, sometimes) drawn from the data point. The line's length shows us how precise the measurement is. A short error bar means that the values are concentrated, and the data is reliable. A long error bar indicates that the values are spread out, and the data is likely to contain errors.

Fortunately, matplotlib allows us to add error bars to various plots. They can be applied to:

  • Line charts;
  • Area charts;
  • Histograms;
  • Bar charts;
  • Scatterplots;
  • Dot plots.

It's not the full list; these are the plots where error bars are the most informative. Now, let's see how it works in practice!

Basic error bars

Luckily, there's a special function in matplotlib to plot error bars, so you won't need to write long lines of code. First of all, let's import the library:

import matplotlib.pyplot as plt

The function we need is plt.errorbar(). Here's the basic syntax of this function:

matplotlib.pyplot.errorbar(x, y, yerr=None, xerr=None, fmt='', ecolor=None, elinewidth=None,
capsize=None, barsabove=False, lolims=False, uplims=False, xlolims=False, xuplims=False,
errorevery=1, capthick=None, *, data=None, **kwargs)

As you remember, error bars can be added to many kinds of plots, but we'll start with the most basic one – the line chart.

The plt.errorbar() function takes only two required positional arguments: x and y are the coordinates of data points. However, that's not enough to create actual error bars. The resulting plot for plt.errorbar(x, y) will be the same as for plt.plot(x, y). So two more arguments come into play: yerr and xerr. They can take int or float arguments – numbers indicating errors on either the X- or Y-axes.

Let's introduce some data and look at an example:

x = [10, 14, 15, 18, 22]
y = [12, 13, 16, 17, 19]

plt.errorbar(x, y, xerr=0.5, yerr=0.8)

Here's the resulting plot:

A simple error bar plot with matplotlib

In case you don't need an error bar for every data point, you can introduce the optional argument errorevery that takes an int. For example, you can set it to 2:

plt.errorbar(x, y, xerr=0.5, yerr=0.8, errorevery=2)

Our plot will look like this:

An error bar plot with the optional argument errorevery

This is a classic example of error bars: balanced sides and the same values for every data point. Sometimes, you may need something different. How can you change them?

Changing the error bars

In some cases, your errorbars present the difference between the minimum and maximum values, with their sides of unequal length. Don't worry since xerr and yerr can also take a list. We need to create two separate lists for the minimum and maximum errors, named xerr_min and xerr_max in the example below. After this, pass a list of these lists (named xerr in our example) to the xerr argument. The same can be done for yerr, too.

In our example, we also introduce several arguments that change the appearance of error bars. Ecolor takes a str and changes the color of the bars. Elinewidth takes an int or float and sets the line thickness. Barsabove takes a bool and, if True, puts the errorbars above the line:

x = [10, 14, 15, 18, 22]
y = [12, 13, 16, 17, 19]

xerr_min = [0.2, 0.6, 0.3, 0.7, 0.5]
xerr_max = [0.7, 0.3, 0.8, 0.2, 0.1]

xerr = [xerr_min, xerr_max]

plt.errorbar(x, y, xerr=xerr, yerr=0.8, ecolor='green', elinewidth=2, barsabove=True)

Here's the result:

Unbalanced error bar plot

Sometimes, you need more balanced error bars, in other words, when the sides are equal, but the error differs from one data point to another. The same feature can be used here – xerr and yerr can take a list. In the following example, these lists are simply named xerr and yerr.

In this example, we also add caps to our bars. We can do it with the capsize argument that can take an int or float, set caps to the bars, and regulate their size. We also use the capthick argument to make caps a bit thicker (also takes an int or float).

x = [10, 14, 15, 18, 22]
y = [12, 13, 16, 17, 19]

xerr = [0.4, 0.6, 0.7, 0.8, 0.5]
yerr = [0.6, 0.5, 0.4, 0.8, 0.2]

plt.errorbar(x, y, xerr=xerr, yerr=yerr, ecolor='green', elinewidth=2, barsabove=True, capsize=5, capthick=2)

Let's have a look at the resulting plot:

A balanced error bar plot

There are arguments to turn caps into arrows — lolims and uplims for the Y-axis and xlolims and xuplims for the X-axis. They take a bool and, if True, give you an arrow instead of a cap. Let's say we want to turn all the caps into arrows. Here's the code:

plt.errorbar(x, y, xerr=xerr, yerr=yerr, ecolor='green', elinewidth=2, barsabove=True, capsize=5,
capthick=2, lolims=True, uplims=True, xlolims=True, xuplims=True)

Let's see what it looks like:

A balanced error bar plot with arrow caps

Now that you're know how to add error bars to a line chart and change them, let's take a look at other kinds of plots!

Error bars on other plots

As we have indicated above, error bars can be added to many kinds of plots. To get a dot plot, there's one argument that you need to add – fmt. It takes a str and has many possible values. You can find the full list in the official documentation (the Notes section). The most common options are fmt="o", which gives you circles as markers, and fmt="none", which plots error bars without any markers.

Here's an example where we use the x, y, xerr and yerr values from the previous code snippet:

plt.errorbar(x, y, xerr=xerr, yerr=yerr, ecolor='blue', elinewidth=2, barsabove=True, capsize=5,
capthick=2, fmt='o')

Let's see what we get:

Error bar on dot plot

Let's move on to more complex plots. There's a common rule: first, create a plot, then add error bars. The rest of the process is the same as before.

Let's have a look at a bar chart. We have two options. The first one is as described above. First, we plot a bar chart with the plt.bar() function. Then, we add plt.errorbar() with the same values for x and y. Note that we use fmt="none" in our example. If we don't need fmt, we'll get a line, not separate error bars. Here's the code:

years = [2015, 2016, 2017, 2018, 2019]
numbers = [15, 18, 19, 14, 15]

plt.bar(years, numbers)
plt.errorbar(years, numbers, yerr=1.2, ecolor="black", fmt="none", capsize=5)

There's another option – plt.bar() can take xerr, yerr, ecolor and capsize as optional arguments. The code looks like this:

plt.bar(years, numbers, yerr=1.2, ecolor='black', capsize=5)

It's up to you which option to prefer, the main difference is that you have more control over the parameters if you use plt.errorbar() separately.

No matter what we choose, our plot will look like this:

Error bar on bar plot

Error bars can also be added to scatterplots and area charts. In these cases there's only one option – first plt.scatter() or plt.fill_between(), then plt.errorbar(). Nothing new. Don't forget to try it!

Histograms are a bit more complicated. There's no straightforward way to add error bars there. You can have a look at this Stackoverflow discussion if you need to plot a histogram with error bars.

Conclusion

In this topic, we've covered the basics of errors visualization with matplotlib. We've learned how to add error bars to line charts, dot charts, bar charts, scatter plots, and area charts. We've also learned how to change their form, color, and thickness. Let's go through the arguments of plt.errorbar():

  • x, y – the data points to add the error bars;
  • xerr, yerr – the errors, can be float, int, or list;
  • fmt – changes the formatting and converts line charts into dot charts;
  • ecolor – changes the color of bars;
  • elinewidth – changes the thickness of bars;
  • capsize – sets caps to the bars and governs their thickness;
  • capthick – changes the thickness of caps;
  • lolims, uplims, xlolims, xuplims – turn caps into arrows;
  • barsabove – sets the bars either above or below the markers;
  • errorevery – changes the number of error bars.

Now it's time to practice!

One thing we've not covered here is using plt.fill_between() to plot continuous errors that can be useful in some situations. You can check out the Python Data Science Handbook to learn to do that as well.
12 learners liked this piece of theory. 1 didn't like it. What about you?
Report a typo