Computer scienceData scienceInstrumentsVisualizationKinds of graphs

Matplotlib scatter plot

6 minutes read

A scatter plot is a visualization of how two variables relate to each other by using plots. It is widely used for its simplicity in building a chart.

In this topic, you’ll become familiar with creating basic scatter plots using matplotlib. In later sections, you’ll learn how to customize such plots.

Building scatter plots

Now, let's try to visualize and compare data, for example, transportation prices from Vienna to Budapest.

You can create a scatter plot by using plt.scatter(), where the arguments are the two variables you wish to compare as input arguments. In this case, we want to show the relationship between transportation and price. The scatter() function also takes the s parameter as an argument that specifies the marker size. In this example, we will also show the labels for each axis and the title of the visualization:

import matplotlib.pyplot as plt

travel = ['flight', 'car', 'train', 'taxi']
price = [142, 62, 36, 100]
plt.title("Travel Costs in Euro: Vienna - Budapest")
plt.xlabel("Transportation")
plt.ylabel("Price in Euro")
plt.scatter(travel, price, s=100)
plt.show()

A scatterplot of travel costs between Vienna and Budapest in euros

In the next sections, we’ll start exploring more advanced uses of scatter().

Understanding the parameters

You’ve learned about the main input parameters to create scatter plots in the sections above. Here’s a summary of key points to remember about the main input parameters:

Parameter

Description

x and y

These parameters represent two variables we want to show the relationship.

s

Defines the marker size.

c

Represents the marker color.

marker

Customizes the shape of the marker.

cmap

Selects the mapping between values and colors.

alpha

This parameter is a float number and represents the transparency of the markers.

Now, let's customize the first example of a scatter plot using different parameters. We will keep the x and y values, in this case, the travel and price arrays.

Changing the colors of the plots

A good idea is to show different colors plots with different prices. In this case, the с parameter, which determines colors, will depend on the values of prices. The code is shown below:

import matplotlib.pyplot as plt

travel = ['flight', 'car', 'train', 'taxi']
price = [142, 62, 36, 100]
plt.title("Travel Costs in Euro: Vienna - Budapest")
plt.xlabel("Transportation")
plt.ylabel("Price in Euro")
plt.scatter(travel, price, c=price, s=100)
plt.show()

Add color to the scatterplot markers

If the colors are not clear enough, let's add another parameter — cmap. It shows the highest prices with a specific color. Just like any mapping, the cmap parameter will add a detailed map of colors. You can take a look at the list of all color maps available in matplotlib. One of them is called Viridis, and we can specify it as a parameter of the function by declaring cmap='viridis'. Let's add it to our scatter function. Also, we may add a color bar to make the visualization more clear.

plt.scatter(travel, price, c=price, cmap='viridis', s=100)
plt.colorbar()

The result is the Scatter plot shown below.

Add a color bar to a scatterplot

As you can see, the relationship between the plot color and price is clear now.

Customizing the marker

We can choose to show different markers, not just circles. Let's change the way of showing plots by changing the marker parameter. By default, it is a circle. In this example, let's make it an x symbol. The function call will be similar to this:

plt.scatter(travel, price, c='orange', marker='x')

The resulting scatter plot is shown in the figure below.

Customize the scatterplot markers

Scatter vs. plot functions

You can also implement a scatter plot by using another function within matplotlib.pyplot. The function plt.plot() is a general-purpose plotting function that creates various line or marker plots. In this case, we want to create a simple scatter plot, just by using the plot function:

import matplotlib.pyplot as plt

travel = ['flight', 'car', 'train', 'taxi']
price = [160, 62, 36, 100]
plt.title("Travel Costs in Euro: Vienna - Budapest")
plt.xlabel("Transportation")
plt.ylabel("Price in Euro")
plt.plot(travel, price, "o")
plt.show()

Make a scatterplot with matplotlib plot function

How to choose between plot() and scatter() functions? Here is a rule of thumb:

  • If you need a basic scatter plot, use plt.plot(), especially if you want to prioritize performance.

  • If you want to customize your scatter plot by using more advanced plotting features, use plt.scatter().

Read more on this topic in SQL and Python: applying programming languages on Hyperskill Blog.

Conclusions

In this topic, we've discussed how to create and customize scatter plots using plt.scatter(). You’re ready to start practicing with your datasets and examples. This function gives you a chance to explore your data and present your findings.

You can get the most out of visualization using plt.scatter() by learning more about all the features in matplotlib.

22 learners liked this piece of theory. 0 didn't like it. What about you?
Report a typo