7 minutes read

SciPy (pronounced "Sigh Pie") is a library and a Python-based ecosystem of open-source software for mathematics, science, and engineering. It is designed mostly for some difficult cases where other scientific libraries fail.

Installation

First things first, let's open the command line and install the package. Installation of SciPy is rather straightforward:

pip install scipy

Keep in mind that SciPy requires NumPy to be installed, but the command above should install the missing package on its own or at least remind you to do so.

Then, SciPy can be imported just like any other package:

import scipy

Note that import scipy does not import all the SciPy subpackages, we will discuss it in more detail later.

Scientific Python Ecosystem

When people say "SciPy" sometimes they refer not to the individual library, but rather to the entire ecosystem of scientific Python libraries. Indeed, this ecosystem is grand, it contains libraries that cover all aspects of data science and scientific computing. All of those libraries can be used in tandem with each other.

Python Ecosystem

This ecosystem provides tools for data analysis (Pandas), data visualization (Matplotlib), solving algebraic equations symbolically (SymPy), manipulating matrices, and dealing with some higher mathematics (NumPy). However, in this topic, we focus on the SciPy library itself. SciPy is there to cover special cases in scientific computing when other packages are not enough.

NumPy vs SciPy

SciPy is quite often compared to NumPy, as both share some of their functionality. The difference between them is very nicely described in SciPy FAQ:

In an ideal world, NumPy would contain nothing but the array data type and the most basic operations: indexing, sorting, reshaping, basic elementwise functions, etc. All numerical code would reside in SciPy. However, one of NumPy's important goals is compatibility, so NumPy tries to retain all features supported by either of its predecessors. Thus, NumPy contains some linear algebra functions and Fourier transforms, even though these more properly belong in SciPy. In any case, SciPy contains more fully-featured versions of the linear algebra modules, as well as many other numerical algorithms. If you are doing scientific computing with Python, you should probably install both NumPy and SciPy. Most new features belong in SciPy rather than NumPy.

The main idea is that SciPy became some kind of overhead on NumPy, more specifically, SciPy includes NumPy as a whole and adds some new functions and faster versions of the existing ones on top of it.

When to use SciPy?

As we said before, SciPy and NumPy share some functionality. Indeed, both SciPy and NumPy have linalg subpackage. Both libraries can be used to perform simple operations like matrix inversion. For example, in NumPy it looks like this:

import numpy as np
from numpy.linalg import inv

A = np.array([[1., 3.], [3., 4.]])
print(inv(A))
# [[-0.8  0.6]
#  [ 0.6 -0.2]]

And with SciPy we can do exactly the same thing

import numpy as np
from scipy.linalg import inv

A = np.array([[1., 3.], [3., 4.]])
print(inv(A))
# [[-0.8  0.6]
#  [ 0.6 -0.2]]

In some cases, like on extremely large matrices, SciPy might even be faster.

What is more, sometimes you may encounter problems where NumPy alone can't get the job done. An example would be the Hessenberg decomposition of a matrix AA into a unitary matrix QQ and a Hessenberg matrix HH such that:

A=QHQHA = Q H Q^H

where QHQ^H is the Hermitian conjugate of QQ. If you want to use NumPy alone, you'll have to implement an algorithm to find this decomposition on your own, while SciPy already has a built-in function specifically to solve this problem:

import numpy as np
from scipy.linalg import hessenberg

A = np.array([[2, 5, 8, 7], [5, 2, 2, 8], [7, 5, 6, 6], [5, 4, 4, 8]])
H, Q = hessenberg(A, calc_q=True)
print(H)
# [[  2.         -11.65843866   1.42005301   0.25349066]
#  [ -9.94987437  14.53535354  -5.31022304   2.43081618]
#  [  0.          -1.83299243   0.38969961  -0.51527034]
#  [  0.           0.          -3.83189513   1.07494686]]

The same is true for some other subpackages. For example, both SciPy and NumPy have subpackages for integration, but in NumPy, you can find only integration using the trapezoid rule (numpy.trapezoid function), while SciPy offers a whole range of various integration routines, from Simpson's rule to general multiple integrals and initial value problem for systems of ordinary differential equations.

To sum up, NumPy provides a number of functions that can help with basic curve fitting, linear algebra, Fourier transforms, etc., while SciPy is the library that actually contains fully-featured versions of these functions along with many others. However, if all you need is some simple operations on arrays or some basic mathematical operations, NumPy should be sufficient and you don't need to use SciPy.

SciPy subpackages

Let's take a closer look at the SciPy library itself.
SciPy is organized into subpackages covering vast scientific computing domains.

  • For linear algebra, there are linalg and sparse submodules;

  • For signal analysis fft and signal are invaluable;

  • For integration and some other advanced calculus, there exist integrate and special submodules;

  • If you want to treat images as n-diminesional NumPy arrays, check the ndimage submodule;

  • For dealing with curves there are optimize and interpolate submodules;

  • All other SciPy's submodules are even more specialized. You can find more about them in the documentation

Important note: SciPy subpackages need to be imported exclusively prior to using them. For example, to use functions from the integrate subpackage you can import it in two ways:

  • from scipy import integrate

  • import scipy.integrate

SciPy help and documentation

All of the SciPy's packages are quite complicated and contain a vast number of functions. To figure out how to use them, it's always a good idea to follow the official documentation. It is quite extensive and there are many usage examples.

Alternatively, we can use Python's help() function to get information about functions or packages:

from scipy import integrate

help(integrate)

Summary

In this topic, we have learned that:

  • SciPy is an integral part of a larger scientific Python ecosystem;

  • SciPy is a powerful tool that complements other libraries;

  • It covers all major domains of data science and scientific computing;

  • SciPy library has extensive documentation that contains examples of how to use all functions;

  • Use SciPy if NumPy was not powerful enough.

31 learners liked this piece of theory. 1 didn't like it. What about you?
Report a typo