SciPy (pronounced "Sigh Pie") is a library and a Python-based ecosystem of open-source software for mathematics, science, and engineering. It is designed mostly for some difficult cases where other scientific libraries fail.
Installation
First things first, let's open the command line and install the package. Installation of SciPy is rather straightforward:
pip install scipyKeep in mind that SciPy requires NumPy to be installed, but the command above should install the missing package on its own or at least remind you to do so.
Then, SciPy can be imported just like any other package:
import scipyNote that import scipy does not import all the SciPy subpackages, we will discuss it in more detail later.
Scientific Python Ecosystem
When people say "SciPy" sometimes they refer not to the individual library, but rather to the entire ecosystem of scientific Python libraries. Indeed, this ecosystem is grand, it contains libraries that cover all aspects of data science and scientific computing. All of those libraries can be used in tandem with each other.
This ecosystem provides tools for data analysis (Pandas), data visualization (Matplotlib), solving algebraic equations symbolically (SymPy), manipulating matrices, and dealing with some higher mathematics (NumPy). However, in this topic, we focus on the SciPy library itself. SciPy is there to cover special cases in scientific computing when other packages are not enough.
NumPy vs SciPy
SciPy is quite often compared to NumPy, as both share some of their functionality. The difference between them is very nicely described in SciPy FAQ:
In an ideal world, NumPy would contain nothing but the array data type and the most basic operations: indexing, sorting, reshaping, basic elementwise functions, etc. All numerical code would reside in SciPy. However, one of NumPy's important goals is compatibility, so NumPy tries to retain all features supported by either of its predecessors. Thus, NumPy contains some linear algebra functions and Fourier transforms, even though these more properly belong in SciPy. In any case, SciPy contains more fully-featured versions of the linear algebra modules, as well as many other numerical algorithms. If you are doing scientific computing with Python, you should probably install both NumPy and SciPy. Most new features belong in SciPy rather than NumPy.
The main idea is that SciPy became some kind of overhead on NumPy, more specifically, SciPy includes NumPy as a whole and adds some new functions and faster versions of the existing ones on top of it.
When to use SciPy?
As we said before, SciPy and NumPy share some functionality. Indeed, both SciPy and NumPy have linalg subpackage. Both libraries can be used to perform simple operations like matrix inversion. For example, in NumPy it looks like this:
import numpy as np
from numpy.linalg import inv
A = np.array([[1., 3.], [3., 4.]])
print(inv(A))
# [[-0.8 0.6]
# [ 0.6 -0.2]]And with SciPy we can do exactly the same thing
import numpy as np
from scipy.linalg import inv
A = np.array([[1., 3.], [3., 4.]])
print(inv(A))
# [[-0.8 0.6]
# [ 0.6 -0.2]]In some cases, like on extremely large matrices, SciPy might even be faster.
What is more, sometimes you may encounter problems where NumPy alone can't get the job done. An example would be the Hessenberg decomposition of a matrix into a unitary matrix and a Hessenberg matrix such that:
where is the Hermitian conjugate of . If you want to use NumPy alone, you'll have to implement an algorithm to find this decomposition on your own, while SciPy already has a built-in function specifically to solve this problem:
import numpy as np
from scipy.linalg import hessenberg
A = np.array([[2, 5, 8, 7], [5, 2, 2, 8], [7, 5, 6, 6], [5, 4, 4, 8]])
H, Q = hessenberg(A, calc_q=True)
print(H)
# [[ 2. -11.65843866 1.42005301 0.25349066]
# [ -9.94987437 14.53535354 -5.31022304 2.43081618]
# [ 0. -1.83299243 0.38969961 -0.51527034]
# [ 0. 0. -3.83189513 1.07494686]]The same is true for some other subpackages. For example, both SciPy and NumPy have subpackages for integration, but in NumPy, you can find only integration using the trapezoid rule (numpy.trapezoid function), while SciPy offers a whole range of various integration routines, from Simpson's rule to general multiple integrals and initial value problem for systems of ordinary differential equations.
To sum up, NumPy provides a number of functions that can help with basic curve fitting, linear algebra, Fourier transforms, etc., while SciPy is the library that actually contains fully-featured versions of these functions along with many others. However, if all you need is some simple operations on arrays or some basic mathematical operations, NumPy should be sufficient and you don't need to use SciPy.
SciPy subpackages
Let's take a closer look at the SciPy library itself.
SciPy is organized into subpackages covering vast scientific computing domains.
For linear algebra, there are
linalgandsparsesubmodules;For signal analysis
fftandsignalare invaluable;For integration and some other advanced calculus, there exist
integrateandspecialsubmodules;If you want to treat images as n-diminesional NumPy arrays, check the
ndimagesubmodule;For dealing with curves there are
optimizeandinterpolatesubmodules;All other SciPy's submodules are even more specialized. You can find more about them in the documentation
Important note: SciPy subpackages need to be imported exclusively prior to using them. For example, to use functions from the integrate subpackage you can import it in two ways:
from scipy import integrateimport scipy.integrate
SciPy help and documentation
All of the SciPy's packages are quite complicated and contain a vast number of functions. To figure out how to use them, it's always a good idea to follow the official documentation. It is quite extensive and there are many usage examples.
Alternatively, we can use Python's help() function to get information about functions or packages:
from scipy import integrate
help(integrate)Summary
In this topic, we have learned that:
SciPy is an integral part of a larger scientific Python ecosystem;
SciPy is a powerful tool that complements other libraries;
It covers all major domains of data science and scientific computing;
SciPy library has extensive documentation that contains examples of how to use all functions;
Use SciPy if NumPy was not powerful enough.