You already know the basics of Jupyter Notebook. Now imagine that your laptop lacks computing power, but you need to train a huge model. This is where Google Colab comes in handy. This is an online service provided by Google. With Colab, you can write and execute programs, save and share your results, and have access to powerful computing resources in your browser. In this topic, we will show the main advantages and disadvantages of the service and describe how to work with it.
Google Colab vs. Jupyter Notebook
Before we turn to Google Colab, let's have a closer look at the main features of Google Colab and Jupyter Notebook.
Google Colab | Jupyter Notebook |
Initiates online sessions to work with your team. | Doesn't allow instant teamwork. A researcher has to wait until another one finishes their part of the code and sends it for further editing. |
Uses Google computing power. | Uses the power of your computer only. |
Most of the ML libraries are on-board. | You need to install the libraries on your machine first. |
Every line of code can be saved on Google Drive. | Managing and navigating through various notebooks stored in different local folders can be hard. |
Google Colab has some minor cons, too, as it cannot be run offline. You can also lose your code if you close the environment without downloading the results. Nevertheless, we see that it still has a lot of benefits over the Jupyter Notebook.
First steps
Here we go! This is how we can write our first program in Google Colab:
Sign in to your Google account.
Go to Google Colab.
If you are a new user, you are going to see the following page:
You can skip this page and proceed to create your own Colab notebook. To do it, press the File button in the upper-left corner and choose New Notebook.
If you have already visited Google Colab, you can choose one of the existing notebooks from the list. Have a look at the example below.
Choose the New Notebook button at the bottom of the page.
Hooray! You have created a new notebook!
Interface
Let's have a look at the interface of your environment. As you can see, it resembles the interface of Jupyter Notebook.
At the top of the page, you can see the name of your notebook. By default, it has the "UntitledXX.ipynb" name where "XX" is a number of your notebook.
The File button allows you to do the main operations: save your notebook, download both .py and .ipynb extensions, save a copy on your Drive, and so on.
The Connect button connects you to Google servers. Once connected, you can start working with your code.
The highlighted buttons at the top allow you to modify a cell — move it up or down, comment, or delete it.
The Play button allows you to run your code. An error message will appear on the screen under the cell in case something is wrong.
The + Code and + Text buttons can add one more cell, a piece of code, or any text information. You can also do it by pressing the buttons that can be found near the Connect button.
The Folder button provides access to files that are used by the notebook.
By default, there are no files, but one folder. By pressing the button that is highlighted in the picture above, you can upload your own datasets for further processing.
Now let's describe code examples in Colab.
Programs in Colab
In this section, we will discuss several programs and show some features of Colab.
The simplest example. First of all, let's analyze an easy example. Imagine we want to count the average number of likes for three posts. You can see how similar it is to Jupyter Notebook.
We have created two cells. In the first one, we have defined three variables, in the second one we have calculated the average number of likes. After that, we can run each cell in the given order and get the results shown in the picture above.
Uploading a file. Let's imagine we need to process a text file named names.txt. We have put down the names of our friends and relatives, one per line. We want to print their names, and we need to upload the file in advance for that. You can do it in two different ways:
Use the buttons described in the previous section.
Use the following snippet.
from google.colab import files uploaded = files.upload()After that, you will see a button for uploading a file. You can press it and choose the file you need.
And then open it.
Now, we can carry out other operations and print all the names.
Downloading a file. Sometimes you need to download a file with data. Imagine you have a CSV file with information on passengers. There are two ways to download it.
Press the Folder button on the left side, choose your file, press the button with three dots, and download your file.
Use the following snippet to download the file.
from google.colab import files files.download("passengers_flight_a3412.csv")
Working with libraries
Libraries are vital for every Python developer. Google Colab allows us to use various external libraries. Suppose, we are going to work with a pre-installed library. Let's use NumPy for this purpose.
The first cell is used for importing NumPy, the second one allows us to create an array from the list. Of course, NumPy is not the only installed library. You can import other libraries like NLTK, Keras, Scipy, TensorFlow, etc. The installed libraries can be displayed by inputting pip list.
Use the following pip command to install a library:
pip install ...Let's try to install the pymorphy2 library that is used for morphological analysis of Russian words.
The library is installed successfully. You can use it for your own experiments. Unfortunately, the libraries you install cannot be saved, so you have to install them on your virtual machine each time you start a new session in Colab.
Conclusion
So, let's summarize what we learned about Google Colab:
You can run it in the cloud;
You can work in Google Colab with your team;
Most of the libraries for machine learning are already installed, so you can easily import them;
You can upload your files to work with them in the environment and download them afterward.
Of course, it is just the beginning. Welcome to Colaboratory page contains more information about this environment. But for now, let's work on solving some practical tasks.
Read more on this topic in Jupyter Notebook — a complete how-to tutorial on Hyperskill Blog.