Computer scienceData scienceInstrumentsPandasData preprocessing with pandas

Working with missing values

Drop NaNs

Report a typo

In this task, we are working with a dataset containing information about students. The dataset has three columns: year represents a student's year of study, degree represents their degree program, and age represents their age.

You must perform the following steps:

  1. Load the data into a pandas DataFrame.

  2. Delete all the rows that contain missing values from the dataframe.

  3. Print the number of rows in the initial dataframe and in the modified dataframe (after deleting the rows with missing values).

  4. Copy the result (the number of rows in the initial and modified dataframes) to the answer section, ensuring that it follows the format shown in the example.

Hint

The number of rows can be accessed via the .shape[...] attribute of the dataframe. Additionally, review the How to deal with them? section of the theory.

Sample Input 1:

year,degree,age
3,M,18
NaN,B,18
2,M,21
NaN,B,19
NaN,B,19
2,NaN,29
1,NaN,28
3,M,29
1,M,28
3,NaN,18
1,NaN,27
2,M,25
2,M,20
1,B,20
3,B,23

Sample Output 1:

15 8
Write code in your IDE to process the text file and display the results below
___

Create a free account to access the full topic