Computer scienceData scienceInstrumentsPandasData preprocessing with pandas

Handling missing values

Replace with the mode

Report a typo

To solve this task, you need to perform the following series of steps:

Load the given dataset into your IDE using pandas;
Look at the dataset and find a categorical feature (the feature that can only take on a value from some fixed set) that contains missing values (NaNs);
Fill the NaNs with the mode (= the most frequent value) of the found categorical feature.

Print the first 5 rows of the DataFrame and copy the result to the answer section. Use print(df.head()) to avoid any formatting issues.

Tip: two methods are of particular interest in this case, .fillna(value, inplace=True) and .mode().

Sample Input 1:

location,building_age,room_num
Jetburg,107,3
Hypercity,97,5
NaN,19,3
Jetburg,111,5
Skilltown,76,4
Hypercity,25,2
NaN,107,3
Hypercity,126,5
Jetburg,79,4
NaN,92,4
Jetburg,121,3
NaN,104,3
Jetburg,108,4
Hypercity,135,2
Jetburg,57,2
Skilltown,6,5
Hypercity,92,2
NaN,42,4
NaN,134,4
NaN,25,4
NaN,62,2
Jetburg,26,5
Jetburg,93,2
Hypercity,53,5
Jetburg,63,2

Sample Output 1:

    location  building_age  room_num
0    Jetburg           107         3
1  Hypercity            97         5
2    Jetburg            19         3
3    Jetburg           111         5
4  Skilltown            76         4

Write code in your IDE to process the text file and display the results below

___

Create a free account to access the full topic