Suppose, you have the full penguins dataset:
penguins_df.head()
Output:
+----+-----------+-----------+------------------+-----------------+---------------------+---------------+--------+
| | species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex |
|----+-----------+-----------+------------------+-----------------+---------------------+---------------+--------|
| 0 | Adelie | Torgersen | 39.1 | 18.7 | 181 | 3750 | MALE |
| 1 | Adelie | Torgersen | 39.5 | 17.4 | 186 | 3800 | FEMALE |
| 2 | Adelie | Torgersen | 40.3 | 18 | 195 | 3250 | FEMALE |
| 3 | Adelie | Torgersen | nan | nan | nan | nan | nan |
| 4 | Adelie | Torgersen | 36.7 | 19.3 | 193 | 3450 | FEMALE |
+----+-----------+-----------+------------------+-----------------+---------------------+---------------+--------+
You would like to make a brief report about female penguins. However, you'd like to base your analysis only on the columns 'species', 'island', and 'sex' . Select all rows that contain female penguins and the columns 'species', 'island', 'sex' , then print the resulting DataFrame.
Tip: It's easier to use .loc here. Take a look at this code snippet from the theory to solve the task:
df.loc[df.birthday == '12.05.1979', 'last_name':'birthday':2]
Output:
+----+-------------+------------+
| | last_name | birthday |
|----+-------------+------------|
| 3 | Doe | 12.05.1979 |
+----+-------------+------------+