Imagine you have a full Penguins dataset:
penguins_df.head()
Output:
+----+-----------+-----------+------------------+-----------------+---------------------+---------------+--------+
| | species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex |
|----+-----------+-----------+------------------+-----------------+---------------------+---------------+--------|
| 0 | Adelie | Torgersen | 39.1 | 18.7 | 181 | 3750 | MALE |
| 1 | Adelie | Torgersen | 39.5 | 17.4 | 186 | 3800 | FEMALE |
| 2 | Adelie | Torgersen | 40.3 | 18 | 195 | 3250 | FEMALE |
| 3 | Adelie | Torgersen | nan | nan | nan | nan | nan |
| 4 | Adelie | Torgersen | 36.7 | 19.3 | 193 | 3450 | FEMALE |
+----+-----------+-----------+------------------+-----------------+---------------------+---------------+--------+
Take only species, body_mass_g, and sex columns and put the selection into the selecting_task variable. Note that the DataFrame is already loaded as penguins_df.