In this task, let's try to understand how to determine whether the MAE score we obtained signals a good or a bad model performance (to start somewhere, we'll keep the threshold simple).
Consider the following dataset (where is the ground truth label, and is the predicted label):
| 0 | 3.2 | 4 |
| 1 | 4.7 | 4.3 |
| 2 | 4 | 4.6 |
| 3 | 5.6 | 4.9 |
| 4 | 5.6 | 5.2 |
| 5 | 6.2 | 5.5 |
| 6 | 5.1 | 5.8 |
| 7 | 6.4 | 6.1 |
1) Calculate the mean absolute value for this dataset (we will call this value ).
2) Then, calculate the median of the ground truth label.
3) Calculate the MAE for a case where all predictions are equal to the median of the ground truth (so the new dataset the column will have the same value, the ground truth median) — we'll call this value .
4) Compare and . We'll say that is good if it equals or less.
Your answer function looks like this: