Description
On the scatterplot of rating vs salary, you may have noticed that the relationship between these two variables seems to be different from linear and looks like a polynomial function. Let's try to raise the rating by several degrees and see whether it improves the score.
Objectives
- Read the data. For downloading the dataset refer to Stage 1;
- Load the data with
pandas.read_csv; - Make
Xa DataFrame with a predictorratingandya series with a targetsalary; - Raise predictor to the power of .
- Split the predictors and target into training and test sets. Use
test_size=0.3andrandom_state=100parameters — they guarantee that the results will be as expected; - Fit the linear model of
salaryonrating, make predictions and calculate the MAPE; - Repeat steps 2–5 for the power of 3 and 4;
- Print the best MAPE obtained by fitting and running the models described above. The MAPE is a float number rounded to five decimal places.
MAPE is a loss function therefore the less the value of MAPE is the better the model's performance.
Example
Example 1: program output
1.23456