Salary Prediction. Stage 2/5

Linear regression with predictor transformation

Report a typo

Description

On the scatterplot of rating vs salary, you may have noticed that the relationship between these two variables seems to be different from linear and looks like a polynomial function. Let's try to raise the rating by several degrees and see whether it improves the score.

Objectives

  1. Read the data. For downloading the dataset refer to Stage 1;
  2. Load the data with pandas.read_csv;
  3. Make X a DataFrame with a predictor rating and y a series with a target salary;
  4. Raise predictor to the power of 22.
  5. Split the predictors and target into training and test sets. Use test_size=0.3 and random_state=100 parameters — they guarantee that the results will be as expected;
  6. Fit the linear model of salary on rating, make predictions and calculate the MAPE;
  7. Repeat steps 2–5 for the power of 3 and 4;
  8. Print the best MAPE obtained by fitting and running the models described above. The MAPE is a float number rounded to five decimal places.

MAPE is a loss function therefore the less the value of MAPE is the better the model's performance.

Example

Example 1: program output

1.23456
Write a program
IDE integration
Checking the IDE status
___

Create a free account to access the full topic