Description
In the first stage, let's start with the simplest linear model — it will include salary as a dependent variable and the player's rating as the only predictor. Your goal is to fit such a model, find its coefficients and calculate the MAPE (mean average percentage error).
The scatterplot above shows the relationship between rating and salary and its linear approximation. The red line is the formula , where is predicted player's salary, is the slope of the linear regression model, and is its intercept. You need to find and . After that, you also need to calculate the MAPE. You can do it with sklearn.metrics.mean_absolute_percentage_error.
Objectives
- We have automated the data download process in the .py file provided to you. However, if that is inconvenient, feel free to download the dataset on your own;
- Load the DataFrame using the
pandas.read_csvmethod; - Make
Xa DataFrame with a predictorratingandya series with a targetsalary; - Split predictor and target into training and test sets. Use
test_size=0.3andrandom_state=100parameters — they guarantee that the results will be as expected; - Fit the linear regression model with the following formula on the training data: .
- Predict a salary with the fitted model on test data and calculate the MAPE;
- Print three float numbers: the model intercept, the slope, and the MAPE rounded to five decimal places and separated by whitespace.
Example
Example 1: program output
123456.78901 987.65432 1.23456