You are given the following starter code:
from sklearn.model_selection import train_test_split
from sklearn.datasets import fetch_california_housing
from sklearn.linear_model import SGDRegressor
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_absolute_error
data = fetch_california_housing()
X = data.data
Y = data.target
X_train, X_test, y_train, y_test = train_test_split(X, Y, train_size=0.8, random_state=42)
pipeline = Pipeline([
("Scaling", StandardScaler()),
("Regression", SGDRegressor(random_state=42))
])Fit the pipeline (by calling .fit() on pipeline) with the training data, make the predictions on X_test (by calling pipeline.predict()), and calculate the mean absolute error on the resulting predictions (with mean_absolute_error(y_test, y_pred)). After that, fit the pipeline without scaling, make the predictions on X_test, and calculate the mean absolute error again.
Observe the difference between the calculated MAE scores. Your answer can be one of the following options:
Both scores fall in the range and the absolute difference between them is less than .
One of the scores falls in the range, and the other in the range.
The two scores differ by a factor greater than one thousand, with the scaled score being significantly closer to than the un-scaled score.
One of the scores lies in the range, and the other is a value in the range.
Your answer should contain the correct option as an integer.