Computer scienceData scienceInstrumentsScikit-learnData preprocessing with scikit-learn

Additional feature scaling techniques in scikit-learn

Time series

Report a typo

Note that the used libraries have the following versions:

sklearn==1.2.2
statsmodels==0.13.5
pandas==1.5.3

Power transformation could be applied in time series forecasting. Suppose you have the following starter code:

import pandas as pd
from statsmodels.tsa.arima.model import ARIMA
from sklearn.metrics import mean_squared_error as mse
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import PowerTransformer

url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-min-temperatures.csv"
df = pd.read_csv(url, header=0, parse_dates=[0], index_col=0)

def fit_arima_model(dataframe):
    train, test = train_test_split(dataframe, test_size=0.25, shuffle=False, random_state=42)
    
    model = ARIMA(train, order=(1,0,3))
    model_fit = model.fit()
    
    return model_fit, train, test

def forecast(model_fit, start, end):
    return model_fit.predict(start=start, end=end)


start = len(train)
end = len(train) + len(test) - 1

Make the predictions for the provided start and end values, calculate the mean squared error between the test and the resulting predictions. After that, fit PowerTransformer on train and transform the test. Make the predictions for the specified start and end values again. Calculate the mean square error. The answer should contain the MSE scores for both predictions, rounded up to the second decimal, separated by a space.

Enter a short text

___

Create a free account to access the full topic