Note that this task assumes
scikit-learn version 1.2.2 You have the following starter code:
from sklearn.decomposition import PCA
from sklearn.datasets import load_wine
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import AgglomerativeClustering
from sklearn.metrics.cluster import adjusted_rand_score
import numpy as np
X, y = load_wine(return_X_y = True)
In order to complete this task, perform the following steps:
- First, perform standardization of the train set.
- After the standardization, use PCA to reduce the train set to 2 components (you will need two versions of train: with and without PCA, but standardization is applied to both)
- Fit
AgglomerativeClustering()with 3 clusters on the non-PCA and PCA-reduced train sets. Make the predictions. - Calculate the adjusted Rand scores for the non-PCA and PCA-reduced clusterings.
The answer should contain the absolute difference between the two adjusted Rand scores, rounded up to the third decimal.