Description
In the previous stage, we have provided you with the coef_ values. In this stage, you need to estimate the coef_ values by gradient descent on the Mean squared error cost function. Gradient descent is an optimization technique for finding the local minimum of a cost function by first-order differentiating. To be precise, we're going to implement the Stochastic gradient descent (SGD).
The Mean squared error cost function can be expressed as:
Where indexes the rows (observations), and:
is the predicted probability value for the row, while is its actual value. As usual, is a value of the row and the column. In other words, it's the value of the independent variable. Weights are updated by their first-order derivatives in the training loop as follows:
The bias can be updated by:
For learning purposes, we will use the entire training set to update weights sequentially. The number of the epoch n_epoch is the number of iterations over the training set. A training loop is a nested for-loop over n_epoch and all the rows in the train set. If n_epoch = 10 and the number of rows in the training set is 100, the coefficients are updated 1000 times after training loops:
# Training loop
for one_epoch in range(n_epoch):
for i, row in enumerate(X_train):
# update weight b0
# update weight b1
# update weight b2
...
The initial values of the weights are insignificant; they are optimized to the values that minimize the cost function. So, you can randomize the weights or set them to zeros. The weight optimization process occurs inside the fit_mse method.
If a particular weight value is updated by large increments, it descents down the quadratic curve in an erratic way and may jump to the opposite side of the curve. In this case, we may miss the value of the weight that minimizes the loss function. The learning rate l_rate can tune the value for updating the weight to the step size that allows for gradual descent along the curve with every iteration:
class CustomLogisticRegression:
def __init__(self, fit_intercept=True, l_rate=0.01, n_epoch=100):
self.fit_intercept = ...
self.l_rate = ...
self.n_epoch = ...
def sigmoid(self, t):
return ...
def predict_proba(self, row, coef_):
t = ...
return self.sigmoid(t)
def fit_mse(self, X_train, y_train):
self.coef_ = ... # initialized weights
for _ in range(self.n_epoch):
for i, row in enumerate(X_train):
y_hat = self.predict_proba(row, coef_)
# update all weights
def predict(self, X_test, cut_off=0.5):
...
for row in X_test:
y_hat = self.predict_proba(row, self.coef_)
return predictions # predictions are binary values - 0 or 1The predict method calculates the values of y_hat for each row in the test set and returns a numpy array that contains these values. Since we are solving a binary classification problem, the predicted values can be only or . The return of predict depends on the cut-off point. The predict_proba probabilities that are less than the cut-off point are rounded to , while those that are equal or bigger are rounded to . Set the default cut-off value to . To determine the prediction accuracy of your model, use accuracy_score from sklearn.metrics.
Objectives
Implement the
fit_msemethod;Implement the
predictmethod;Load the dataset. Select the following columns as independent variables:
worst concave points,worst perimeter,worst radius. The target variable remains the same;Standardize
X;Instantiate the
CustomLogisticRegressionclass with the following attributes:lr = CustomLogisticRegression(fit_intercept=True, l_rate=0.01, n_epoch=1000)Fit the model with the training set from the previous stage (
train_size=0.8andrandom_state=43) usingfit_mse;Predict
y_hatvalues for the test set;Calculate the accuracy score for the test set;
Print
coef_array and accuracy score as a Python dictionary in the format shown in the Examples section.
Examples
The training set in the examples below is the same as in the Objectives section. Only the test set and CustomLogisticRegression class attributes vary.
Example test set (features are standardized):
Standardized X_test and y_test data | |||
|
|
|
|
0.320904 | 0.230304 | -0.171560 | 1.0 |
-1.743529 | -0.954428 | -0.899849 | 1.0 |
1.014627 | 0.780857 | 0.773975 | 0.0 |
1.432990 | -0.132764 | -0.123973 | 0.0 |
Example 1: processing the CustomLogisticRegression class with the following attributes
lr = CustomLogisticRegression(fit_intercept=True, l_rate=0.01, n_epoch=100)Output (a Python dict):
{'coef_': [ 0.7219814 , -2.06824488, -1.44659819, -1.52869155], 'accuracy': 0.75}Example 2: processing the CustomLogisticRegression class with the following attributes
lr = CustomLogisticRegression(fit_intercept=False, l_rate=0.01, n_epoch=100)Output (a Python dict):
{'coef_': [-1.86289827, -1.60283708, -1.69204615], 'accuracy': 0.75}