Logistic Regression from Scratch. Stage 3/4

Log-Loss

Report a typo

Description

The Mean squared error cost function produces a non-convex graph with the local and global minimums when applied to a sigmoid function. If a weight value is close to a local minimum, gradient descent minimizes the cost function by the local (not global) minimum. This presents grave limitations to the Mean squared error cost function if we apply it to binary classification tasks. The Log-loss cost function may help to overcome this issue.

We can represent it in the following way:

J(b0,b1,...)=1ni=1n[yiln(yi^)+(1yi)ln(1yi^)]J(b_0,b_1, ...) = -{1\over n} \displaystyle\sum_{i=1}^{n}\big[{y_i \cdot ln(\hat{y_i})}+({1-y_i) \cdot ln(1-\hat{y_i})}\big]

where

yi^=11+et;  t=b0+b1xi1+b2xi2+...\hat{y_i} = {1 \over 1 + e^{-t}}; \ \ t =b_0 + b_1x_{i 1}+ b_2x_{i2} +...In the previous stage, you've implemented the Stochastic gradient descent with the Mean squared error loss function and obtained the coef_ values. The procedure of applying the Stochastic gradient descent to the Log-loss cost function is similar. The only differences are the first-order differentials with which we will update the weights.

The bias b0b_0 is updated with:

b0=b0l_rate(yi^yi)Nb_0 = b_0 - {l\_rate \cdot (\hat{y_i}-y_i) \over N}While the coefficients bjb_j are updated with:

bj=bjl_rate(yi^yi)xijNb_j = b_j - {l\_rate\cdot (\hat{y_i}-y_i) \cdot x_{ij} \over N}

where ii is the observation (row) index. jj is the independent variable (column) index. NN is the number of rows in the training set.

Similar to the fit_mse method, the log-loss can be fitted without a bias term.

The attributes and methods in the CustomLogisticRegression class are:

class CustomLogisticRegression:

    def __init__(self, fit_intercept=True, l_rate=0.01, n_epoch=100):
        self.fit_intercept = ...
        self.l_rate = ...
        self.n_epoch = ...

    def sigmoid(self, t):
        return ...

    def predict_proba(self, row, coef_):
        t = ...
        return self.sigmoid(t)

    def fit_mse(self, X_train, y_train):
        self.coef_ = ...  # initialized weights

        for _ in range(self.n_epoch):
            for i, row in enumerate(X_train):
                y_hat = self.predict_proba(row, coef_)
                # update all weights

    def fit_log_loss(self, X_train, y_train):
        # stochastic gradient descent implementation

    def predict(X_test, cut_off=0.5):
        ...
        for row in X_test:
            y_hat = self.predict_proba(row, self.coef_)
        return predictions # predictions are binary values — 0 or 1

Objectives

  1. Implement fit_log_loss;

  2. Load the dataset and select the same independent and target variables as in the previous stage;

  3. Standardize X;

  4. Instantiate the CustomLogisticRegression class with the following attributes:

    lr = CustomLogisticRegression(fit_intercept=True, l_rate=0.01, n_epoch=1000)
  5. Fit the model with the training set from Stage 1 using fit_log_loss;

  6. Predict y_hat values for the test set;

  7. Calculate the accuracy score for the test set;

  8. Print coef_ array and accuracy score as a Python dictionary in the format shown in the Examples section.

Examples

The training set below remains the same as in the Objectives. Only the test set and the CustomLogisticRegression class attributes change.

Example test set:

Standardized X_test and y_test data

worst concave points

worst perimeter

worst radius

y

0.320904

0.230304

-0.171560

1.0

-1.743529

-0.954428

-0.899849

1.0

1.014627

0.780857

0.773975

0.0

1.432990

-0.132764

-0.123973

0.0

Download as a file

Example 1: processing the CustomLogisticRegression class with the following attributes

lr = CustomLogisticRegression(fit_intercept=True, l_rate=0.01, n_epoch=100)

Output (a Python dict):

{'coef_': [ 0.10644459, -0.2961112 , -0.27592773, -0.27338684], 'accuracy': 0.75}

Example 2: processing the CustomLogisticRegression class with the following attributes

lr = CustomLogisticRegression(fit_intercept=False, l_rate=0.01, n_epoch=100)

Output (a Python dict):

{'coef_': [-0.29627229, -0.27640283, -0.27384802], 'accuracy': 0.75}
Write a program
IDE integration
Checking the IDE status
___

Create a free account to access the full topic