logistic-regression – Make Me Engineer

How to find the importance of the features for a logistic regression model?

June 9, 2023 by Tarik

One of the simplest options to get a feeling for the “influence” of a given parameter in a linear classification model (logistic being one of those), is to consider the magnitude of its coefficient times the standard deviation of the corresponding parameter in the data. Consider this example: import numpy as np from sklearn.linear_model import … Read more

sklearn Logistic Regression “ValueError: Found array with dim 3. Estimator expected

May 9, 2023 by Tarik

scikit-learn expects 2d num arrays for the training dataset for a fit function. The dataset you are passing in is a 3d array you need to reshape the array into a 2d. nsamples, nx, ny = train_dataset.shape d2_train_dataset = train_dataset.reshape((nsamples,nx*ny))

Getting a low ROC AUC score but a high accuracy

November 5, 2022 by Tarik

To start with, saying that an AUC of 0.583 is “lower” than a score* of 0.867 is exactly like comparing apples with oranges. [* I assume your score is mean accuracy, but this is not critical for this discussion – it could be anything else in principle] According to my experience at least, most ML … Read more

How to implement the Softmax function in Python

June 28, 2022 by Tarik

They’re both correct, but yours is preferred from the point of view of numerical stability. You start with e ^ (x – max(x)) / sum(e^(x – max(x)) By using the fact that a^(b – c) = (a^b)/(a^c) we have = e ^ x / (e ^ max(x) * sum(e ^ x / e ^ max(x))) … Read more

How to choose cross-entropy loss in TensorFlow?

June 10, 2022 by Tarik

Preliminary facts In functional sense, the sigmoid is a partial case of the softmax function, when the number of classes equals 2. Both of them do the same operation: transform the logits (see below) to probabilities. In simple binary classification, there’s no big difference between the two, however in case of multinomial classification, sigmoid allows … Read more