Cost function training target versus accuracy desired goal

How can we train a neural network so that it ends up maximizing classification accuracy? I’m asking for a way to get a continuous proxy function that’s closer to the accuracy To start with, the loss function used today for classification tasks in (deep) neural nets was not invented with them, but it goes back … Read more

What is the difference between a sigmoid followed by the cross entropy and sigmoid_cross_entropy_with_logits in TensorFlow?

You’re confusing the cross-entropy for binary and multi-class problems. Multi-class cross-entropy The formula that you use is correct and it directly corresponds to tf.nn.softmax_cross_entropy_with_logits: -tf.reduce_sum(p * tf.log(q), axis=1) p and q are expected to be probability distributions over N classes. In particular, N can be 2, as in the following example: p = tf.placeholder(tf.float32, shape=[None, … Read more

Under what parameters are SVC and LinearSVC in scikit-learn equivalent?

In mathematical sense you need to set: SVC(kernel=”linear”, **kwargs) # by default it uses RBF kernel and LinearSVC(loss=”hinge”, **kwargs) # by default it uses squared hinge loss Another element, which cannot be easily fixed is increasing intercept_scaling in LinearSVC, as in this implementation bias is regularized (which is not true in SVC nor should be … Read more

How to interpret loss and accuracy for a machine learning model [closed]

The lower the loss, the better a model (unless the model has over-fitted to the training data). The loss is calculated on training and validation and its interperation is how well the model is doing for these two sets. Unlike accuracy, loss is not a percentage. It is a summation of the errors made for … Read more

Is deep learning bad at fitting simple non linear functions outside training scope (extrapolating)?

Is my analysis correct? Given my remarks in the comments that your network is certainly not deep, let’s accept that your analysis is indeed correct (after all, your model does seem to do a good job inside its training scope), in order to get to your 2nd question, which is the interesting one. If the … Read more

Epoch vs Iteration when training neural networks [closed]

In the neural network terminology: one epoch = one forward pass and one backward pass of all the training examples batch size = the number of training examples in one forward/backward pass. The higher the batch size, the more memory space you’ll need. number of iterations = number of passes, each pass using [batch size] … Read more

Order between using validation, training and test sets

The Wikipedia article is not wrong; according to my own experience, this is a frequent point of confusion among newcomers to ML. There are two separate ways of approaching the problem: Either you use an explicit validation set to do hyperparameter search & tuning Or you use cross-validation So, the standard point is that you … Read more

Caffe | solver.prototxt values setting strategy

In order to set these values in a meaningful manner, you need to have a few more bits of information regarding your data: 1. Training set size the total number of training examples you have, let’s call this quantity T. 2. Training batch size the number of training examples processed together in a single batch, … Read more

Getting a low ROC AUC score but a high accuracy

To start with, saying that an AUC of 0.583 is “lower” than a score* of 0.867 is exactly like comparing apples with oranges. [* I assume your score is mean accuracy, but this is not critical for this discussion – it could be anything else in principle] According to my experience at least, most ML … Read more