How to choose cross-entropy loss in TensorFlow?

Preliminary facts In functional sense, the sigmoid is a partial case of the softmax function, when the number of classes equals 2. Both of them do the same operation: transform the logits (see below) to probabilities. In simple binary classification, there’s no big difference between the two, however in case of multinomial classification, sigmoid allows … Read more

Keras input explanation: input_shape, units, batch_size, dim, etc

Units: The amount of “neurons”, or “cells”, or whatever the layer has inside it. It’s a property of each layer, and yes, it’s related to the output shape (as we will see later). In your picture, except for the input layer, which is conceptually different from other layers, you have: Hidden layer 1: 4 units … Read more

Why binary_crossentropy and categorical_crossentropy give different performances for the same problem?

The reason for this apparent performance discrepancy between categorical & binary cross entropy is what user xtof54 has already reported in his answer below, i.e.: the accuracy computed with the Keras method evaluate is just plain wrong when using binary_crossentropy with more than 2 labels I would like to elaborate more on this, demonstrate the … Read more

What is the role of the bias in neural networks? [closed]

I think that biases are almost always helpful. In effect, a bias value allows you to shift the activation function to the left or right, which may be critical for successful learning. It might help to look at a simple example. Consider this 1-input, 1-output network that has no bias: The output of the network … Read more