Logistic (Log) Loss Function

From GM-RKB
(Redirected from log loss function)
Jump to navigation Jump to search

A Logistic (Log) Loss Function is a convex loss function that is defined as the negative log-likelihood of a logistic model.



References

2021a

  • (Wikipedia, 2021) ⇒ https://en.wikipedia.org/wiki/Loss_functions_for_classification#Logistic_loss Retrieved:2021-3-7.
    • The logistic loss function can be generated using (2) and Table-I as follows : \begin{align} \phi(v) &= C[f^{-1}(v)]+\left(1-f^{-1}(v)\right)\, C'\left[f^{-1}(v)\right] \\ &= \frac{1}{\log(2)}\left [\frac{-e^v}{1+e^v}\log\frac{e^v}{1+e^v}-\left(1-\frac{e^v}{1+e^v}\right)\log\left(1-\frac{e^v}{1+e^v}\right)\right ]+\left(1-\frac{e^v}{1+e^v}\right) \left [\frac{-1}{\log(2)}\log\left(\frac{\frac{e^v}{1+e^v}}{1-\frac{e^v}{1+e^v}}\right)\right] \\ &=\frac{1}{\log(2)}\log(1+e^{-v}). \end{align} The logistic loss is convex and grows linearly for negative values which make it less sensitive to outliers. The logistic loss is used in the LogitBoost algorithm.

      The minimizer of I[f] for the logistic loss function can be directly found from equation (1) as : f^*_\text{Logistic}= \log\left(\frac{\eta}{1-\eta}\right)=\log\left(\frac{p(1\mid x)}{1-p(1\mid x)}\right). This function is undefined when p(1\mid x)=1 or p(1\mid x)=0 (tending toward ∞ and −∞ respectively), but predicts a smooth curve which grows when p(1\mid x) increases and equals 0 when p(1\mid x)= 0.5 .

      It's easy to check that the logistic loss and binary cross entropy loss (Log loss) are in fact the same (up to a multiplicative constant \frac{1}{\log(2)} ). The cross entropy loss is closely related to the Kullback–Leibler divergence between the empirical distribution and the predicted distribution. The cross entropy loss is ubiquitous in modern deep neural networks.

2021b

2021c

2018a

2018b

2017a

2017b

from math import log 
def log_loss(predicted, target): if len(predicted) != len(target): print 'lengths not equal!' return target = [float(x) for x in target] # make sure all float values predicted = [min([max([x,1e-15]),1-1e-15]) for x in predicted] # within (0,1) interval return -(1.0/len(target))*sum([target[i]*log(predicted[i]) + \ (1.0-target[i])*log(1.0-predicted[i]) \ for i in xrange(len(target))])
if __name__=='__main__': # if you run at the command line as 'python utils.py' actual = [0, 1, 1, 1, 1, 0, 0, 1, 0, 1] pred = [0.24160452, 0.41107934, 0.37063768, 0.48732519, 0.88929869, 0.60626423, 0.09678324, 0.38135864, 0.20463064, 0.21945892] print log_loss(pred,actual)

2016

def log_loss(solution, prediction, task = 'binary.classification'):
    Log loss for binary and multiclass. 
   [sample_num, label_num] = solution.shape
   eps = 1e-15
   pred = np.copy(prediction) # beware: changes in prediction occur through this
   sol = np.copy(solution)
   if (task == 'multiclass.classification') and (label_num>1):
       # Make sure the lines add up to one for multi-class classification
       norma = np.sum(prediction, axis=1)
       for k in range(sample_num):
           pred[k,:] /= sp.maximum (norma[k], eps) 
       # Make sure there is a single label active per line for multi-class classification
       sol = binarize_predictions(solution, task='multiclass.classification')
       # For the base prediction, this solution is ridiculous in the multi-label case
   # Bounding of predictions to avoid log(0),1/0,...
   pred = sp.minimum (1-eps, sp.maximum (eps, pred))
   # Compute the log loss
   pos_class_log_loss = - mvmean(sol*np.log(pred), axis=0)
   if (task != 'multiclass.classification') or (label_num==1):
       # The multi-label case is a bunch of binary problems.
       # The second class is the negative class for each column.
       neg_class_log_loss = - mvmean((1-sol)*np.log(1-pred), axis=0)
       log_loss = pos_class_log_loss + neg_class_log_loss
       # Each column is an independent problem, so we average.
       # The probabilities in one line do not add up to one.
       # log_loss = mvmean(log_loss) 
       # print('binary {}'.format(log_loss))
       # In the multilabel case, the right thing i to AVERAGE not sum
       # We return all the scores so we can normalize correctly later on
   else:
       # For the multiclass case the probabilities in one line add up one.
       log_loss = pos_class_log_loss
       # We sum the contributions of the columns.
       log_loss = np.sum(log_loss) 
       #print('multiclass {}'.format(log_loss))
   return log_loss

2015

2014