Mean squared logarithmic error#

Corresponde al valor esperado del error o pérdida logaritmica al cuadrado.

Esta métrica se usa para variables que tienen un comportamiento exponencial como poblaciones.

La métrica penaliza más los estimados por debajo que por encima del valor real.

Se computa como:

\text{MSLE}(y, \hat{y}) = \frac{1}{n_{\text{samples}}} \sum_{i=0}^{n_{\text{samples}} - 1} ( \log_e(1 +y_i) - \log_e(1 + \hat{y}_i) )^2

[1]:

from sklearn.metrics import mean_squared_log_error

y_true = [3.0, 5.0, 2.5, 7.0]
y_pred = [2.5, 5.0, 4.0, 8.0]

# 1/4 * ((log(1+3.0) - log(1+2.5))**2 +
#        (log(1+5.0) - log(1+5.0))**2 +
#        (log(1+2.5) - log(1+4.0))**2 +
#        (log(1+7.0) - log(1+8.0))**2 ) = 0.039730
#
mean_squared_log_error(
    # -------------------------------------------------------------------------
    # Ground truth (correct) target values.
    y_true=y_true,
    # -------------------------------------------------------------------------
    # Estimated target values.
    y_pred=y_pred,
    # -------------------------------------------------------------------------
    # Sample weights.
    sample_weight=None,
    # -------------------------------------------------------------------------
    # Defines aggregating of multiple output scores.
    # * 'raw_values': Returns a full set of scores in case of multioutput input.
    # * 'uniform_average': Scores of all outputs are averaged with uniform
    #      weight.
    multioutput="uniform_average",
    # -------------------------------------------------------------------------
    # If True returns MSLE value, if False returns RMSLE value.
    squared=True,
)

[1]:

0.03973012298459379

[2]:

#
# sqrt(0.039730) = 0.199324
#
mean_squared_log_error(
    y_true,
    y_pred,
    squared=False,
)

[2]:

0.19932416558108

[3]:

y_true = [[0.5, 1.0], [1.0, 2.0], [7.0, 6.0]]
y_pred = [[0.5, 2.0], [1.0, 2.5], [8.0, 8.0]]

# y_true = [0.5, 1.0, 7.0]
# y_pred = [0.5, 1.0, 8.0]
# mean_squared_log_error(y_true, y_pred) = 0.004624
#
# y_true = [1.0, 2.0, 6.0]
# y_pred = [2.0, 2.5, 8.0]
# mean_squared_log_error(y_true, y_pred) = 0.083774

mean_squared_log_error(
    y_true,
    y_pred,
    multioutput="raw_values",
)

[3]:

array([0.00462428, 0.08377444])

[4]:

#
# 0.5 * (0.004624 + 0.083774) = 0.044199
#
mean_squared_log_error(
    y_true,
    y_pred,
)

[4]:

0.044199361889160516

[5]:

#
# 0.3 * 0.004624 + 0.7 * 0.083774 = 0.060029
#
mean_squared_log_error(
    y_true,
    y_pred,
    multioutput=[0.3, 0.7],
)

[5]:

0.06002939417970032