Mean squared logarithmic error#

  • Corresponde al valor esperado del error o pérdida logaritmica al cuadrado.

  • Esta métrica se usa para variables que tienen un comportamiento exponencial como poblaciones.

  • La métrica penaliza más los estimados por debajo que por encima del valor real.

  • Se computa como:

    \text{MSLE}(y, \hat{y}) = \frac{1}{n_{\text{samples}}} \sum_{i=0}^{n_{\text{samples}} - 1} ( \log_e(1 +y_i) - \log_e(1 + \hat{y}_i) )^2

[1]:
from sklearn.metrics import mean_squared_log_error

y_true = [3.0, 5.0, 2.5, 7.0]
y_pred = [2.5, 5.0, 4.0, 8.0]

# 1/4 * ((log(1+3.0) - log(1+2.5))**2 +
#        (log(1+5.0) - log(1+5.0))**2 +
#        (log(1+2.5) - log(1+4.0))**2 +
#        (log(1+7.0) - log(1+8.0))**2 ) = 0.039730
#
mean_squared_log_error(
    # -------------------------------------------------------------------------
    # Ground truth (correct) target values.
    y_true=y_true,
    # -------------------------------------------------------------------------
    # Estimated target values.
    y_pred=y_pred,
    # -------------------------------------------------------------------------
    # Sample weights.
    sample_weight=None,
    # -------------------------------------------------------------------------
    # Defines aggregating of multiple output scores.
    # * 'raw_values': Returns a full set of scores in case of multioutput input.
    # * 'uniform_average': Scores of all outputs are averaged with uniform
    #      weight.
    multioutput="uniform_average",
    # -------------------------------------------------------------------------
    # If True returns MSLE value, if False returns RMSLE value.
    squared=True,
)
[1]:
0.03973012298459379
[2]:
#
# sqrt(0.039730) = 0.199324
#
mean_squared_log_error(
    y_true,
    y_pred,
    squared=False,
)
[2]:
0.19932416558108
[3]:
y_true = [[0.5, 1.0], [1.0, 2.0], [7.0, 6.0]]
y_pred = [[0.5, 2.0], [1.0, 2.5], [8.0, 8.0]]

# y_true = [0.5, 1.0, 7.0]
# y_pred = [0.5, 1.0, 8.0]
# mean_squared_log_error(y_true, y_pred) = 0.004624
#
# y_true = [1.0, 2.0, 6.0]
# y_pred = [2.0, 2.5, 8.0]
# mean_squared_log_error(y_true, y_pred) = 0.083774

mean_squared_log_error(
    y_true,
    y_pred,
    multioutput="raw_values",
)
[3]:
array([0.00462428, 0.08377444])
[4]:
#
# 0.5 * (0.004624 + 0.083774) = 0.044199
#
mean_squared_log_error(
    y_true,
    y_pred,
)
[4]:
0.044199361889160516
[5]:
#
# 0.3 * 0.004624 + 0.7 * 0.083774 = 0.060029
#
mean_squared_log_error(
    y_true,
    y_pred,
    multioutput=[0.3, 0.7],
)
[5]:
0.06002939417970032