QuantileRegressor#

  • Este regresor estima la mediana u otros cuartiles de y condicionales sobre X, mientras que los mínimos cuadrados ordinarios (OLS) estima la media condicional.

  • Este modelo lineal da la predicción \hat{y}(w,X) = Xw para el q-ésimo cuartial, con q \in (0,1).

  • Los coeficientes son obtenidos con la siguiente función objetivo:

\min_w \frac{1}{n_\text{samples}} \sum_i \text{PB}_q (y_i - X_iw) + \alpha * ||w||_1

  • PB se define como:

    \text{PB}_q(t) = q \; \max(t, 0) + (1-q) \max(-t, 0) = \left\{ \begin{array} qt, & t > 0 \\ 0, & t = 0, \\ (q-1)t, & t < 0 \end{array} \right.

  • Es un regresor robusto a outliers.

  • Se usa para pronosticar los intervalos en vez de la media (predicción puntual), ya que en muchos casos, los intervalos son calculados suponiendo que el error sigue una distribución normal con media cero y varianza constante.

  • Permite pronosticar intervalos para errores con varianza no constante o que no siguen una distribución normal.

[1]:
import numpy as np

n_samples, n_features = 10, 2
rng = np.random.RandomState(0)

y = rng.randn(n_samples)
X = rng.randn(n_samples, n_features)
[2]:
from sklearn.linear_model import QuantileRegressor

estimator = QuantileRegressor(
    # -------------------------------------------------------------------------
    # The quantile that the model tries to predict. It must be strictly between
    # 0 and 1. If 0.5 (default), the model predicts the 50% quantile, i.e. the
    # median.
    quantile=0.5,
    # -------------------------------------------------------------------------
    # Regularization constant that multiplies the L1 penalty term.
    alpha=1.0,
    # -------------------------------------------------------------------------
    # Whether or not to fit the intercept.
    fit_intercept=True,
    # -------------------------------------------------------------------------
    # Method used by scipy.optimize.linprog to solve the linear programming
    # formulation.
    # * 'highs-ds'
    # * 'highs-ipm'
    # * 'highs'
    # * 'interior-point'
    # * 'revised simplex'
    solver='interior-point',
    # -------------------------------------------------------------------------
    # Additional parameters passed to scipy.optimize.linprog as options. If
    # None and if solver='interior-point', then {"lstsq": True} is passed to
    # scipy.optimize.linprog for the sake of stability.
    solver_options=None,
)
[3]:
estimator.fit(X, y)
[3]:
QuantileRegressor(solver='interior-point')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
[4]:
estimator.coef_
[4]:
array([ 4.45409994e-14, -1.83417568e-13])
[5]:
estimator.intercept_
[5]:
0.7194722439589483