SVR: Support Vector Regression#

Para la regresión, el problema primal se define como:

\min_{w,b, \zeta, \zeta^*} \;\;\; \frac{1}{2} w^T w \; + \; C \sum_{i=1}^n (\zeta_i + \zeta_i^*)

sujeto a:

\begin{split} y_i - \left( w^T \phi(x_i) + b \right) \le \; & \epsilon - \zeta_i \\ \\ - y_i + \left( w^T \phi(x_i) + b \right) \le \; & \epsilon - \zeta_i^* \\ \\ \zeta_i, \zeta_i^* \ge & \; 0 \\ \\ i= & 1, ..., n \\ \end{split}

El problema dual es:

\min_{\alpha, \alpha^*} \;\;\; \frac{1}{2} (\alpha - \alpha^*)^T Q (\alpha - \alpha^*) \; + \; \epsilon e^T (\alpha + \alpha^*) - y^T (\alpha + \alpha^*)

sujeto a:

\begin{split} e^T (\alpha - \alpha^*) = \; & 0 \\ \\ \alpha_i, \alpha_i^* \ge & \; C \\ \\ i= & 1, ..., n \\ \end{split}

La predicción del modelo se calcula como:

\sum_{i \text{ in } SV} (\alpha_i - \alpha_i^*) K(x_i, x) + b

[1]:
import numpy as np

n_samples, n_features = 10, 5
rng = np.random.RandomState(0)
y = rng.randn(n_samples)
X = rng.randn(n_samples, n_features)
[2]:
from sklearn.svm import SVR

svr = SVR(
    # --------------------------------------------------------------------------
    # Specifies the kernel type to be used in the algorithm. If none is given,
    # ‘rbf’ will be used.
    # * 'linear'
    # * 'poly'
    # * rbf'
    # * 'sigmoid'
    kernel='rbf',
    # --------------------------------------------------------------------------
    # Degree of the polynomial kernel function (‘poly’). Must be non-negative.
    # Ignored by all other kernels.
    degree=3,
    # ----------------------------------------------------------------------------
    # Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’.
    # * if gamma='scale' (default) is passed then it uses
    #   1 / (n_features * X.var()) as value of gamma,
    # * if ‘auto’, uses 1 / n_features
    # * if float, must be non-negative.
    gamma='scale',
    # ----------------------------------------------------------------------------
    # Independent term in kernel function. It is only significant in ‘poly’ and
    # ‘sigmoid’.
    coef0=0.0,
    # --------------------------------------------------------------------------
    # Tolerance for stopping criterion.
    tol=1e-3,
    # --------------------------------------------------------------------------
    # Regularization parameter. The strength of the regularization is inversely
    # proportional to C. Must be strictly positive. The penalty is a squared l2
    # penalty.
    C=1.0,
    # --------------------------------------------------------------------------
    # Epsilon in the epsilon-SVR model. It specifies the epsilon-tube within
    # which no penalty is associated in the training loss function with points
    # predicted within a distance epsilon from the actual value. Must be
    # non-negative.
    epsilon=0.1,
    # --------------------------------------------------------------------------
    # Hard limit on iterations within solver, or -1 for no limit.
    max_iter=-1,
)

svr.fit(X, y)
svr.predict(X)
[2]:
array([ 0.80693615,  0.5000337 ,  0.88964549,  1.75067062,  0.83376958,
        0.3181422 ,  0.94095627, -0.05129545,  0.27037618,  0.31066025])
[3]:
svr.dual_coef_
[3]:
array([[ 1.        , -0.35876841,  1.        ,  1.        , -1.        ,
        -0.78118084, -1.        ,  0.13994925]])
[4]:
svr.fit_status_
[4]:
0
[5]:
svr.intercept_
[5]:
array([0.81363927])
[6]:
svr.n_support_
[6]:
array([8], dtype=int32)
[7]:
svr.support_
[7]:
array([0, 1, 3, 4, 5, 7, 8, 9], dtype=int32)
[8]:
svr.support_vectors_
[8]:
array([[ 0.14404357,  1.45427351,  0.76103773,  0.12167502,  0.44386323],
       [ 0.33367433,  1.49407907, -0.20515826,  0.3130677 , -0.85409574],
       [-1.45436567,  0.04575852, -0.18718385,  1.53277921,  1.46935877],
       [ 0.15494743,  0.37816252, -0.88778575, -1.98079647, -0.34791215],
       [ 0.15634897,  1.23029068,  1.20237985, -0.38732682, -0.30230275],
       [-0.4380743 , -1.25279536,  0.77749036, -1.61389785, -0.21274028],
       [-0.89546656,  0.3869025 , -0.51080514, -1.18063218, -0.02818223],
       [ 0.42833187,  0.06651722,  0.3024719 , -0.63432209, -0.36274117]])