Nystroem#
Se sabe que un kernel es una función que computa:
k(x, x_i) = \lt \phi(x), \phi(x_i) \gt
donde <,> representa el producto interno entre matrices.
Para datasets de gran tamaño es preferible reemplazar el cálculo exacto de un kernel por un método de aproximación.
Nystroem aproxima un kernel mediante el muestreo de un subconjunto de los datos, con el fin de no generar una matriz de n \times n, donde n es la cantidad de ejemplos.
Los métodos de aproximación permiten la transformación no lineal de las variables de entrada; las transformaciones obtenidas pueden servir de base para modelos lineales y otros algoritmos, por ejemplo, la combinación con un estimador SGDClassifier.
[1]:
from sklearn.datasets import load_digits
from sklearn.kernel_approximation import Nystroem
from sklearn.svm import LinearSVC
X, y = load_digits(
n_class=9,
return_X_y=True,
)
data = X / 16.0
linearSVC = LinearSVC()
nystroem = Nystroem(
# -------------------------------------------------------------------------
# Kernel map to be approximated.
kernel="rbf",
# -------------------------------------------------------------------------
# Gamma parameter for the RBF, laplacian, polynomial, exponential chi2 and
# sigmoid kernels.
gamma=0.2,
# -------------------------------------------------------------------------
# Zero coefficient for polynomial and sigmoid kernels. Ignored by other
# kernels
coef0=None,
# -------------------------------------------------------------------------
# Degree of the polynomial kernel. Ignored by other kernels.
degree=None,
# -------------------------------------------------------------------------
# Number of features to construct. How many data points will be used to
# construct the mapping.
n_components=300,
# -------------------------------------------------------------------------
# Pseudo-random number generator to control the uniform sampling without
# replacement of n_components of the training data to construct the basis
# kernel.
random_state=1,
)
data_transformed = nystroem.fit_transform(data)
linearSVC.fit(data_transformed, y)
linearSVC.score(data_transformed, y)
[1]:
0.9987631416202845
[2]:
#
# Dimensiones del dataset
#
data.shape
[2]:
(1617, 64)
[3]:
#
# Dimensiones del dataset transformado
#
data_transformed.shape
[3]:
(1617, 300)
[4]:
#
# Subset of training points used to construct the feature map.
#
nystroem.components_
[4]:
array([[0. , 0. , 0.5625, ..., 0. , 0. , 0. ],
[0. , 0. , 0.4375, ..., 0.3125, 0. , 0. ],
[0. , 0. , 0.125 , ..., 0.0625, 0. , 0. ],
...,
[0. , 0. , 0.6875, ..., 0. , 0. , 0. ],
[0. , 0. , 0.75 , ..., 0.5 , 0. , 0. ],
[0. , 0. , 0. , ..., 0.4375, 0. , 0. ]])
[5]:
nystroem.components_.shape
[5]:
(300, 64)
[6]:
#
# Indices of components_ in the training set.
#
nystroem.component_indices_
[6]:
array([ 108, 1339, 258, ..., 1096, 235, 1061])
[7]:
nystroem.component_indices_.shape
[7]:
(1617,)
[8]:
#
# Patrones en el espacio original de los datos
#
data[nystroem.component_indices_, :]
[8]:
array([[0. , 0. , 0.5625, ..., 0. , 0. , 0. ],
[0. , 0. , 0.4375, ..., 0.3125, 0. , 0. ],
[0. , 0. , 0.125 , ..., 0.0625, 0. , 0. ],
...,
[0. , 0. , 0.25 , ..., 0. , 0. , 0. ],
[0. , 0. , 0. , ..., 0. , 0. , 0. ],
[0. , 0.125 , 0.625 , ..., 0.1875, 0. , 0. ]])
[9]:
nystroem.component_indices_.shape
[9]:
(1617,)
[10]:
#
# Normalization matrix needed for embedding.
#
nystroem.normalization_
[10]:
array([[ 1.74871777e+00, 8.91967740e-03, -2.67099689e-03, ...,
5.14032084e-03, -2.41497683e-03, -5.72252710e-03],
[ 8.91967740e-03, 2.42597573e+00, -2.37991804e-03, ...,
-4.56829956e-03, -3.13785752e-03, 1.45713393e-03],
[-2.67099689e-03, -2.37991804e-03, 2.03135146e+00, ...,
1.17305163e-02, -8.50655899e-04, 3.69518591e-02],
...,
[ 5.14032084e-03, -4.56829956e-03, 1.17305163e-02, ...,
1.78886081e+00, 1.08028490e-02, -2.16321887e-03],
[-2.41497683e-03, -3.13785752e-03, -8.50655899e-04, ...,
1.08028490e-02, 1.71934129e+00, -9.94733358e-03],
[-5.72252710e-03, 1.45713393e-03, 3.69518591e-02, ...,
-2.16321887e-03, -9.94733358e-03, 2.49165989e+00]])
[11]:
nystroem.normalization_.shape
[11]:
(300, 300)