當前位置：首頁 > 编程语言 > python >内容正文

python

【Python-ML】非线性映射降维-KPCA方法

發布時間：2025/4/16 python 19 豆豆

生活随笔收集整理的這篇文章主要介紹了【Python-ML】非线性映射降维-KPCA方法小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

# -*- coding: utf-8 -*- ''' Created on 2018年1月18日 @author: Jason.F @summary: 特征抽取-KPCA方法，核主成分分析方法，RBF核實現 ''' import pandas as pd import numpy as np import matplotlib.pyplot as plt from scipy.spatial.distance import pdist,squareform from scipy import exp from scipy.linalg import eigh from sklearn.datasets import make_moons from sklearn.datasets import make_circles from sklearn.decomposition import PCA from matplotlib.ticker import FormatStrFormatter def rbf_kernel_pca(X,gama,n_components):'''RBF kernel PCA implementation.Parameters:X:{Numpy ndarray},shape=[n_samples,n_features]gama:float,Tuning parameter of the RBF kerneln_components:int,Number of principal components to returnReturns:X_pc:{Numpy ndarray},shape=[n_samples,n_features],Projected dataset'''#1：計算樣本對歐幾里得距離，并生成核矩陣#k(x,y)=exp(-gama *||x-y||^2)，x和y表示樣本，構建一個NXN的核矩陣，矩陣值是樣本間的歐氏距離值。#1.1:calculate pairwise squared Euclidean distances in the MXN dimensional dataset.sq_dists = pdist (X, 'sqeuclidean') #計算兩兩樣本間歐幾里得距離#1.2:convert pairwise distances into a square matrix.mat_sq_dists=squareform(sq_dists) #距離平方#1.3:compute the symmetric kernel matrix.K=exp(-gama * mat_sq_dists) #2:聚集核矩陣K'=K-L*K-K*L + L*K*L，其中L是一個nXn的矩陣(和核矩陣K的維數相同，所有的值都是1/n。#聚集核矩陣的必要性是：樣本經過標準化處理后，當在生成協方差矩陣并以非線性特征的組合替代點積時，所有特征的均值為0；但用低維點積計算時并沒有精確計算新的高維特征空間，也無法確定新特征空間的中心在零點。#center the kernel matrix.N=K.shape[0]one_n = np.ones((N,N))/N #NXN單位矩陣K=K - one_n.dot(K) - K.dot(one_n) + one_n.dot(K).dot(one_n)#3：對聚集后的核矩陣求取特征值和特征向量#obtaining eigenpairs from the centered kernel matrix#numpy.eigh returns them in sorted order.eigvals,eigvecs = eigh(K)#4：選擇前k個特征值所對應的特征向量，和PCA不同，KPCA得到的K個特征，不是主成分軸，而是高維映射到低維后的低維特征數量#核化過程是低維映射到高維，pca是降維，經過核化后的維度已經不是原來的特征空間。#核化是低維映射到高維，但并不是在高維空間計算(非線性特征組合)而是在低維空間計算(點積)，做到這點關鍵是核函數，核函數通過兩個向量點積來度量向量間相似度，能在低維空間內近似計算出高維空間的非線性特征空間。#collect the top k eigenvectors (projected samples).X_pc = np.column_stack((eigvecs[:,-i] for i in range(1,n_components+1)))return X_pc#case1:分離半月形數據 #1.1：生成二維線性不可分數據 X,y=make_moons(n_samples=100,random_state=123) plt.scatter(X[y==0,0],X[y==0,1],color='red',marker='^',alpha=0.5) plt.scatter(X[y==1,0],X[y==1,1],color='blue',marker='o',alpha=0.5) plt.show() #1.2：PCA降維，映射到主成分，仍不能很好線性分類 sk_pca = PCA(n_components=2) X_spca=sk_pca.fit_transform(X) fig,ax = plt.subplots(nrows=1,ncols=2,figsize=(7,3)) ax[0].scatter(X_spca[y==0,0],X_spca[y==0,1],color='red',marker='^',alpha=0.5) ax[0].scatter(X_spca[y==1,0],X_spca[y==1,1],color='blue',marker='o',alpha=0.5) ax[1].scatter(X_spca[y==0,0],np.zeros((50,1))+0.02,color='red',marker='^',alpha=0.5) ax[1].scatter(X_spca[y==1,0],np.zeros((50,1))-0.02,color='blue',marker='^',alpha=0.5) ax[0].set_xlabel('PC1') ax[0].set_ylabel('PC2') ax[1].set_ylim([-1,1]) ax[1].set_yticks([]) ax[1].set_xlabel('PC1') plt.show() #1.3：利用基于RBF核的KPCA來實現線性可分 X_kpca=rbf_kernel_pca(X, gama=15, n_components=2) fig,ax = plt.subplots(nrows=1,ncols=2,figsize=(7,3)) ax[0].scatter(X_kpca[y==0,0],X_kpca[y==0,1],color='red',marker='^',alpha=0.5) ax[0].scatter(X_kpca[y==1,0],X_kpca[y==1,1],color='blue',marker='o',alpha=0.5) ax[1].scatter(X_kpca[y==0,0],np.zeros((50,1))+0.02,color='red',marker='^',alpha=0.5) ax[1].scatter(X_kpca[y==1,0],np.zeros((50,1))-0.02,color='blue',marker='^',alpha=0.5) ax[0].set_xlabel('PC1') ax[0].set_ylabel('PC2') ax[1].set_ylim([-1,1]) ax[1].set_yticks([]) ax[1].set_xlabel('PC1') ax[0].xaxis.set_major_formatter(FormatStrFormatter('%0.1f')) ax[1].xaxis.set_major_formatter(FormatStrFormatter('%0.1f')) plt.show()#case2：分離同心圓 #2.1：生成同心圓數據 X,y=make_circles(n_samples=1000,random_state=123,noise=0.1,factor=0.2) plt.scatter(X[y==0,0],X[y==0,1],color='red',marker='^',alpha=0.5) plt.scatter(X[y==1,0],X[y==1,1],color='blue',marker='o',alpha=0.5) plt.show() #2.2：標準PCA映射 sk_pca = PCA(n_components=2) X_spca=sk_pca.fit_transform(X) fig,ax = plt.subplots(nrows=1,ncols=2,figsize=(7,3)) ax[0].scatter(X_spca[y==0,0],X_spca[y==0,1],color='red',marker='^',alpha=0.5) ax[0].scatter(X_spca[y==1,0],X_spca[y==1,1],color='blue',marker='o',alpha=0.5) ax[1].scatter(X_spca[y==0,0],np.zeros((500,1))+0.02,color='red',marker='^',alpha=0.5) ax[1].scatter(X_spca[y==1,0],np.zeros((500,1))-0.02,color='blue',marker='^',alpha=0.5) ax[0].set_xlabel('PC1') ax[0].set_ylabel('PC2') ax[1].set_ylim([-1,1]) ax[1].set_yticks([]) ax[1].set_xlabel('PC1') plt.show() #2.3：RBF-KPCA映射 X_kpca=rbf_kernel_pca(X, gama=15, n_components=2) fig,ax = plt.subplots(nrows=1,ncols=2,figsize=(7,3)) ax[0].scatter(X_kpca[y==0,0],X_kpca[y==0,1],color='red',marker='^',alpha=0.5) ax[0].scatter(X_kpca[y==1,0],X_kpca[y==1,1],color='blue',marker='o',alpha=0.5) ax[1].scatter(X_kpca[y==0,0],np.zeros((500,1))+0.02,color='red',marker='^',alpha=0.5) ax[1].scatter(X_kpca[y==1,0],np.zeros((500,1))-0.02,color='blue',marker='^',alpha=0.5) ax[0].set_xlabel('PC1') ax[0].set_ylabel('PC2') ax[1].set_ylim([-1,1]) ax[1].set_yticks([]) ax[1].set_xlabel('PC1') ax[0].xaxis.set_major_formatter(FormatStrFormatter('%0.1f')) ax[1].xaxis.set_major_formatter(FormatStrFormatter('%0.1f')) plt.show()

case1結果：

case2結果：

總結

以上是生活随笔為你收集整理的【Python-ML】非线性映射降维-KPCA方法的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：【Python-ML】SKlearn库特
下一篇：【Python-ML】非线性映射降维-K