日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

3.10 程序示例--神经网络设计-机器学习笔记-斯坦福吴恩达教授

發布時間:2025/4/5 编程问答 16 豆豆
生活随笔 收集整理的這篇文章主要介紹了 3.10 程序示例--神经网络设计-机器学习笔记-斯坦福吴恩达教授 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

神經網絡設計

在神經網絡的結構設計方面,往往遵循如下要點:

  • 輸入層的單元數等于樣本特征數。
  • 輸出層的單元數等于分類的類型數。
  • 每個隱層的單元數通常是越多分類精度越高,但是也會帶來計算性能的下降,因此,要平衡質量和性能間的關系。
  • 默認不含有隱藏層(感知器),如果含有多個隱層,則每個隱層上的單元數最好保持一致。
  • 因此,對于神經網絡模塊,我們考慮如下設計:

    • 設計 sigmoid 函數作為激勵函數:
      g(z)=11+e?zg(z)=\frac1{1+e^{?z}}g(z)=1+e?z1?g′(z)=g(z)(1?(g(z)))g′(z)=g(z)(1?(g(z)))g(z)=g(z)(1?(g(z)))=a(1?a)=a(1?a)=a(1?a)a=g(z)a=g(z)a=g(z)
    def sigmoid(z):"""sigmoid"""return 1 / (1 + np.exp(-z))def sigmoidDerivative(a):"""sigmoid求導"""return np.multiply(a, (1-a))
    • 設計初始化權值矩陣的函數:
    def initThetas(hiddenNum, unitNum, inputSize, classNum, epsilon):"""初始化權值矩陣Args:hiddenNum 隱層數目unitNum 每個隱層的神經元數目inputSize 輸入層規模classNum 分類數目epsilon epsilonReturns:Thetas 權值矩陣序列"""hiddens = [unitNum for i in range(hiddenNum)]units = [inputSize] + hiddens + [classNum]Thetas = []for idx, unit in enumerate(units):if idx == len(units) - 1:breaknextUnit = units[idx + 1]# 考慮偏置Theta = np.random.rand(nextUnit, unit + 1) * 2 * epsilon - epsilonThetas.append(Theta)return Thetas
    • 定義參數展開和參數還原函數:
    def unroll(matrixes):"""參數展開Args:matrixes 矩陣Return:vec 向量"""vec = []for matrix in matrixes:vetor = matrix.reshape(1,-1)[0]vec = np.concatenate((vec,vector))return vecdef roll(vector, shapes):"""參數恢復Args:vector 向量shapes shape listReturns:matrixes 恢復的矩陣序列"""matrixes = []begin = 0for shape in shapes:end = begin + shape[0] * shape[1]matrix = vector[begin:end].reshape(shape)begin = endmatrixes.append(matrix)return matrixes
    • 定義梯度校驗過程:
    def gradientCheck(Thetas,X,y,theLambda):"""梯度校驗Args:Thetas 權值矩陣X 樣本y 標簽theLambda 正規化參數Returns:checked 是否檢測通過"""m, n = X.shape# 前向傳播計算各個神經元的激活值a = fp(Thetas, X)# 反向傳播計算梯度增量D = bp(Thetas, a, y, theLambda)# 計算預測代價J = computeCost(Thetas, y, theLambda, a=a)DVec = unroll(D)# 求梯度近似epsilon = 1e-4gradApprox = np.zeros(DVec.shape)ThetaVec = unroll(Thetas)shapes = [Theta.shape for Theta in Thetas]for i,item in enumerate(ThetaVec):ThetaVec[i] = item - epsilonJMinus = computeCost(roll(ThetaVec,shapes),y,theLambda,X=X)ThetaVec[i] = item + epsilonJPlus = computeCost(roll(ThetaVec,shapes),y,theLambda,X=X)gradApprox[i] = (JPlus-JMinus) / (2*epsilon)# 用歐氏距離表示近似程度diff = np.linalg.norm(gradApprox - DVec)if diff < 1e-2:return Trueelse:return False
    • 計算代價計算函數:
      J(Θ)J(Θ)=?1m∑i=1m∑k=1K[yk(i)log((hΘ(x(i)))k)+(1?yk(i))log(1?(hΘ(x(i)))k)]+λ2m∑l=1L?1∑i=1sl∑j=1sl+1(Θj,i(l))2J(Θ)J(Θ)=?\frac 1m∑_{i=1}^m∑_{k=1}^K[y^{(i)}_k\ log((h_Θ(x^{(i)}))_k)+(1?y^{(i)}_k)\ log(1?(h_Θ(x^{(i)}))_k)]+\frac λ{2m}∑_{l=1}^{L?1}∑_{i=1}^{s_l}∑_{j=1}^{s_l+1}(Θ^{(l)}_{j,i})^2J(Θ)J(Θ)=?m1?i=1m?k=1K?[yk(i)??log((hΘ?(x(i)))k?)+(1?yk(i)?)?log(1?(hΘ?(x(i)))k?)]+2mλ?l=1L?1?i=1sl??j=1sl?+1?(Θj,i(l)?)2矩陣表示為:J(Θ)=?1m∑(YT.?log(ΘA))+log(1?ΘA).?(1?YT))矩陣表示為:J(\Theta)=-\frac1m\sum(Y^T.*\ log(\Theta A))+log(1-\Theta A).*(1-Y^T))J(Θ)=?m1?(YT.??log(ΘA))+log(1?ΘA).?(1?YT))
    def computeCost(Thetas, y, theLambda, X=None, a=None):"""計算代價Args:Thetas 權值矩陣序列X 樣本y 標簽集a 各層激活值Returns:J 預測代價"""m = y.shape[0]if a is None:a = fp(Thetas, X)error = -np.sum(np.multiply(y.T,np.log(a[-1]))+np.multiply((1-y).T, np.log(1-a[-1])))# 正規化參數reg = -np.sum([np.sum(Theta[:, 1:]) for Theta in Thetas])return (1.0 / m) * error + (1.0 / (2 * m)) * theLambda * reg
    • 設計前向傳播過程:
      a(1)=xa^{(1)}=xa(1)=xz(2)=Θ(1)a(1)z^{(2)}=\Theta^{(1)}a^{(1)}z(2)=Θ(1)a(1)a(2)=g(z(2))a^{(2)}=g(z^{(2)})a(2)=g(z(2))z(3)=Θ(3)a(3)z^{(3)}=\Theta^{(3)}a^{(3)}z(3)=Θ(3)a(3)a(3)=g(z(3))a^{(3)}=g(z^{(3)})a(3)=g(z(3))hΘ(x)=a(3)h_\Theta(x)=a^{(3)} hΘ?(x)=a(3)
    def fp(Thetas, X):"""前向反饋過程Args:Thetas 權值矩陣X 輸入樣本Returns:a 各層激活向量"""layers = range(len(Thetas) + 1)layerNum = len(layers)# 激活向量序列a = range(layerNum)# 前向傳播計算各層輸出for l in layers:if l == 0:a[l] = X.Telse:z = Thetas[l - 1] * a[l - 1]a[l] = sigmoid(z)# 除輸出層外,需要添加偏置if l != layerNum - 1:a[l] = np.concatenate((np.ones((1, a[l].shape[1])), a[l]))return a
    • 設計反向傳播過程
      σ(l)={a(l)?yl=L(Θ(l)σ(l+1))T.?g′(z(l))l=2,3,...,L?1\sigma^{(l)}=\begin{cases} a^{(l)}-y\quad\quad\quad\ \quad\quad\quad\quad l=L\\ (\Theta^{(l)}\sigma^{(l+1)})^T.*g^′(z^{(l)})\quad l=2,3,...,L-1 \end{cases}σ(l)={a(l)?y?l=L(Θ(l)σ(l+1))T.?g(z(l))l=2,3,...,L?1?

    Δ(l)=δ(l+1)(a(l))T\Delta^{(l)}=\delta^{(l+1)}(a^{(l)})^TΔ(l)=δ(l+1)(a(l))T

    Di,j(l)={1m(Δi,j(l)+λΘi,j(l)),ifj≠01m(Δij(l)),ifj=0D^{(l)}_{i,j}=\begin{cases} \frac 1m(\Delta^{(l)}_{i,j}+\lambda\Theta^{(l)}_{i,j}), \ \ \quad\quad if\ j \ne 0\\ \frac1m(\Delta^{(l)}_{ij}),\quad\quad\quad\quad\quad\quad if j=0 \end{cases}Di,j(l)?={m1?(Δi,j(l)?+λΘi,j(l)?),??if?j?=0m1?(Δij(l)?),ifj=0?

    def bp(Thetas, a, y, theLambda):"""反向傳播過程Args:a 激活值y 標簽Returns:D 權值梯度"""m = y.shape[0]layers = range(len(Thetas) + 1)layerNum = len(layers)d = range(len(layers))delta = [np.zeros(Theta.shape) for Theta in Thetas]for l in layers[::-1]:if l == 0:# 輸入層不計算誤差breakif l == layerNum - 1:# 輸出層誤差d[l] = a[l] - y.Telse:# 忽略偏置d[l] = np.multiply((Thetas[l][:,1:].T * d[l + 1]), sigmoidDerivative(a[l][1:, :]))for l in layers[0:layerNum - 1]:delta[l] = d[l + 1] * (a[l].T)D = [np.zeros(Theta.shape) for Theta in Thetas]for l in range(len(Thetas)):Theta = Thetas[l]# 偏置更新增量D[l][:, 0] = (1.0 / m) * (delta[l][0:, 0].reshape(1, -1))# 權值更新增量D[l][:, 1:] = (1.0 / m) * (delta[l][0:, 1:] +theLambda * Theta[:, 1:])return D
    • 獲得了梯度后,設計權值更新過程:
      Θ(l)=Θ(l)+αD(l)\Theta^{(l)}=\Theta^{(l)}+\alpha D^{(l)}Θ(l)=Θ(l)+αD(l)
    def updateThetas(m, Thetas, D, alpha, theLambda):"""更新權值Args:m 樣本數Thetas 各層權值矩陣D 梯度alpha 學習率theLambda 正規化參數Returns:Thetas 更新后的權值矩陣"""for l in range(len(Thetas)):Thetas[l] = Thetas[l] - alpha * D[l]return Thetas
    • 綜上,我們能得到梯度下降過程:
      1.前向傳播計算各層激活值
      2.反向計算權值的更新梯度
      3.更新權值
    def gradientDescent(Thetas, X, y, alpha, theLambda):"""梯度下降Args:X 樣本y 標簽alpha 學習率theLambda 正規化參數Returns:J 預測代價Thetas 更新后的各層權值矩陣"""# 樣本數,特征數m, n = X.shape# 前向傳播計算各個神經元的激活值a = fp(Thetas, X)# 反向傳播計算梯度增量D = bp(Thetas, a, y, theLambda)# 計算預測代價J = computeCost(Thetas,y,theLambda,a=a)# 更新權值Thetas = updateThetas(m, Thetas, D, alpha, theLambda)if np.isnan(J):J = np.infreturn J, Thetas
    • 則整個網絡的訓練過程如下:
      • 默認由系統自動初始化權值矩陣
      • 默認為不含有隱層的感知器神經網絡
      • 默認隱層單元數為 5 個
      • 默認學習率為 1
      • 默認不進行正規化
      • 默認誤差精度為 10?2
      • 默認最大迭代次數為 50 次

    在訓練之前,我們會進行一次梯度校驗來確定網絡是否正確:

    def train(X, y, Thetas=None, hiddenNum=0, unitNum=5, epsilon=1, alpha=1, theLambda=0, precision=0.01, maxIters=50):"""網絡訓練Args:X 訓練樣本y 標簽集Thetas 初始化的Thetas,如果為None,由系統隨機初始化ThetashiddenNum 隱藏層數目unitNum 隱藏層的單元數epsilon 初始化權值的范圍[-epsilon, epsilon]alpha 學習率theLambda 正規化參數precision 誤差精度maxIters 最大迭代次數"""# 樣本數,特征數m, n = X.shape# 矯正標簽集y = adjustLabels(y)classNum = y.shape[1]# 初始化Thetaif Thetas is None:Thetas = initThetas(inputSize=n,hiddenNum=hiddenNum,unitNum=unitNum,classNum=classNum,epsilon=epsilon)# 先進性梯度校驗print 'Doing Gradient Checking....'checked = gradientCheck(Thetas, X, y, theLambda)if checked:for i in range(maxIters):error, Thetas = gradientDescent(Thetas, X, y, alpha=alpha, theLambda=theLambda)if error < precision:breakif error == np.inf:breakif error < precision:success = Trueelse:success = Falsereturn {'error': error,'Thetas': Thetas,'iters': i,'success': error}else:print 'Error: Gradient Cheching Failed!!!'return {'error': None,'Thetas': None,'iters': 0,'success': False}

    訓練結果將包含如下信息:(1)網絡的預測誤差 error;(2)各層權值矩陣 Thetas;(3)迭代次數 iters;(4)是否訓練成功 success。

    • 預測函數:
    def predict(X, Thetas):"""預測函數Args:X: 樣本Thetas: 訓練后得到的參數Return:a"""a = fp(Thetas,X)return a[-1]

    完整的神經網絡模塊為:

    # coding: utf-8 # neural_network/nn.py import numpy as np from scipy.optimize import minimize from scipy import statsdef sigmoid(z):"""sigmoid"""return 1 / (1 + np.exp(-z))def sigmoidDerivative(a):"""sigmoid求導"""return np.multiply(a, (1-a))def initThetas(hiddenNum, unitNum, inputSize, classNum, epsilon):"""初始化權值矩陣Args:hiddenNum 隱層數目unitNum 每個隱層的神經元數目inputSize 輸入層規模classNum 分類數目epsilon epsilonReturns:Thetas 權值矩陣序列"""hiddens = [unitNum for i in range(hiddenNum)]units = [inputSize] + hiddens + [classNum]Thetas = []for idx, unit in enumerate(units):if idx == len(units) - 1:breaknextUnit = units[idx + 1]# 考慮偏置Theta = np.random.rand(nextUnit, unit + 1) * 2 * epsilon - epsilonThetas.append(Theta)return Thetasdef computeCost(Thetas, y, theLambda, X=None, a=None):"""計算代價Args:Thetas 權值矩陣序列X 樣本y 標簽集a 各層激活值Returns:J 預測代價"""m = y.shape[0]if a is None:a = fp(Thetas, X)error = -np.sum(np.multiply(y.T,np.log(a[-1]))+np.multiply((1-y).T, np.log(1-a[-1])))# 正規化參數reg = -np.sum([np.sum(Theta[:, 1:]) for Theta in Thetas])return (1.0 / m) * error + (1.0 / (2 * m)) * theLambda * regdef gradientCheck(Thetas,X,y,theLambda):"""梯度校驗Args:Thetas 權值矩陣X 樣本y 標簽theLambda 正規化參數Returns:checked 是否檢測通過"""m, n = X.shape# 前向傳播計算各個神經元的激活值a = fp(Thetas, X)# 反向傳播計算梯度增量D = bp(Thetas, a, y, theLambda)# 計算預測代價J = computeCost(Thetas, y, theLambda, a=a)DVec = unroll(D)# 求梯度近似epsilon = 1e-4gradApprox = np.zeros(DVec.shape)ThetaVec = unroll(Thetas)shapes = [Theta.shape for Theta in Thetas]for i,item in enumerate(ThetaVec):ThetaVec[i] = item - epsilonJMinus = computeCost(roll(ThetaVec,shapes),y,theLambda,X=X)ThetaVec[i] = item + epsilonJPlus = computeCost(roll(ThetaVec,shapes),y,theLambda,X=X)gradApprox[i] = (JPlus-JMinus) / (2*epsilon)# 用歐氏距離表示近似程度diff = np.linalg.norm(gradApprox - DVec)if diff < 1e-2:return Trueelse:return Falsedef adjustLabels(y):"""校正分類標簽Args:y 標簽集Returns:yAdjusted 校正后的標簽集"""# 保證標簽對類型的標識是邏輯標識if y.shape[1] == 1:classes = set(np.ravel(y))classNum = len(classes)minClass = min(classes)if classNum > 2:yAdjusted = np.zeros((y.shape[0], classNum), np.float64)for row, label in enumerate(y):yAdjusted[row, label - minClass] = 1else:yAdjusted = np.zeros((y.shape[0], 1), np.float64)for row, label in enumerate(y):if label != minClass:yAdjusted[row, 0] = 1.0return yAdjustedreturn ydef unroll(matrixes):"""參數展開Args:matrixes 矩陣Return:vec 向量"""vec = []for matrix in matrixes:vector = matrix.reshape(1, -1)[0]vec = np.concatenate((vec, vector))return vecdef roll(vector, shapes):"""參數恢復Args:vector 向量shapes shape listReturns:matrixes 恢復的矩陣序列"""matrixes = []begin = 0for shape in shapes:end = begin + shape[0] * shape[1]matrix = vector[begin:end].reshape(shape)begin = endmatrixes.append(matrix)return matrixesdef fp(Thetas, X):"""前向反饋過程Args:Thetas 權值矩陣X 輸入樣本Returns:a 各層激活向量"""layers = range(len(Thetas) + 1)layerNum = len(layers)# 激活向量序列a = range(layerNum)# 前向傳播計算各層輸出for l in layers:if l == 0:a[l] = X.Telse:z = Thetas[l - 1] * a[l - 1]a[l] = sigmoid(z)# 除輸出層外,需要添加偏置if l != layerNum - 1:a[l] = np.concatenate((np.ones((1, a[l].shape[1])), a[l]))return adef bp(Thetas, a, y, theLambda):"""反向傳播過程Args:a 激活值y 標簽Returns:D 權值梯度"""m = y.shape[0]layers = range(len(Thetas) + 1)layerNum = len(layers)d = range(len(layers))delta = [np.zeros(Theta.shape) for Theta in Thetas]for l in layers[::-1]:if l == 0:# 輸入層不計算誤差breakif l == layerNum - 1:# 輸出層誤差d[l] = a[l] - y.Telse:# 忽略偏置d[l] = np.multiply((Thetas[l][:,1:].T * d[l + 1]), sigmoidDerivative(a[l][1:, :]))for l in layers[0:layerNum - 1]:delta[l] = d[l + 1] * (a[l].T)D = [np.zeros(Theta.shape) for Theta in Thetas]for l in range(len(Thetas)):Theta = Thetas[l]# 偏置更新增量D[l][:, 0] = (1.0 / m) * (delta[l][0:, 0].reshape(1, -1))# 權值更新增量D[l][:, 1:] = (1.0 / m) * (delta[l][0:, 1:] +theLambda * Theta[:, 1:])return Ddef updateThetas(m, Thetas, D, alpha, theLambda):"""更新權值Args:m 樣本數Thetas 各層權值矩陣D 梯度alpha 學習率theLambda 正規化參數Returns:Thetas 更新后的權值矩陣"""for l in range(len(Thetas)):Thetas[l] = Thetas[l] - alpha * D[l]return Thetasdef gradientDescent(Thetas, X, y, alpha, theLambda):"""梯度下降Args:X 樣本y 標簽alpha 學習率theLambda 正規化參數Returns:J 預測代價Thetas 更新后的各層權值矩陣"""# 樣本數,特征數m, n = X.shape# 前向傳播計算各個神經元的激活值a = fp(Thetas, X)# 反向傳播計算梯度增量D = bp(Thetas, a, y, theLambda)# 計算預測代價J = computeCost(Thetas,y,theLambda,a=a)# 更新權值Thetas = updateThetas(m, Thetas, D, alpha, theLambda)if np.isnan(J):J = np.infreturn J, Thetasdef train(X, y, Thetas=None, hiddenNum=0, unitNum=5, epsilon=1, alpha=1, theLambda=0, precision=0.01, maxIters=50):"""網絡訓練Args:X 訓練樣本y 標簽集Thetas 初始化的Thetas,如果為None,由系統隨機初始化ThetashiddenNum 隱藏層數目unitNum 隱藏層的單元數epsilon 初始化權值的范圍[-epsilon, epsilon]alpha 學習率theLambda 正規化參數precision 誤差精度maxIters 最大迭代次數"""# 樣本數,特征數m, n = X.shape# 矯正標簽集y = adjustLabels(y)classNum = y.shape[1]# 初始化Thetaif Thetas is None:Thetas = initThetas(inputSize=n,hiddenNum=hiddenNum,unitNum=unitNum,classNum=classNum,epsilon=epsilon)# 先進性梯度校驗print 'Doing Gradient Checking....'checked = gradientCheck(Thetas, X, y, theLambda)if checked:for i in range(maxIters):error, Thetas = gradientDescent(Thetas, X, y, alpha=alpha, theLambda=theLambda)if error < precision:breakif error == np.inf:breakif error < precision:success = Trueelse:success = Falsereturn {'error': error,'Thetas': Thetas,'iters': i,'success': error}else:print 'Error: Gradient Cheching Failed!!!'return {'error': None,'Thetas': None,'iters': 0,'success': False}def predict(X, Thetas):"""預測函數Args:X: 樣本Thetas: 訓練后得到的參數Return:a"""a = fp(Thetas,X)return a[-1]

    總結

    以上是生活随笔為你收集整理的3.10 程序示例--神经网络设计-机器学习笔记-斯坦福吴恩达教授的全部內容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。