日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 人工智能 > pytorch >内容正文

pytorch

深度学习笔记一:稀疏自编码器

發布時間:2025/4/16 pytorch 89 豆豆
生活随笔 收集整理的這篇文章主要介紹了 深度学习笔记一:稀疏自编码器 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

開始學習深度學習了,既然確定目標就要努力前行!為自己加油!——2015.6.11

Sparse Encoder

1.神經網絡
概念:假設我們有訓練樣本集 (x(^ i),y(^ i)) ,那么神經網絡算法能夠提供一種復雜且非線性的假設模型 h_{W,b}(x) ,它具有參數 W, b ,可以以此參數來擬合我們的數據。
激活函數:
f(z)=sigmoid(z)=1/(1+exp(-z))
導數:f’(z)=f(z)(1-f(z)) 很重要,求代價函數極值的時候要用到
模型:
一個簡單的神經網絡,只有輸入層,一個隱藏層和輸出層組成。每加一層就相當于對輸入多進行一次非線性處理,進而形成復雜的目標函數hw,b(x).
(https://img-blog.csdn.net/20150611084555088)
目標值從前往后計算:
Z2=W1*data+b1
a2=f(Z2)
Z3=W2*a2+b2
a3=f(Z3)
目標函數的代價函數:
第一部分是:直接誤差——m個輸入的平均誤差
第二部分是:權值懲罰——所有W元素的平方和,目的是為了減少權重的幅度,防止過度擬合
(https://img-blog.csdn.net/20150611085816673)
為使代價函數最小,可以使用批梯度下降法,從而確定參數W1,W2,b1,b2。
步驟:(1)給W1,W2,b1,b2初始值:初始值設計很關鍵,否則將會得到不好的結果,練習里面是這樣設計的:

r = sqrt(6) / sqrt(hiddenSize+visibleSize+1); % we'll choose weights uniformly from the interval [-r, r] W1 = rand(hiddenSize, visibleSize) * 2 * r - r; W2 = rand(visibleSize, hiddenSize) * 2 * r - r;b1 = zeros(hiddenSize, 1); b2 = zeros(visibleSize, 1);

(2)需要給出代價函數:即上述代碼公式
(3)需要給出代價函數對W和b的偏導

說明:這里需要的是后面有的1/m括號里的部分。括號里的第一部分其實就是每個輸入對W,b求導值的平均。
2、反向傳導
這部分很關鍵,用于計算每個輸入對的求導。為了對W求導,先對Z求導,對Z求導的結果就是殘差。從后向前計算,有點類似于計算圖的關鍵路徑的計算方法。

3、稀疏自編碼器
代價函數和W偏導與普通神經網絡有一點區別,代價函數需要加入稀疏代價。

心得:感覺稀疏自編碼器就是對輸入進行壓縮表示,前面編碼,最后一層解碼。做個實驗試了一下兩個隱藏層的情況,想第一次發現邊,第二次發現拐角,然而效果好差!翻了一下教程,發現后面有專門的棧式自編碼器,汗~~~不過至少說明自己思考的方向是對滴~

代碼完成中出現的問題:
錯誤總結:

Jcost=(0.5/m)*sum(sum((a3-data).^2)); %%正確sum((a3-data).^2),寫成sum(a3-data).^2導致錯誤,找了好久的原因啊,原來是因為一對括號! Jweight=0.5*(sum(sum(W1.^2))+sum(sum(W2.^2))); %%寫成了Jweight=0.5*sum(sum(W1.^2))+sum(sum(W2.^2));還是少了括號!!

經驗:
minFun的用法

addpath minFunc/ options.Method = 'lbfgs'; % Here, we use L-BFGS to optimize our cost% function. Generally, for minFunc to work, you% need a function pointer with two outputs: the% function value and the gradient. In our problem,% sparseAutoencoderCost.m satisfies this. options.maxIter = 400; % Maximum number of iterations of L-BFGS to run options.display = 'on';[opttheta, cost] = minFunc( @(p) sparseAutoencoderCost(p, ...visibleSize, hiddenSize, ...lambda, sparsityParam, ...beta, patches), ...theta, options);%需要提供一個返回代價函數和偏導的函數sparseAutoencoderCost

sparseAutoencoderCost.h

function [cost,grad] = sparseAutoencoderCost(theta, visibleSize, hiddenSize, ...lambda, sparsityParam, beta, data)% visibleSize: the number of input units (probably 64) % hiddenSize: the number of hidden units (probably 25) % lambda: weight decay parameter % sparsityParam: The desired average activation for the hidden units (denoted in the lecture % notes by the greek alphabet rho, which looks like a lower-case "p"). % beta: weight of sparsity penalty term % data: Our 64x10000 matrix containing the training data. So, data(:,i) is the i-th training example. % The input theta is a vector (because minFunc expects the parameters to be a vector). % We first convert theta to the (W1, W2, b1, b2) matrix/vector format, so that this % follows the notation convention of the lecture notes. W1 = reshape(theta(1:hiddenSize*visibleSize), hiddenSize, visibleSize); W2 = reshape(theta(hiddenSize*visibleSize+1:2*hiddenSize*visibleSize), visibleSize, hiddenSize); b1 = theta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize); b2 = theta(2*hiddenSize*visibleSize+hiddenSize+1:end);% Cost and gradient variables (your code needs to compute these values). % Here, we initialize them to zeros. cost = 0; W1grad = zeros(size(W1)); W2grad = zeros(size(W2)); b1grad = zeros(size(b1)); b2grad = zeros(size(b2));%% ---------- YOUR CODE HERE -------------------------------------- % Instructions: Compute the cost/optimization objective J_sparse(W,b) for the Sparse Autoencoder, % and the corresponding gradients W1grad, W2grad, b1grad, b2grad. % % W1grad, W2grad, b1grad and b2grad should be computed using backpropagation. % Note that W1grad has the same dimensions as W1, b1grad has the same dimensions % as b1, etc. Your code should set W1grad to be the partial derivative of J_sparse(W,b) with % respect to W1. I.e., W1grad(i,j) should be the partial derivative of J_sparse(W,b) % with respect to the input parameter W1(i,j). Thus, W1grad should be equal to the term % [(1/m) \Delta W^{(1)} + \lambda W^{(1)}] in the last block of pseudo-code in Section 2.2 % of the lecture notes (and similarly for W2grad, b1grad, b2grad). % % Stated differently, if we were using batch gradient descent to optimize the parameters, % the gradient descent update to W1 would be W1 := W1 - alpha * W1grad, and similarly for W2, b1, b2. % %%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%ych代碼 Jcost = 0;%直接誤差 Jweight = 0;%權值懲罰 Jsparse = 0;%稀疏性懲罰 [n m] = size(data);%m為樣本的個數,n為樣本的特征數 % % %前向算法計算各神經網絡節點的線性組合值和active值 z2=W1*data+repmat(b1,1,m); a2=sigmoid(z2); z3=W2*a2+repmat(b2,1,m); a3=sigmoid(z3);Jcost=(0.5/m)*sum(sum((a3-data).^2)); %%正確sum((a3-data).^2),寫成sum(a3-data).^2導致錯誤,找了好久的原因啊,原來是因為一對括號!Jweight=0.5*(sum(sum(W1.^2))+sum(sum(W2.^2))); rho=(1/m).*sum(a2,2); Jsparse=sum(sparsityParam.*log(sparsityParam./rho)+(1-sparsityParam).*log((1-sparsityParam)./(1-rho)));cost=Jcost+lambda*Jweight+beta*Jsparse;d3=-(data-a3).*(sigmoid(z3).*(1-sigmoid(z3))); sterm = beta*(-sparsityParam./rho+(1-sparsityParam)./(1-rho)); d2=(W2'*d3+repmat(sterm,1,m)).*(sigmoid(z2).*(1-sigmoid(z2)));W1grad=(1/m).*(d2*data')+lambda.*W1; W2grad=(1/m).*(d3*a2')+lambda.*W2; b1grad=(1/m).*sum(d2,2); b2grad=(1/m).*sum(d3,2); %------------------------------------------------------------------- % After computing the cost and gradient, we will convert the gradients back % to a vector format (suitable for minFunc). Specifically, we will unroll % your gradient matrices into a vector.grad = [W1grad(:) ; W2grad(:) ; b1grad(:) ; b2grad(:)];endfunction sigm = sigmoid(x)sigm = 1 ./ (1 + exp(-x)); end

本文參考:http://ufldl.stanford.edu/wiki/index.php/UFLDL%E6%95%99%E7%A8%8B
http://www.cnblogs.com/tornadomeet/tag/Deep%20Learning/

第一次寫博客,內容有些凌亂,格式也不規范,當做自己的學習筆記,不當之處敬請指正

總結

以上是生活随笔為你收集整理的深度学习笔记一:稀疏自编码器的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。