人工神经网络_制作属于自己的人工神经网络
在本文中,我已經(jīng)實(shí)現(xiàn)了具有Dropout和L2正則化的人工神經(jīng)網(wǎng)絡(luò)的完全向量化代碼。
在本文中,我實(shí)現(xiàn)了一個(gè)在多個(gè)數(shù)據(jù)集上測(cè)試的人工神經(jīng)網(wǎng)絡(luò)的完全向量化python代碼。此外,并對(duì)Dropout和L2正則化技術(shù)進(jìn)行了實(shí)現(xiàn)和詳細(xì)說明。
強(qiáng)烈建議通過人工神經(jīng)網(wǎng)絡(luò)的基本工作,正向傳播和反向傳播。
本文分為10個(gè)部分:
1.簡(jiǎn)介
人工神經(jīng)網(wǎng)絡(luò)是監(jiān)督深度學(xué)習(xí)中最簡(jiǎn)單和最基本的概念之一。它可用于執(zhí)行多個(gè)任務(wù),如二進(jìn)制或分類。它看起來很容易理解和實(shí)現(xiàn)。在編寫這樣一個(gè)網(wǎng)絡(luò)的過程中,會(huì)出現(xiàn)一些小問題,這些問題會(huì)導(dǎo)致很大的錯(cuò)誤,并幫助您理解以前忽略的概念。因此,在本文中,我嘗試實(shí)現(xiàn)一個(gè)人工神經(jīng)網(wǎng)絡(luò),它可能會(huì)幫助您節(jié)省正確編碼和理解主題的每個(gè)概念所需的工作時(shí)間。
2.先決條件
我假設(shè)你知道神經(jīng)網(wǎng)絡(luò)是什么以及它們是如何學(xué)習(xí)的。如果你對(duì)Python和像numpy這樣的庫(kù)很熟悉的話,這將很容易理解。另外,還需要對(duì)線性代數(shù)和微積分知識(shí)有很好的了解,以便于輕松地理解正向和反向傳播部分。此外,我強(qiáng)烈建議您閱讀Andrew Ng在Coursera上的課程視頻(https://www.coursera.org/ ; https://www.deeplearning.ai/ )。
3.導(dǎo)入我們的庫(kù)
現(xiàn)在,我們可以開始對(duì)神經(jīng)網(wǎng)絡(luò)進(jìn)行編碼了。第一件事是需要導(dǎo)入我們實(shí)現(xiàn)網(wǎng)絡(luò)所需的所有庫(kù)。
# Importing the librariesimport numpy as npimport matplotlib.pyplot as pltimport pandas as pdimport warningsimport timewarnings.filterwarnings('ignore');import osimport sys我們將使用pandas導(dǎo)入和清理我們的數(shù)據(jù)集。Numpy 是執(zhí)行矩陣代數(shù)和復(fù)雜計(jì)算的最重要的庫(kù)。
4.激活函數(shù)及求導(dǎo)
在本文后面的部分中,我們將需要激活函數(shù)來執(zhí)行正向傳播。另外,我們還需要在反向傳播過程中對(duì)激活函數(shù)求導(dǎo)。
所以,讓我們編寫一些激活函數(shù)。
def sigmoid(z) : """ Reutrns the element wise sigmoid function. """ return 1./(1 + np.exp(-z))def sigmoid_prime(z) : """ Returns the derivative of the sigmoid function. """ return sigmoid(z)*(1-sigmoid(z))def ReLU(z) : """ Reutrns the element wise ReLU function. """ return (z*(z > 0))def ReLU_prime(z) : """ Returns the derivative of the ReLU function. """ return 1*(z>=0)def lReLU(z) : """ Reutrns the element wise leaky ReLU function. """ return np.maximum(z/100,z)def lReLU_prime(z) : """ Returns the derivative of the leaky ReLU function. """ z = 1*(z>=0) z[z==0] = 1/100 return zdef tanh(z) : """ Reutrns the element wise hyperbolic tangent function. """ return np.tanh(z)def tanh_prime(z) : """ Returns the derivative of the tanh function. """ return (1-tanh(z)**2)# A dictionary of our activation functionsPHI = {'sigmoid':sigmoid, 'relu':ReLU, 'lrelu':lReLU, 'tanh':tanh}# A dictionary containing the derivatives of our activation functionsPHI_PRIME = {'sigmoid':sigmoid_prime, 'relu':ReLU_prime, 'lrelu':lReLU_prime, 'tanh':tanh_prime}我們有四種最流行的激活函數(shù)。首先是常規(guī)的sigmoid激活函數(shù)。
我們有ReLU或“ 線性整流函數(shù) ”。我們將主要使用這個(gè)激活函數(shù)。注意,我們將保持ReLU 0的導(dǎo)數(shù)在點(diǎn)0處。
我們還有一個(gè)ReLU的擴(kuò)展版本叫做Leaky ReLU。它的工作原理與ReLU類似,可以在某些數(shù)據(jù)集上提供更好的結(jié)果(不一定是全部)。
我們還有tanh(雙曲正切)激活函數(shù)。它也被廣泛使用,并且?guī)缀蹩偸莾?yōu)于sigmoid。
另外,PHI和PHI_PRIME是分別包含激活函數(shù)及其導(dǎo)數(shù)的python字典。
5.我們的神經(jīng)網(wǎng)絡(luò)類
在本節(jié)中,我們將創(chuàng)建并初始化我們的神經(jīng)網(wǎng)絡(luò)類。首先,我們將決定在初始化期間使用哪些參數(shù)。我們需要:
記住這一點(diǎn),讓我們開始編寫神經(jīng)網(wǎng)絡(luò)的類:
class NeuralNet : """ This is a class for making Artificial Neural Networks. L2 and Droupout are the default regularization methods implemented in this class. It takes the following parameters: 1. layers : A python list containing the different number of neurons in each layer. (containing the output layer) Eg - [64,32,16,16,1] 2. X : Matrix of features with rows as features and columns as different examples. 3. y : Numpy array containing the ouputs of coresponding examples. 4. ac_funcs : A python list containing activation function of each layer. Eg - ['relu','relu','lrelu','tanh','sigmoid'] 5. init_method : Meathod to initialize weights of the network. Can be 'gaussian','random','zeros'. 6. loss_func : Currently not implemented 7. W : Weights of a pretrained neural network with same architecture. 8. W : Biases of a pretrained neural network with same architecture. """現(xiàn)在我們有了一個(gè)正確記錄的神經(jīng)網(wǎng)絡(luò)類,我們可以繼續(xù)初始化網(wǎng)絡(luò)的其他變量。
def __init__(self, layers, X, y, ac_funcs, init_method='gaussian', loss_func='b_ce', W=np.array([]), B=np.array([])) : """ Initialize the network. """ # Store the layers of the network self.layers = layers # ---- self.W = None self.B = None # Store the number of examples in the dataset as m self.m = X.shape[1] # Store the full layer list as n self.n = [X.shape[0], *layers] # Save the dataset self.X = X # Save coresponding output self.y = y # List to store the cost of the model calculated during training self.cost = [] # Stores the accuracy obtained on the test set. self.acc = 0 # Activation function of each layer self.ac_funcs = ac_funcs self.loss = loss_func # Inittialize the weights by provided methods if not provided.我們將使用' self.m'來存儲(chǔ)數(shù)據(jù)集中示例的數(shù)量。' self.n '將存儲(chǔ)每層中神經(jīng)元數(shù)量的信息。' self.ac_funcs '是每層的激活函數(shù)的python列表。' self.cost '將在我們訓(xùn)練網(wǎng)絡(luò)時(shí)存儲(chǔ)成本函數(shù)的記錄值。' self.acc '將在訓(xùn)練后的數(shù)據(jù)集上存儲(chǔ)記錄的精度。在初始化網(wǎng)絡(luò)的所有變量之后,讓我們進(jìn)一步初始化網(wǎng)絡(luò)的權(quán)重和偏差。
6.初始化權(quán)重和偏差
我們知道權(quán)重不能初始化為零,因?yàn)槊總€(gè)神經(jīng)元的假設(shè)變得相同而網(wǎng)絡(luò)永遠(yuǎn)不會(huì)學(xué)習(xí)。因此,我們必須有一些方法來使我們的神經(jīng)網(wǎng)絡(luò)學(xué)習(xí)。我們可以使用高斯正態(tài)分布來獲得隨機(jī)值。由于這些分布的均值為零,因此權(quán)重集中在零并且非常小。因此,網(wǎng)絡(luò)開始非常快速有效地學(xué)習(xí)。我們可以使用np.random.randn()函數(shù)從正態(tài)分布中生成隨機(jī)值。
# Inittialize the weights by provided methods if not provided. if len(W) and len(B) : self.W = W self.B = B else : if init_method=='gaussian': self.W = [np.random.randn(self.n[nl], self.n[nl-1]) for nl in range(1,len(self.n))] self.B = [np.zeros((nl,1), 'float32') for nl in self.layers] elif init_method == 'random': self.W = [np.random.rand(self.n[nl], self.n[nl-1]) for nl in range(1,len(self.n))] self.B = [np.random.rand(nl,1) for nl in self.layers] elif init_method == 'zeros': self.W = [np.zeros((self.n[nl], self.n[nl-1]), 'float32') for nl in range(1,len(self.n))] self.B = [np.zeros((nl,1), 'float32') for nl in self.layers]我們已將權(quán)重初始化為正態(tài)分布中的隨機(jī)值。偏差已初始化為零。
7.正向傳播
首先,讓我們理解沒有正則化的正向傳播。
我們用Z表示每個(gè)神經(jīng)元從一層到另一層的連接。一旦我們計(jì)算了Z,我們將激活函數(shù)f應(yīng)用于Z值以獲得每層中每個(gè)神經(jīng)元的激活y。這是簡(jiǎn)單的正向傳播。Dropout是一種提高神經(jīng)網(wǎng)絡(luò)泛化能力和魯棒性的神奇技術(shù)。所以,讓我們首先了解一下Dropout正則化。
Dropout正則化的本質(zhì)
Dropout,顧名思義,是指神經(jīng)網(wǎng)絡(luò)中的一些神經(jīng)元“失活”,并對(duì)其余的神經(jīng)元進(jìn)行訓(xùn)練。
為了提高性能,我們可以訓(xùn)練幾十個(gè)和幾百個(gè)具有不同超參數(shù)值的神經(jīng)網(wǎng)絡(luò),獲得所有網(wǎng)絡(luò)的輸出并取其平均值來獲得最終結(jié)果。這個(gè)過程在計(jì)算上非常昂貴并且實(shí)際上不能實(shí)現(xiàn)。因此,我們需要一種以更優(yōu)化和計(jì)算成本更低的方式執(zhí)行類似操作的方法。Dropout正則化以非常便宜和簡(jiǎn)單的方式完成相似的事情。事實(shí)上,Dropout是優(yōu)化性能的簡(jiǎn)單方法,近年來受到了廣泛的關(guān)注,并且?guī)缀踉谠S多其他深度學(xué)習(xí)模型中無處不在。
要實(shí)現(xiàn)Dropout,我們將使用以下方法:
我們將首先從伯努利分布中提取隨機(jī)值,如果概率高于某個(gè)閾值則保持神經(jīng)元不變,然后執(zhí)行常規(guī)正向傳播。注意,我們不會(huì)在預(yù)測(cè)新數(shù)據(jù)集上的值或測(cè)試時(shí)間期間應(yīng)用dropout。
實(shí)現(xiàn)Dropout的代碼
我們使用keep_prob作為每層神經(jīng)元存活的概率。我們只保留概率高于存活概率或keep_prob的神經(jīng)元。假設(shè),它的值是0.8。這意味著我們將使每一層中20%的神經(jīng)元失活,并訓(xùn)練其余80%的神經(jīng)元。注意,我們?cè)诿看蔚笸S秒S機(jī)選擇的神經(jīng)元。這有助于神經(jīng)元學(xué)習(xí)在更大的數(shù)據(jù)集上泛化的特征。
def _feedForward(self, keep_prob): """ Forward pass """ z = [];a = [] z.append(np.dot(self.W[0], self.X) + self.B[0]) a.append(PHI[self.ac_funcs[0]](z[-1])) for l in range(1,len(self.layers)): z.append(np.dot(self.W[l], a[-1]) + self.B[l]) # a.append(PHI[self.ac_funcs[l]](z[l])) _a = PHI[self.ac_funcs[l]](z[l]) a.append( ((np.random.rand(_a.shape[0],1) < keep_prob)*_a)/keep_prob ) return z,a我們首先初始化將存儲(chǔ)Z和A值的列表。我們首先在Z中附加第一層的線性值,然后在A中附加第一層神經(jīng)元的激活。PHI是一個(gè)python字典,包含我們之前編寫的激活函數(shù)。類似地使用for循環(huán)計(jì)算所有其他層的Z和A的值。注意,我們沒有在輸入層應(yīng)用dropout。我們最終返回Z和A的計(jì)算值。
8.成本函數(shù)
我們將使用標(biāo)準(zhǔn)二進(jìn)制/分類交叉熵成本函數(shù)。
def _cost_func(self, a, _lambda): """ Binary Cross Entropy Cost Function """ return ( (-1/self.m)*np.sum(np.nan_to_num(self.y*np.log(a) + (1-self.y)*np.log(1-a))) + (_lambda/(2*self.m))*np.sum([np.sum(i**2) for i in self.W]) ) def _cost_derivative(self, a) : """ The derivative of cost w.r.t z """ return (a-self.y)我們用L2正則化對(duì)我們的成本函數(shù)進(jìn)行了編譯。lambda參數(shù)稱為“ 懲罰參數(shù) ”。它有助于使權(quán)重值不會(huì)迅速增加,從而更好地形成。這里,' a'包含輸出層的激活值。我們還有函數(shù)_cost_derivative來計(jì)算成本函數(shù)對(duì)輸出層激活的導(dǎo)數(shù)。我們稍后會(huì)在反向傳播期間使用它。
9.反向傳播
以下是我們需要執(zhí)行反向傳播的一些公式。
我們將在深度神經(jīng)網(wǎng)絡(luò)上實(shí)現(xiàn)這一點(diǎn)。右邊的公式是完全向量化的。一旦理解了這些公式,我們就可以繼續(xù)對(duì)它們進(jìn)行編譯。
def startTraining(self, epochs, alpha, _lambda, keep_prob=0.5, interval=100): """ Start training the neural network. It takes the followng parameters : 1. epochs : Number of epochs for which you want to train the network. 2. alpha : The learning rate of your network. 3. _lambda : L2 regularization parameter or the penalization parameter. 4. keep_prob : Dropout regularization parameter. The probability of neurons to deactivate. Eg - 0.8 means 20% of the neurons have been deactivated. 5. interval : The interval between updates of cost and accuracy. """ start = time.time() for i in range(epochs+1) : z,a = self._feedForward(keep_prob) delta = self._cost_derivative(a[-1]) for l in range(1,len(z)) : delta_w = np.dot(delta, a[-l-1].T) + (_lambda)*self.W[-l] delta_b = np.sum(delta, axis=1, keepdims=True) delta = np.dot(self.W[-l].T, delta)*PHI_PRIME[self.ac_funcs[-l-1]](z[-l-1]) self.W[-l] = self.W[-l] - (alpha/self.m)*delta_w self.B[-l] = self.B[-l] - (alpha/self.m)*delta_b delta_w = np.dot(delta, self.X.T ) + (_lambda)*self.W[0] delta_b = np.sum(delta, axis=1, keepdims=True) self.W[0] = self.W[0] - (alpha/self.m)*delta_w self.B[0] = self.B[0] - (alpha/self.m)*delta_b我們將epochs、alpha(學(xué)習(xí)率)、_lambda、keep_prob和interval作為函數(shù)的參數(shù)來實(shí)現(xiàn)反向傳播。
我們從正向傳播開始。然后我們將成本函數(shù)的導(dǎo)數(shù)計(jì)算為delta。現(xiàn)在,對(duì)于每一層,我們計(jì)算delta_w和delta_b,其中包含成本函數(shù)對(duì)網(wǎng)絡(luò)的權(quán)重和偏差的導(dǎo)數(shù)。然后我們根據(jù)各自的公式更新delta,權(quán)重和偏差。在將最后一層的權(quán)重和偏差更新到第二層之后,我們更新第一層的權(quán)重和偏差。我們這樣做了幾次迭代,直到權(quán)重和偏差的值收斂。
重要提示:此處可能出現(xiàn)的一個(gè)重大錯(cuò)誤是在更新權(quán)重和偏差后更新delta 。這樣做可能會(huì)導(dǎo)致非常糟糕的梯度漸變消失/爆炸問題。
我們的大部分工作都在這里完成,但是我們?nèi)匀恍枰幾g可以預(yù)測(cè)新數(shù)據(jù)集結(jié)果的函數(shù)。因此,作為我們的最后一步,我們將編寫一個(gè)函數(shù)來預(yù)測(cè)新數(shù)據(jù)集的標(biāo)簽。
10.預(yù)測(cè)新數(shù)據(jù)集的標(biāo)簽
這一步非常簡(jiǎn)單。我們只需要執(zhí)行正向傳播但不需要Dropout正則化。我們不需要在測(cè)試時(shí)應(yīng)用Dropout正則化,因?yàn)槲覀冃枰袑拥乃猩窠?jīng)元來為我們提供適當(dāng)?shù)慕Y(jié)果,而不僅僅是一些隨機(jī)值。
def predict(self, X_test) : """ Predict the labels for a new dataset. Returns probability. """ a = PHI[self.ac_funcs[0]](np.dot(self.W[0], X_test) + self.B[0]) for l in range(1,len(self.layers)): a = PHI[self.ac_funcs[l]](np.dot(self.W[l], a) + self.B[l]) return a我們將返回輸出層的激活作為結(jié)果。
整個(gè)代碼
以下是您自己實(shí)現(xiàn)人工神經(jīng)網(wǎng)絡(luò)的完整代碼。我在培訓(xùn)時(shí)添加了一些代碼來打印網(wǎng)絡(luò)的成本和準(zhǔn)確性。除此之外,一切都是一樣的。
# Importing the librariesimport numpy as npimport matplotlib.pyplot as pltimport pandas as pdimport warningsimport timewarnings.filterwarnings('ignore');import osimport sys# Importing our datasetos.chdir("C:/Users/Hilak/Desktop/INTERESTS/Machine Learning A-Z Template Folder/Part 3 - Classification/Section 14 - Logistic Regression");training_set = pd.read_csv("Social_Network_Ads.csv");# Splitting our dataset into matrix of features and output values.X = training_set.iloc[:, 1:4].valuesy = training_set.iloc[:, 4].values# Encoding our object features.from sklearn.preprocessing import LabelEncoder, OneHotEncoderle_x = LabelEncoder()X[:,0] = le_x.fit_transform(X[:,0])ohe = OneHotEncoder(categorical_features = [0])X = ohe.fit_transform(X).toarray()# Performing Feature scalingfrom sklearn.preprocessing import StandardScalerss = StandardScaler()X[:,2:4] = ss.fit_transform(X[:, 2:4])# Splitting the dataset into train and test set.from sklearn.model_selection import train_test_splitX_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3)X_train = X_train.TX_test = X_test.T# # Alternate Dataset for test purposes. Not used in the example shown# os.chdir("C:甥敳獲HilakDesktopINTERESTSMachine Learning A-Z Template FolderPart 8 - Deep LearningSection 39 - Artificial Neural Networks (ANN)");# dataset = pd.read_csv('Churn_Modelling.csv')# X = dataset.iloc[:, 3:13].values# y = dataset.iloc[:, 13].values# # Encoding categorical data# from sklearn.preprocessing import LabelEncoder, OneHotEncoder# labelencoder_X_1 = LabelEncoder()# X[:, 1] = labelencoder_X_1.fit_transform(X[:, 1])# labelencoder_X_2 = LabelEncoder()# X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])# onehotencoder = OneHotEncoder(categorical_features = [1])# X = onehotencoder.fit_transform(X).toarray()# X = X[:, 1:]# # Splitting the dataset into the Training set and Test set# from sklearn.model_selection import train_test_split# X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2)# X_test, X_CV, y_test, y_CV = train_test_split(X, y, test_size = 0.5)# # Feature Scaling# from sklearn.preprocessing import StandardScaler# sc = StandardScaler()# X_train = sc.fit_transform(X_train)# X_test = sc.transform(X_test)# X_train = X_train.T# X_test = X_test.T# X_CV = X_CV.Tdef sigmoid(z) : """ Reutrns the element wise sigmoid function. """ return 1./(1 + np.exp(-z))def sigmoid_prime(z) : """ Returns the derivative of the sigmoid function. """ return sigmoid(z)*(1-sigmoid(z))def ReLU(z) : """ Reutrns the element wise ReLU function. """ return (z*(z > 0))def ReLU_prime(z) : """ Returns the derivative of the ReLU function. """ return 1*(z>=0)def lReLU(z) : """ Reutrns the element wise leaky ReLU function. """ return np.maximum(z/100,z)def lReLU_prime(z) : """ Returns the derivative of the leaky ReLU function. """ z = 1*(z>=0) z[z==0] = 1/100 return zdef tanh(z) : """ Reutrns the element wise hyperbolic tangent function. """ return np.tanh(z)def tanh_prime(z) : """ Returns the derivative of the tanh function. """ return (1-tanh(z)**2)# A dictionary of our activation functionsPHI = {'sigmoid':sigmoid, 'relu':ReLU, 'lrelu':lReLU, 'tanh':tanh}# A dictionary containing the derivatives of our activation functionsPHI_PRIME = {'sigmoid':sigmoid_prime, 'relu':ReLU_prime, 'lrelu':lReLU_prime, 'tanh':tanh_prime}class NeuralNet : """ This is a class for making Artificial Neural Networks. L2 and Droupout are the default regularization methods implemented in this class. It takes the following parameters: 1. layers : A python list containing the different number of neurons in each layer. (containing the output layer) Eg - [64,32,16,16,1] 2. X : Matrix of features with rows as features and columns as different examples. 3. y : Numpy array containing the ouputs of coresponding examples. 4. ac_funcs : A python list containing activation function of each layer. Eg - ['relu','relu','lrelu','tanh','sigmoid'] 5. init_method : Meathod to initialize weights of the network. Can be 'gaussian','random','zeros'. 6. loss_func : Currently not implemented 7. W : Weights of a pretrained neural network with same architecture. 8. W : Biases of a pretrained neural network with same architecture. """ def __init__(self, layers, X, y, ac_funcs, init_method='gaussian', loss_func='b_ce', W=np.array([]), B=np.array([])) : """ Initialize the network. """ # Store the layers of the network self.layers = layers # ---- self.W = None self.B = None # Store the number of examples in the dataset as m self.m = X.shape[1] # Store the full layer list as n self.n = [X.shape[0], *layers] # Save the dataset self.X = X # Save coresponding output self.y = y # List to store the cost of the model calculated during training self.cost = [] # Stores the accuracy obtained on the test set. self.acc = 0 # Activation function of each layer self.ac_funcs = ac_funcs self.loss = loss_func # Inittialize the weights by provided methods if not provided. if len(W) and len(B) : self.W = W self.B = B else : if init_method=='gaussian': self.W = [np.random.randn(self.n[nl], self.n[nl-1]) for nl in range(1,len(self.n))] self.B = [np.zeros((nl,1), 'float32') for nl in self.layers] elif init_method == 'random': self.W = [np.random.rand(self.n[nl], self.n[nl-1]) for nl in range(1,len(self.n))] self.B = [np.random.rand(nl,1) for nl in self.layers] elif init_method == 'zeros': self.W = [np.zeros((self.n[nl], self.n[nl-1]), 'float32') for nl in range(1,len(self.n))] self.B = [np.zeros((nl,1), 'float32') for nl in self.layers] def startTraining(self, epochs, alpha, _lambda, keep_prob=0.5, interval=100): """ Start training the neural network. It takes the followng parameters : 1. epochs : Number of epochs for which you want to train the network. 2. alpha : The learning rate of your network. 3. _lambda : L2 regularization parameter or the penalization parameter. 4. keep_prob : Dropout regularization parameter. The probability of neurons to deactivate. Eg - 0.8 means 20% of the neurons have been deactivated. 5. interval : The interval between updates of cost and accuracy. """ start = time.time() for i in range(epochs+1) : z,a = self._feedForward(keep_prob) delta = self._cost_derivative(a[-1]) for l in range(1,len(z)) : delta_w = np.dot(delta, a[-l-1].T) + (_lambda)*self.W[-l] delta_b = np.sum(delta, axis=1, keepdims=True) delta = np.dot(self.W[-l].T, delta)*PHI_PRIME[self.ac_funcs[-l-1]](z[-l-1]) self.W[-l] = self.W[-l] - (alpha/self.m)*delta_w self.B[-l] = self.B[-l] - (alpha/self.m)*delta_b delta_w = np.dot(delta, self.X.T ) + (_lambda)*self.W[0] delta_b = np.sum(delta, axis=1, keepdims=True) self.W[0] = self.W[0] - (alpha/self.m)*delta_w self.B[0] = self.B[0] - (alpha/self.m)*delta_b if not i%interval : aa = self.predict(self.X) if self.loss == 'b_ce': aa = aa > 0.5 self.acc = sum(sum(aa == self.y)) / self.m cost_val = self._cost_func(a[-1], _lambda) self.cost.append(cost_val) elif self.loss == 'c_ce': aa = np.argmax(aa, axis = 0) yy = np.argmax(self.y, axis = 0) self.acc = np.sum(aa==yy)/(self.m) cost_val = self._cost_func(a[-1], _lambda) self.cost.append(cost_val) sys.stdout.write(f'Epoch[{i}] : Cost = {cost_val:.2f} ; Acc = {(self.acc*100):.2f}% ; Time Taken = {(time.time()-start):.2f}s') print('') return None def predict(self, X_test) : """ Predict the labels for a new dataset. Returns probability. """ a = PHI[self.ac_funcs[0]](np.dot(self.W[0], X_test) + self.B[0]) for l in range(1,len(self.layers)): a = PHI[self.ac_funcs[l]](np.dot(self.W[l], a) + self.B[l]) return a def _feedForward(self, keep_prob): """ Forward pass """ z = [];a = [] z.append(np.dot(self.W[0], self.X) + self.B[0]) a.append(PHI[self.ac_funcs[0]](z[-1])) for l in range(1,len(self.layers)): z.append(np.dot(self.W[l], a[-1]) + self.B[l]) # a.append(PHI[self.ac_funcs[l]](z[l])) _a = PHI[self.ac_funcs[l]](z[l]) a.append( ((np.random.rand(_a.shape[0],1) < keep_prob)*_a)/keep_prob ) return z,a def _cost_func(self, a, _lambda): """ Binary Cross Entropy Cost Function """ return ( (-1/self.m)*np.sum(np.nan_to_num(self.y*np.log(a) + (1-self.y)*np.log(1-a))) + (_lambda/(2*self.m))*np.sum([np.sum(i**2) for i in self.W]) ) def _cost_derivative(self, a) : """ The derivative of cost w.r.t z """ return (a-self.y) @property def summary(self) : return self.cost, self.acc, self.W,self.B def __repr__(self) : return f''# Initializing our neural networkneural_net_sigmoid = NeuralNet([32,16,1], X_train, y_train, ac_funcs = ['relu','relu','sigmoid'])# Staring the training of our network.neural_net_sigmoid.startTraining(5000, 0.01, 0.2, 0.5, 100)# Predicting on new dataset using our trained network.preds = neural_net_sigmoid.predict(X_test)preds = preds > 0.5acc = (sum(sum(preds == y_test)) / y_test.size)*100# Accuracy (metric of evaluation) obtained by the network.print(f'Test set Accuracy ( r-t-s ) : {acc}%')# Plotting our cost vs epochs relationshipsigmoid_summary = neural_net_sigmoid.summaryplt.plot(range(len(sigmoid_summary[0])), sigmoid_summary[0], label='Sigmoid Cost')plt.title('Cost')plt.show()# Comparing our results with the library keras.from keras.models import Sequentialfrom keras.layers import DenseX_train, X_test = X_train.T, X_test.Tclassifier = Sequential()classifier.add(Dense(input_dim=4, units = 32, kernel_initializer="uniform 創(chuàng)作挑戰(zhàn)賽新人創(chuàng)作獎(jiǎng)勵(lì)來咯,堅(jiān)持創(chuàng)作打卡瓜分現(xiàn)金大獎(jiǎng)總結(jié)
以上是生活随笔為你收集整理的人工神经网络_制作属于自己的人工神经网络的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 万份收益和基金净值的区别,都可以计算收益
- 下一篇: 51单片机流水灯用c语言,51单片机之流