當前位置：首頁 > 编程语言 > python >内容正文

python

仅使用NumPy完成卷积神经网络CNN的搭建（附Python代码）

發布時間：2024/8/23 python 38 豆豆

生活随笔收集整理的這篇文章主要介紹了仅使用NumPy完成卷积神经网络CNN的搭建（附Python代码）小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

摘要：?現有的Caffe、TensorFlow等工具箱已經很好地實現CNN模型，但這些工具箱需要的硬件資源比較多，不利于初學者實踐和理解。因此，本文教大家如何僅使用NumPy來構建卷積神經網絡（Convolutional Neural Network , CNN）模型，具體實現了卷積層、ReLU激活函數層以及最大池化層（max pooling），代碼簡單，講解詳細。

? ? ? ?目前網絡上存在很多編譯好的機器學習、深度學習工具箱，在某些情況下，直接調用已經搭好的模型可能是非常方便且有效的，比如Caffe、TensorFlow工具箱，但這些工具箱需要的硬件資源比較多，不利于初學者實踐和理解。因此，為了更好的理解并掌握相關知識，最好是能夠自己編程實踐下。本文將展示如何使用NumPy來構建卷積神經網絡（Convolutional Neural Network , CNN）。
? ? ? ?CNN是較早提出的一種神經網絡，直到近年來才變得火熱，可以說是計算機視覺領域中應用最多的網絡。一些工具箱中已經很好地實現CNN模型，相關的庫函數已經完全編譯好，開發人員只需調用現有的模塊即可完成模型的搭建，避免了實現的復雜性。但實際上，這樣會使得開發人員不知道其中具體的實現細節。有些時候，數據科學家必須通過一些細節來提升模型的性能，但這些細節是工具箱不具備的。在這種情況下，唯一的解決方案就是自己編程實現一個類似的模型，這樣你對實現的模型會有最高級別的控制權，同時也能更好地理解模型每步的處理過程。
? ? ? ?本文將僅使用NumPy實現CNN網絡，創建三個層模塊，分別為卷積層（Conv）、ReLu激活函數和最大池化（max pooling）。

1.讀取輸入圖像

? ? ? ?以下代碼將從skimage Python庫中讀取已經存在的圖像，并將其轉換為灰度圖：

1. import skimage.data 2. # Reading the image 3. img = skimage.data.chelsea() 4. # Converting the image into gray. 5. img = skimage.color.rgb2gray(img)js

? ? ? ?讀取圖像是第一步，下一步的操作取決于輸入圖像的大小。將圖像轉換為灰度圖如下所示：

2.準備濾波器

? ? ? ?以下代碼為第一個卷積層Conv準備濾波器組（Layer 1，縮寫為l1，下同）：

1. l1_filter = numpy.zeros((2,3,3))

? ? ? ?根據濾波器的數目和每個濾波器的大小來創建零數組。上述代碼創建了2個3x3大小的濾波器，（2,3,3）中的元素數字分別表示2：濾波器的數目（num_filters）、3：表示濾波器的列數、3：表示濾波器的行數。由于輸入圖像是灰度圖，讀取后變成2維圖像矩陣，因此濾波器的尺寸選擇為2維陣列，舍去了深度。如果圖像是彩色圖（具有3個通道，分別為RGB），則濾波器的大小必須為（3,3,3），最后一個3表示深度，上述代碼也要更改，變成（2,3,3,3）。
? ? ? ?濾波器組的大小由自己指定，但沒有給定濾波器中具體的數值，一般采用隨機初始化。下列一組值可以用來檢查垂直和水平邊緣：

1. l1_filter[0, :, :] = numpy.array([[[-1, 0, 1], 2. [-1, 0, 1], 3. [-1, 0, 1]]]) 4. l1_filter[1, :, :] = numpy.array([[[1, 1, 1], 5. [0, 0, 0], 6. [-1, -1, -1]]])

3.卷積層（Conv Layer）

? ? ? ?構建好濾波器后，接下來就是與輸入圖像進行卷積操作。下面代碼使用conv函數將輸入圖像與濾波器組進行卷積：

1. l1_feature_map = conv(img, l1_filter)

? ? ? ?conv函數只接受兩個參數，分別為輸入圖像、濾波器組：

1. def conv(img, conv_filter): 2. if len(img.shape) > 2 or len(conv_filter.shape) > 3: # Check if number of image channels matches the filter depth. 3. if img.shape[-1] != conv_filter.shape[-1]: 4. print("Error: Number of channels in both image and filter must match.") 5. sys.exit() 6. if conv_filter.shape[1] != conv_filter.shape[2]: # Check if filter dimensions are equal. 7. print('Error: Filter must be a square matrix. I.e. number of rows and columns must match.') 8. sys.exit() 9. if conv_filter.shape[1]%2==0: # Check if filter diemnsions are odd. 10. print('Error: Filter must have an odd size. I.e. number of rows and columns must be odd.') 11. sys.exit() 12. 13. # An empty feature map to hold the output of convolving the filter(s) with the image. 14. feature_maps = numpy.zeros((img.shape[0]-conv_filter.shape[1]+1, 15. img.shape[1]-conv_filter.shape[1]+1, 16. conv_filter.shape[0])) 17. 18. # Convolving the image by the filter(s). 19. for filter_num in range(conv_filter.shape[0]): 20. print("Filter ", filter_num + 1) 21. curr_filter = conv_filter[filter_num, :] # getting a filter from the bank. 22. """ 23. Checking if there are mutliple channels for the single filter. 24. If so, then each channel will convolve the image. 25. The result of all convolutions are summed to return a single feature map. 26. """ 27. if len(curr_filter.shape) > 2: 28. conv_map = conv_(img[:, :, 0], curr_filter[:, :, 0]) # Array holding the sum of all feature maps. 29. for ch_num in range(1, curr_filter.shape[-1]): # Convolving each channel with the image and summing the results. 30. conv_map = conv_map + conv_(img[:, :, ch_num], 31. curr_filter[:, :, ch_num]) 32. else: # There is just a single channel in the filter. 33. conv_map = conv_(img, curr_filter) 34. feature_maps[:, :, filter_num] = conv_map # Holding feature map with the current filter. 35. return feature_maps # Returning all feature maps.

? ? ? ?該函數首先確保每個濾波器的深度等于圖像通道的數目，代碼如下。if語句首先檢查圖像與濾波器是否有一個深度通道，若存在，則檢查其通道數是否相等，如果匹配不成功，則報錯。

1. if len(img.shape) > 2 or len(conv_filter.shape) > 3: # Check if number of image channels matches the filter depth. 2. if img.shape[-1] != conv_filter.shape[-1]: 3. print("Error: Number of channels in both image and filter must match.")

? ? ? ?此外，濾波器的大小應該是奇數，且每個濾波器的大小是相等的。這是根據下面兩個if條件語塊來檢查的。如果條件不滿足，則程序報錯并退出。

1. if conv_filter.shape[1] != conv_filter.shape[2]: # Check if filter dimensions are equal. 2. print('Error: Filter must be a square matrix. I.e. number of rows and columns must match.') 3. sys.exit() 4. if conv_filter.shape[1]%2==0: # Check if filter diemnsions are odd. 5. print('Error: Filter must have an odd size. I.e. number of rows and columns must be odd.') 6. sys.exit()

? ? ? ?上述條件都滿足后，通過初始化一個數組來作為濾波器的值，通過下面代碼來指定濾波器的值：

1. # An empty feature map to hold the output of convolving the filter(s) with the image. 2. feature_maps = numpy.zeros((img.shape[0]-conv_filter.shape[1]+1, 3. img.shape[1]-conv_filter.shape[1]+1, 4. conv_filter.shape[0]))

? ? ? ?由于沒有設置步幅（stride）或填充（padding），默認為步幅設置為1，無填充。那么卷積操作后得到的特征圖大小為（img_rows-filter_rows+1, image_columns-filter_columns+1, num_filters），即輸入圖像的尺寸減去濾波器的尺寸后再加1。注意到，每個濾波器都會輸出一個特征圖。

1. # Convolving the image by the filter(s). 2. for filter_num in range(conv_filter.shape[0]): 3. print("Filter ", filter_num + 1) 4. curr_filter = conv_filter[filter_num, :] # getting a filter from the bank. 5. """ 6. Checking if there are mutliple channels for the single filter. 7. If so, then each channel will convolve the image. 8. The result of all convolutions are summed to return a single feature map. 9. """ 10. if len(curr_filter.shape) > 2: 11. conv_map = conv_(img[:, :, 0], curr_filter[:, :, 0]) # Array holding the sum of all feature maps. 12. for ch_num in range(1, curr_filter.shape[-1]): # Convolving each channel with the image and summing the results. 13. conv_map = conv_map + conv_(img[:, :, ch_num], 14. curr_filter[:, :, ch_num]) 15. else: # There is just a single channel in the filter. 16. conv_map = conv_(img, curr_filter) 17. feature_maps[:, :, filter_num] = conv_map # Holding feature map with the current filter.

循環遍歷濾波器組中的每個濾波器后，通過下面代碼更新濾波器的狀態：

1. curr_filter = conv_filter[filter_num, :] # getting a filter from the bank.

? ? ? ?如果輸入圖像不止一個通道，則濾波器必須具有同樣的通道數目。只有這樣，卷積過程才能正常進行。最后將每個濾波器的輸出求和作為輸出特征圖。下面的代碼檢測輸入圖像的通道數，如果圖像只有一個通道，那么一次卷積即可完成整個過程：

1. if len(curr_filter.shape) > 2: 2. conv_map = conv_(img[:, :, 0], curr_filter[:, :, 0]) # Array holding the sum of all feature map 3. for ch_num in range(1, curr_filter.shape[-1]): # Convolving each channel with the image and summing the results. 4. conv_map = conv_map + conv_(img[:, :, ch_num], 5. curr_filter[:, :, ch_num]) 6. else: # There is just a single channel in the filter. 7. conv_map = conv_(img, curr_filter)

? ? ? ?上述代碼中conv_函數與之前的conv函數不同，函數conv只接受輸入圖像和濾波器組這兩個參數，本身并不進行卷積操作，它只是設置用于conv_函數執行卷積操作的每一組輸入濾波器。下面是conv_函數的實現代碼：

1. def conv_(img, conv_filter): 2. filter_size = conv_filter.shape[0] 3. result = numpy.zeros((img.shape)) 4. #Looping through the image to apply the convolution operation. 5. for r in numpy.uint16(numpy.arange(filter_size/2, 6. img.shape[0]-filter_size/2-2)): 7. for c in numpy.uint16(numpy.arange(filter_size/2, img.shape[1]-filter_size/2-2)): 8. #Getting the current region to get multiplied with the filter. 9. curr_region = img[r:r+filter_size, c:c+filter_size] 10. #Element-wise multipliplication between the current region and the filter. 11. curr_result = curr_region * conv_filter 12. conv_sum = numpy.sum(curr_result) #Summing the result of multiplication. 13. result[r, c] = conv_sum #Saving the summation in the convolution layer feature map. 14. 15. #Clipping the outliers of the result matrix. 16. final_result = result[numpy.uint16(filter_size/2):result.shape[0]-numpy.uint16(filter_size/2), 17. numpy.uint16(filter_size/2):result.shape[1]-numpy.uint16(filter_size/2)] 18. return final_result

每個濾波器在圖像上迭代卷積的尺寸相同，通過以下代碼實現：

1. curr_region = img[r:r+filter_size, c:c+filter_size]

之后，在圖像區域矩陣和濾波器之間對位相乘，并將結果求和以得到單值輸出：

1. #Element-wise multipliplication between the current region and the filter. 2. curr_result = curr_region * conv_filter 3. conv_sum = numpy.sum(curr_result) #Summing the result of multiplication. 4. result[r, c] = conv_sum #Saving the summation in the convolution layer feature map.

? ? ? ?輸入圖像與每個濾波器卷積后，通過conv函數返回特征圖。下圖顯示conv層返回的特征圖（由于l1卷積層的濾波器參數為（2,3,3），即2個3x3大小的卷積核，最終輸出2個特征圖）：

卷積后圖像

卷積層的后面一般跟著激活函數層，本文采用ReLU激活函數。

4.ReLU激活函數層

? ? ? ?ReLU層將ReLU激活函數應用于conv層輸出的每個特征圖上，根據以下代碼行調用ReLU激活函數：

l1_feature_map_relu = relu(l1_feature_map)

ReLU激活函數（ReLU）的具體實現代碼如下：

1. def relu(feature_map): 2. #Preparing the output of the ReLU activation function. 3. relu_out = numpy.zeros(feature_map.shape) 4. for map_num in range(feature_map.shape[-1]): 5. for r in numpy.arange(0,feature_map.shape[0]): 6. for c in numpy.arange(0, feature_map.shape[1]): 7. relu_out[r, c, map_num] = numpy.max(feature_map[r, c, map_num], 0)

? ? ? ?ReLU思想很簡單，只是將特征圖中的每個元素與0進行比較，若大于0，則保留原始值。否則將其設置為0。ReLU層的輸出如下圖所示：

ReLU層輸出圖像

激活函數層后面一般緊跟池化層，本文采用最大池化（max pooling）。

5.最大池化層

? ? ? ?ReLU層的輸出作為最大池化層的輸入，根據下面的代碼行調用最大池化操作：

1. l1_feature_map_relu_pool = pooling(l1_feature_map_relu, 2, 2)

最大池化函數（max pooling）的具體實現代碼如下：

1. def pooling(feature_map, size=2, stride=2): 2. #Preparing the output of the pooling operation. 3. pool_out = numpy.zeros((numpy.uint16((feature_map.shape[0]-size+1)/stride), 4. numpy.uint16((feature_map.shape[1]-size+1)/stride), 5. feature_map.shape[-1])) 6. for map_num in range(feature_map.shape[-1]): 7. r2 = 0 8. for r in numpy.arange(0,feature_map.shape[0]-size-1, stride): 9. c2 = 0 10. for c in numpy.arange(0, feature_map.shape[1]-size-1, stride): 11. pool_out[r2, c2, map_num] = numpy.max(feature_map[r:r+size, c:c+size]) 12. c2 = c2 + 1 13. r2 = r2 +1

? ? ? ?該函數接受3個參數，分別為ReLU層的輸出，池化掩膜的大小和步幅。首先也是創建一個空數組，用來保存該函數的輸出。數組大小根據輸入特征圖的尺寸、掩膜大小以及步幅來確定。

1. pool_out = numpy.zeros((numpy.uint16((feature_map.shape[0]-size+1)/stride), 2. numpy.uint16((feature_map.shape[1]-size+1)/stride), 3. feature_map.shape[-1]))

? ? ? ?對每個輸入特征圖通道都進行最大池化操作，返回該區域中最大的值，代碼如下：

pool_out[r2, c2, map_num] = numpy.max(feature_map[r:r+size, c:c+size])

? ? ? ?池化層的輸出如下圖所示，這里為了顯示讓其圖像大小看起來一樣，其實池化操作后圖像尺寸遠遠小于其輸入圖像。

池化層輸出圖像

6.層堆疊

? ? ? ?以上內容已經實現CNN結構的基本層——conv、ReLU以及max pooling，現在將其進行堆疊使用，代碼如下：

1. # Second conv layer 2. l2_filter = numpy.random.rand(3, 5, 5, l1_feature_map_relu_pool.shape[-1]) 3. print("\n**Working with conv layer 2**") 4. l2_feature_map = conv(l1_feature_map_relu_pool, l2_filter) 5. print("\n**ReLU**") 6. l2_feature_map_relu = relu(l2_feature_map) 7. print("\n**Pooling**") 8. l2_feature_map_relu_pool = pooling(l2_feature_map_relu, 2, 2) 9. print("**End of conv layer 2**\n")

? ? ? ?從代碼中可以看到，l2表示第二個卷積層，該卷積層使用的卷積核為（3,5,5），即3個5x5大小的卷積核（濾波器）與第一層的輸出進行卷積操作，得到3個特征圖。后續接著進行ReLU激活函數以及最大池化操作。將每個操作的結果可視化，如下圖所示：

l2層處理過程可視化圖像

1. # Third conv layer 2. l3_filter = numpy.random.rand(1, 7, 7, l2_feature_map_relu_pool.shape[-1]) 3. print("\n**Working with conv layer 3**") 4. l3_feature_map = conv(l2_feature_map_relu_pool, l3_filter) 5. print("\n**ReLU**") 6. l3_feature_map_relu = relu(l3_feature_map) 7. print("\n**Pooling**") 8. l3_feature_map_relu_pool = pooling(l3_feature_map_relu, 2, 2) 9. print("**End of conv layer 3**\n"

? ? ? ?從代碼中可以看到，l3表示第三個卷積層，該卷積層使用的卷積核為（1,7,7），即1個7x7大小的卷積核（濾波器）與第二層的輸出進行卷積操作，得到1個特征圖。后續接著進行ReLU激活函數以及最大池化操作。將每個操作的結果可視化，如下圖所示：

l3層處理過程可視化圖像

? ? ? ?神經網絡的基本結構是前一層的輸出作為下一層的輸入，比如l2層接收l1層的輸出，l3層接收來l2層的輸出，代碼如下：1. l2_feature_map = conv(l1_feature_map_relu_pool, l2_filter) 2. l3_feature_map = conv(l2_feature_map_relu_pool, l3_filter)

7.完整代碼

? ? ? ?全部代碼已經上傳至Github上，每層的可視化是使用Matplotlib庫實現。

本文由阿里云云棲社區組織翻譯。
文章原標題《Building Convolutional Neural Network using NumPy from Scratch》

原文鏈接

總結

以上是生活随笔為你收集整理的仅使用NumPy完成卷积神经网络CNN的搭建（附Python代码）的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： Python数据预处理：使用Dask和N
下一篇： MaxCompute Studio使用心