當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

应用训练MNIST的CNN模型识别手写数字图片完整实例（图片来自网上）

發布時間：2023/12/20 编程问答 27 豆豆

生活随笔收集整理的這篇文章主要介紹了应用训练MNIST的CNN模型识别手写数字图片完整实例（图片来自网上）小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

1 思考訓練模型如何進行應用

通過CNN訓練的MNIST模型如何應用來識別手寫數字圖片（圖片來自網上）？

這個問題困擾了我2天，網上找的很多代碼都是訓練模型和調用模型包含在一個.py文件中，這樣子每一次調用模型都需要重新訓練一次模型，這種方法顯然效率低下；

我想到要把訓練模型的.py文件和調用模型預測的.py文件分開，但是調用模型的.py文件該怎么寫，很多回答都是如下所示：

saver = tf.train.Saver() # 定義saver with tf.Session() as sess:sess.run(intt) # 載入模型saver.restore(sess,"./save/model.ckpt")

這個回答不是我要的答案，我覺得載入的模型要起作用，起碼應該有個輸入輸出的參數，于是我想要在兩個.py文件之間傳遞參數，我收到的結果是：

from xxx import 參數獲取xxx.py文件的參數

但是我這樣寫之后，直接是把訓練模型的文件重新跑了一遍，這不是我要的效果，而且最后的圖片識別也報錯，程序執行中斷；

終于我無意間看到了下面這篇文章：

多層神經網絡建模與模型的保存還原
https://www.cnblogs.com/HuangYJ/p/11681357.html

簡單來說，saver.restore() 是加載模型的參數：首先定義相同結構的模型（要定義一個和以前存盤模型相同結構的模型，只有它們的結構相同，這些變量才能吻合，才能把讀取出來的變量的值賦給等待著被覆蓋的變量的值）。

2 訓練模型的 main graph

從上圖（main graph）可以直觀看出我們一共需要定義的模型結構有10個：

input image conv_layer1 pooling_layer1 conv_layer2 pooling_layer2 fc_layer3 dropout output_fc_layer4 softmax

10個結構的代碼（函數定義的代碼沒放上來）：

with tf.name_scope('input'):x=tf.placeholder(tf.float32,[None,784])y_=tf.placeholder('float',[None,10])with tf.name_scope('image'):x_image=tf.reshape(x,[-1,28,28,1])tf.summary.image('input_image',x_image,8)with tf.name_scope('conv_layer1'):W_conv1=weight_variable([5,5,1,32])b_conv1=bias_variable([32])h_conv1=tf.nn.relu(conv2d(x_image,W_conv1)+b_conv1)with tf.name_scope('pooling_layer1'):h_pool1=max_pool_2x2(h_conv1)with tf.name_scope('conv_layer2'):W_conv2=weight_variable([5,5,32,64])b_conv2=bias_variable([64])h_conv2=tf.nn.relu(conv2d(h_pool1,W_conv2)+b_conv2)with tf.name_scope('pooling_layer2'):h_pool2=max_pool_2x2(h_conv2)with tf.name_scope('fc_layer3'):W_fc1=weight_variable([7*7*64,1024])b_fc1=bias_variable([1024])h_pool2_flat=tf.reshape(h_pool2,[-1,7*7*64])h_fc1=tf.nn.relu(tf.matmul(h_pool2_flat,W_fc1)+b_fc1)with tf.name_scope('dropout'):keep_prob=tf.placeholder(tf.float32)h_fc1_drop=tf.nn.dropout(h_fc1,keep_prob)with tf.name_scope('output_fc_layer4'):W_fc2=weight_variable([1024,10])b_fc2=bias_variable([10])with tf.name_scope('softmax'):y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop,W_fc2)+b_fc2)

3 構造模型的輸入和輸出

換一種說法，我們要調用這個訓練好的模型，是希望我們輸入一張手寫數字圖片，模型能自動幫我們識別出這張圖片上的數字，并打印出來。以上是我們要達到的目的，但是訓練的模型本質還是做數學運算，圖片輸入和識別數字輸出都要根據模型來確定。

模型的輸入要求的是一維張量（向量），圖像要求是28*28的尺寸，一共784個像素點，需要由2維張量（矩陣）展開成一維張量，以下代碼實現：

text = Image.open('./images/text3.png') # 載入圖片 data = list(text.getdata()) picture=[(255-x)*1.0/255.0 for x in data] #picture作為調用模型的輸入

模型的輸出是經過softmax函數運算的輸出，是一長串概率數組，我們要找出最大的概率對應的數字，這個數字就是調入的模型預測到的結果，以下代碼實現：

# 進行預測prediction = tf.argmax(y_conv,1)#找概率最大對應的數字predict_result = prediction.eval(feed_dict={x: [picture],keep_prob:1.0},session=sess)print("你導入的圖片是：",predict_result[0])

4 應用模型進行識別的完整.py代碼

from PIL import Image import tensorflow.compat.v1 as tf tf.disable_v2_behavior()#---設置模型參數--- def weight_variable(shape):#權重函數initial=tf.truncated_normal(shape,stddev=0.1)return tf.Variable(initial)def bias_variable(shape):#偏置函數initial=tf.constant(0.1,shape=shape)return tf.Variable(initial)def conv2d(x,W):return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding='SAME')def max_pool_2x2(x):return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')with tf.name_scope('input'):x=tf.placeholder(tf.float32,[None,784])y_=tf.placeholder('float',[None,10])with tf.name_scope('image'):x_image=tf.reshape(x,[-1,28,28,1])tf.summary.image('input_image',x_image,8)with tf.name_scope('conv_layer1'):W_conv1=weight_variable([5,5,1,32])b_conv1=bias_variable([32])h_conv1=tf.nn.relu(conv2d(x_image,W_conv1)+b_conv1)with tf.name_scope('pooling_layer1'):h_pool1=max_pool_2x2(h_conv1)with tf.name_scope('conv_layer2'):W_conv2=weight_variable([5,5,32,64])b_conv2=bias_variable([64])h_conv2=tf.nn.relu(conv2d(h_pool1,W_conv2)+b_conv2)with tf.name_scope('pooling_layer2'):h_pool2=max_pool_2x2(h_conv2)with tf.name_scope('fc_layer3'):W_fc1=weight_variable([7*7*64,1024])b_fc1=bias_variable([1024])h_pool2_flat=tf.reshape(h_pool2,[-1,7*7*64])h_fc1=tf.nn.relu(tf.matmul(h_pool2_flat,W_fc1)+b_fc1)with tf.name_scope('dropout'):keep_prob=tf.placeholder(tf.float32)h_fc1_drop=tf.nn.dropout(h_fc1,keep_prob)with tf.name_scope('output_fc_layer4'):W_fc2=weight_variable([1024,10])b_fc2=bias_variable([10])with tf.name_scope('softmax'):y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop,W_fc2)+b_fc2)#---加載模型，用導入的圖片進行測試-- text = Image.open('./images/text2.png') # 載入圖片 data = list(text.getdata()) picture=[(255-x)*1.0/255.0 for x in data] intt=tf.global_variables_initializer() saver = tf.train.Saver() # 定義saverwith tf.Session() as sess:sess.run(intt)# 載入模型參數saver.restore(sess,"./save/model.ckpt")# 進行預測prediction = tf.argmax(y_conv,1)predict_result = prediction.eval(feed_dict={x: [picture],keep_prob:1.0},session=sess)print("你導入的圖片是：",predict_result[0])

text2.png

識別結果（Spyder編譯）

模型和圖片下載鏈接：
https://download.csdn.net/download/weixin_42899627/12672965

5 運行小提示

每次代碼運行完都需要 restart kernel 才能再次運行，否則會報錯，具體什么原因我沒深究。

參考文章：
1 [Python]基于CNN的MNIST手寫數字識別 - 東聃 - 博客園
2 TensorFlow下利用MNIST訓練模型識別手寫數字 - qiuhlee - 博客園
3 多層神經網絡建模與模型的保存還原
4 TensorFlow實戰（三）分類應用入門：MNIST手寫數字識別

以上是個人理解，有不對的希望批評指正

總結

以上是生活随笔為你收集整理的应用训练MNIST的CNN模型识别手写数字图片完整实例（图片来自网上）的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。