當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

TensorFlow2-神经网络训练

發布時間：2024/4/11 编程问答 31 豆豆

生活随笔收集整理的這篇文章主要介紹了 TensorFlow2-神经网络训练小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

TensorFlow2神經網絡訓練

文章目錄

TensorFlow2神經網絡訓練
- 梯度下降
- 反向傳播
- 訓練可視化
- 補充說明

梯度下降

梯度 $?f=(?f?x1;?f?x2;…;?f?xn)\nabla f=\left(\frac{\partial f}{\partial x_{1}} ; \frac{\partial f}{\partial x_{2}} ; \ldots ; \frac{\partial f}{\partial x_{n}}\right)$ 指函數關于變量x的導數，梯度的方向表示函數值增大的方向，梯度的模表示函數值增大的速率。那么只要不斷將參數的值向著梯度的反方向更新一定大小，就能得到函數的最小值（全局最小值或者局部最小值）。
$θt+1=θt?αt?f(θt)\theta_{t+1}=\theta_{t}-\alpha_{t} \nabla f\left(\theta_{t}\right)$
上述參數更新的過程就叫做梯度下降法，但是一般利用梯度更新參數時會將梯度乘以一個小于1的學習速率（learning rate），這是因為往往梯度的模還是比較大的，直接用其更新參數會使得函數值不斷波動，很難收斂到一個平衡點（這也是學習率不宜過大的原因）。
但是對于不同的函數，GD（梯度下降法）未必都能找到最優解，很多時候它只能收斂到一個局部最優解就不再變動了（盡管這個局部最優解已經很接近全局最優解了），這是函數性質決定的，實驗證明，梯度下降法對于凸函數有著較好的表現。
TensorFlow和PyTorch這類深度學習框架是支持自動梯度求解的，在TensorFlow2中只要將需要進行梯度求解的代碼段包裹在GradientTape中，TensorFlow就會自動求解相關運算的梯度。但是通過tape.gradient(loss, [w1, w2, ...])只能調用一次，梯度作為占用顯存較大的資源在被獲取一次后就會被釋放掉，要想多次調用需要設置tf.GradientTape(persistent=True)（此時注意及時釋放資源）。TensorFlow2也支持多階求導，只要將求導進行多層包裹即可。示例如下。

反向傳播

反向傳播算法（BP）是訓練深度神經網絡的核心算法，它的實現是基于鏈式法則的。將輸出層的loss通過權值反向傳播(前向傳播的逆運算)回第i層（這是個反復迭代返回的過程），計算i層的梯度更新參數。具體原理見之前的BP神經網絡博客。
在TensorFlow2中，對于經典的BP神經網絡層進行了封裝，稱為全連接層，自動完成BP神經網絡隱層的操作。下面為使用Dense層構建BP神經網絡訓練Fashion_MNIST數據集進行識別的代碼。""" Author: Zhou Chen Date: 2019/10/15 Desc: About """ import tensorflow as tf from tensorflow import keras from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics(x, y), (x_test, y_test) = datasets.fashion_mnist.load_data() print(x.shape, y.shape)def preprocess(x, y):x = tf.cast(x, dtype=tf.float32) / 255.y = tf.cast(y, dtype=tf.int32)return x, ybatch_size = 64 db = tf.data.Dataset.from_tensor_slices((x, y)) db = db.map(preprocess).shuffle(10000).batch(batch_size) db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test)) db_test = db_test.map(preprocess).shuffle(10000).batch(batch_size)model = Sequential([layers.Dense(256, activation=tf.nn.relu), # [b, 784] => [b, 256]layers.Dense(128, activation=tf.nn.relu), # [b, 256] => [b, 128]layers.Dense(64, activation=tf.nn.relu), # [b, 128] => [b, 64]layers.Dense(32, activation=tf.nn.relu), # [b, 64] => [b, 32]layers.Dense(10), # [b, 32] => [b, 10] ]) model.build(input_shape=([None, 28*28])) optimizer = optimizers.Adam(lr=1e-3)def main():# forwardfor epoch in range(30):for step, (x, y) in enumerate(db):x = tf.reshape(x, [-1, 28*28])with tf.GradientTape() as tape:logits = model(x)y_onthot = tf.one_hot(y, depth=10)loss_mse = tf.reduce_mean(tf.losses.MSE(y_onthot, logits))loss_ce = tf.reduce_mean(tf.losses.categorical_crossentropy(y_onthot, logits, from_logits=True))grads = tape.gradient(loss_ce, model.trainable_variables)# backwardoptimizer.apply_gradients(zip(grads, model.trainable_variables))if step % 100 == 0:print(epoch, step, "loss:", float(loss_mse), float(loss_ce))# testtotal_correct, total_num = 0, 0for x, y in db_test:x = tf.reshape(x, [-1, 28*28])logits = model(x)prob = tf.nn.softmax(logits, axis=1)pred = tf.cast(tf.argmax(prob, axis=1), dtype=tf.int32)correct = tf.reduce_sum(tf.cast(tf.equal(pred, y), dtype=tf.int32))total_correct += int(correct)total_num += int(x.shape[0])acc = total_correct / total_numprint("acc", acc)if __name__ == '__main__':main()

訓練可視化

TensorFlow有一套伴生的可視化工具包TensorBoard（使用pip安裝，最新版本的TensorFlow會自動安裝TensorBoard），它是基于Web端的方便監控訓練過程和訓練數據的工具，監控數據來源于本地磁盤指定的一個目錄。一般使用TensorBoard需要三步，創建log目錄，創建summary實例，指定數據給summary實例。
tensorboard --logdir logs監聽設定的log目錄，此時由于并沒有寫入文件，所以顯示如下。
后面兩步一般在訓練過程中嵌入，示例如下。（注意，TensorBoard并沒有設計組合多個sample圖片而是一個個顯示，組合需要自己寫接口，下面的代碼就寫了這個接口。）""" Author: Zhou Chen Date: 2019/10/15 Desc: About """ import tensorflow as tf from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics import datetime from matplotlib import pyplot as plt import iodef preprocess(x, y):x = tf.cast(x, dtype=tf.float32) / 255.y = tf.cast(y, dtype=tf.int32)return x, ydef plot_to_image(figure):# Save the plot to a PNG in memory.buf = io.BytesIO()plt.savefig(buf, format='png')# Closing the figure prevents it from being displayed directly inside the notebook.plt.close(figure)buf.seek(0)# Convert PNG buffer to TF imageimage = tf.image.decode_png(buf.getvalue(), channels=4)# Add the batch dimensionimage = tf.expand_dims(image, 0)return imagedef image_grid(images):"""Return a 5x5 grid of the MNIST images as a matplotlib figure."""# Create a figure to contain the plot.figure = plt.figure(figsize=(10, 10))for i in range(25):# Start next subplot.plt.subplot(5, 5, i + 1, title='name')plt.xticks([])plt.yticks([])plt.grid(False)plt.imshow(images[i], cmap=plt.cm.binary)return figurebatchsz = 128 (x, y), (x_val, y_val) = datasets.mnist.load_data() print('datasets:', x.shape, y.shape, x.min(), x.max())db = tf.data.Dataset.from_tensor_slices((x, y)) db = db.map(preprocess).shuffle(60000).batch(batchsz).repeat(10)ds_val = tf.data.Dataset.from_tensor_slices((x_val, y_val)) ds_val = ds_val.map(preprocess).batch(batchsz, drop_remainder=True)network = Sequential([layers.Dense(256, activation='relu'),layers.Dense(128, activation='relu'),layers.Dense(64, activation='relu'),layers.Dense(32, activation='relu'),layers.Dense(10)]) network.build(input_shape=(None, 28 * 28)) network.summary()optimizer = optimizers.Adam(lr=0.01)current_time = datetime.datetime.now().strftime("%Y%m%d-%H%M%S") log_dir = 'logs/' + current_time summary_writer = tf.summary.create_file_writer(log_dir)# get x from (x,y) sample_img = next(iter(db))[0] # get first image instance sample_img = sample_img[0] sample_img = tf.reshape(sample_img, [1, 28, 28, 1]) with summary_writer.as_default():tf.summary.image("Training sample:", sample_img, step=0)for step, (x, y) in enumerate(db):with tf.GradientTape() as tape:# [b, 28, 28] => [b, 784]x = tf.reshape(x, (-1, 28 * 28))# [b, 784] => [b, 10]out = network(x)# [b] => [b, 10]y_onehot = tf.one_hot(y, depth=10)# [b]loss = tf.reduce_mean(tf.losses.categorical_crossentropy(y_onehot, out, from_logits=True))grads = tape.gradient(loss, network.trainable_variables)optimizer.apply_gradients(zip(grads, network.trainable_variables))if step % 100 == 0:print(step, 'loss:', float(loss))with summary_writer.as_default():tf.summary.scalar('train-loss', float(loss), step=step)# evaluateif step % 500 == 0:total, total_correct = 0., 0for _, (x, y) in enumerate(ds_val):# [b, 28, 28] => [b, 784]x = tf.reshape(x, (-1, 28 * 28))# [b, 784] => [b, 10]out = network(x)# [b, 10] => [b]pred = tf.argmax(out, axis=1)pred = tf.cast(pred, dtype=tf.int32)# bool typecorrect = tf.equal(pred, y)# bool tensor => int tensor => numpytotal_correct += tf.reduce_sum(tf.cast(correct, dtype=tf.int32)).numpy()total += x.shape[0]print(step, 'Evaluate Acc:', total_correct / total)# print(x.shape)val_images = x[:25]val_images = tf.reshape(val_images, [-1, 28, 28, 1])with summary_writer.as_default():tf.summary.scalar('test-acc', float(total_correct / total), step=step)tf.summary.image("val-onebyone-images:", val_images, max_outputs=25, step=step)val_images = tf.reshape(val_images, [-1, 28, 28])figure = image_grid(val_images)tf.summary.image('val-images:', plot_to_image(figure), step=step)
TensorBoard的反饋效果如下。

補充說明

本文主要針對TensorFlow2中layers中Dense層以及反向傳播和訓練可視化進行了簡略說明。
博客同步至我的個人博客網站，歡迎瀏覽其他文章。
如有錯誤，歡迎指正。

總結

以上是生活随笔為你收集整理的TensorFlow2-神经网络训练的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

神经网络

上一篇： Linux服务-Samba文件服务器部署
下一篇： TensorFlow2-高层API接口K