关于TensorFlow使用GPU加速
生活随笔
收集整理的這篇文章主要介紹了
关于TensorFlow使用GPU加速
小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
我們?cè)诎惭btensorflow-gpu后,其運(yùn)行時(shí)我們可以選定使用gpu來(lái)進(jìn)行加速訓(xùn)練,這無(wú)疑會(huì)幫助我們加快訓(xùn)練腳步。
(注意:當(dāng)我們的tensorflow-gpu安裝后,其默認(rèn)會(huì)使用gpu來(lái)訓(xùn)練)
之前博主已經(jīng)為自己的python環(huán)境安裝了tensorflow-gpu,詳情參考:
Tensorflow安裝
安裝完成后,我們以BP神經(jīng)網(wǎng)絡(luò)算法實(shí)現(xiàn)手寫(xiě)數(shù)字識(shí)別這個(gè)項(xiàng)目為例
首先先對(duì)BP神經(jīng)網(wǎng)絡(luò)的原理進(jìn)行簡(jiǎn)單理解
BP神經(jīng)網(wǎng)絡(luò)實(shí)現(xiàn)手寫(xiě)數(shù)字識(shí)別
# -*- coding: utf-8 -*-""" 手寫(xiě)數(shù)字識(shí)別, BP神經(jīng)網(wǎng)絡(luò)算法 """ # ------------------------------------------- ''' 使用python解析二進(jìn)制文件 ''' import numpy as np import struct import random import tensorflow as tf from sklearn.model_selection import train_test_splitimport os os.environ["CUDA_VISIBLE_DEVICES"] = "0" # 強(qiáng)制使用cpu import time T1 = time.clock() class LoadData(object):def __init__(self, file1, file2):self.file1 = file1self.file2 = file2# 載入訓(xùn)練集def loadImageSet(self):binfile = open(self.file1, 'rb') # 讀取二進(jìn)制文件buffers = binfile.read() # 緩沖head = struct.unpack_from('>IIII', buffers, 0) # 取前4個(gè)整數(shù),返回一個(gè)元組offset = struct.calcsize('>IIII') # 定位到data開(kāi)始的位置imgNum = head[1] # 圖像個(gè)數(shù)width = head[2] # 行數(shù),28行height = head[3] # 列數(shù),28bits = imgNum*width*height # data一共有60000*28*28個(gè)像素值bitsString = '>' + str(bits) + 'B' # fmt格式:'>47040000B'imgs = struct.unpack_from(bitsString, buffers, offset) # 取data數(shù)據(jù),返回一個(gè)元組binfile.close()imgs = np.reshape(imgs, [imgNum, width*height])return imgs, head# 載入訓(xùn)練集標(biāo)簽def loadLabelSet(self):binfile = open(self.file2, 'rb') # 讀取二進(jìn)制文件buffers = binfile.read() # 緩沖head = struct.unpack_from('>II', buffers, 0) # 取前2個(gè)整數(shù),返回一個(gè)元組offset = struct.calcsize('>II') # 定位到label開(kāi)始的位置labelNum = head[1] # label個(gè)數(shù)numString = '>' + str(labelNum) + 'B'labels = struct.unpack_from(numString, buffers, offset) # 取label數(shù)據(jù)binfile.close()labels = np.reshape(labels, [labelNum]) # 轉(zhuǎn)型為列表(一維數(shù)組)return labels, head# 將標(biāo)簽拓展為10維向量def expand_lables(self):labels, head = self.loadLabelSet()expand_lables = []for label in labels:zero_vector = np.zeros((1, 10))zero_vector[0, label] = 1expand_lables.append(zero_vector)return expand_lables# 將樣本與標(biāo)簽組合成數(shù)組[[array(data), array(label)], []...]def loadData(self):imags, head = self.loadImageSet()expand_lables = self.expand_lables()data = []for i in range(imags.shape[0]):imags[i] = imags[i].reshape((1, 784))data.append([imags[i], expand_lables[i]])return datafile1 = r'train-images.idx3-ubyte' file2 = r'train-labels.idx1-ubyte' trainingData = LoadData(file1, file2) training_data = trainingData.loadData() file3 = r't10k-images.idx3-ubyte' file4 = r't10k-labels.idx1-ubyte' testData = LoadData(file3, file4) test_data = testData.loadData() X_train = [i[0] for i in training_data] y_train = [i[1][0] for i in training_data] X_test = [i[0] for i in test_data] y_test = [i[1][0] for i in test_data]X_train, X_validation, y_train, y_validation = train_test_split(X_train, y_train, test_size=0.1, random_state=7) # print(np.array(X_test).shape) # print(np.array(y_test).shape) # print(np.array(X_train).shape) # print(np.array(y_train).shape)INUPUT_NODE = 784 OUTPUT_NODE = 10LAYER1_NODE = 500 BATCH_SIZE = 200 LERANING_RATE_BASE = 0.005 # 基礎(chǔ)的學(xué)習(xí)率 LERANING_RATE_DACAY = 0.99 # 學(xué)習(xí)率的衰減率 REGULARZATION_RATE = 0.01 # 正則化項(xiàng)在損失函數(shù)中的系數(shù) TRAINING_STEPS = 30000 MOVING_AVERAGE_DECAY = 0.99 # 滑動(dòng)平均衰減率# 三層全連接神經(jīng)網(wǎng)絡(luò),滑動(dòng)平均類(lèi) def inference(input_tensor, avg_class, weights1, biases1, weights2, biases2):if not avg_class:layer1 = tf.nn.relu(tf.matmul(input_tensor, weights1)+biases1)# 沒(méi)有使用softmax層輸出return tf.matmul(layer1, weights2)+biases2else:layer1 = tf.nn.relu(tf.matmul(input_tensor, avg_class.average(weights1))+avg_class.average(biases1))return tf.matmul(layer1, avg_class.average(weights2))+avg_class.average(biases2)def train(X_train, X_validation, y_train, y_validation, X_test, y_test):x = tf.placeholder(tf.float32, [None, INUPUT_NODE], name="x-input")y_ = tf.placeholder(tf.float32, [None, OUTPUT_NODE], name="y-input")# 生成隱藏層weights1 = tf.Variable(tf.truncated_normal([INUPUT_NODE, LAYER1_NODE], stddev=0.1))biases1 = tf.Variable(tf.constant(0.1, shape=[LAYER1_NODE]))# 生成輸出層weights2 = tf.Variable(tf.truncated_normal([LAYER1_NODE, OUTPUT_NODE], stddev=0.1))biases2 = tf.Variable(tf.constant(0.1, shape=[OUTPUT_NODE]))y = inference(x, None, weights1, biases1, weights2, biases2)global_step = tf.Variable(0, trainable=False)variable_averages = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY, global_step)variable_averages_op = variable_averages.apply(tf.trainable_variables())average_y = inference(x, variable_averages, weights1, biases1, weights2, biases2)cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y, labels=tf.argmax(y_, 1))cross_entropy_mean = tf.reduce_mean(cross_entropy)# L2正則化損失regularizer = tf.contrib.layers.l2_regularizer(REGULARZATION_RATE)regularization = regularizer(weights1) + regularizer(weights2)loss = cross_entropy_mean + regularization# 指數(shù)衰減的學(xué)習(xí)率learning_rate = tf.train.exponential_decay(LERANING_RATE_BASE,global_step,len(X_train)/BATCH_SIZE,LERANING_RATE_DACAY)train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)with tf.control_dependencies([train_step, variable_averages_op]):train_op = tf.no_op(name='train')correct_prediction = tf.equal(tf.argmax(average_y, 1), tf.argmax(y_, 1))accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))with tf.Session() as sess:init_op = tf.global_variables_initializer()sess.run(init_op)validation_feed = {x: X_validation, y_: y_validation}train_feed = {x: X_train, y_: y_train}test_feed = {x: X_test, y_: y_test}for i in range(TRAINING_STEPS):if i % 500 == 0:validate_acc = sess.run(accuracy, feed_dict=validation_feed)print("after %d training step(s), validation accuracy ""using average model is %g" % (i, validate_acc))start = (i * BATCH_SIZE) % len(X_train)end = min(start + BATCH_SIZE, len(X_train))sess.run(train_op,feed_dict={x: X_train[start:end], y_: y_train[start:end]})# print('loss:', sess.run(loss))test_acc = sess.run(accuracy, feed_dict=test_feed)print("after %d training step(s), test accuracy using""average model is %g" % (TRAINING_STEPS, test_acc))train(X_train, X_validation, y_train, y_validation, X_test, y_test) T2 = time.clock() print('程序運(yùn)行時(shí)間:%s毫秒' % ((T2 - T1)*1000))GPU運(yùn)行結(jié)果
CPU運(yùn)行結(jié)果
從運(yùn)行結(jié)果來(lái)看,兩者運(yùn)行時(shí)間相差兩倍
博主的顯卡太拉跨了,看別人的測(cè)試兩者可謂天差地別,嗚嗚嗚,但好歹也算是有些加速效果吧,拜拜!
總結(jié)
以上是生活随笔為你收集整理的关于TensorFlow使用GPU加速的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: SpringCloud-25-Gatew
- 下一篇: 我知道很多主播因为以前因为公会的名声不太