當(dāng)前位置：首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

Tensorflow[实战篇]——Face Recognition

發(fā)布時(shí)間：2025/3/15 编程问答 33 豆豆

生活随笔收集整理的這篇文章主要介紹了 Tensorflow[实战篇]——Face Recognition 小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

前言

本文章的參考卷積神經(jīng)網(wǎng)絡(luò)應(yīng)用于人臉識(shí)別，通過(guò)Tensorflow改寫的代碼。也通過(guò)自己的想法改動(dòng)了一些代碼。本文算是一個(gè)小小的demo吧，因?yàn)橹岸际腔A(chǔ)篇。而這個(gè)算是基于之前的基礎(chǔ)學(xué)習(xí)做的demo。雖然本demo比較簡(jiǎn)單，但我覺(jué)得可以給大家一些啟示。

源碼地址：https://github.com/Salon-sai/learning-tensorflow/tree/master/lesson3

數(shù)據(jù)材料

這是一個(gè)小型的人臉數(shù)據(jù)庫(kù)，一共有40個(gè)人，每個(gè)人有10張照片作為樣本數(shù)據(jù)。這些圖片都是黑白照片，意味著這些圖片都只有灰度0-255，沒(méi)有rgb三通道。于是我們需要對(duì)這張大圖片切分成一個(gè)個(gè)的小臉。整張圖片大小是1190 × 942，一共有20 × 20張照片。那么每張照片的大小就是（1190 / 20）× （942 / 20）= 57 × 47 （大約，以為每張圖片之間存在間距）。

olivettifaces數(shù)據(jù)庫(kù)

設(shè)計(jì)思路

那么我們得到這些樣本圖片之后，再將圖片進(jìn)行分類，每一個(gè)人一個(gè)類別，每個(gè)類別10張圖片樣本。我就取每一類的前八張作為訓(xùn)練樣本，每類第九張作為校驗(yàn)樣本，每類第十張圖片作為測(cè)試樣本。每張圖片的label使用one-hot向量表示。

至于模型設(shè)計(jì)上面，選取了CNN卷積神經(jīng)網(wǎng)絡(luò)作為設(shè)計(jì)模型，一共有兩層卷積，一層全連接層，最后一層用softmax作為分類。然后用softmax的預(yù)測(cè)結(jié)果和真實(shí)的labels做Cross-Entropy，這樣就得到了Loss Function了。

代碼展示（部分）和說(shuō)明

獲取dataset

def load_data(dataset_path):img = Image.open(dataset_path)# 定義一個(gè)20 × 20的訓(xùn)練樣本，一共有40個(gè)人，每個(gè)人都10張樣本照片img_ndarray = np.asarray(img, dtype='float64') / 64# 記錄臉數(shù)據(jù)矩陣，57 * 47為每張臉的像素矩陣faces = np.empty((400, 57 * 47))for row in range(20):for column in range(20):faces[20 * row + column] = np.ndarray.flatten(img_ndarray[row * 57: (row + 1) * 57, column * 47 : (column + 1) * 47])label = np.zeros((400, 40))for i in range(40):label[i * 10: (i + 1) * 10, i] = 1# 將數(shù)據(jù)分成訓(xùn)練集，驗(yàn)證集，測(cè)試集train_data = np.empty((320, 57 * 47))train_label = np.zeros((320, 40))vaild_data = np.empty((40, 57 * 47))vaild_label = np.zeros((40, 40))test_data = np.empty((40, 57 * 47))test_label = np.zeros((40, 40))for i in range(40):train_data[i * 8: i * 8 + 8] = faces[i * 10: i * 10 + 8]train_label[i * 8: i * 8 + 8] = label[i * 10: i * 10 + 8]vaild_data[i] = faces[i * 10 + 8]vaild_label[i] = label[i * 10 + 8]test_data[i] = faces[i * 10 + 9]test_label[i] = label[i * 10 + 9]return [(train_data, train_label),(vaild_data, vaild_label),(test_data, test_label)]

跟之前說(shuō)的思路一樣，讀取圖片然后按照每張照片的大小57 × 47去劃分照片，最后得到一個(gè)faces的數(shù)據(jù)矩陣，然后再給每張照片賦予相應(yīng)的label。最后劃分?jǐn)?shù)據(jù)成為訓(xùn)練數(shù)據(jù)集，校驗(yàn)數(shù)據(jù)集，測(cè)試數(shù)據(jù)集。

模型網(wǎng)絡(luò)

1. 卷積層：

def convolutional_layer(data, kernel_size, bias_size, pooling_size):kernel = tf.get_variable("conv", kernel_size, initializer=tf.random_normal_initializer())bias = tf.get_variable('bias', bias_size, initializer=tf.random_normal_initializer())conv = tf.nn.conv2d(data, kernel, strides=[1, 1, 1, 1], padding='SAME')linear_output = tf.nn.relu(tf.add(conv, bias))pooling = tf.nn.max_pool(linear_output, ksize=pooling_size, strides=pooling_size, padding="SAME")return pooling

我定義了一個(gè)卷積層的函數(shù)用于，每在不同的scope下調(diào)用一次就生成相應(yīng)的卷積層及其參數(shù)。哈哈，是不是覺(jué)得這招很熟悉很好用呢？kernel，和bias分別是卷積核和偏移量（卷積就是對(duì)局部進(jìn)行Linear操作）。最后就是max_pool層（取得核中的最大值代碼這個(gè)核里面的元素，因?yàn)橹翟酱箢伾缴?#xff0c;那就越能代表他的特征。也可以理解成一個(gè)去噪的操作吧），做完pool層后，我們算是對(duì)圖片進(jìn)行一個(gè)（卷積+pooling）處理。

2. 全連接層和分類層的Linear部分：

def linear_layer(data, weights_size, biases_size):weights = tf.get_variable("weigths", weights_size, initializer=tf.random_normal_initializer())biases = tf.get_variable("biases", biases_size, initializer=tf.random_normal_initializer())return tf.add(tf.matmul(data, weights), biases)

相信大家都知道這是一個(gè)很有簡(jiǎn)單的線性操作，我在這里就不多說(shuō)了。

3. 整個(gè)網(wǎng)絡(luò)：

def convolutional_neural_network(data):# 根據(jù)類別個(gè)數(shù)定義最后輸出層的神經(jīng)元n_ouput_layer = 40kernel_shape1=[5, 5, 1, 32]kernel_shape2=[5, 5, 32, 64]full_conn_w_shape = [15 * 12 * 64, 1024]out_w_shape = [1024, n_ouput_layer]bias_shape1=[32]bias_shape2=[64]full_conn_b_shape = [1024]out_b_shape = [n_ouput_layer]data = tf.reshape(data, [-1, 57, 47, 1])# 經(jīng)過(guò)第一層卷積神經(jīng)網(wǎng)絡(luò)后，得到的張量shape為：[batch, 29, 24, 32]with tf.variable_scope("conv_layer1") as layer1:layer1_output = convolutional_layer(data=data,kernel_size=kernel_shape1,bias_size=bias_shape1,pooling_size=[1, 2, 2, 1])# 經(jīng)過(guò)第二層卷積神經(jīng)網(wǎng)絡(luò)后，得到的張量shape為：[batch, 15, 12, 64]with tf.variable_scope("conv_layer2") as layer2:layer2_output = convolutional_layer(data=layer1_output,kernel_size=kernel_shape2,bias_size=bias_shape2,pooling_size=[1, 2, 2, 1])with tf.variable_scope("full_connection") as full_layer3:# 講卷積層張量數(shù)據(jù)拉成2-D張量只有有一列的列向量layer2_output_flatten = tf.contrib.layers.flatten(layer2_output)layer3_output = tf.nn.relu(linear_layer(data=layer2_output_flatten,weights_size=full_conn_w_shape,biases_size=full_conn_b_shape))# layer3_output = tf.nn.dropout(layer3_output, 0.8)with tf.variable_scope("output") as output_layer4:output = linear_layer(data=layer3_output,weights_size=out_w_shape,biases_size=out_b_shape)return output;

一圖勝千言：

模型設(shè)計(jì)

訓(xùn)練過(guò)程：

在訓(xùn)練中我使用softmax_cross_entropy_with_logits作為L(zhǎng)oss function，AdamOptimizer作為優(yōu)化算法（學(xué)習(xí)率為0.01，由于我的本本已經(jīng)是6年前的產(chǎn)物，而且沒(méi)有GPU，唯有調(diào)大點(diǎn)，讓他快點(diǎn)收斂算了[哭崩的臉.jpg]）。

predict = convolutional_neural_network(X)cost_func = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=predict, labels=Y))optimizer = tf.train.AdamOptimizer(1e-2).minimize(cost_func)# 用于保存訓(xùn)練的最佳模型saver = tf.train.Saver()model_dir = './model'model_path = model_dir + '/best.ckpt'with tf.Session() as session:# 若不存在模型數(shù)據(jù)，需要訓(xùn)練模型參數(shù)if not os.path.exists(model_path + ".index"):session.run(tf.global_variables_initializer())best_loss = float('Inf')for epoch in range(20):epoch_loss = 0for i in range(np.shape(train_set_x)[0] / batch_size):x = train_set_x[i * batch_size: (i + 1) * batch_size]y = train_set_y[i * batch_size: (i + 1) * batch_size]_, cost = session.run([optimizer, cost_func], feed_dict={X: x, Y: y})epoch_loss += costprint(epoch, ' : ', epoch_loss)if best_loss > epoch_loss:best_loss = epoch_lossif not os.path.exists(model_dir):os.mkdir(model_dir)print("create the directory: %s" % model_dir)save_path = saver.save(session, model_path)print("Model saved in file: %s" % save_path)

再次強(qiáng)調(diào)，我的本本很舊了，而且鄙人又是學(xué)生。因此我把epoch設(shè)置為20。還好loss function應(yīng)該是一個(gè)convex function，所以也在20個(gè)epoch里到達(dá)min值0。（假如有錢就施舍一下給小弟弟吧，我又窮又想要GPU，難道我去拍AV當(dāng)男優(yōu)賺錢嗎？還是當(dāng)鴨好呢？）。最后就是有個(gè)Saver用于保存模型，畢竟他讓我本本元?dú)獯騻?#xff0c;不可以讓訓(xùn)練的好的模型就此消失與內(nèi)存中。我都是保留最優(yōu)（loss function最小）的模型。

校驗(yàn)與測(cè)試

# 恢復(fù)數(shù)據(jù)并校驗(yàn)和測(cè)試 saver.restore(session, model_path) correct = tf.equal(tf.argmax(predict,1), tf.argmax(Y,1)) valid_accuracy = tf.reduce_mean(tf.cast(correct,'float')) print('valid set accuracy: ', valid_accuracy.eval({X: vaild_set_x, Y: valid_set_y}))test_pred = tf.argmax(predict, 1).eval({X: test_set_x}) test_true = np.argmax(test_set_y, 1) test_correct = correct.eval({X: test_set_x, Y: test_set_y}) incorrect_index = [i for i in range(np.shape(test_correct)[0]) if not test_correct[i]] for i in incorrect_index:print('picture person is %i, but mis-predicted as person %i'%(test_true[i], test_pred[i])) plot(incorrect_index, "olivettifaces.gif")

最后就是恢復(fù)模型并且計(jì)算出校驗(yàn)集的準(zhǔn)確率以及測(cè)試數(shù)據(jù)哪些出現(xiàn)錯(cuò)誤，并標(biāo)注出來(lái)。

畫(huà)出在測(cè)試集中錯(cuò)誤的數(shù)據(jù)

def plot(error_index, dataset_path):img = mpimg.imread(dataset_path)plt.imshow(img)currentAxis = plt.gca()for index in error_index:row = index // 2column = index % 2currentAxis.add_patch(patches.Rectangle(xy=(47 * 9 if column == 0 else 47 * 19,row * 57),width=47,height=57,linewidth=1,edgecolor='r',facecolor='none'))plt.savefig("result.png")plt.show()
后臺(tái)打印的結(jié)果

我沒(méi)騙大家，我只是用cpu運(yùn)算，我真的很窮（做鴨還是當(dāng)男優(yōu)呢？）。

最后是測(cè)試集合，當(dāng)中有五個(gè)標(biāo)記錯(cuò)誤，在console也看到。這張圖片是標(biāo)注了那些被分類錯(cuò)誤的。

分類錯(cuò)誤的臉孔

總結(jié)

這是一個(gè)實(shí)戰(zhàn)篇，也算是給大家介紹一些關(guān)于Tensorflow卷積和pool的使用，但我覺(jué)得這個(gè)難度一般般吧。因?yàn)槲矣X(jué)得這些數(shù)據(jù)比較小，而且模型都比較簡(jiǎn)單，大家應(yīng)該可以掌握。

不知道你有沒(méi)有這樣的感受，在剛剛?cè)腴T機(jī)器學(xué)習(xí)的時(shí)候，我們一般都是從MNIST、CIFAR-10這一類知名公開(kāi)數(shù)據(jù)集開(kāi)始快速上手，復(fù)現(xiàn)別人的結(jié)果，但總覺(jué)得過(guò)于簡(jiǎn)單，給人的感覺(jué)太不真實(shí)。因?yàn)檫@些數(shù)據(jù)太“完美”了（干凈的輸入，均衡的類別，分布基本一致的測(cè)試集，還有大量現(xiàn)成的參考模型），要成為真正的數(shù)據(jù)科學(xué)家，光在這些數(shù)據(jù)集上跑模型卻是遠(yuǎn)遠(yuǎn)不夠的。而現(xiàn)實(shí)中你幾乎不可能遇到這樣的數(shù)據(jù)（現(xiàn)實(shí)數(shù)據(jù)往往有著殘缺的輸入，類別嚴(yán)重不均衡，分布不一致甚至隨時(shí)變動(dòng)的測(cè)試集，幾乎沒(méi)有可以參考的論文），這往往讓剛進(jìn)入工作的同學(xué)手忙腳亂，無(wú)所適從。

引用來(lái)源與知乎分分鐘帶你殺入Kaggle Top 1%

正是如此，我們需要做不同實(shí)戰(zhàn)，無(wú)論論文里面的idea還是比賽，我們都應(yīng)該盡力去實(shí)現(xiàn)它，不要說(shuō)我懂得這個(gè)模型就啥都不做。希望大家保持一種謙卑的學(xué)習(xí)態(tài)度認(rèn)真努力吧。

作者：Salon_sai
鏈接：http://www.jianshu.com/p/3e5ddc44aa56
來(lái)源：簡(jiǎn)書(shū)

著作權(quán)歸作者所有。商業(yè)轉(zhuǎn)載請(qǐng)聯(lián)系作者獲得授權(quán)，非商業(yè)轉(zhuǎn)載請(qǐng)注明出處。

總結(jié)

以上是生活随笔為你收集整理的Tensorflow[实战篇]——Face Recognition的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： Tensorflow学习笔记——word
下一篇：理解ResNet结构与TensorFlo