日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 人工智能 > 卷积神经网络 >内容正文

卷积神经网络

04.卷积神经网络 W4.特殊应用:人脸识别和神经风格转换(作业:快乐屋人脸识别+图片风格转换)

發布時間:2024/7/5 卷积神经网络 80 豆豆
生活随笔 收集整理的這篇文章主要介紹了 04.卷积神经网络 W4.特殊应用:人脸识别和神经风格转换(作业:快乐屋人脸识别+图片风格转换) 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

文章目錄

  • 作業1:快樂房子 - 人臉識別
    • 0. 樸素人臉驗證
    • 1. 編碼人臉圖片
      • 1.1 使用卷積網絡編碼
      • 1.2 Triplet 損失
    • 2. 加載訓練過的模型
    • 3. 使用模型
      • 3.1 人臉驗證
      • 3.2 人臉識別
  • 作業2:神經風格轉換
    • 1. 問題背景
    • 2. 遷移學習
    • 3. 神經風格轉換
      • 3.1 計算內容損失
      • 3.2 計算風格損失
        • 3.2.1 風格矩陣
        • 3.2.2 風格損失
        • 3.2.3 風格權重
      • 3.3 總的損失
    • 4. 優化求解
    • 5. 用自己的照片測試

測試題:參考博文

筆記:W4.特殊應用:人臉識別和神經風格轉換

作業1:快樂房子 - 人臉識別

背景:04 W2 作業:Keras教程+ResNets殘差網絡 里的快樂的房子問題

作業里很多想法來源于 FaceNet(https://arxiv.org/pdf/1503.03832.pdf)

FaceNet 學習了神經網絡,可以把一個臉部圖像編碼成一個128個數的向量,通過比較兩個這樣的向量,判斷這兩個圖片是不是同一個人

  • 導入一些包
from keras.models import Sequential from keras.layers import Conv2D, ZeroPadding2D, Activation, Input, concatenate from keras.models import Model from keras.layers.normalization import BatchNormalization from keras.layers.pooling import MaxPooling2D, AveragePooling2D from keras.layers.merge import Concatenate from keras.layers.core import Lambda, Flatten, Dense from keras.initializers import glorot_uniform from keras.engine.topology import Layer from keras import backend as K K.set_image_data_format('channels_first') # 數據格式,通道數在前 (𝑚,𝑛𝐶,𝑛𝐻,𝑛𝑊) import cv2 import os import numpy as np from numpy import genfromtxt import pandas as pd import tensorflow as tf from fr_utils import * from inception_blocks_v2 import *%matplotlib inline %load_ext autoreload %autoreload 2np.set_printoptions(threshold=np.inf)

0. 樸素人臉驗證

給定兩張人臉照片,最簡單的方法:逐個比較每個像素,如果距離小于某個閾值,則判斷是同一個人

當然,該算法的性能非常差,因為像素值會因光線變化、人臉方位變化、甚至頭部位置的微小變化等而發生顯著變化

可以學習編碼 f(img)f(img)f(img),對圖片編碼進行比較,更準確地判斷兩張圖片是否屬于同一個人

1. 編碼人臉圖片

1.1 使用卷積網絡編碼

練習采用預訓練好的權重,網絡結構來源于 Inception 網絡模型

Inception網絡模型 參考博文

一些關鍵點:

  • 網絡使用 96x96 的3通道圖片,維度(𝑚,𝑛𝐶,𝑛𝐻,𝑛𝑊)= (𝑚,3,96,96)
  • 網絡輸出圖片的編碼:矩陣,他的維度(𝑚,128)

定義模型:

FRmodel = faceRecoModel(input_shape=(3, 96, 96))

1.2 Triplet 損失

三元組損失函數 試圖將同一個人的兩個圖像(Anchor & Positive)的編碼“推”得更近,同時將不同人物(Anchor & Negative)的兩個圖像的編碼“拉”得更遠



J=∑i=1m[∣∣f(A(i))?f(P(i))∣∣22?(1)?∣∣f(A(i))?f(N(i))∣∣22?(2)+α]+\mathcal{J} = \sum^{m}_{i=1} \large[ \small \underbrace{\mid \mid f(A^{(i)}) - f(P^{(i)}) \mid \mid_2^2}_\text{(1)} - \underbrace{\mid \mid f(A^{(i)}) - f(N^{(i)}) \mid \mid_2^2}_\text{(2)} + \alpha \large ] \small_+ J=i=1m?[(1)f(A(i))?f(P(i))22????(2)f(A(i))?f(N(i))22???+α]+?

(A(i),P(i),N(i))(A^{(i)}, P^{(i)}, N^{(i)})(A(i),P(i),N(i)) 表示第 i 個訓練樣本,"[z]+[z]_+[z]+?" 表示 max?(z,0)\max(z,0)max(z,0)α\alphaα 是間隔,常取0.2

可以使用函數:tf.reduce_sum(), tf.square(), tf.subtract(), tf.add(), tf.maximum()

# GRADED FUNCTION: triplet_lossdef triplet_loss(y_true, y_pred, alpha = 0.2):"""Implementation of the triplet loss as defined by formula (3)Arguments:y_true -- true labels, required when you define a loss in Keras, you don't need it in this function.y_pred -- python list containing three objects:anchor -- the encodings for the anchor images, of shape (None, 128)positive -- the encodings for the positive images, of shape (None, 128)negative -- the encodings for the negative images, of shape (None, 128)Returns:loss -- real number, value of the loss"""anchor, positive, negative = y_pred[0], y_pred[1], y_pred[2]### START CODE HERE ### (≈ 4 lines)# Step 1: Compute the (encoding) distance between the anchor and the positive, # you will need to sum over axis=-1pos_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, positive)),axis=-1)# Step 2: Compute the (encoding) distance between the anchor and the negative, # you will need to sum over axis=-1neg_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, negative)),axis=-1)# Step 3: subtract the two previous distances and add alpha.basic_loss = tf.subtract(pos_dist, neg_dist)+alpha# Step 4: Take the maximum of basic_loss and 0.0. Sum over the training examples.loss = tf.reduce_sum(tf.maximum(basic_loss, 0))### END CODE HERE ###return loss

2. 加載訓練過的模型

FaceNet 已經使用 Triplet 損失訓練過了,我們直接加載訓練好的模型

FRmodel.compile(optimizer = 'adam', loss = triplet_loss, metrics = ['accuracy']) load_weights_from_FaceNet(FRmodel)

3. 使用模型

你不想讓所有的人都可以進入快樂房子,只允許在名單里的人才能進入,你需要刷卡,以便系統讀取你的人名身份

3.1 人臉驗證

對每個允許進入的人創建編碼向量的數據庫,使用img_to_encoding(image_path, model)函數,輸入圖片,運行前向傳播

  • 創建數據庫(字典),人名:編碼向量
database = {} database["danielle"] = img_to_encoding("images/danielle.png", FRmodel) database["younes"] = img_to_encoding("images/younes.jpg", FRmodel) database["tian"] = img_to_encoding("images/tian.jpg", FRmodel) database["andrew"] = img_to_encoding("images/andrew.jpg", FRmodel) database["kian"] = img_to_encoding("images/kian.jpg", FRmodel) database["dan"] = img_to_encoding("images/dan.jpg", FRmodel) database["sebastiano"] = img_to_encoding("images/sebastiano.jpg", FRmodel) database["bertrand"] = img_to_encoding("images/bertrand.jpg", FRmodel) database["kevin"] = img_to_encoding("images/kevin.jpg", FRmodel) database["felix"] = img_to_encoding("images/felix.jpg", FRmodel) database["benoit"] = img_to_encoding("images/benoit.jpg", FRmodel) database["arnaud"] = img_to_encoding("images/arnaud.jpg", FRmodel)
  • 驗證,計算圖片編碼,與數據庫編碼的距離,如果 < 0.7 則開門
# GRADED FUNCTION: verifydef verify(image_path, identity, database, model):"""Function that verifies if the person on the "image_path" image is "identity".Arguments:image_path -- path to an imageidentity -- string, name of the person you'd like to verify the identity. Has to be a resident of the Happy house.database -- python dictionary mapping names of allowed people's names (strings) to their encodings (vectors).model -- your Inception model instance in KerasReturns:dist -- distance between the image_path and the image of "identity" in the database.door_open -- True, if the door should open. False otherwise."""### START CODE HERE #### Step 1: Compute the encoding for the image. Use img_to_encoding() see example above. (≈ 1 line)encoding = img_to_encoding(image_path, model)# Step 2: Compute distance with identity's image (≈ 1 line)dist = np.linalg.norm(database[identity]-encoding)# Step 3: Open the door if dist < 0.7, else don't open (≈ 3 lines)if dist < 0.7:print("It's " + str(identity) + ", welcome home!")door_open = Trueelse:print("It's not " + str(identity) + ", please go away")door_open = False### END CODE HERE ###return dist, door_open

verify("images/camera_0.jpg", "younes", database, FRmodel)

輸出:

It's younes, welcome home! (0.67100716, True)

verify("images/camera_2.jpg", "kian", database, FRmodel)

輸出:

It's not kian, please go away (0.85800135, False)

再試下詹姆斯的頭像試試:

database["james"] = img_to_encoding("images/james.png", FRmodel) verify("images/james_no.png", "james", database, FRmodel) It's not james, please go away (0.84896624, False) # 回答正確 verify("images/james_yes.png", "james", database, FRmodel) It's james, welcome home! (0.57764035, True) # 回答正確 verify("images/james_yes1.png", "james", database, FRmodel) It's not james, please go away (0.87970865, False) # 回答錯誤

3.2 人臉識別

但是你的卡丟了,就不能進門了,所以需要改造為識別系統,授權人員只需要走到攝像頭跟前,門就會為他打開(我們不再需要刷卡)

# GRADED FUNCTION: who_is_itdef who_is_it(image_path, database, model):"""Implements face recognition for the happy house by finding who is the person on the image_path image.Arguments:image_path -- path to an imagedatabase -- database containing image encodings along with the name of the person on the imagemodel -- your Inception model instance in KerasReturns:min_dist -- the minimum distance between image_path encoding and the encodings from the databaseidentity -- string, the name prediction for the person on image_path"""### START CODE HERE ### ## Step 1: Compute the target "encoding" for the image. Use img_to_encoding() see example above. ## (≈ 1 line)encoding = img_to_encoding(image_path, model)## Step 2: Find the closest encoding ### Initialize "min_dist" to a large value, say 100 (≈1 line)min_dist = np.inf# Loop over the database dictionary's names and encodings.for (name, db_enc) in database.items():# Compute L2 distance between the target "encoding" and the current "emb" from the database. (≈ 1 line)dist = np.linalg.norm(encoding - db_enc)# If this distance is less than the min_dist, then set min_dist to dist, and identity to name. (≈ 3 lines)if dist < min_dist:min_dist = distidentity = name### END CODE HERE ###if min_dist > 0.7:print("Not in the database.")else:print ("it's " + str(identity) + ", the distance is " + str(min_dist))return min_dist, identity who_is_it("images/camera_0.jpg", database, FRmodel) it's younes, the distance is 0.67100716 (0.67100716, 'younes') who_is_it("images/james_yes.png", database, FRmodel) it's james, the distance is 0.57764035 (0.57764035, 'james') who_is_it("images/james_yes1.png", database, FRmodel) it's andrew, the distance is 0.66093665 (0.66093665, 'andrew')

what ??? 詹姆斯 很像 NG老師?哈哈

您現在已經了解了最先進的人臉識別系統是如何工作的。

有一些方法可以進一步改進算法:

  • 把每個人的更多照片(在不同的光照條件下,在不同的日子,等等)放入數據庫。然后給出一個新的圖像,將新的面孔與該人的多張照片進行比較。這將提高準確性。

  • 裁剪圖像,使其只包含臉部,而不包含臉部周圍的“邊界”區域。去除了人臉周圍一些不相關的像素點,使算法更加健壯。

作業2:神經風格轉換

大多數算法都會優化成本函數以獲得一組參數值。神經風格轉換中,你將優化一個成本函數來獲得像素值!

導入一些包

import os import sys import scipy.io import scipy.misc import matplotlib.pyplot as plt from matplotlib.pyplot import imshow from PIL import Image from nst_utils import * import numpy as np import tensorflow as tf%matplotlib inline

1. 問題背景

在本例中,您將生成巴黎盧浮宮博物館的圖像(內容圖像 C),與印象派運動領袖克勞德·莫內(Claude Monet)的繪畫混合(樣式圖像 S)

2. 遷移學習

Neural Style Transfer (NST) 使用先前訓練過的卷積網絡,并在此基礎上構建。使用一個在不同任務上訓練的網絡并將其應用于新任務的想法稱為遷移學習

遵循NST原始論文(https://arxiv.org/abs/1508.06576),我們將使用VGG網絡。使用VGG-19,VGG網絡的19層版本。這個模型已經在非常大的 ImageNet數據庫上進行了訓練,因此學會了識別各種低級特征(在淺層)和高級特征(在深層)。

  • 加載模型
model = load_vgg_model("pretrained-model/imagenet-vgg-verydeep-19.mat") print(model) {'input': <tf.Variable 'Variable:0' shape=(1, 300, 400, 3) dtype=float32_ref>, 'conv1_1': <tf.Tensor 'Relu:0' shape=(1, 300, 400, 64) dtype=float32>, 'conv1_2': <tf.Tensor 'Relu_1:0' shape=(1, 300, 400, 64) dtype=float32>, 'avgpool1': <tf.Tensor 'AvgPool:0' shape=(1, 150, 200, 64) dtype=float32>, 'conv2_1': <tf.Tensor 'Relu_2:0' shape=(1, 150, 200, 128) dtype=float32>, 'conv2_2': <tf.Tensor 'Relu_3:0' shape=(1, 150, 200, 128) dtype=float32>, 'avgpool2': <tf.Tensor 'AvgPool_1:0' shape=(1, 75, 100, 128) dtype=float32>, 'conv3_1': <tf.Tensor 'Relu_4:0' shape=(1, 75, 100, 256) dtype=float32>, 'conv3_2': <tf.Tensor 'Relu_5:0' shape=(1, 75, 100, 256) dtype=float32>, 'conv3_3': <tf.Tensor 'Relu_6:0' shape=(1, 75, 100, 256) dtype=float32>, 'conv3_4': <tf.Tensor 'Relu_7:0' shape=(1, 75, 100, 256) dtype=float32>, 'avgpool3': <tf.Tensor 'AvgPool_2:0' shape=(1, 38, 50, 256) dtype=float32>, 'conv4_1': <tf.Tensor 'Relu_8:0' shape=(1, 38, 50, 512) dtype=float32>, 'conv4_2': <tf.Tensor 'Relu_9:0' shape=(1, 38, 50, 512) dtype=float32>, 'conv4_3': <tf.Tensor 'Relu_10:0' shape=(1, 38, 50, 512) dtype=float32>, 'conv4_4': <tf.Tensor 'Relu_11:0' shape=(1, 38, 50, 512) dtype=float32>, 'avgpool4': <tf.Tensor 'AvgPool_3:0' shape=(1, 19, 25, 512) dtype=float32>, 'conv5_1': <tf.Tensor 'Relu_12:0' shape=(1, 19, 25, 512) dtype=float32>, 'conv5_2': <tf.Tensor 'Relu_13:0' shape=(1, 19, 25, 512) dtype=float32>, 'conv5_3': <tf.Tensor 'Relu_14:0' shape=(1, 19, 25, 512) dtype=float32>, 'conv5_4': <tf.Tensor 'Relu_15:0' shape=(1, 19, 25, 512) dtype=float32>, 'avgpool5': <tf.Tensor 'AvgPool_4:0' shape=(1, 10, 13, 512) dtype=float32>}

模型存儲在python字典中,其中每個變量名是鍵,對應的值是包含該變量值的張量。要通過這個網絡運行圖像,只需將圖像喂給模型。在TensorFlow中,可以使用tf.assign函數:model["input"].assign(image)

要獲取指定層的激活值可以使用:sess.run(model["conv4_2"])

3. 神經風格轉換

3.1 計算內容損失

讀取內容圖片 C

import imageio content_image = imageio.imread("images/louvre.jpg") imshow(content_image)


ConvNet的早期(較淺)層傾向于檢測較低層次的特征,如邊緣和簡單紋理
后面(較深)層則傾向于檢測更高級的特征,如更復雜的紋理以及對象類。

我們希望“生成的”圖像G與輸入圖像C具有相似的內容。實際上,如果選擇網絡中間的一個層(既不太淺也不太深),您將獲得最令人滿意的結果。(可以嘗試使用不同的層,看看結果如何變化)

Jcontent(C,G)=14×nH×nW×nC∑all?entries(a(C)?a(G))2J_{content}(C,G) = \frac{1}{4 \times n_H \times n_W \times n_C}\sum _{ \text{all entries}} (a^{(C)} - a^{(G)})^2Jcontent?(C,G)=4×nH?×nW?×nC?1?all?entries?(a(C)?a(G))2

# GRADED FUNCTION: compute_content_costdef compute_content_cost(a_C, a_G):"""Computes the content costArguments:a_C -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing content of the image C a_G -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing content of the image GReturns: J_content -- scalar that you compute using equation 1 above."""### START CODE HERE #### Retrieve dimensions from a_G (≈1 line)m, n_H, n_W, n_C = a_G.get_shape().as_list()# Reshape a_C and a_G (≈2 lines)a_C_unrolled = tf.reshape(a_C, [-1, n_C])a_G_unrolled = tf.reshape(a_G, [-1, n_C])# compute the cost with tensorflow (≈1 line)J_content = tf.reduce_sum(tf.square(tf.subtract(a_C_unrolled, a_G_unrolled)))/(4*n_H*n_W*n_C)### END CODE HERE ###return J_content

3.2 計算風格損失

style_image = imageio.imread("images/monet_800600.jpg") imshow(style_image)

3.2.1 風格矩陣

又叫 Gram 矩陣,其項是 Gij=viTvj=np.dot(vi,vj)G_{ij} = v_i^Tv_j = np.dot(v_i, v_j)Gij?=viT?vj?=np.dot(vi?,vj?)GijG_{ij}Gij? 比較 viv_ivi?vjv_jvj? 的相似程度:如果它們高度相似,期望它們有一個大的點積

在NST中,可以通過將“展開”過濾器矩陣與其轉置相乘來計算風格矩陣:

輸出矩陣是 nc×ncn_c \times n_cnc?×nc? 的,ncn_cnc? 是過濾器數量,GijG_{ij}Gij? 測量了過濾器 i 和過濾器 j 的激活值有多少相似度

Gram 矩陣的一個重要部分是,對角線元素GiiG_{ii}Gii? 表示過濾器 i 有多活躍。例如,假設過濾器 i 正在檢測圖像中的垂直紋理。然后,GiiG_{ii}Gii? 衡量圖像整體中垂直紋理的常見程度:如果 GiiG_{ii}Gii? 很大,這意味著圖像有很多垂直紋理。

通過捕捉不同類型的特征 GiiG_{ii}Gii?,以及有多少不同的特征組合出現 GijG_{ij}Gij?,樣式矩陣 GGG 測量圖像的樣式

# GRADED FUNCTION: gram_matrixdef gram_matrix(A):"""Argument:A -- matrix of shape (n_C, n_H*n_W)Returns:GA -- Gram matrix of A, of shape (n_C, n_C)"""### START CODE HERE ### (≈1 line)GA = tf.matmul(A, tf.transpose(A))### END CODE HERE ###return GA

3.2.2 風格損失

Jstyle[l](S,G)=14×nC2×(nH×nW)2∑i=1nC∑j=1nC(Gij(S)?Gij(G))2J_{style}^{[l]}(S,G) = \frac{1}{4 \times {n_C}^2 \times (n_H \times n_W)^2} \sum _{i=1}^{n_C}\sum_{j=1}^{n_C}(G^{(S)}_{ij} - G^{(G)}_{ij})^2Jstyle[l]?(S,G)=4×nC?2×(nH?×nW?)21?i=1nC??j=1nC??(Gij(S)??Gij(G)?)2

# GRADED FUNCTION: compute_layer_style_costdef compute_layer_style_cost(a_S, a_G):"""Arguments:a_S -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing style of the image S a_G -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing style of the image GReturns: J_style_layer -- tensor representing a scalar value, style cost defined above by equation (2)"""### START CODE HERE #### Retrieve dimensions from a_G (≈1 line)m, n_H, n_W, n_C = a_G.get_shape().as_list()# Reshape the images to have them of shape (n_C, n_H*n_W) (≈2 lines)a_S = tf.reshape(a_S, [-1, n_C])a_G = tf.reshape(a_G, [-1, n_C])# Computing gram_matrices for both images S and G (≈2 lines)GS = gram_matrix(tf.transpose(a_S))GG = gram_matrix(tf.transpose(a_G))# Computing the loss (≈1 line)J_style_layer = tf.reduce_sum(tf.square(tf.subtract(GS, GG)))/(4*n_C**2*(n_H*n_W)**2)### END CODE HERE ###return J_style_layer

3.2.3 風格權重

給每一層的風格給定權重,可以更改,看看有什么效果變化

# 權重系數 STYLE_LAYERS = [('conv1_1', 0.2),('conv2_1', 0.2),('conv3_1', 0.2),('conv4_1', 0.2),('conv5_1', 0.2)]

Jstyle(S,G)=∑lλ[l]Jstyle[l](S,G)J_{style}(S,G) = \sum_{l} \lambda^{[l]} J^{[l]}_{style}(S,G)Jstyle?(S,G)=l?λ[l]Jstyle[l]?(S,G)

def compute_style_cost(model, STYLE_LAYERS):"""Computes the overall style cost from several chosen layersArguments:model -- our tensorflow modelSTYLE_LAYERS -- A python list containing:- the names of the layers we would like to extract style from- a coefficient for each of themReturns: J_style -- tensor representing a scalar value, style cost defined above by equation (2)"""# initialize the overall style costJ_style = 0for layer_name, coeff in STYLE_LAYERS:# Select the output tensor of the currently selected layerout = model[layer_name]# Set a_S to be the hidden layer activation from the layer we have selected, # by running the session on outa_S = sess.run(out)# Set a_G to be the hidden layer activation from same layer. Here, a_G references model[layer_name] # and isn't evaluated yet. Later in the code, we'll assign the image G as the model input, so that# when we run the session, this will be the activations drawn from the appropriate layer, with G as input.a_G = out# Compute style_cost for the current layerJ_style_layer = compute_layer_style_cost(a_S, a_G)# Add coeff * J_style_layer of this layer to overall style costJ_style += coeff * J_style_layerreturn J_style

注意:內循環 a_G 還沒有評估,在后面 run TF 圖的時候會評估和更新

圖片風格可以用 一個隱藏層的激活值的 Gram 矩陣表示

為了得到更好的結果,我們綜合所有的層的風格,這一點跟 內容損失不一樣(內容損失只用1層靠中間的層)

3.3 總的損失

J(G)=αJcontent(C,G)+βJstyle(S,G)J(G) = \alpha J_{content}(C,G) + \beta J_{style}(S,G)J(G)=αJcontent?(C,G)+βJstyle?(S,G)

# GRADED FUNCTION: total_costdef total_cost(J_content, J_style, alpha = 10, beta = 40):"""Computes the total cost functionArguments:J_content -- content cost coded aboveJ_style -- style cost coded abovealpha -- hyperparameter weighting the importance of the content costbeta -- hyperparameter weighting the importance of the style costReturns:J -- total cost as defined by the formula above."""### START CODE HERE ### (≈1 line)J = alpha*J_content + beta*J_style### END CODE HERE ###return J

4. 優化求解

步驟:

  • 創建 Interactive Session(相比常規 Session,可以簡化代碼)
  • 加載內容圖像
  • 加載樣式圖像
  • 隨機初始化要生成的圖像
  • 加載VGG16模型
  • 構建 TensorFlow 圖:
  • 在VGG16模型中運行內容圖像并計算內容成本
  • 在VGG16模型中運行樣式圖像并計算樣式成本
  • 計算總成本
  • 定義優化器和學習率
    • 初始化TensorFlow圖并運行它,進行大量迭代,在每一步都更新生成的圖像

    • 創建 交互式Session
    # Reset the graph tf.reset_default_graph()# Start interactive session sess = tf.InteractiveSession()
    • 加載內容圖片
    content_image = imageio.imread("images/louvre_small.jpg") content_image = reshape_and_normalize_image(content_image)
    • 加載風格圖片
    style_image = imageio.imread("images/monet.jpg") style_image = reshape_and_normalize_image(style_image)
    • 隨機生成噪聲圖片,為了加快速度,在內容圖片上加了噪聲
    generated_image = generate_noise_image(content_image) imshow(generated_image[0])

    • 加載 VGG16 模型
    model = load_vgg_model("pretrained-model/imagenet-vgg-verydeep-19.mat")
    • 使用 conv4_2 層計算內容損失
    # Assign the content image to be the input of the VGG model. sess.run(model['input'].assign(content_image))# Select the output tensor of layer conv4_2 out = model['conv4_2']# Set a_C to be the hidden layer activation from the layer we have selected a_C = sess.run(out)# Set a_G to be the hidden layer activation from same layer. Here, a_G references model['conv4_2'] # and isn't evaluated yet. Later in the code, we'll assign the image G as the model input, so that # when we run the session, this will be the activations drawn from the appropriate layer, with G as input. a_G = out# Compute the content cost J_content = compute_content_cost(a_C, a_G)
    • 風格損失
    # Assign the input of the model to be the "style" image sess.run(model['input'].assign(style_image))# Compute the style cost J_style = compute_style_cost(model, STYLE_LAYERS)
    • 總體損失
    ### START CODE HERE ### (1 line) J = total_cost(J_content, J_style, alpha=10, beta=40) ### END CODE HERE ###
    • 定義優化器
    # define optimizer (1 line) optimizer = tf.train.AdamOptimizer(learning_rate=2.0)# define train_step (1 line) train_step = optimizer.minimize(J)
    • 完整模型
    def model_nn(sess, input_image, num_iterations = 200):# Initialize global variables (you need to run the session on the initializer)### START CODE HERE ### (1 line)sess.run(tf.global_variables_initializer())### END CODE HERE #### Run the noisy input image (initial generated image) through the model. Use assign().### START CODE HERE ### (1 line)sess.run(model['input'].assign(input_image))### END CODE HERE ###total_cost = []content_cost = []style_cost = []iter = []for i in range(num_iterations):# Run the session on the train_step to minimize the total cost### START CODE HERE ### (1 line)sess.run(train_step)### END CODE HERE #### Compute the generated image by running the session on the current model['input']### START CODE HERE ### (1 line)generated_image = sess.run(model['input'])### END CODE HERE #### Print every 20 iteration.Jt, Jc, Js = sess.run([J, J_content, J_style])total_cost.append(Jt)content_cost.append(Jc)style_cost.append(Js)iter.append(i)if i%20 == 0:print("Iteration " + str(i) + " :")print("total cost = " + str(Jt))print("content cost = " + str(Jc))print("style cost = " + str(Js))# save current generated image in the "/output" directorysave_image("output/" + str(i) + ".png", generated_image)# save last generated imagesave_image('output/generated_image.jpg', generated_image)# plot costplt.rcParams["font.sans-serif"] = "SimHei"# 消除中文亂碼plt.figure()plt.plot(iter, total_cost, 'r-', label='total')plt.plot(iter, content_cost, 'g-', label='content')plt.plot(iter, style_cost, 'b-', label='style')plt.legend()plt.xlabel('迭代次數')plt.ylabel('損失')return generated_image
    • 運行模型
    model_nn(sess, generated_image, num_iterations=300)

    5. 用自己的照片測試

    content 圖片(400x300):

    style 圖片(400x300):



    如有鏈接失效,請查看原文
    本文地址:https://michael.blog.csdn.net/article/details/108803515


    我的CSDN博客地址 https://michael.blog.csdn.net/

    長按或掃碼關注我的公眾號(Michael阿明),一起加油、一起學習進步!

    總結

    以上是生活随笔為你收集整理的04.卷积神经网络 W4.特殊应用:人脸识别和神经风格转换(作业:快乐屋人脸识别+图片风格转换)的全部內容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。