使用Keras和TensorFlow构建深度自动编码器
In this tutorial, we will explore how to build and train deep autoencoders using Keras and Tensorflow.
在本教程中,我們將探索如何使用Keras和Tensorflow構(gòu)建和訓(xùn)練深度自動(dòng)編碼器。
The primary reason I decided to write this tutorial is that most of the tutorials out there, including the official Keras and TensorFlow ones, use the MNIST data for the training. I have been asked numerous times to show how to train autoencoders using our own images that may be large in number.
我決定編寫此教程的主要原因是那里的大多數(shù)教程(包括官方的Keras和TensorFlow教程)都使用MNIST數(shù)據(jù)進(jìn)行培訓(xùn)。 我無數(shù)次被要求展示如何使用我們自己的圖像(可能數(shù)量很多)來訓(xùn)練自動(dòng)編碼器。
I will try to keep this tutorial brief and will not get into the details of how autoencoder works. Therefore, having a basic knowledge of autoencoders is the prerequisite to understand the code presented in this tutorial (needless to say that you must know how to program in Python, Keras and TensorFlow).
我將嘗試使本教程簡短,而不會(huì)深入探討自動(dòng)編碼器的工作原理。 因此,具有自動(dòng)編碼器的基礎(chǔ)知識是理解本教程中提供的代碼的先決條件(不必說您必須知道如何使用Python,Keras和TensorFlow進(jìn)行編程)。
自動(dòng)編碼器 (Autoencoders)
Autoencoders are unsupervised neural networks that learn to reconstruct its input. Denoising an image is one of the uses of autoencoders. Denoising is very useful for OCR. Autoencoders are also also used for image compression.
自動(dòng)編碼器是無監(jiān)督的神經(jīng)網(wǎng)絡(luò),可以學(xué)習(xí)重建其輸入。 對圖像進(jìn)行降噪是自動(dòng)編碼器的用途之一。 去噪對于OCR非常有用。 自動(dòng)編碼器也用于圖像壓縮。
As shown in Figure 1, an autoencoder consists of:
如圖1所示,自動(dòng)編碼器包括:
Both encoders and decoders are convolutional neural networks with the difference that the encoders dimensions reduce with each layer and the decoders dimensions increase with each layer until the output layer where the dimensions match with the original image.
編碼器和解碼器都是卷積神經(jīng)網(wǎng)絡(luò),不同之處在于編碼器的尺寸隨每一層減小,而解碼器的尺寸隨每一層增大,直到輸出層的尺寸與原始圖像匹配為止。
培訓(xùn)自動(dòng)編碼器 (Training Autoencoders)
We will use our own images for training and testing the autoencoders. For the purpose of this tutorial, we will use a dataset that contains scanned images of restaurant receipts. The dataset is freely available from the link https://expressexpense.com/large-receipt-image-dataset-SRD.zip uner MIT License.
我們將使用自己的圖像來訓(xùn)練和測試自動(dòng)編碼器。 在本教程中,我們將使用包含餐廳收據(jù)掃描圖像的數(shù)據(jù)集。 可從MIT許可中的鏈接https://expressexpense.com/large-receipt-image-dataset-SRD.zip免費(fèi)獲得該數(shù)據(jù)集。
Although this dataset does not have a large number of images, we will write code that will work for both small and large datasets.
盡管此數(shù)據(jù)集沒有大量圖像,但我們將編寫適用于小型和大型數(shù)據(jù)集的代碼。
The code below is divided into 4 parts.
下面的代碼分為4部分。
I will use Google Colaboratory (https://colab.research.google.com/) to execute the code. You can use your favorite IDE to write and run the code. The code below works both for CPUs and GPUs, I will use the GPU based machine to speed up the training. Google Colab offers a free GPU based virtual machine for education and learning.
我將使用Google Colaboratory( https://colab.research.google.com/ )執(zhí)行代碼。 您可以使用自己喜歡的IDE編寫和運(yùn)行代碼。 下面的代碼適用于CPU和GPU,我將使用基于GPU的機(jī)器來加快培訓(xùn)速度。 Google Colab提供了免費(fèi)的基于GPU的虛擬機(jī),用于教育和學(xué)習(xí)。
If you use a Jupyter notebook, the steps below will look very similar.
如果您使用Jupyter筆記本,則以下步驟看起來非常相似。
First we create a notebook project, AE Demo for example.
首先,我們創(chuàng)建一個(gè)筆記本項(xiàng)目,例如AE Demo。
Before we start the actual code, let’s import all dependencies that we need for our project. Here is a list of imports that we will need.
在開始實(shí)際代碼之前,讓我們導(dǎo)入項(xiàng)目所需的所有依賴項(xiàng)。 這是我們需要的進(jìn)口清單。
# Import the necessary packages
#導(dǎo)入必要的軟件包
import tensorflow as tf
將tensorflow作為tf導(dǎo)入
from google.colab.patches import cv2_imshow
從google.colab.patches導(dǎo)入cv2_imshow
from tensorflow.keras.layers import BatchNormalization
從tensorflow.keras.layers導(dǎo)入BatchNormalization
from tensorflow.keras.layers import Conv2D
從tensorflow.keras.layers導(dǎo)入Conv2D
from tensorflow.keras.layers import Conv2DTranspose
從tensorflow.keras.layers導(dǎo)入Conv2DTranspose
from tensorflow.keras.layers import LeakyReLU
從tensorflow.keras.layers導(dǎo)入LeakyReLU
from tensorflow.keras.layers import Activation
從tensorflow.keras.layers導(dǎo)入激活
from tensorflow.keras.layers import Flatten
從tensorflow.keras.layers導(dǎo)入Flatten
from tensorflow.keras.layers import Dense
從tensorflow.keras.layers導(dǎo)入Dense
from tensorflow.keras.layers import Reshape
從tensorflow.keras.layers導(dǎo)入重塑
from tensorflow.keras.layers import Input
從tensorflow.keras.layers導(dǎo)入輸入
from tensorflow.keras.models import Model
從tensorflow.keras.models導(dǎo)入模型
from tensorflow.keras import backend as K
從tensorflow.keras將后端導(dǎo)入為K
from tensorflow.keras.optimizers import Adam
從tensorflow.keras.optimizers導(dǎo)入Adam
import numpy as np
將numpy導(dǎo)入為np
Listing 1.1: Import the necessary packages.
代碼清單1.1:導(dǎo)入必要的軟件包
數(shù)據(jù)準(zhǔn)備: (Data Preparation:)
Our receipt images are in a directory. We will use ImageDataGenerator class, provided by Keras API, and create training and test iterators as shown in the listing 1.2 below.
我們的收據(jù)圖像位于目錄中。 我們將使用Keras API提供的ImageDataGenerator類,并創(chuàng)建訓(xùn)練和測試迭代器,如下面清單1.2所示。
trainig_img_dir = “inputs”
trainig_img_dir =“輸入”
height = 1000
高度= 1000
width = 500
寬度= 500
channel = 1
頻道= 1
batch_size = 8
batch_size = 8
datagen = tf.keras.preprocessing.image.ImageDataGenerator(validation_split=0.2, rescale=1. / 255.)
datagen = tf.keras.preprocessing.image.ImageDataGenerator(validation_split = 0.2,rescale = 1. / 255。)
train_it = datagen.flow_from_directory(
train_it = datagen.flow_from_directory(
trainig_img_dir,
trainig_img_dir,
target_size=(height, width),
target_size =(高度,寬度),
color_mode=’grayscale’,
color_mode ='灰度',
class_mode=’input’,
class_mode ='輸入',
batch_size=batch_size,
batch_size =批量大小,
subset=’training’) # set as training data
subset ='training')#設(shè)置為訓(xùn)練數(shù)據(jù)
val_it = datagen.flow_from_directory(
val_it = datagen.flow_from_directory(
trainig_img_dir,
trainig_img_dir,
target_size=(height, width),
target_size =(高度,寬度),
color_mode=’grayscale’,
color_mode ='灰度',
class_mode=’input’,
class_mode ='輸入',
batch_size=batch_size,
batch_size =批量大小,
subset=’validation’) # set as validation data
subset ='validation')#設(shè)置為驗(yàn)證數(shù)據(jù)
Listing 1.2: Image input preparation. Load images in batches from a directory.
代碼清單1.2:圖像輸入準(zhǔn)備 從目錄中批量加載圖像。
Important notes about Listing 1.2:
有關(guān)清單1.2的重要說明:
All other parameters are self explanatory.
所有其他參數(shù)不言自明。
配置自動(dòng)編碼器神經(jīng)網(wǎng)絡(luò) (Configure Autoencoder Neural Networks)
As shown in Listing 1.3 below, we have created an AutoencoderBuilder class that provides a function build_ae(). This function takes the following arguments:
如下面的清單1.3所示,我們創(chuàng)建了一個(gè)AutoencoderBuilder類,該類提供了一個(gè)build_ae()函數(shù)。 此函數(shù)采用以下參數(shù):
- height of the input images, 輸入圖像的高度,
- width of the input images, 輸入圖像的寬度,
- depth (or the number of channels) of the input images. 輸入圖像的深度(或通道數(shù))。
- filters as a tuple with the default as (32,64) 過濾為元組,默認(rèn)為(32,64)
- latentDim which represents the dimension of the latent vector latentDim,代表潛在向量的維數(shù)
class AutoencoderBuilder:
AutoencoderBuilder類:
@staticmethod
@staticmethod
def build_ae(height, width, depth, filters=(32, 64), latentDim=16):
def build_ae(高度,寬度,深度,過濾器=(32,64),latentDim = 16):
#Initialize the input shape.
#初始化輸入形狀。
inputShape = (height, width, depth)
inputShape =(高度,寬度,深度)
chanDim = -1
chanDim = -1
# define the input to the encoder
#定義編碼器的輸入
inputs = Input(shape=inputShape)
輸入=輸入(shape = inputShape)
x = inputs
x =輸入
# loop over the filters
#遍歷過濾器
for filter in filters:
用于過濾器中的過濾器:
# Build network with Convolutional with RELU and BatchNormalization
#使用RELU和BatchNormalization通過卷積構(gòu)建網(wǎng)絡(luò)
x = Conv2D(filter, (3, 3), strides=2, padding=”same”)(x)
x = Conv2D(過濾器,(3,3),步幅= 2,填充=“相同”)(x)
x = LeakyReLU(alpha=0.2)(x)
x = LeakyReLU(alpha = 0.2)(x)
x = BatchNormalization(axis=chanDim)(x)
x =批次歸一化(axis = chanDim)(x)
# flatten the network and then construct the latent vector
#展平網(wǎng)絡(luò),然后構(gòu)造潛在向量
volumeSize = K.int_shape(x)
volumeSize = K.int_shape(x)
x = Flatten()(x)
x = Flatten()(x)
latent = Dense(latentDim)(x)
潛伏=密集(latentDim)[x)
# build the encoder model
#建立編碼器模型
encoder = Model(inputs, latent, name=”encoder”)
編碼器=型號(輸入,潛伏,名稱=“編碼器”)
# We will now build the the decoder model which takes the output from the encoder as its inputs
#現(xiàn)在,我們將構(gòu)建解碼器模型,該模型將編碼器的輸出作為輸入
latentInputs = Input(shape=(latentDim,))
latentInputs =輸入(shape =(latentDim,))
x = Dense(np.prod(volumeSize[1:]))(latentInputs)
x =密集(np.prod(volumeSize [1:]))(latentInputs)
x = Reshape((volumeSize[1], volumeSize[2], volumeSize[3]))(x)
x =重塑((volumeSize [1],volumeSize [2],volumeSize [3]))(x)
# We will loop over the filters again but in the reverse order
#我們將再次循環(huán)過濾器,但順序相反
for filter in filters[::-1]:
用于過濾器中的過濾器[::-1]:
# In the decoder, we will apply a CONV_TRANSPOSE with RELU and BatchNormalization operation
#在解碼器中,我們將通過RELU和BatchNormalization操作應(yīng)用CONV_TRANSPOSE
x = Conv2DTranspose(filter, (3, 3), strides=2,
x = Conv2DTranspose(filter,(3,3),strides = 2,
padding=”same”)(x)
填充=“相同”)(x)
x = LeakyReLU(alpha=0.2)(x)
x = LeakyReLU(alpha = 0.2)(x)
x = BatchNormalization(axis=chanDim)(x)
x =批次歸一化(axis = chanDim)(x)
# Now, we want to recover the original depth of the image. For this, we apply a single CONV_TRANSPOSE layer
#現(xiàn)在,我們要恢復(fù)圖像的原始深度。 為此,我們應(yīng)用一個(gè)CONV_TRANSPOSE層
x = Conv2DTranspose(depth, (3, 3), padding=”same”)(x)
x = Conv2DTranspose(depth,(3,3),padding =“ same”)(x)
outputs = Activation(“sigmoid”)(x)
輸出=激活(“ sigmoid”)(x)
# Now build the decoder model
#現(xiàn)在建立解碼器模型
decoder = Model(latentInputs, outputs, name=”decoder”)
解碼器=模型(latentInputs,輸出,名稱=“解碼器”)
# Finally, the autoencoder is the encoder + decoder
#最后,自動(dòng)編碼器是編碼器+解碼器
autoencoder = Model(inputs, decoder(encoder(inputs)),
autoencoder =模型(輸入,解碼器(編碼器(輸入)),
name=”autoencoder”)
名稱=“自動(dòng)編碼器”)
# return a tuple of the encoder, decoder, and autoencoder models
#返回編碼器,解碼器和自動(dòng)編碼器模型的元組
return (encoder, decoder, autoencoder)
返回(編碼器,解碼器,自動(dòng)編碼器)
Listing 1.3: Builder class to create autoencoder networks.
代碼清單1.3:用于創(chuàng)建自動(dòng)編碼器網(wǎng)絡(luò)的Builder類
培訓(xùn)自動(dòng)編碼器 (Training Autoencoders)
The following code Listing 1.4 starts the autoencoder training.
以下代碼清單1.4開始自動(dòng)編碼器訓(xùn)練。
# initialize the number of epochs to train for and batch size
#初始化要訓(xùn)練的時(shí)期數(shù)和批量大小
EPOCHS = 300
EPOCHS = 300
BATCHES = 8
批次= 8
MODEL_OUT_DIR = “ae_model_dir”
MODEL_OUT_DIR =“ ae_model_dir”
# construct our convolutional autoencoder
#構(gòu)造我們的卷積自動(dòng)編碼器
print(“[INFO] building autoencoder…”)
打印(“ [[INFO] Building autoencoder ...”)
(encoder, decoder, autoencoder) = AutoencoderBuilder().build_ae(height,width,channel)
(編碼器,解碼器,自動(dòng)編碼器)= AutoencoderBuilder()。build_ae(高度,寬度,通道)
opt = Adam(lr=1e-3)
opt =亞當(dāng)(lr = 1e-3)
autoencoder.compile(loss=”mse”, optimizer=opt)
autoencoder.compile(loss =“ mse”,Optimizer = opt)
# train the convolutional autoencoder
#訓(xùn)練卷積自動(dòng)編碼器
history = autoencoder.fit(
歷史= autoencoder.fit(
train_it,
train_it,
validation_data=val_it,
validation_data = val_it,
epochs=EPOCHS,
epochs = EPOCHS,
batch_size=BATCHES)
batch_size = BATCHES)
autoencoder.save(MODEL_OUT_DIR+”/ae_model.h5”)
autoencoder.save(MODEL_OUT_DIR +” / ae_model.h5”)
Listing 1.4: Training autoencoder model.
代碼清單1.4:訓(xùn)練自動(dòng)編碼器模型
可視化培訓(xùn)指標(biāo) (Visualizing the Training Metrics)
The code listing 1.5 shows how to display a graph of loss/accuracy per epoch of both training and validation. Figure 2 shows a sample output of the code Listing 1.5
代碼清單1.5顯示了如何顯示訓(xùn)練和驗(yàn)證的每個(gè)時(shí)期的損失/準(zhǔn)確性圖。 圖2顯示了代碼清單1.5的示例輸出。
# set the matplotlib backend so figures can be saved in the background
#設(shè)置matplotlib后端,以便可以將圖形保存在后臺(tái)
import matplotlib
導(dǎo)入matplotlib
import matplotlib.pyplot as plt
導(dǎo)入matplotlib.pyplot作為plt
%matplotlib inline
%matplotlib內(nèi)聯(lián)
# construct a plot that plots and displays the training history
#構(gòu)造一個(gè)繪制并顯示訓(xùn)練歷史的圖
N = np.arange(0, EPOCHS)
N = np.arange(0,EPOCHS)
plt.style.use(“ggplot”)
plt.style.use(“ ggplot”)
plt.figure()
plt.figure()
plt.plot(N, history.history[“l(fā)oss”], label=”train_loss”)
plt.plot(N,history.history [“ loss”],label =“ train_loss”)
plt.plot(N, history.history[“val_loss”], label=”val_loss”)
plt.plot(N,history.history [“ val_loss”],label =“ val_loss”)
plt.title(“Training Loss and Accuracy”)
plt.title(“培訓(xùn)損失和準(zhǔn)確性”)
plt.xlabel(“Epoch #”)
plt.xlabel(“ Epoch#”)
plt.ylabel(“Loss/Accuracy”)
plt.ylabel(“損失/準(zhǔn)確性”)
plt.legend(loc=”lower left”)
plt.legend(loc =“左下角”)
# plt.savefig(plot)
#plt.savefig(圖)
plt.show(block=True)
plt.show(block = True)
Listing 1.5: Display a plot of training loss and accuracy vs epochs
清單1.5:顯示訓(xùn)練損失和準(zhǔn)確性與歷時(shí)的關(guān)系圖
Figure 1.2: Plot of loss/accuracy vs epoch
圖1.2:損失/準(zhǔn)確性與時(shí)期的關(guān)系圖
作出預(yù)測 (Make Predictions)
Now that we have a trained autoencoder model, we will use it to make predictions. The code listing 1.6 shows how to load the model from the directory location where it was saved. We use predict() function and pass the validation image iterator that we created before. Ideally we should have a different image set for prediction and testing.
現(xiàn)在我們有了訓(xùn)練有素的自動(dòng)編碼器模型,我們將使用它來進(jìn)行預(yù)測。 代碼清單1.6顯示了如何從保存模型的目錄位置加載模型。 我們使用predict()函數(shù)并傳遞之前創(chuàng)建的驗(yàn)證圖像迭代器。 理想情況下,我們應(yīng)該為預(yù)測和測試設(shè)置不同的圖像集。
Here is the code to do the prediction and display.
這是執(zhí)行預(yù)測和顯示的代碼。
from google.colab.patches import cv2_imshow
從google.colab.patches導(dǎo)入cv2_imshow
# use the convolutional autoencoder to make predictions on the
#使用卷積自動(dòng)編碼器對
# validation images, then display those predicted image.
#驗(yàn)證圖像,然后顯示那些預(yù)測圖像。
print(“[INFO] making predictions…”)
打印(“ [INFO]做出預(yù)測…”)
autoencoder_model = tf.keras.models.load_model(MODEL_OUT_DIR+”/encoder_decoder_model.h5")
autoencoder_model = tf.keras.models.load_model(MODEL_OUT_DIR +” / encoder_decoder_model.h5“)
decoded = autoencoder_model.predict(train_it)
解碼= autoencoder_model.predict(train_it)
decoded = autoencoder.predict(val_it)
解碼= autoencoder.predict(val_it)
examples = 10
例子= 10
# loop over a few samples to display the predicted images
#循環(huán)幾個(gè)樣本以顯示預(yù)測的圖像
for i in range(0, examples):
對于我在范圍內(nèi)(0,示例):
predicted = (decoded[i] * 255).astype(“uint8”)
預(yù)測=(decoded [i] * 255).astype(“ uint8”)
cv2_imshow(predicted)
cv2_imshow(預(yù)測)
Listing 1.6: Code to predict and display the images
代碼清單1.6:預(yù)測和顯示圖像的代碼
In the above code listing, I have used the cv2_imshow package which is very specific to Google Colab. If you are Jupyter or any other IDE, you may have to simply import the cv2 package. To display the image, use cv2.imshow() function.
在上面的代碼清單中,我使用了cv2_imshow軟件包,該軟件包非常特定于Google Colab。 如果您是Jupyter或任何其他IDE,則可能只需導(dǎo)入cv2軟件包。 要顯示圖像,請使用cv2.imshow()函數(shù)。
結(jié)論 (Conclusion)
In this tutorial, we built autoencoder models using our own images. We also explored how to save the model. We loaded the saved model and made the predictions. We finally displayed the predicted images.
在本教程中,我們使用自己的圖像構(gòu)建了自動(dòng)編碼器模型。 我們還探討了如何保存模型。 我們加載了保存的模型并做出了預(yù)測。 我們最終顯示了預(yù)測的圖像。
翻譯自: https://medium.com/building-deep-autoencoder-with-keras-and-tensorflo/building-deep-autoencoders-with-keras-and-tensorflow-a97a53049e4d
總結(jié)
以上是生活随笔為你收集整理的使用Keras和TensorFlow构建深度自动编码器的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 体验更佳!小米11T、POCO F4正在
- 下一篇: 国道警示下坡已死亡37人引质疑 官方回应