Keras-模型
Keras 模型
簡介
前一篇文章介紹了以Keras為核心的數據相關的操作,當有了數據之后,就應該將數據“喂”給模型,這是深度學習中相當重要也非常有趣的一部分,本文主要涉及Keras中關于模型的一系列操作。包括模型定義,參數初始化以及模型的保存和加載。
模型構建
在介紹模型定義之前首先介紹在較為古老的版本Keras構建模型的兩種手段,Sequential容器和Function API(函數式API)。
Sequential API
keras.Sequential是一個包裝張量運算的容器,這里的張量運算主要指封裝完成的keras.layers或者繼承自keras.Model的張量運算(前者如卷積操作,后者如resnet block模塊)。
該容器接收一個列表作為參數構建模型,列表中就是上述所說的各種張量運算,會返回一個Sequential類型的對象,該類其實是keras.Model的子類,擁有模型類的全部功能。例如下面構建的一個簡單卷積分類器,其源碼和輸出如下。
import tensorflow.keras as kerasmodel = keras.Sequential([keras.layers.Input(batch_input_shape=(None, 224, 224, 3)),keras.layers.Conv2D(16, (3, 3), strides=(2, 2), padding='same'),keras.layers.Flatten(),keras.layers.Dense(1000)] ) print(model.summary()) Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 112, 112, 16) 448 _________________________________________________________________ flatten (Flatten) (None, 200704) 0 _________________________________________________________________ dense (Dense) (None, 1000) 200705000 ================================================================= Total params: 200,705,448 Trainable params: 200,705,448 Non-trainable params: 0 _________________________________________________________________Function API
下面介紹另一種堆疊張量運算的方式—函數式API,相對于Sequential那種單方向堆疊的API,Function API適合構建大型模型的使用,其最基本的用法為張量運算(張量)來堆疊張量的運算變換,再通過keras.Model類封裝最終的模型。例如下面代碼所描述的示例以及其輸出結果,可以看到,和上文Sequential法的輸出是一致的。
import tensorflow.keras as kerasinputs = keras.layers.Input(batch_input_shape=(None, 224, 224, 3)) x = keras.layers.Conv2D(16, (3, 3), strides=(2, 2), padding='same')(inputs) x = keras.layers.Flatten()(x) x = keras.layers.Dense(1000)(x) model = keras.Model(inputs=inputs, outputs=x) print(model.summary()) Model: "model" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, 224, 224, 3)] 0 _________________________________________________________________ conv2d (Conv2D) (None, 112, 112, 16) 448 _________________________________________________________________ flatten (Flatten) (None, 200704) 0 _________________________________________________________________ dense (Dense) (None, 1000) 200705000 ================================================================= Total params: 200,705,448 Trainable params: 200,705,448 Non-trainable params: 0 _________________________________________________________________Subclassing API
上述的兩種方法是原生Keras提供的兩種模型構建手段,各有適用場景,一般按需使用即可。隨著Keras加入TensorFlow大家族,新版本的Keras提供了一種新的類似Pytorch構建模型的API稱為Subclassing API,其主要思路為繼承keras.Model然后定義前向傳播運算。
在Keras中想要讓后續的訓練模塊認可一個模型的關鍵要素如下(這里特別說明):
- 自定義模型以類的形式存在,且該類繼承自tf.keras.Model并在__init__方法中聲明模型需要的組件;
- 實現call方法,用于定義模型的前向運算;
- 在必要情況下實現compute_output_shape計算模型輸出大小。
通過上述三個要求,可以構建相當靈活復雜的自定義模型,代價則是更加容易出錯,一般情況下,Function API可以滿足大多數設計要求。Subclassing API更加適合于配合動態圖機制進行細微的張量操作。
定義模型后需要實例化模型,在使用模型前需要通過model.build(input_shape)來為模型指定輸入尺寸,這樣模型才能計算張量大小變化。
下面是和上面兩種方法一樣的模型使用Subclassing API構建的結果。
import tensorflow.keras as keras import tensorflow as tfclass MyModel(keras.Model):def __init__(self, num_classes=1000):super(MyModel, self).__init__()self.num_classes = num_classesself.conv1 = keras.layers.Conv2D(16, (3, 3), (2, 2), padding='same', activation='relu')self.flatten = keras.layers.Flatten()self.classifier = keras.layers.Dense(self.num_classes, activation='softmax')def call(self, inputs, training=None, mask=None):x = self.conv1(inputs)x = self.flatten(x)x = self.classifier(x)return xmodel = MyModel(1000) model.build(input_shape=(32, 224, 224, 3)) print(model.summary())model.compile(optimizer=keras.optimizers.Adam(0.001), loss='categorical_crossentropy', metrics=['loss'])模型保存與加載
上一節已經使用一些常見的張量操作來構建模型如卷積操作,具體有哪些已經封裝的張量操作可以在keras.layers中找到,這里不多贅述,他們的參數也比較簡單易懂。對于構建好的模型(不論是上一節提到的哪一種方法),都可以通過keras.Model類封裝的方法保存訓練模型或者訓練參數,一般情況下,為了節省開銷 建議保存參數即可。
只要是keras.Model對象或者其子類對象均可以通過model.save()方法進行模型存儲,keras.models.load_model()從文件中加載一個模型,存儲的文件默認是TensorFlow SavedModel文件。不過,需要尤其注意,使用Subclassing API的需要使用TensorFlow模型保存方法。
下面重點介紹使用最多的參數保存,對模型調用model.save_weights()即可保存模型參數,而構建模型對象調用model.load_weights()即可加載文件中的模型參數,同樣支持TF CheckPoint格式和HDF5文件格式,這里建議使用Keras支持較好的HDF5文件,關于TensorFlow的操作可以查看我相關文章。
示例代碼如下。
# 定義模型 model = MyModel(1000) model.build(input_shape=(32, 224, 224, 3)) model.compile(optimizer=keras.optimizers.Adam(0.001), loss='categorical_crossentropy', metrics=['loss'])# 訓練模型,略過# 保存模型參數 model.save_weights('model.h5') del model# 加載模型參數 model = MyModel(1000) model.build(input_shape=(32, 224, 224, 3)) model.load_weights('model.h5')模型的保存一般用于模型部署或者遷移學習,這方面的內容我會在后文提到,感興趣可以查看我的專欄文章。
ResNet實戰
下面主要使用Keras構建ResNet網絡,主要是殘差模塊的設計,具體代碼如下,采用Function API。
from tensorflow.keras.models import Model from tensorflow.keras.layers import Conv2D, MaxPooling2D, BatchNormalization, Flatten, Input, ZeroPadding2D, AveragePooling2D, Dense from tensorflow.keras.layers import adddef Conv2D_BN(x, filters, kernel_size, strides=(1, 1), padding='same', name=None):if name:bn_name = name + '_bn'conv_name = name + '_conv'else:bn_name = Noneconv_name = Nonex = Conv2D(filters, kernel_size, strides=strides, padding=padding, activation='relu', name=conv_name)(x)x = BatchNormalization(name=bn_name)(x)return xdef identity_block(input_tensor, filters, kernel_size, strides=(1, 1), is_conv_shortcuts=False):""":param input_tensor::param filters::param kernel_size::param strides::param is_conv_shortcuts: 直接連接或者投影連接:return:"""x = Conv2D_BN(input_tensor, filters, kernel_size, strides=strides, padding='same')x = Conv2D_BN(x, filters, kernel_size, padding='same')if is_conv_shortcuts:shortcut = Conv2D_BN(input_tensor, filters, kernel_size, strides=strides, padding='same')x = add([x, shortcut])else:x = add([x, input_tensor])return xdef bottleneck_block(input_tensor, filters=(64, 64, 256), strides=(1, 1), is_conv_shortcuts=False):""":param input_tensor::param filters::param strides::param is_conv_shortcuts: 直接連接或者投影連接:return:"""filters_1, filters_2, filters_3 = filtersx = Conv2D_BN(input_tensor, filters=filters_1, kernel_size=(1, 1), strides=strides, padding='same')x = Conv2D_BN(x, filters=filters_2, kernel_size=(3, 3))x = Conv2D_BN(x, filters=filters_3, kernel_size=(1, 1))if is_conv_shortcuts:short_cut = Conv2D_BN(input_tensor, filters=filters_3, kernel_size=(1, 1), strides=strides)x = add([x, short_cut])else:x = add([x, input_tensor])return xdef ResNet34(input_shape=(224, 224, 3), n_classes=1000):""":param input_shape::param n_classes::return:"""input_layer = Input(shape=input_shape)x = ZeroPadding2D((3, 3))(input_layer)# block1x = Conv2D_BN(x, filters=64, kernel_size=(7, 7), strides=(2, 2), padding='valid')x = MaxPooling2D(pool_size=(3, 3), strides=2, padding='same')(x)# block2x = identity_block(x, filters=64, kernel_size=(3, 3))x = identity_block(x, filters=64, kernel_size=(3, 3))x = identity_block(x, filters=64, kernel_size=(3, 3))# block3x = identity_block(x, filters=128, kernel_size=(3, 3), strides=(2, 2), is_conv_shortcuts=True)x = identity_block(x, filters=128, kernel_size=(3, 3))x = identity_block(x, filters=128, kernel_size=(3, 3))x = identity_block(x, filters=128, kernel_size=(3, 3))# block4x = identity_block(x, filters=256, kernel_size=(3, 3), strides=(2, 2), is_conv_shortcuts=True)x = identity_block(x, filters=256, kernel_size=(3, 3))x = identity_block(x, filters=256, kernel_size=(3, 3))x = identity_block(x, filters=256, kernel_size=(3, 3))x = identity_block(x, filters=256, kernel_size=(3, 3))x = identity_block(x, filters=256, kernel_size=(3, 3))# block5x = identity_block(x, filters=512, kernel_size=(3, 3), strides=(2, 2), is_conv_shortcuts=True)x = identity_block(x, filters=512, kernel_size=(3, 3))x = identity_block(x, filters=512, kernel_size=(3, 3))x = AveragePooling2D(pool_size=(7, 7))(x)x = Flatten()(x)x = Dense(n_classes, activation='softmax')(x)model = Model(inputs=input_layer, outputs=x)return modeldef ResNet50(input_shape=(224, 224, 3), n_classes=1000):""":param input_shape::param n_classes::return:"""input_layer = Input(shape=input_shape)x = ZeroPadding2D((3, 3))(input_layer)# block1x = Conv2D_BN(x, filters=64, kernel_size=(7, 7), strides=(2, 2), padding='valid')x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2), padding='same')(x)# block2x = bottleneck_block(x, filters=(64, 64, 256), strides=(1, 1), is_conv_shortcuts=True)x = bottleneck_block(x, filters=(64, 64, 256))x = bottleneck_block(x, filters=(64, 64, 256))# block3x = bottleneck_block(x, filters=(128, 128, 512), strides=(2, 2), is_conv_shortcuts=True)x = bottleneck_block(x, filters=(128, 128, 512))x = bottleneck_block(x, filters=(128, 128, 512))x = bottleneck_block(x, filters=(128, 128, 512))# block4x = bottleneck_block(x, filters=(256, 256, 1024), strides=(2, 2), is_conv_shortcuts=True)x = bottleneck_block(x, filters=(256, 256, 1024))x = bottleneck_block(x, filters=(256, 256, 1024))x = bottleneck_block(x, filters=(256, 256, 1024))x = bottleneck_block(x, filters=(256, 256, 1024))x = bottleneck_block(x, filters=(256, 256, 1024))# block5x = bottleneck_block(x, filters=(512, 512, 2048), strides=(2, 2), is_conv_shortcuts=True)x = bottleneck_block(x, filters=(512, 512, 2048))x = bottleneck_block(x, filters=(512, 512, 2048))x = AveragePooling2D(pool_size=(7, 7))(x)x = Flatten()(x)x = Dense(n_classes, activation='softmax')(x)model = Model(inputs=input_layer, outputs=x)return modelif __name__ == '__main__':resnet34 = ResNet34((224, 224, 3), n_classes=101)resnet50 = ResNet50((224, 224, 3), n_classes=101)print(resnet34.summary())print(resnet50.summary())補充說明
本文主要介紹了三種API模式在Keras中構建深度模型,也提到了模型參數的保存與加載。具體的代碼可以在我的Github找到,歡迎star或者fork。
總結
- 上一篇: Keras-数据增广
- 下一篇: Keras-训练可视化