深度学习tensorflow实现宝可梦图像分类
目錄
一、數據集簡介
二、數據預處理
三、構建卷積神經網絡
四、模型訓練
五、預測
六、分析與優化
一、數據集簡介
寶可夢數據集(共1168張圖像):bulbasaur(妙蛙種子,234)、charmander(小火龍,238)、mewtwo(超夢,239)、pikachu(皮卡丘,234)、squirtle(杰尼龜,223)。
二、數據預處理
通過pokmon.py批量讀取圖像路徑,根據不同路徑生成每張圖像的路徑和標簽并打亂順序。
import ?os, globimport ?random, csvimport tensorflow as tfdef load_csv(root, filename, name2label):# root:數據集根目錄# filename:csv文件名# name2label:類別名編碼表if not os.path.exists(os.path.join(root, filename)):images = []for name in name2label.keys():images += glob.glob(os.path.join(root, name, '*.png'))images += glob.glob(os.path.join(root, name, '*.jpg'))images += glob.glob(os.path.join(root, name, '*.jpeg'))print(len(images), images)random.shuffle(images)with open(os.path.join(root, filename), mode='w', newline='') as f:writer = csv.writer(f)for img in images:name = img.split(os.sep)[-2]label = name2label[name]writer.writerow([img, label])print('written into csv file:', filename)images, labels = [], []with open(os.path.join(root, filename)) as f:reader = csv.reader(f)for row in reader:img, label = rowlabel = int(label)images.append(img)labels.append(label)assert len(images) == len(labels)return images, labelsdef load_pokemon(root, mode='train'):# 創建數字編碼表name2label = {} ?# "sq...":0for name in sorted(os.listdir(os.path.join(root))):if not os.path.isdir(os.path.join(root, name)):continue# 給每個類別編碼一個數字name2label[name] = len(name2label.keys())# 讀取Label信息# [file1,file2,], [3,1]images, labels = load_csv(root, 'images.csv', name2label)if mode == 'train': ?# 60%images = images[:int(0.6 * len(images))]labels = labels[:int(0.6 * len(labels))]elif mode == 'val': ?# 20% = 60%->80%images = images[int(0.6 * len(images)):int(0.8 * len(images))]labels = labels[int(0.6 * len(labels)):int(0.8 * len(labels))]else: ?# 20% = 80%->100%images = images[int(0.8 * len(images)):]labels = labels[int(0.8 * len(labels)):]return images, labels, name2labelimg_mean = tf.constant([0.485, 0.456, 0.406])img_std = tf.constant([0.229, 0.224, 0.225])def normalize(x, mean=img_mean, std=img_std):x = (x - mean)/stdreturn xdef denormalize(x, mean=img_mean, std=img_std):x = x * std + meanreturn xdef main():import timeimages, labels, table = load_pokemon('pokemon', 'train')print('images', len(images), images)print('labels', len(labels), labels)print(table)if __name__ == '__main__':main()三、構建卷積神經網絡
通過keras.Sequential構建一個簡單的卷積神經網絡。
network = keras.Sequential([layers.Conv2D(16,5,3),layers.MaxPool2D(3,3),layers.ReLU(),layers.Conv2D(64,5,3),layers.MaxPool2D(2,2),layers.ReLU(),layers.Flatten(),layers.Dense(64),layers.ReLU(),layers.Dense(5) ])四、模型訓練
1、讀取訓練數據,batchsize根據內存或顯卡顯存大小決定。
batchsz = 256 images, labels, table = load_pokemon('pokemon',mode='train') db_train = tf.data.Dataset.from_tensor_slices((images, labels)) db_train = db_train.shuffle(1000).map(preprocess).batch(batchsz)
2、讀取驗證數據
4、數據預處理
def preprocess(x,y):# x: 圖片的路徑,y:圖片的數字編碼x = tf.io.read_file(x)x = tf.image.decode_jpeg(x, channels=3)x = tf.image.resize(x, [244, 244])x = tf.image.random_flip_left_right(x)x = tf.image.random_crop(x, [224,224,3])x = tf.cast(x, dtype=tf.float32) / 255.x = normalize(x)y = tf.convert_to_tensor(y)y = tf.one_hot(y, depth=5)return x, y5、模型訓練,損失采用交叉熵,使用earlystop防止過擬合。
network.build(input_shape=(4, 224, 224, 3)) network.summary()early_stopping = EarlyStopping(monitor='val_accuracy',min_delta=0.001,patience=5 )network.compile(optimizer=optimizers.Adam(lr=1e-3),loss=losses.CategoricalCrossentropy(from_logits=True),metrics=['accuracy']) network.fit(db_train, validation_data=db_val, validation_freq=1, epochs=100,callbacks=[early_stopping]) network.evaluate(db_test)模型結構:
Model: "sequential"
_________________________________________________________________
Layer (type) ????????????????Output Shape ?????????????Param # ??
=================================================================
conv2d (Conv2D) ?????????????multiple ?????????????????1216 ?????
_________________________________________________________________
max_pooling2d (MaxPooling2D) multiple ?????????????????0 ????????
_________________________________________________________________
re_lu (ReLU) ????????????????multiple ?????????????????0 ????????
_________________________________________________________________
conv2d_1 (Conv2D) ???????????multiple ?????????????????25664 ????
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 multiple ?????????????????0 ????????
_________________________________________________________________
re_lu_1 (ReLU) ??????????????multiple ?????????????????0 ????????
_________________________________________________________________
flatten (Flatten) ???????????multiple ?????????????????0 ????????
_________________________________________________________________
dense (Dense) ???????????????multiple ?????????????????36928 ????
_________________________________________________________________
re_lu_2 (ReLU) ??????????????multiple ?????????????????0 ????????
_________________________________________________________________
dense_1 (Dense) ?????????????multiple ?????????????????325 ??????
=================================================================
Total params: 64,133
Trainable params: 64,133
Non-trainable params: 0
訓練結果:
Epoch 16/100
1/3 [=========>....................] - ETA: 6s - loss: 0.1232 - accuracy: 0.9805
2/3 [===================>..........] - ETA: 3s - loss: 0.1455 - accuracy: 0.9785
3/3 [==============================] - 11s 4s/step - loss: 0.1241 - accuracy: 0.9793 - val_loss: 0.3912 - val_accuracy: 0.8798
1/3 [=========>....................] - ETA: 2s - loss: 0.4005 - accuracy: 0.8700
2/3 [===================>..........] - ETA: 1s - loss: 0.4779 - accuracy: 0.8450
3/3 [==============================] - 3s 899ms/step - loss: 0.4673 - accuracy: 0.8504
6、保存模型
network.save('model.h5')
五、預測
1、圖像讀取和預處理
def preprocess(img):img = tf.io.read_file(img)img = tf.image.decode_jpeg(img, channels=3)img = tf.image.resize(img, [244, 244])img = tf.image.random_flip_left_right(img)img = tf.image.random_crop(img, [224,224,3])img = tf.cast(img, dtype=tf.float32) / 255.return imgimg = '3.jpg' x = preprocess(img) x = tf.reshape(x, [1, 224, 224, 3])2、加載訓練模型
network = tf.keras.models.load_model('model.h5')
3、預測分類結果及對應概率,這里使用softmax將輸出的logits轉換為每個分類對應概率。
logits = network .predict(x) prob = tf.nn.softmax(logits, axis=1) print(prob) max_prob_index = np.argmax(prob, axis=-1)[0] prob = prob.numpy() max_prob = prob[0][max_prob_index] max_index = np.argmax(logits, axis=-1)[0] name = ['妙蛙種子', '小火龍', '超夢', '皮卡丘', '杰尼龜'] print(name[max_index]?+ “:”?+ max_prob)測試圖像:
預測結果:
tf.Tensor([[0.02942971 0.29606345 0.02201815 0.57856214 0.07392654]], shape=(1, 5), dtype=float32)
0.57856214
皮卡丘
六、分析與優化
? ? ? ? 從訓練和預測效果上看,在訓練集上已經達到了98%左右的精度,但是在驗證集和測試集上只能達到80%多的精度,盡管使用了earlystop,也出現了明顯的過擬合現象。通過預測,可以看出一張很明顯的皮卡丘圖像預測概率為0.578,雖然可以正確分類,但還沒有達到比較好擬合狀態。
1、數據集和模型結構優化
? ? ? ? 為了快速完成訓練,這里采用的比較淺的卷積網絡,并且由于訓練數據太少(總共只有一千多張圖像),很難達到比較好的擬合效果,因此可以繼續增加數據集以提升精度,也可以用更深層的網絡進行訓練。
2、訓練參數優化
? ? ? ? 可以通過修改每層參數,以及學習率,更換優化器等方式調整參數,以達到更優的訓練效果。
3、遷移學習
? ? ? ? 針對小樣本學習,遷移學習是一個不錯的選擇,使用tensorflow內置模型結合其在對應公開數據集上的訓練參數,通過凍結模型最后對應不同分類結果的全連接層,使用自己的樣本和自定義輸出層進行訓練可以達到更好的擬合效果。
數據集和全部代碼地址:
鏈接:https://pan.baidu.com/s/1s__J2FkaGNsisTG7UAbMiQ?
提取碼:curk?
總結
以上是生活随笔為你收集整理的深度学习tensorflow实现宝可梦图像分类的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: python使用线性回归实现房价预测
- 下一篇: 梳理百年深度学习发展史-七月在线机器学习