當前位置：首頁 > 人工智能 > pytorch >内容正文

pytorch

深度学习tensorflow实现宝可梦图像分类

發布時間：2024/1/23 pytorch 48 豆豆

生活随笔收集整理的這篇文章主要介紹了深度学习tensorflow实现宝可梦图像分类小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

一、數據集簡介

二、數據預處理

三、構建卷積神經網絡

四、模型訓練

五、預測

六、分析與優化

一、數據集簡介

寶可夢數據集（共1168張圖像）：bulbasaur（妙蛙種子，234）、charmander（小火龍，238）、mewtwo（超夢，239）、pikachu（皮卡丘，234）、squirtle（杰尼龜，223）。

二、數據預處理

通過pokmon.py批量讀取圖像路徑，根據不同路徑生成每張圖像的路徑和標簽并打亂順序。

import ?os, globimport ?random, csvimport tensorflow as tfdef load_csv(root, filename, name2label):# root:數據集根目錄# filename:csv文件名# name2label:類別名編碼表if not os.path.exists(os.path.join(root, filename)):images = []for name in name2label.keys():images += glob.glob(os.path.join(root, name, '*.png'))images += glob.glob(os.path.join(root, name, '*.jpg'))images += glob.glob(os.path.join(root, name, '*.jpeg'))print(len(images), images)random.shuffle(images)with open(os.path.join(root, filename), mode='w', newline='') as f:writer = csv.writer(f)for img in images:name = img.split(os.sep)[-2]label = name2label[name]writer.writerow([img, label])print('written into csv file:', filename)images, labels = [], []with open(os.path.join(root, filename)) as f:reader = csv.reader(f)for row in reader:img, label = rowlabel = int(label)images.append(img)labels.append(label)assert len(images) == len(labels)return images, labelsdef load_pokemon(root, mode='train'):# 創建數字編碼表name2label = {} ?# "sq...":0for name in sorted(os.listdir(os.path.join(root))):if not os.path.isdir(os.path.join(root, name)):continue# 給每個類別編碼一個數字name2label[name] = len(name2label.keys())# 讀取Label信息# [file1,file2,], [3,1]images, labels = load_csv(root, 'images.csv', name2label)if mode == 'train': ?# 60%images = images[:int(0.6 * len(images))]labels = labels[:int(0.6 * len(labels))]elif mode == 'val': ?# 20% = 60%->80%images = images[int(0.6 * len(images)):int(0.8 * len(images))]labels = labels[int(0.6 * len(labels)):int(0.8 * len(labels))]else: ?# 20% = 80%->100%images = images[int(0.8 * len(images)):]labels = labels[int(0.8 * len(labels)):]return images, labels, name2labelimg_mean = tf.constant([0.485, 0.456, 0.406])img_std = tf.constant([0.229, 0.224, 0.225])def normalize(x, mean=img_mean, std=img_std):x = (x - mean)/stdreturn xdef denormalize(x, mean=img_mean, std=img_std):x = x * std + meanreturn xdef main():import timeimages, labels, table = load_pokemon('pokemon', 'train')print('images', len(images), images)print('labels', len(labels), labels)print(table)if __name__ == '__main__':main()

三、構建卷積神經網絡

通過keras.Sequential構建一個簡單的卷積神經網絡。

network = keras.Sequential([layers.Conv2D(16,5,3),layers.MaxPool2D(3,3),layers.ReLU(),layers.Conv2D(64,5,3),layers.MaxPool2D(2,2),layers.ReLU(),layers.Flatten(),layers.Dense(64),layers.ReLU(),layers.Dense(5) ])

四、模型訓練

1、讀取訓練數據，batchsize根據內存或顯卡顯存大小決定。

batchsz = 256 images, labels, table = load_pokemon('pokemon',mode='train') db_train = tf.data.Dataset.from_tensor_slices((images, labels)) db_train = db_train.shuffle(1000).map(preprocess).batch(batchsz)

2、讀取驗證數據

images2, labels2, table = load_pokemon('pokemon',mode='val') db_val = tf.data.Dataset.from_tensor_slices((images2, labels2)) db_val = db_val.map(preprocess).batch(batchsz) 3、讀取測試數據 images3, labels3, table = load_pokemon('pokemon',mode='test') db_test = tf.data.Dataset.from_tensor_slices((images3, labels3)) db_test = db_test.map(preprocess).batch(100)

4、數據預處理

def preprocess(x,y):# x: 圖片的路徑，y：圖片的數字編碼x = tf.io.read_file(x)x = tf.image.decode_jpeg(x, channels=3)x = tf.image.resize(x, [244, 244])x = tf.image.random_flip_left_right(x)x = tf.image.random_crop(x, [224,224,3])x = tf.cast(x, dtype=tf.float32) / 255.x = normalize(x)y = tf.convert_to_tensor(y)y = tf.one_hot(y, depth=5)return x, y

5、模型訓練，損失采用交叉熵，使用earlystop防止過擬合。

network.build(input_shape=(4, 224, 224, 3)) network.summary()early_stopping = EarlyStopping(monitor='val_accuracy',min_delta=0.001,patience=5 )network.compile(optimizer=optimizers.Adam(lr=1e-3),loss=losses.CategoricalCrossentropy(from_logits=True),metrics=['accuracy']) network.fit(db_train, validation_data=db_val, validation_freq=1, epochs=100,callbacks=[early_stopping]) network.evaluate(db_test)

模型結構：

Model: "sequential"

_________________________________________________________________

Layer (type) ????????????????Output Shape ?????????????Param # ??

=================================================================

conv2d (Conv2D) ?????????????multiple ?????????????????1216 ?????

_________________________________________________________________

max_pooling2d (MaxPooling2D) multiple ?????????????????0 ????????

_________________________________________________________________

re_lu (ReLU) ????????????????multiple ?????????????????0 ????????

_________________________________________________________________

conv2d_1 (Conv2D) ???????????multiple ?????????????????25664 ????

_________________________________________________________________

max_pooling2d_1 (MaxPooling2 multiple ?????????????????0 ????????

_________________________________________________________________

re_lu_1 (ReLU) ??????????????multiple ?????????????????0 ????????

_________________________________________________________________

flatten (Flatten) ???????????multiple ?????????????????0 ????????

_________________________________________________________________

dense (Dense) ???????????????multiple ?????????????????36928 ????

_________________________________________________________________

re_lu_2 (ReLU) ??????????????multiple ?????????????????0 ????????

_________________________________________________________________

dense_1 (Dense) ?????????????multiple ?????????????????325 ??????

=================================================================

Total params: 64,133

Trainable params: 64,133

Non-trainable params: 0

訓練結果：

Epoch 16/100

1/3 [=========>....................] - ETA: 6s - loss: 0.1232 - accuracy: 0.9805

2/3 [===================>..........] - ETA: 3s - loss: 0.1455 - accuracy: 0.9785

3/3 [==============================] - 11s 4s/step - loss: 0.1241 - accuracy: 0.9793 - val_loss: 0.3912 - val_accuracy: 0.8798

1/3 [=========>....................] - ETA: 2s - loss: 0.4005 - accuracy: 0.8700

2/3 [===================>..........] - ETA: 1s - loss: 0.4779 - accuracy: 0.8450

3/3 [==============================] - 3s 899ms/step - loss: 0.4673 - accuracy: 0.8504
6、保存模型
network.save('model.h5')

五、預測

1、圖像讀取和預處理

def preprocess(img):img = tf.io.read_file(img)img = tf.image.decode_jpeg(img, channels=3)img = tf.image.resize(img, [244, 244])img = tf.image.random_flip_left_right(img)img = tf.image.random_crop(img, [224,224,3])img = tf.cast(img, dtype=tf.float32) / 255.return imgimg = '3.jpg' x = preprocess(img) x = tf.reshape(x, [1, 224, 224, 3])

2、加載訓練模型

network = tf.keras.models.load_model('model.h5')

3、預測分類結果及對應概率，這里使用softmax將輸出的logits轉換為每個分類對應概率。

logits = network .predict(x) prob = tf.nn.softmax(logits, axis=1) print(prob) max_prob_index = np.argmax(prob, axis=-1)[0] prob = prob.numpy() max_prob = prob[0][max_prob_index] max_index = np.argmax(logits, axis=-1)[0] name = ['妙蛙種子', '小火龍', '超夢', '皮卡丘', '杰尼龜'] print(name[max_index]?+ “:”?+ max_prob)

測試圖像：

預測結果：

tf.Tensor([[0.02942971 0.29606345 0.02201815 0.57856214 0.07392654]], shape=(1, 5), dtype=float32)

0.57856214

皮卡丘

六、分析與優化

? ? ? ? 從訓練和預測效果上看，在訓練集上已經達到了98%左右的精度，但是在驗證集和測試集上只能達到80%多的精度，盡管使用了earlystop，也出現了明顯的過擬合現象。通過預測，可以看出一張很明顯的皮卡丘圖像預測概率為0.578，雖然可以正確分類，但還沒有達到比較好擬合狀態。

1、數據集和模型結構優化

? ? ? ? 為了快速完成訓練，這里采用的比較淺的卷積網絡，并且由于訓練數據太少（總共只有一千多張圖像），很難達到比較好的擬合效果，因此可以繼續增加數據集以提升精度，也可以用更深層的網絡進行訓練。

2、訓練參數優化

? ? ? ? 可以通過修改每層參數，以及學習率，更換優化器等方式調整參數，以達到更優的訓練效果。

3、遷移學習

? ? ? ? 針對小樣本學習，遷移學習是一個不錯的選擇，使用tensorflow內置模型結合其在對應公開數據集上的訓練參數，通過凍結模型最后對應不同分類結果的全連接層，使用自己的樣本和自定義輸出層進行訓練可以達到更好的擬合效果。

數據集和全部代碼地址：

鏈接：https://pan.baidu.com/s/1s__J2FkaGNsisTG7UAbMiQ?
提取碼：curk?

總結

以上是生活随笔為你收集整理的深度学习tensorflow实现宝可梦图像分类的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： python使用线性回归实现房价预测
下一篇：梳理百年深度学习发展史-七月在线机器学习