當前位置：首頁 > 人文社科 > 生活经验 >内容正文

生活经验

CV算法复现（分类算法5/6）：ResNet（2015年微软亚洲研究院）

發布時間：2023/11/27 生活经验 23 豆豆

生活随笔收集整理的這篇文章主要介紹了 CV算法复现（分类算法5/6）：ResNet（2015年微软亚洲研究院）小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

致謝：霹靂吧啦Wz：https://space.bilibili.com/18161609

1 本次要點

1.1 Python庫語法

1.2 深度學習理論

2 網絡簡介

2.1 歷史意義

2.2 網絡亮點

2.3 網絡結構

3 代碼結構

3.1 model.py

3.2 train.py

3.3 predict.py

1 本次要點

1.1 深度學習理論

BN層：使feature map（指一批圖的，而不是一張圖）滿足均值為0，方差為1的分布。
- 注意1：如果要使用BN層，則batch size應該盡可能大，因為這樣更接近全體數據集的均值和方差，而batchsize如果為1，可能還不如不用。
- 注意2：BN建議放在卷積層和激活層（如Relu）之間，且卷積不要使用偏置bias，因為有也會在BN計算時抵消掉。
- 詳細可見：（霹靂吧啦Wz）Batch Normalization詳解以及pytorch實驗：https://blog.csdn.net/qq_37541097/article/details/104434557
遷移學習：如果使用別人的預訓練模型，一定要知道別人的預處理方式。
- 常見的遷移學習方式：
  1. 載入權重后，訓練所有參數
  2. 載入權重后，只訓練最后幾層參數
  3. 載入權重后，在原網絡基礎上再添加一層全連接層，僅訓練最后一個全連接層
- 遷移學習效果：花分類任務，重頭開始訓練，幾十個epoch后才89%，但使用ImageNet預訓練模型，一個epoch后就90.9%。
殘差結構：（特征圖做加法）

2 網絡簡介

2.1 歷史意義

ResNet在2015年由微軟實驗室提出，斬獲當年ImageNet競賽中分類任務第一名，目標檢測第一名。獲得COCO數據集中目標檢測第一名，圖像分割第一名。

2.2 網絡亮點

提出residual 殘差模塊（加法運算，而不是通道拼接，所以C、H、W維度都要一致）
使用Batch Normalization 加速訓練( 丟棄dropout)
超深的網絡結構( 突破1000 層)

2.3 網絡結構

3 代碼結構

model.py
train.py
predict.py

3.1 model.py

import torch.nn as nn
import torch#18和34層殘差結構（具備實線殘差結構功能和虛線殘差結構功能）
class BasicBlock(nn.Module):expansion = 1 #對應殘差結構中卷積核個數有沒有發生變化。1是1倍意思，即都一樣。# downsample對應虛線的殘差結構def __init__(self, in_channel, out_channel, stride=1, downsample=None):super(BasicBlock, self).__init__()self.conv1 = nn.Conv2d(in_channels=in_channel, out_channels=out_channel,kernel_size=3, stride=stride, padding=1, bias=False)#注意之前的卷積層不要biasself.bn1 = nn.BatchNorm2d(out_channel)self.relu = nn.ReLU()self.conv2 = nn.Conv2d(in_channels=out_channel, out_channels=out_channel,kernel_size=3, stride=1, padding=1, bias=False)self.bn2 = nn.BatchNorm2d(out_channel)self.downsample = downsampledef forward(self, x):identity = xif self.downsample is not None:identity = self.downsample(x)out = self.conv1(x)out = self.bn1(out)out = self.relu(out)out = self.conv2(out)out = self.bn2(out)out += identityout = self.relu(out)return out#50層及以上的殘差結構（具備實線殘差結構功能和虛線殘差結構功能）
class Bottleneck(nn.Module):expansion = 4def __init__(self, in_channel, out_channel, stride=1, downsample=None):super(Bottleneck, self).__init__()self.conv1 = nn.Conv2d(in_channels=in_channel, out_channels=out_channel,kernel_size=1, stride=1, bias=False)  # squeeze channelsself.bn1 = nn.BatchNorm2d(out_channel)# -----------------------------------------self.conv2 = nn.Conv2d(in_channels=out_channel, out_channels=out_channel,kernel_size=3, stride=stride, bias=False, padding=1)self.bn2 = nn.BatchNorm2d(out_channel)# -----------------------------------------self.conv3 = nn.Conv2d(in_channels=out_channel, out_channels=out_channel*self.expansion,kernel_size=1, stride=1, bias=False)  # unsqueeze channelsself.bn3 = nn.BatchNorm2d(out_channel*self.expansion)self.relu = nn.ReLU(inplace=True)self.downsample = downsampledef forward(self, x):identity = xif self.downsample is not None:identity = self.downsample(x)out = self.conv1(x)out = self.bn1(out)out = self.relu(out)out = self.conv2(out)out = self.bn2(out)out = self.relu(out)out = self.conv3(out)out = self.bn3(out)out += identityout = self.relu(out)return out# block為 BasicBlock(nn.Module)或Bottleneck(nn.Module)
# blocks_num：列表參數,代表殘差結構的個數，如[3,4,6,3]、[2,2,2,2]
# include_top=True方便在resnet上搭建更復雜的結構。默認就是True
class ResNet(nn.Module):def __init__(self, block, blocks_num, num_classes=1000, include_top=True):super(ResNet, self).__init__()self.include_top = include_topself.in_channel = 64self.conv1 = nn.Conv2d(3, self.in_channel, kernel_size=7, stride=2,padding=3, bias=False)self.bn1 = nn.BatchNorm2d(self.in_channel)self.relu = nn.ReLU(inplace=True)self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)self.layer1 = self._make_layer(block, 64, blocks_num[0])self.layer2 = self._make_layer(block, 128, blocks_num[1], stride=2)self.layer3 = self._make_layer(block, 256, blocks_num[2], stride=2)self.layer4 = self._make_layer(block, 512, blocks_num[3], stride=2)if self.include_top:#采用自適應平均池化，不管輸入是什么維度，輸出的HW都將是1*1self.avgpool = nn.AdaptiveAvgPool2d((1, 1))  # output size = (1, 1)self.fc = nn.Linear(512 * block.expansion, num_classes)for m in self.modules():if isinstance(m, nn.Conv2d):nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')# 搭建殘差結構的函數def _make_layer(self, block, channel, block_num, stride=1):downsample = Noneif stride != 1 or self.in_channel != channel * block.expansion:downsample = nn.Sequential(nn.Conv2d(self.in_channel, channel * block.expansion, kernel_size=1, stride=stride, bias=False),nn.BatchNorm2d(channel * block.expansion))layers = []layers.append(block(self.in_channel, channel, downsample=downsample, stride=stride))self.in_channel = channel * block.expansionfor _ in range(1, block_num):layers.append(block(self.in_channel, channel))return nn.Sequential(*layers) # 將list轉為非關鍵字參數傳入def forward(self, x):x = self.conv1(x)x = self.bn1(x)x = self.relu(x)x = self.maxpool(x)x = self.layer1(x)x = self.layer2(x)x = self.layer3(x)x = self.layer4(x)if self.include_top:x = self.avgpool(x)x = torch.flatten(x, 1)x = self.fc(x)return xdef resnet34(num_classes=1000, include_top=True):return ResNet(BasicBlock, [3, 4, 6, 3], num_classes=num_classes, include_top=include_top)def resnet101(num_classes=1000, include_top=True):return ResNet(Bottleneck, [3, 4, 23, 3], num_classes=num_classes, include_top=include_top)

3.2 train.py

import torch
import torch.nn as nn
from torchvision import transforms, datasets
import json
import matplotlib.pyplot as plt
import os
import torch.optim as optim
from model import resnet34, resnet101#import torchvision.models.resnet #導入pytorch框架自帶的網絡結構。本作者的修改于其版本。def main():device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")print("using {} device.".format(device))data_transform = {"train": transforms.Compose([transforms.RandomResizedCrop(224),transforms.RandomHorizontalFlip(),transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),"val": transforms.Compose([transforms.Resize(256), #將圖像最小的邊縮放到256.transforms.CenterCrop(224),transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])}data_root = os.path.abspath(os.path.join(os.getcwd(), "../.."))  # get data root pathimage_path = os.path.join(data_root, "data_set", "flower_data")  # flower data set pathassert os.path.exists(image_path), "{} path does not exist.".format(image_path)train_dataset = datasets.ImageFolder(root=os.path.join(image_path, "train"),transform=data_transform["train"])train_num = len(train_dataset)# {'daisy':0, 'dandelion':1, 'roses':2, 'sunflower':3, 'tulips':4}flower_list = train_dataset.class_to_idxcla_dict = dict((val, key) for key, val in flower_list.items())# write dict into json filejson_str = json.dumps(cla_dict, indent=4)with open('class_indices.json', 'w') as json_file:json_file.write(json_str)batch_size = 16nw = min([os.cpu_count(), batch_size if batch_size > 1 else 0, 8])  # number of workersprint('Using {} dataloader workers every process'.format(nw))train_loader = torch.utils.data.DataLoader(train_dataset,batch_size=batch_size, shuffle=True,num_workers=nw)validate_dataset = datasets.ImageFolder(root=os.path.join(image_path, "val"),transform=data_transform["val"])val_num = len(validate_dataset)validate_loader = torch.utils.data.DataLoader(validate_dataset,batch_size=batch_size, shuffle=False,num_workers=nw)print("using {} images for training, {} images fot validation.".format(train_num,val_num))net = resnet34() #實例化網絡，注意：此時沒有傳入參數，默認是1000分類# load pretrain weights# download url: https://download.pytorch.org/models/resnet34-333f7ec4.pthmodel_weight_path = "./resnet34-pre.pth"assert os.path.exists(model_weight_path), "file {} does not exist.".format(model_weight_path)missing_keys, unexpected_keys = net.load_state_dict(torch.load(model_weight_path), strict=False)# for param in net.parameters():#     param.requires_grad = False# change fc layer structurein_channel = net.fc.in_featuresnet.fc = nn.Linear(in_channel, 5) #由于花分類是5類，所以重新賦值（默認1000分類）net.to(device)loss_function = nn.CrossEntropyLoss()optimizer = optim.Adam(net.parameters(), lr=0.0001)best_acc = 0.0save_path = './resNet34.pth'for epoch in range(3):# trainnet.train()running_loss = 0.0for step, data in enumerate(train_loader, start=0):images, labels = dataoptimizer.zero_grad()logits = net(images.to(device))loss = loss_function(logits, labels.to(device))loss.backward()optimizer.step()# print statisticsrunning_loss += loss.item()# print train processrate = (step+1)/len(train_loader)a = "*" * int(rate * 50)b = "." * int((1 - rate) * 50)print("\rtrain loss: {:^3.0f}%[{}->{}]{:.4f}".format(int(rate*100), a, b, loss), end="")print()# validatenet.eval()acc = 0.0  # accumulate accurate number / epochwith torch.no_grad():for val_data in validate_loader:val_images, val_labels = val_dataoutputs = net(val_images.to(device))  # eval model only have last output layer# loss = loss_function(outputs, test_labels)predict_y = torch.max(outputs, dim=1)[1]acc += (predict_y == val_labels.to(device)).sum().item()val_accurate = acc / val_numif val_accurate > best_acc:best_acc = val_accuratetorch.save(net.state_dict(), save_path)print('[epoch %d] train_loss: %.3f  test_accuracy: %.3f' %(epoch + 1, running_loss / step, val_accurate))print('Finished Training')if __name__ == '__main__':main()

輸出：

3.3 predict.py

import torch
from model import resnet34
from PIL import Image
from torchvision import transforms
import matplotlib.pyplot as plt
import jsondevice = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")data_transform = transforms.Compose([transforms.Resize(256),transforms.CenterCrop(224),transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])# load image
img = Image.open("../tulip.jpg")
plt.imshow(img)
# [N, C, H, W]
img = data_transform(img)
# expand batch dimension
img = torch.unsqueeze(img, dim=0)# read class_indict
try:json_file = open('./class_indices.json', 'r')class_indict = json.load(json_file)
except Exception as e:print(e)exit(-1)# create model
model = resnet34(num_classes=5)
# load model weights
model_weight_path = "./resNet34.pth"
model.load_state_dict(torch.load(model_weight_path, map_location=device))
model.eval()
with torch.no_grad():# predict classoutput = torch.squeeze(model(img))predict = torch.softmax(output, dim=0)predict_cla = torch.argmax(predict).numpy()
print(class_indict[str(predict_cla)], predict[predict_cla].numpy())
plt.show()

輸出：

總結

以上是生活随笔為你收集整理的CV算法复现（分类算法5/6）：ResNet（2015年微软亚洲研究院）的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： CV算法复现（分类算法4/6）：Goog
下一篇： CV算法复现（分类算法6/6）：Mobi

生活经验

CV算法复现（分类算法5/6）：ResNet（2015年 微软亚洲研究院）

致謝：霹靂吧啦Wz：https://space.bilibili.com/18161609

1 本次要點

1.1 深度學習理論

2 網絡簡介

2.1 歷史意義

2.2 網絡亮點

2.3 網絡結構

3 代碼結構

3.1 model.py

3.2 train.py

3.3 predict.py

總結

CV算法复现（分类算法5/6）：ResNet（2015年微软亚洲研究院）