日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

MicroNet论文复现:用极低的FLOPs改进图像识别

發布時間:2023/12/20 编程问答 35 豆豆
生活随笔 收集整理的這篇文章主要介紹了 MicroNet论文复现:用极低的FLOPs改进图像识别 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

摘要

????????MicroNet旨在解決在極低的計算成本下(例如在ImageNet分類上的5M FLOPs)性能大幅下降的問題。研究發現,稀疏連接和動態激活函數這兩個因素可以有效地提高準確性。前者避免了網絡寬度的顯著減少,而后者則減輕了網絡深度減少的不利影響。在技術上,論文提出了微因子卷積法,它將卷積矩陣分解為低等級矩陣,將稀疏連接性整合到卷積中。論文還提出了一個新的動態激活函數,名為動態移位最大值,通過對輸入特征圖和其循環通道移位之間的多個動態融合的最大值來改善非線性。在這兩個新算子的基礎上,得到了一個名為MicroNet的網絡系列,它在低FLOP制度下取得了比現有技術更顯著的性能提升。例如,在1200萬FLOPs的約束下,MicroNet在ImageNet分類上實現了59.4%的最高1級精度,比MobileNetV3高出9.6%。

1. MicroNet

1.1 總體架構

????????論文提出應對極低計算量場景的輕量級網絡MicroNet,包含兩個核心思路Micro-Factorized convolution和Dynamic Shift-Max,Micro-Factorized convolution通過低秩近似將原卷積分解成多個小卷積,保持輸入輸出的連接性并降低連接數,Dynamic Shift-Max通過動態的組間特征融合增加節點的連接以及提升非線性,彌補網絡深度減少帶來的性能降低。

2. 代碼復現

2.1 Micro-Factorized Convolution

Micro-Factorized Convolution的目標是平衡通道數量和節點連接性。這里一個層的連接性被定義為每個輸出節點的路徑數,其中一個路徑連接著一個輸入結點和一個輸出結點。

class MaxGroupPooling(nn.Layer):def __init__(self, channel_per_group=2):super().__init__()self.channel_per_group = channel_per_groupdef forward(self, x):if self.channel_per_group == 1:return x# max opb, c, h, w = x.shape# reshapey = x.reshape([b, c // self.channel_per_group, -1, h, w])out, _ = paddle.max(y, axis=2)return outclass GroupConv(nn.Layer):def __init__(self, inp, oup, groups=2):super().__init__()self.inp = inpself.oup = oupself.groups = groups# print('inp: %d, oup:%d, g:%d' % (inp, oup, self.groups[0]))self.conv = nn.Sequential(nn.Conv2D(inp, oup, 1, groups=self.groups[0], bias_attr=False),nn.BatchNorm2D(oup))def forward(self, x):x = self.conv(x)return xclass ChannelShuffle(nn.Layer):def __init__(self, groups):super().__init__()self.groups = groupsdef forward(self, x):b, c, h, w = x.shapechannels_per_group = c // self.groups# reshapex = x.reshape([b, self.groups, channels_per_group, h, w])x = x.transpose([0, 2, 1, 3, 4])out = x.reshape([b, c, h, w])return outclass SpatialSepConvSF(nn.Layer):def __init__(self, inp, oups, kernel_size, stride):super().__init__()oup1, oup2 = oupsself.conv = nn.Sequential(nn.Conv2D(inp,oup1, (kernel_size, 1), (stride, 1), (kernel_size // 2, 0),groups=1,bias_attr=False),nn.BatchNorm2D(oup1),nn.Conv2D(oup1,oup1 * oup2, (1, kernel_size), (1, stride),(0, kernel_size // 2),groups=oup1,bias_attr=False),nn.BatchNorm2D(oup1 * oup2),ChannelShuffle(oup1), )def forward(self, x):out = self.conv(x)return outclass DepthSpatialSepConv(nn.Layer):def __init__(self, inp, expand, kernel_size, stride):super().__init__()exp1, exp2 = expandhidden_dim = inp * exp1oup = inp * exp1 * exp2self.conv = nn.Sequential(nn.Conv2D(inp,inp * exp1, (kernel_size, 1), (stride, 1),(kernel_size // 2, 0),groups=inp,bias_attr=False),nn.BatchNorm2D(inp * exp1),nn.Conv2D(hidden_dim,oup, (1, kernel_size), (1, stride), (0, kernel_size // 2),groups=hidden_dim,bias_attr=False),nn.BatchNorm2D(oup))def forward(self, x):out = self.conv(x)return out

2.2 Dynamic Shift-Max

Dynamic-ShiftMax是一種新的動態非線性,加強了由微因子化創建的組之間的聯系。這是對Micro-Factorized Convolution的補充,后者側重于組內的連接。

class DYShiftMax(nn.Layer):def __init__(self,inp,oup,reduction=4,act_max=1.0,act_relu=True,init_a=[0.0, 0.0],init_b=[0.0, 0.0],relu_before_pool=False,g=None,expansion=False):super().__init__()self.oup = oupself.act_max = act_max * 2self.act_relu = act_reluself.avg_pool = nn.Sequential(nn.ReLU() if relu_before_pool == Trueelse nn.Identity(),nn.AdaptiveAvgPool2D(1))self.exp = 4 if act_relu else 2self.init_a = init_aself.init_b = init_b# determine squeezesqueeze = _make_divisible(inp // reduction, 4)if squeeze < 4:squeeze = 4# print('reduction: {}, squeeze: {}/{}'.format(reduction, inp, squeeze))# print('init-a: {}, init-b: {}'.format(init_a, init_b))self.fc = nn.Sequential(nn.Linear(inp, squeeze),nn.ReLU(), nn.Linear(squeeze, oup * self.exp), nn.Hardsigmoid())if g is None:g = 1self.g = g[1]if self.g != 1 and expansion:self.g = inp // self.g# print('group shuffle: {}, divide group: {}'.format(self.g, expansion))self.gc = inp // self.gindex = paddle.to_tensor(list(range(inp))).reshape([1, inp, 1, 1])index = index.reshape([1, self.g, self.gc, 1, 1])indexgs = paddle.split(index, [1, self.g - 1], axis=1)indexgs = paddle.concat((indexgs[1], indexgs[0]), axis=1)indexs = paddle.split(indexgs, [1, self.gc - 1], axis=2)indexs = paddle.concat((indexs[1], indexs[0]), axis=2)self.index = indexs.reshape([inp]).astype(paddle.int64)self.expansion = expansiondef forward(self, x):x_in = xx_out = xb, c, _, _ = x_in.shapey = self.avg_pool(x_in).reshape([b, c])y = self.fc(y).reshape([b, self.oup * self.exp, 1, 1])y = (y - 0.5) * self.act_maxn2, c2, h2, w2 = x_out.shapex2 = paddle.index_select(x_out, self.index, axis=1)if self.exp == 4:a1, b1, a2, b2 = paddle.split(y, 4, axis=1)a1 = a1 + self.init_a[0]a2 = a2 + self.init_a[1]b1 = b1 + self.init_b[0]b2 = b2 + self.init_b[1]z1 = x_out * a1 + x2 * b1z2 = x_out * a2 + x2 * b2out = paddle.maximum(z1, z2)elif self.exp == 2:a1, b1 = paddle.split(y, 2, axis=1)a1 = a1 + self.init_a[0]b1 = b1 + self.init_b[0]out = x_out * a1 + x2 * b1return outclass DYMicroBlock(nn.Layer):def __init__(self,inp,oup,kernel_size=3,stride=1,ch_exp=(2, 2),ch_per_group=4,groups_1x1=(1, 1),dy=[0, 0, 0],ratio=1.0,activation_cfg=None):super().__init__()# print(dy)self.identity = stride == 1 and inp == oupy1, y2, y3 = dyact_max = activation_cfg["act_max"]act_reduction = activation_cfg["reduction"] * ratioinit_a = activation_cfg["init_a"]init_b = activation_cfg["init_b"]init_ab3 = activation_cfg["init_ab3"]t1 = ch_expgs1 = ch_per_grouphidden_fft, g1, g2 = groups_1x1hidden_dim1 = inp * t1[0]hidden_dim2 = inp * t1[0] * t1[1]if gs1[0] == 0:self.layers = nn.Sequential(DepthSpatialSepConv(inp, t1, kernel_size, stride),DYShiftMax(hidden_dim2,hidden_dim2,act_max=act_max,act_relu=True if y2 == 2 else False,init_a=init_a,reduction=act_reduction,init_b=init_b,g=gs1,expansion=False) if y2 > 0 else nn.ReLU6(),ChannelShuffle(gs1[1]),ChannelShuffle2(hidden_dim2 // 2)if y2 != 0 else nn.Identity(),GroupConv(hidden_dim2, oup, (g1, g2)),DYShiftMax(oup,oup,act_max=act_max,act_relu=False,init_a=[init_ab3[0], 0.0],reduction=act_reduction // 2,init_b=[init_ab3[1], 0.0],g=(g1, g2),expansion=False) if y3 > 0 else nn.Identity(),ChannelShuffle(g2),ChannelShuffle2(oup // 2)if oup % 2 == 0 and y3 != 0 else nn.Identity(), )elif g2 == 0:self.layers = nn.Sequential(GroupConv(inp, hidden_dim2, gs1),DYShiftMax(hidden_dim2,hidden_dim2,act_max=act_max,act_relu=False,init_a=[init_ab3[0], 0.0],reduction=act_reduction,init_b=[init_ab3[1], 0.0],g=gs1,expansion=False) if y3 > 0 else nn.Identity(), )else:self.layers = nn.Sequential(GroupConv(inp, hidden_dim2, gs1),DYShiftMax(hidden_dim2,hidden_dim2,act_max=act_max,act_relu=True if y1 == 2 else False,init_a=init_a,reduction=act_reduction,init_b=init_b,g=gs1,expansion=False) if y1 > 0 else nn.ReLU6(),ChannelShuffle(gs1[1]),DepthSpatialSepConv(hidden_dim2, (1, 1), kernel_size, stride),nn.Identity(),DYShiftMax(hidden_dim2,hidden_dim2,act_max=act_max,act_relu=True if y2 == 2 else False,init_a=init_a,reduction=act_reduction,init_b=init_b,g=gs1,expansion=True, ) if y2 > 0 else nn.ReLU6(),ChannelShuffle2(hidden_dim2 // 4)if y1 != 0 and y2 != 0 else nn.Identity()if y1 == 0 and y2 == 0 else ChannelShuffle2(hidden_dim2 // 2),GroupConv(hidden_dim2, oup, (g1, g2)),DYShiftMax(oup,oup,act_max=act_max,act_relu=False,init_a=[init_ab3[0], 0.0],reduction=act_reduction // 2if oup < hidden_dim2 else act_reduction,init_b=[init_ab3[1], 0.0],g=(g1, g2),expansion=False) if y3 > 0 else nn.Identity(),ChannelShuffle(g2),ChannelShuffle2(oup // 2) if y3 != 0 else nn.Identity(), )def forward(self, x):out = self.layers(x)if self.identity:out = out + xreturn out

3.數據集和復現精度

3.1 數據集

ImageNet項目是一個大型視覺數據庫,用于視覺目標識別研究任務,該項目已手動標注了 1400 多萬張圖像。ImageNet-1k 是 ImageNet 數據集的子集,其包含 1000 個類別。訓練集包含 1281167 個圖像數據,驗證集包含 50000 個圖像數據。2010 年以來,ImageNet 項目每年舉辦一次圖像分類競賽,即 ImageNet 大規模視覺識別挑戰賽(ILSVRC)。挑戰賽使用的數據集即為 ImageNet-1k。到目前為止,ImageNet-1k 已經成為計算機視覺領域發展的最重要的數據集之一,其促進了整個計算機視覺的發展,很多計算機視覺下游任務的初始化模型都是基于該數據集訓練得到的。

數據集訓練集大小測試集大小類別數備注
ImageNet1k1.2M50k1000

3.2 復現精度

模型epochstop1 acc (參考精度)top1 acc (復現精度)權重訓練日志
micronet_m060046.646.4m1_epoch_594.pdparamsm0_train.log
micronet_m360062.562.8m3_epoch_591.pdparamsm3_train.log

權重及訓練日志下載地址:百度網盤 or work/best_model.pdparams

4.準備環境

4.1 安裝paddlepaddle

# 安裝GPU版本的Paddle pip install paddlepaddle-gpu==2.3.2

更多安裝方法可以參考:Paddle安裝指南。

4.2 下載代碼

%cd /home/aistudio/# !git clone https://github.com/flytocc/PaddleClas.git # !cd PaddleClas # !git checkout -b micronet_PR!unzip PaddleClas-micronet_PR.zip %cd /home/aistudio/PaddleClas-micronet_PR !pip install -r requirements.txt

5.開始使用

5.1 模型預測

%cd /home/aistudio/PaddleClas-micronet_PR%run tools/infer.py \-c ./ppcls/configs/ImageNet/MicroNet/micronet_m3.yaml \-o Infer.infer_imgs=./deploy/images/ImageNet/ILSVRC2012_val_00020010.jpeg \-o Global.pretrained_model=/home/aistudio/work/best_model

最終輸出結果為

[{'class_ids': [178, 209, 211, 208, 236], 'scores': [0.99474, 0.00512, 8e-05, 3e-05, 2e-05], 'file_name': './deploy/images/ImageNet/ILSVRC2012_val_00020010.jpeg', 'label_names': ['Weimaraner', 'Chesapeake Bay retriever', 'vizsla, Hungarian pointer', 'Labrador retriever', 'Doberman, Doberman pinscher']}]

表示預測的類別為Weimaraner(魏瑪獵狗),ID是178,置信度為0.99474。

5.2 模型訓練

  • 單機多卡訓練
python -m paddle.distributed.launch --gpus=0,1,2,3 \tools/train.py \-c ./ppcls/configs/ImageNet/MicroNet/micronet_m3.yaml

部分訓練日志如下所示。

[2022/08/31 04:13:11] ppcls INFO: [Train][Epoch 302/600][Iter: 1550/2503]lr(LinearWarmup): 0.10046482, top1: 0.48098, top5: 0.72183, CELoss: 2.36589, loss: 2.36589, batch_cost: 0.27864s, reader_cost: 0.01763, ips: 459.37528 samples/s, eta: 2 days, 9:48:20 [2022/08/31 04:13:14] ppcls INFO: [Train][Epoch 302/600][Iter: 1560/2503]lr(LinearWarmup): 0.10046271, top1: 0.48091, top5: 0.72176, CELoss: 2.36646, loss: 2.36646, batch_cost: 0.27873s, reader_cost: 0.01755, ips: 459.22941 samples/s, eta: 2 days, 9:49:24

5.3 模型評估

python -m paddle.distributed.launch --gpus=0,1,2,3 \tools/eval.py \-c ./ppcls/configs/ImageNet/MicroNet/micronet_m3.yaml \-o Global.pretrained_model=$TRAINED_MODEL

6. License

This project is released under MIT License.

7. 參考鏈接與文獻

  • MicroNet: Improving Image Recognition with Extremely Low FLOPs: https://arxiv.org/abs/2108.05894
  • micronet: https://github.com/liyunsheng13/micronet
  • @article{li2021micronet,title={MicroNet: Improving Image Recognition with Extremely Low FLOPs},author={Li, Yunsheng and Chen, Yinpeng and Dai, Xiyang and Chen, Dongdong and Liu, Mengchen and Yuan, Lu and Liu, Zicheng and Zhang, Lei and Vasconcelos, Nuno},journal={arXiv preprint arXiv:2108.05894},year={2021}}

    此文章為搬運
    原項目鏈接

    總結

    以上是生活随笔為你收集整理的MicroNet论文复现:用极低的FLOPs改进图像识别的全部內容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。