當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

DL之DeepLabv2：DeepLab v2算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略

發布時間：2025/3/21 编程问答 28 豆豆

生活随笔收集整理的這篇文章主要介紹了 DL之DeepLabv2：DeepLab v2算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

DL之DeepLabv2：DeepLab v2算法的簡介(論文介紹)、架構詳解、案例應用等配圖集合之詳細攻略

DeepLab v2算法的簡介(論文介紹)

0、實驗結果

1、DeepLab-v2 改進點

DeepLab v2算法的架構詳解

DeepLab v2算法的案例應用

相關文章
DL之DeepLabv1：DeepLabv1算法的簡介(論文介紹)、架構詳解、案例應用等配圖集合之詳細攻略
DL之DeepLabv1：DeepLabv1算法的架構詳解
DL之DeepLabv2：DeepLab v2算法的簡介(論文介紹)、架構詳解、案例應用等配圖集合之詳細攻略
DL之DeepLabv2：DeepLab v2算法的架構詳解
DL之DeepLabv3：DeepLab v3和DeepLab v3+算法的簡介(論文介紹)、架構詳解、案例應用等配圖集合之詳細攻略
DL之DeepLabv3：DeepLab v3和DeepLab v3+算法的架構詳解

DeepLab v2算法的簡介(論文介紹)

? ? ? DeepLabv2是DeepLabv1的改進版本，改進的不多，主要是用多尺度提取獲得更好的分割效果。

Abstract
? ? ? ?In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions ?that are experimentally shown to have substantial practical merit. First, we highlight convolution with upsampled filters, or ?‘atrous convolution’, as a powerful tool in dense prediction tasks. Atrous convolution allows us to explicitly control the resolution at ?which feature responses are computed within Deep Convolutional Neural Networks. It also allows us to effectively enlarge the field of ?view of filters to incorporate larger context without increasing the number of parameters or the amount of computation. Second, we ?propose atrous spatial pyramid pooling (ASPP) to robustly segment objects at multiple scales. ASPP probes an incoming convolutional ?feature layer with filters at multiple sampling rates and effective fields-of-views, thus capturing objects as well as image context at ?multiple scales. Third, we improve the localization of object boundaries by combining methods from DCNNs and probabilistic graphical ?models. The commonly deployed combination of max-pooling and downsampling in DCNNs achieves invariance but has a toll on ?localization accuracy. We overcome this by combining the responses at the final DCNN layer with a fully connected Conditional ?Random Field (CRF), which is shown both qualitatively and quantitatively to improve localization performance. Our proposed ?“DeepLab” system sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 79.7% mIOU in ?the test set, and advances the results on three other datasets: PASCAL-Context, PASCAL-Person-Part, and Cityscapes. All of our code ?is made publicly available online.
? ? ? ?本文研究了基于深度學習的語義圖像分割問題，并提出了三個具有實際應用價值的主要研究方向。首先，我們強調卷積與上采樣濾波器，或“atrous卷積”，在密集預測任務中是一個強大的工具。Atrous卷積允許我們顯式地控制在深度卷積神經網絡中計算特征響應的分辨率。它還允許我們有效地擴大過濾器的視野，在不增加參數數量或計算量的情況下合并更大的上下文。其次，提出了一種基于空間金字塔池化?(ASPP)的多尺度魯棒分割方法。ASPP使用多個采樣速率的過濾器和有效的視圖字段探測傳入的卷積特征層，從而在多個尺度上捕獲對象和圖像上下文。第三，結合DCNNs方法和概率圖形模型，改進了目標邊界的定位。DCNNs中常用的最大池和下采樣的組合實現了不變性，但對定位精度有一定的影響。我們通過將DCNN最后一層的響應與一個完全連接的條件隨機場(CRF)相結合來克服這個問題，該條件隨機場在定性和定量上都顯示出來，以提高定位性能。我們提出的“DeepLab”系統在PASCAL VOC-2012語義圖像分割任務中設置了新的技術狀態，在測試集中達到了79.7%的mIOU，并在其他三個數據集:PASCAL-Context, PASCAL-Person-Part,和Cityscapes上提出了結果。我們所有的代碼都在網上公開。
CONCLUSION?
? ? ? ?Our proposed “DeepLab” system re-purposes networks ?trained on image classification to the task of semantic segmentation ?by applying the ‘atrous convolution’ with upsampled ?filters for dense feature extraction. We further extend it ?to atrous spatial pyramid pooling, which encodes objects as ?well as image context at multiple scales. To produce semantically ?accurate predictions and detailed segmentation maps ?along object boundaries, we also combine ideas from deep ?convolutional neural networks and fully-connected conditional ?random fields. Our experimental results show that ?the proposed method significantly advances the state-ofart ?in several challenging datasets, including PASCAL VOC ?2012 semantic image segmentation benchmark, PASCALContext, ?PASCAL-Person-Part, and Cityscapes datasets.
? ? ? ?我們提出的“DeepLab”系統將訓練有素的圖像分類網絡重新用于語義分割任務，利用帶上采樣濾波器的“atrous convolution”進行密集特征提取。我們進一步將其擴展到空間金字塔池，它在多個尺度上編碼對象和圖像上下文。為了產生精確的語義預測和沿著目標邊界的詳細分割地圖，我們還結合了深度卷積神經網絡和全連通條件隨機域的思想。實驗結果表明，該方法在PASCAL VOC 2012語義圖像分割基準測試、PASCALContext, ?PASCAL-Person-Part和Cityscapes數據集等多個具有挑戰性的數據集上都取得了顯著的進步。

論文
Liang-ChiehChen, George Papandreou, IasonasKokkinos, Kevin Murphy, Alan L. Yuille.
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, AtrousConvolution,
and Fully Connected CRFs.
IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: 40 , Issue: 4 , April 1 2018 )應該是2017
https://arxiv.org/abs/1606.00915

0、實驗結果

1、基于VGG-16的DeepLabmodel中，ASPP對PASCAL VOC 2012 valset性能(平均IOU)的影響。

Effect of ASPP on PASCAL VOC 2012 valset performance (mean IOU) for VGG-16 based DeepLabmodel.

LargeFOV: single branch, r = 12 .
ASPP-S: four branches, r= { 2, 4, 8, 12 } .
ASPP-L: four branches, r = { 6, 12, 18, 24 } .
多尺度+大感受野可顯著提高語義分割效果

2、PASCAL VOC 2012 valresults輸入圖像和論文中的DeepLabresults之前/之后的CRF

PASCAL VOC 2012 valresults. Input image and our DeepLabresults before/after CRF

3、ASPP與基線LargeFOV模型進行定性分割

Qualitative segmentation results with ASPP compared to the baseline LargeFOV model.
采用多個大FOV的ASPP-L模型可以成功捕獲多個尺度的目標和圖像上下文。

4、PASCAL VOC 2012測試集性能

Performance on PASCAL VOC 2012 test set
在NVidia Titan X GPU 上運行速度達到了8FPS，全連接CRF 平均推斷需要0.5s ，在耗時方面和DeepLab-v1無差異，但在PASCAL VOC-2012 達到79.7 mIOU。

1、DeepLab-v2 改進點

(1)、用多尺度特征提取獲得更好的分割效果

目標存在多尺度的問題，DeepLabv1中是用多個MLP結合多尺度特征解決，雖然可以提升系統的性能，但是增加了特征計算量和存儲空間。
受到SpatialPyramidPooling(SPP)的啟發，提出了一個類似的結構，在給定的輸入上以不同采樣率的空洞卷積并行采樣，相當于以多個尺度捕捉圖像的上下文，稱為ASPP(atrousspatialpyramidpooling)模塊。

DeepLab v2算法的架構詳解

更新……

DeepLab v2算法的案例應用

更新……

總結

以上是生活随笔為你收集整理的DL之DeepLabv2：DeepLab v2算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： DL之DeepLabv1：DeepLab
下一篇： Matlab：成功解决Expressio