DL之DeepLabv1:DeepLabv1算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略
DL之DeepLabv1:DeepLabv1算法的簡(jiǎn)介(論文介紹)、架構(gòu)詳解、案例應(yīng)用等配圖集合之詳細(xì)攻略
?
?
?
目錄
DeepLabv1算法的簡(jiǎn)介(論文介紹)
0、實(shí)驗(yàn)結(jié)果
1、FCN局限性及其改進(jìn)
DeepLabv1算法的架構(gòu)詳解
DeepLabv1算法的案例應(yīng)用
?
?
?
?
相關(guān)文章
DL之DeepLabv1:DeepLabv1算法的簡(jiǎn)介(論文介紹)、架構(gòu)詳解、案例應(yīng)用等配圖集合之詳細(xì)攻略
DL之DeepLabv1:DeepLabv1算法的架構(gòu)詳解
DL之DeepLabv2:DeepLab v2算法的簡(jiǎn)介(論文介紹)、架構(gòu)詳解、案例應(yīng)用等配圖集合之詳細(xì)攻略
DL之DeepLabv2:DeepLab v2算法的架構(gòu)詳解
DL之DeepLabv3:DeepLab v3和DeepLab v3+算法的簡(jiǎn)介(論文介紹)、架構(gòu)詳解、案例應(yīng)用等配圖集合之詳細(xì)攻略
DL之DeepLabv3:DeepLab v3和DeepLab v3+算法的架構(gòu)詳解
DeepLabv1算法的簡(jiǎn)介(論文介紹)
? ? ? 作者意識(shí)到FCN算法模型的局限性,因此,通過改進(jìn)提出了DeepLabv1算法。
ABSTRACT ?
? ? ? ?Deep Convolutional Neural Networks (DCNNs) have recently shown state of the ?art performance in high level vision tasks, such as image classification and object ?detection. This work brings together methods from DCNNs and probabilistic ?graphical models for addressing the task of pixel-level classification (also called ?”semantic image segmentation”). We show that responses at the final layer of ?DCNNs are not sufficiently localized for accurate object segmentation. This is ?due to the very invariance properties that make DCNNs good for high level tasks. ?We overcome this poor localization property of deep networks by combining the ?responses at the final DCNN layer with a fully connected Conditional Random ?Field (CRF). Qualitatively, our “DeepLab” system is able to localize segment ?boundaries at a level of accuracy which is beyond previous methods. Quantitatively, ?our method sets the new state-of-art at the PASCAL VOC-2012 semantic ?image segmentation task, reaching 71.6% IOU accuracy in the test set. We show ?how these results can be obtained efficiently: Careful network re-purposing and a ?novel application of the ’hole’ algorithm from the wavelet community allow dense ?computation of neural net responses at 8 frames per second on a modern GPU.
? ? ? ?深度卷積神經(jīng)網(wǎng)絡(luò)(DCNNs)最近在圖像分類和目標(biāo)檢測(cè)等高級(jí)視覺任務(wù)中表現(xiàn)出了最先進(jìn)的性能。這項(xiàng)工作結(jié)合了DCNNs和概率圖形模型的方法來解決像素級(jí)分類(也稱為“語義圖像分割”)的任務(wù)。結(jié)果表明,對(duì)于精確的目標(biāo)分割,DCNNs最后一層的響應(yīng)沒有得到足夠的局部化。這是由于非常不變性的性質(zhì),使DCNNs適合高級(jí)任務(wù)。通過將DCNN最后一層的響應(yīng)與完全連接的條件隨機(jī)場(chǎng)(CRF)相結(jié)合,克服了深度網(wǎng)絡(luò)的這種較差的定位特性。定性地說,我們的“DeepLab”系統(tǒng)能夠以超出以往方法的精度水平定位段邊界。量化地來說,我們的方法集新技術(shù)發(fā)展水平在PASCAL VOC-2012 語義圖像分割任務(wù),測(cè)試集的準(zhǔn)確性達(dá)到71.6%的IOU。我們展示了可有效地獲得這些結(jié)果:仔細(xì)的網(wǎng)絡(luò)重新設(shè)計(jì)和一個(gè)新的應(yīng)用小波社區(qū)的“孔”算法允許在現(xiàn)代GPU上以每秒8幀的速度密集計(jì)算神經(jīng)網(wǎng)絡(luò)響應(yīng)。
DISCUSSION ?
? ? ? ?Our work combines ideas from deep convolutional neural networks and fully-connected conditional ?random fields, yielding a novel method able to produce semantically accurate predictions and detailed ?segmentation maps, while being computationally efficient. Our experimental results show that ?the proposed method significantly advances the state-of-art in the challenging PASCAL VOC 2012 ?semantic image segmentation task. ?There are multiple aspects in our model that we intend to refine, such as fully integrating its two ?main components (CNN and CRF) and train the whole system in an end-to-end fashion, similar to ?Krahenb ¨ uhl & Koltun (2013); Chen et al. (2014); Zheng et al. (2015). We also plan to experiment ¨ ?with more datasets and apply our method to other sources of data such as depth maps or videos. Recently, ?we have pursued model training with weakly supervised annotations, in the form of bounding ?boxes or image-level labels (Papandreou et al., 2015). ?At a higher level, our work lies in the intersection of convolutional neural networks and probabilistic ?graphical models. We plan to further investigate the interplay of these two powerful classes of ?methods and explore their synergistic potential for solving challenging computer vision tasks.
? ? ? ?我們的工作結(jié)合了深卷積神經(jīng)網(wǎng)絡(luò)和全連通條件隨機(jī)場(chǎng)的思想,提出了一種新的方法,能夠產(chǎn)生語義準(zhǔn)確的預(yù)測(cè)和詳細(xì)的分割地圖,同時(shí)計(jì)算效率高。實(shí)驗(yàn)結(jié)果表明,該方法顯著提高了PASCAL VOC 2012語義圖像分割的水平。我們的模型中有很多方面是我們想要完善的,比如充分集成其兩個(gè)主要組件(CNN和CRF),以端到端的方式訓(xùn)練整個(gè)系統(tǒng),類似于Krahenb¨uhl & Koltun (2013);Chen等(2014);鄭等(2015)。我們還計(jì)劃用更多的數(shù)據(jù)集進(jìn)行實(shí)驗(yàn),并將我們的方法應(yīng)用于其他數(shù)據(jù)源,如深度地圖或視頻。最近,我們以邊界框或圖像級(jí)標(biāo)簽的形式,采用弱監(jiān)督注解進(jìn)行模型訓(xùn)練(Papandreou et al., 2015)。在更高層次上,我們的工作是卷積神經(jīng)網(wǎng)絡(luò)和概率圖形模型的交叉。我們計(jì)劃進(jìn)一步研究這兩種功能強(qiáng)大的方法之間的相互作用,并探索它們?cè)诮鉀Q具有挑戰(zhàn)性的計(jì)算機(jī)視覺任務(wù)方面的協(xié)同潛力。
?
?
論文
Liang-ChiehChen, George Papandreou, IasonasKokkinos, Kevin Murphy, Alan L. Yuille.
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs, ICCV, 2015.
https://arxiv.org/abs/1412.7062
0、實(shí)驗(yàn)結(jié)果
1、在Titan GPU 上運(yùn)行速度達(dá)到了8FPS,全連接CRF 平均推斷需要0.5s
2、與最先進(jìn)的模型在valset的比較
Comparisons with state-of-the-art models on the valset
First row: images.?第一行:圖像
Second row: ground truths.?第二行:基本真理
Third row: other recent models (Left: FCN-8s, Right: TTI-Zoomout-16).其他最新型模型(左:FCN-8s,右:TTI-Zoomout-16)
Fourth row: our DeepLab-CRF.??我們的Deeplab CRF
3、VOC 2012 VAL可視化結(jié)果
Visualization results on VOC 2012-val
? ? ? For each row, we show the input image, the segmentation result delivered by the DCNN (DeepLab), and the refined segmentation result of the Fully Connected CRF (DeepLab-CRF).對(duì)于每一行,我們顯示輸入圖像,DCNN (DeepLab)提供的分割結(jié)果,以及完全連接的CRF (DeepLab-CRF)的細(xì)化分割結(jié)果。
failure modes 失敗的模型
?
?
?
1、FCN局限性及其改進(jìn)
1、FCN局限性分析
- 池化層可增大神經(jīng)元的感受野,提高分類精度,但導(dǎo)致特征圖分辨率降低
- 倍率過大的上采樣導(dǎo)致FCN的分割邊界模糊
2、改進(jìn)FCN
- –仍以VGG-16為基礎(chǔ)
- –刪去部分池化層(感受野變小)
- –利用預(yù)訓(xùn)練的VGG-16在新網(wǎng)絡(luò)上進(jìn)行網(wǎng)絡(luò)微調(diào)
- –用帶孔卷積(膨脹卷積)替換傳統(tǒng)卷積(增大感受野,同時(shí)提升特征圖的分辨率)
- –利用全連接條件隨機(jī)場(chǎng)提升分割邊界的精度
- –利用多尺度特征
?
?
DeepLabv1算法的架構(gòu)詳解
更新……
?
?
?
?
?
?
DeepLabv1算法的案例應(yīng)用
更新……
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
總結(jié)
以上是生活随笔為你收集整理的DL之DeepLabv1:DeepLabv1算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: DL之U-Net:U-Net算法的简介(
- 下一篇: DL之DeepLabv2:DeepLab