當前位置：首頁 > 人工智能 > pytorch >内容正文

pytorch

图像分割综述【深度学习方法】

發布時間：2025/3/16 pytorch 51 豆豆

生活随笔收集整理的這篇文章主要介紹了图像分割综述【深度学习方法】小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

CNN圖像語義分割基本上是這個套路：

下采樣+上采樣：Convlution + Deconvlution／Resize

多尺度特征融合：特征逐點相加／特征channel維度拼接

獲得像素級別的segement map：對每一個像素點進行判斷類別

即使是更復雜的DeepLab v3+依然也是這個基本套路。

圖13 DeepLab v3+

Image Segmentation（圖像分割）網絡結構比較

網絡?父輩生辰?增加的結構丟棄的結構優勢劣勢??

VGG16	?	FCN的靈感來源	?	?	?	?	?	?	?	?
FCN	?	VGG16	2014	?	一個Deconv層(從無到有)	所有fc層	簡單	粗糙	?	?
DeconvNet	?	FCN	2015	?	Unpooling層（從無到有）、多個Deconv層（層數增加）、fc層（從無到有）	?	?	?	?	?
SegNet	?	DeconvNet	2016	?	每個max_pooling的max索引	所有fc層	?	?	?	?
DeepLab	?	FCN	?	?	?	?	?	?	?	?
PSPNet	?	?	?	?	?	?	?	?	?	?
Mask-RCNN	?	?	2017	?	?	?	真正做到像素級	?	?	?

Image Segmentation（圖像分割）族譜

FCN

DeepLab
DeconvNet
- SegNet
PSPNet
Mask-RCNN

按分割目的劃分

普通分割

將不同分屬不同物體的像素區域分開。?
如前景與后景分割開，狗的區域與貓的區域與背景分割開。
語義分割

在普通分割的基礎上，分類出每一塊區域的語義（即這塊區域是什么物體）。?
如把畫面中的所有物體都指出它們各自的類別。
實例分割

在語義分割的基礎上，給每個物體編號。?
如這個是該畫面中的狗A，那個是畫面中的狗B。

論文推薦：

圖像的語義分割（Semantic Segmentation）是計算機視覺中非常重要的任務。它的目標是為圖像中的每個像素分類。如果能夠快速準去地做圖像分割，很多問題將會迎刃而解。因此，它的應用領域就包括但不限于：自動駕駛、圖像美化、三維重建等等。

語義分割是一個非常困難的問題，尤其是在深度學習之前。深度學習使得圖像分割的準確率提高了很多，下面我們就總結一下近年來最具有代表性的方法和論文。

Fully Convolutional Networks (FCN)

我們介紹的第一篇論文是Fully Convolutional Networks for Semantic Segmentation，簡稱FCN。這篇論文是第一篇成功使用深度學習做圖像語義分割的論文。論文的主要貢獻有兩點：

提出了全卷積網絡。將全連接網絡替換成了卷積網絡，使得網絡可以接受任意大小的圖片，并輸出和原圖一樣大小的分割圖。只有這樣，才能為每個像素做分類。

使用了反卷積層（Deconvolution）。分類神經網絡的特征圖一般只有原圖的幾分之一大小。想要映射回原圖大小必須對特征圖進行上采樣，這就是反卷積層的作用。雖然名字叫反卷積層，但其實它并不是卷積的逆操作，更合適的名字叫做轉置卷積（Transposed Convolution），作用是從小的特征圖卷出大的特征圖。

這是神經網絡做語義分割的開山之作，需徹底理解。

DeepLab

DeepLab有v1 v2 v3，第一篇名字叫做DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs。這一系列論文引入了以下幾點比較重要的方法：

第一個是帶洞卷積，英文名叫做Dilated Convolution，或者Atrous Convolution。帶洞卷積實際上就是普通的卷積核中間插入了幾個洞，如下圖。

它的運算量跟普通卷積保持一樣，好處是它的“視野更大了”，比如普通3x3卷積的結果的視野是3x3，插入一個洞之后的視野是5x5。視野變大的作用是，在特征圖縮小到同樣倍數的情況下可以掌握更多圖像的全局信息，這在語義分割中很重要。

Pyramid Scene Parsing Network

Pyramid Scene Parsing Network的核心貢獻是Global Pyramid Pooling，翻譯成中文叫做全局金字塔池化。它將特征圖縮放到幾個不同的尺寸，使得特征具有更好地全局和多尺度信息，這一點在準確率提升上上非常有用。

其實不光是語義分割，金字塔多尺度特征對于各類視覺問題都是挺有用的。

Mask R-CNN

Mask R-CNN是大神何凱明的力作，將Object Detection與Semantic Segmentation合在了一起做。它的貢獻主要是以下幾點。

第一，神經網絡有了多個分支輸出。Mask R-CNN使用類似Faster R-CNN的框架，Faster R-CNN的輸出是物體的bounding box和類別，而Mask R-CNN則多了一個分支，用來預測物體的語義分割圖。也就是說神經網絡同時學習兩項任務，可以互相促進。

第二，在語義分割中使用Binary Mask。原來的語義分割預測類別需要使用0 1 2 3 4等數字代表各個類別。在Mask R-CNN中，檢測分支會預測類別。這時候分割只需要用0 1預測這個物體的形狀面具就行了。

第三，Mask R-CNN提出了RoiAlign用來替換Faster R-CNN中的RoiPooling。RoiPooling的思想是將輸入圖像中任意一塊區域對應到神經網絡特征圖中的對應區域。RoiPooling使用了化整的近似來尋找對應區域，導致對應關系與實際情況有偏移。這個偏移在分類任務中可以容忍，但對于精細度更高的分割則影響較大。

為了解決這個問題，RoiAlign不再使用化整操作，而是使用線性插值來尋找更精準的對應區域。效果就是可以得到更好地對應。實驗也證明了效果不錯。下面展示了與之前方法的對比，下面的圖是Mask R-CNN，可以看出精細了很多。

U-Net

U-Net原作者官網

U-Net是原作者參加ISBI Challenge提出的一種分割網絡，能夠適應很小的訓練集（大約30張圖）。U-Net與FCN都是很小的分割網絡，既沒有使用空洞卷積，也沒有后接CRF，結構簡單。

圖9 U-Net網絡結構圖

整個U-Net網絡結構如圖9，類似于一個大大的U字母：首先進行Conv+Pooling下采樣；然后Deconv反卷積進行上采樣，crop之前的低層feature map，進行融合；然后再次上采樣。重復這個過程，直到獲得輸出388x388x2的feature map，最后經過softmax獲得output segment map。總體來說與FCN思路非常類似。

為何要提起U-Net？是因為U-Net采用了與FCN完全不同的特征融合方式：拼接！

圖10 U-Net concat特征融合方式

與FCN逐點相加不同，U-Net采用將特征在channel維度拼接在一起，形成更“厚”的特征。所以：

語義分割網絡在特征融合時也有2種辦法：

FCN式的逐點相加，對應caffe的EltwiseLayer層，對應tensorflow的tf.add()

U-Net式的channel維度拼接融合，對應caffe的ConcatLayer層，對應tensorflow的tf.concat()

綜述介紹

圖像語義分割，簡單而言就是給定一張圖片，對圖片上的每一個像素點分類

從圖像上來看，就是我們需要將實際的場景圖分割成下面的分割圖：

不同顏色代表不同類別。經過閱讀“大量”論文和查看PASCAL VOC Challenge performance evaluation server，發現圖像語義分割從深度學習引入這個任務（FCN）到現在而言，一個通用的框架已經大概確定了。即：

FCN-全卷積網絡
CRF-條件隨機場
MRF-馬爾科夫隨機場

前端使用FCN進行特征粗提取，后端使用CRF/MRF優化前端的輸出，最后得到分割圖。

前端

為什么需要FCN？

我們分類使用的網絡通常會在最后連接幾層全連接層，它會將原來二維的矩陣（圖片）壓扁成一維的，從而丟失了空間信息，最后訓練輸出一個標量，這就是我們的分類標簽。

而圖像語義分割的輸出需要是個分割圖，且不論尺寸大小，但是至少是二維的。所以，我們需要丟棄全連接層，換上全卷積層，而這就是全卷積網絡了。具體定義請參看論文：Fully Convolutional Networks for Semantic Segmentation

前端結構

FCN

此處的FCN特指Fully Convolutional Networks for Semantic Segmentation論文中提出的結構，而非廣義的全卷積網絡。

作者的FCN主要使用了三種技術：

卷積化（Convolutional）
上采樣（Upsample）
跳躍結構（Skip Layer）

卷積化

卷積化即是將普通的分類網絡，比如VGG16，ResNet50/101等網絡丟棄全連接層，換上對應的卷積層即可。

上采樣

此處的上采樣即是反卷積（Deconvolution）。當然關于這個名字不同框架不同，Caffe和Kera里叫Deconvolution，而tensorflow里叫conv_transpose。CS231n這門課中說，叫conv_transpose更為合適。

眾所諸知，普通的池化（為什么這兒是普通的池化請看后文）會縮小圖片的尺寸，比如VGG16 五次池化后圖片被縮小了32倍。為了得到和原圖等大的分割圖，我們需要上采樣/反卷積。

反卷積和卷積類似，都是相乘相加的運算。只不過后者是多對一，前者是一對多。而反卷積的前向和后向傳播，只用顛倒卷積的前后向傳播即可。所以無論優化還是后向傳播算法都是沒有問題。圖解如下：

但是，雖然文中說是可學習的反卷積，但是作者實際代碼并沒有讓它學習，可能正是因為這個一對多的邏輯關系。代碼如下：

layer {name: "upscore"type: "Deconvolution"bottom: "score_fr"top: "upscore"param {lr_mult: 0}convolution_param {num_output: 21bias_term: falsekernel_size: 64stride: 32} }

可以看到lr_mult被設置為了0.

跳躍結構

（這個奇怪的名字是我翻譯的，好像一般叫忽略連接結構）這個結構的作用就在于優化結果，因為如果將全卷積之后的結果直接上采樣得到的結果是很粗糙的，所以作者將不同池化層的結果進行上采樣之后來優化輸出。具體結構如下：

而不同上采樣結構得到的結果對比如下：

當然，你也可以將pool1， pool2的輸出再上采樣輸出。不過，作者說了這樣得到的結果提升并不大。

這是第一種結構，也是深度學習應用于圖像語義分割的開山之作，所以得了CVPR2015的最佳論文。但是，還是有一些處理比較粗糙的地方，具體和后面對比就知道了。

SegNet/DeconvNet

這樣的結構總結在這兒，只是我覺得結構上比較優雅，它得到的結果不一定比上一種好。

SegNet

DeconvNet

這樣的對稱結構有種自編碼器的感覺在里面，先編碼再解碼。這樣的結構主要使用了反卷積和上池化。即：

?
?

反卷積如上。而上池化的實現主要在于池化時記住輸出值的位置，在上池化時再將這個值填回原來的位置，其他位置填0即OK。

DeepLab

接下來介紹一個很成熟優雅的結構，以至于現在的很多改進是基于這個網絡結構的進行的。

首先這里我們將指出一個第一個結構FCN的粗糙之處：為了保證之后輸出的尺寸不至于太小，FCN的作者在第一層直接對原圖加了100的padding，可想而知，這會引入噪聲。

而怎樣才能保證輸出的尺寸不會太小而又不會產生加100 padding這樣的做法呢？可能有人會說減少池化層不就行了，這樣理論上是可以的，但是這樣直接就改變了原先可用的結構了，而且最重要的一點是就不能用以前的結構參數進行fine-tune了。所以，Deeplab這里使用了一個非常優雅的做法：將pooling的stride改為1，再加上 1 padding。這樣池化后的圖片尺寸并未減小，并且依然保留了池化整合特征的特性。

但是，事情還沒完。因為池化層變了，后面的卷積的感受野也對應的改變了，這樣也不能進行fine-tune了。所以，Deeplab提出了一種新的卷積，帶孔的卷積：Atrous Convolution.即：

而具體的感受野變化如下：

a為普通的池化的結果，b為“優雅”池化的結果。我們設想在a上進行卷積核尺寸為3的普通卷積，則對應的感受野大小為7.而在b上進行同樣的操作，對應的感受野變為了5.感受野減小了。但是如果使用hole為1的Atrous Convolution則感受野依然為7.

所以，Atrous Convolution能夠保證這樣的池化后的感受野不變，從而可以fine tune，同時也能保證輸出的結果更加精細。即：

總結

這里介紹了三種結構：FCN, SegNet/DeconvNet，DeepLab。當然還有一些其他的結構方法，比如有用RNN來做的，還有更有實際意義的weakly-supervised方法等等。

后端

終于到后端了，后端這里會講幾個場，涉及到一些數學的東西。我的理解也不是特別深刻，所以歡迎吐槽。

全連接條件隨機場(DenseCRF)

對于每個像素具有類別標簽還有對應的觀測值，這樣每個像素點作為節點，像素與像素間的關系作為邊，即構成了一個條件隨機場。而且我們通過觀測變量來推測像素對應的類別標簽。條件隨機場如下：

條件隨機場符合吉布斯分布：(此處的即上面說的觀測值)

其中的是能量函數，為了簡便，以下省略全局觀測：

其中的一元勢函數即來自于前端FCN的輸出。而二元勢函數如下：

二元勢函數就是描述像素點與像素點之間的關系，鼓勵相似像素分配相同的標簽，而相差較大的像素分配不同標簽，而這個“距離”的定義與顏色值和實際相對距離有關。所以這樣CRF能夠使圖片盡量在邊界處分割。

而全連接條件隨機場的不同就在于，二元勢函數描述的是每一個像素與其他所有像素的關系，所以叫“全連接”。

關于這一堆公式大家隨意理解一下吧... ...而直接計算這些公式是比較麻煩的（我想也麻煩），所以一般會使用平均場近似方法進行計算。而平均場近似又是一堆公式，這里我就不給出了（我想大家也不太愿意看），愿意了解的同學直接看論文吧。

CRFasRNN

最開始使用DenseCRF是直接加在FCN的輸出后面，可想這樣是比較粗糙的。而且在深度學習中，我們都追求end-to-end的系統，所以CRFasRNN這篇文章將DenseCRF真正結合進了FCN中。

這篇文章也使用了平均場近似的方法，因為分解的每一步都是一些相乘相加的計算，和普通的加減（具體公式還是看論文吧），所以可以方便的把每一步描述成一層類似卷積的計算。這樣即可結合進神經網絡中，并且前后向傳播也不存在問題。

當然，這里作者還將它進行了迭代，不同次數的迭代得到的結果優化程度也不同（一般取10以內的迭代次數），所以文章才說是as RNN。優化結果如下：

馬爾科夫隨機場(MRF)

在Deep Parsing Network中使用的是MRF，它的公式具體的定義和CRF類似，只不過作者對二元勢函數進行了修改：

其中，作者加入的為label context，因為只是定義了兩個像素同時出現的頻率，而可以對一些情況進行懲罰，比如，人可能在桌子旁邊，但是在桌子下面的可能性就更小一些。所以這個量可以學習不同情況出現的概率。而原來的距離只定義了兩個像素間的關系，作者在這兒加入了個triple penalty，即還引入了附近的，這樣描述三方關系便于得到更充足的局部上下文。具體結構如下：

這個結構的優點在于：

將平均場構造成了CNN
聯合訓練并且可以one-pass inference，而不用迭代

高斯條件隨機場(G-CRF)

這個結構使用CNN分別來學習一元勢函數和二元勢函數。這樣的結構是我們更喜歡的：

而此中的能量函數又不同于之前：

而當是對稱正定時，求的最小值等于求解：

而G-CRF的優點在于：

二次能量有明確全局
解線性簡便很多?

感悟

FCN更像一種技巧。隨著基本網絡（如VGG， ResNet）性能的提升而不斷進步。
深度學習+概率圖模型（PGM）是一種趨勢。其實DL說白了就是進行特征提取，而PGM能夠從數學理論很好的解釋事物本質間的聯系。
概率圖模型的網絡化。因為PGM通常不太方便加入DL的模型中，將PGM網絡化后能夠是PGM參數自學習，同時構成end-to-end的系統。

完結撒花

引用

[1]Fully Convolutional Networks for Semantic Segmentation

[2]Learning Deconvolution Network for Semantic Segmentation

[3]Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

[4]Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs

[5]Conditional Random Fields as Recurrent Neural Networks

[6]DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

[7]Semantic Image Segmentation via Deep Parsing Network

[8]Fast, Exact and Multi-Scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs

[9]SegNet

圖像分割（Image Segmentation)?重大資源：

入門學習

A 2017 Guide to Semantic Segmentation with Deep Learning 概述——用深度學習做語義分割

[http://blog.qure.ai/notes/semantic-segmentation-deep-learning-review]
中文翻譯：[http://simonduan.site/2017/07/23/notes-semantic-segmentation-deep-learning-review/]

從全卷積網絡到大型卷積核：深度學習的語義分割全指南

[https://www.jiqizhixin.com/articles/2017-07-14-10]

Fully Convolutional Networks

[http://simtalk.cn/2016/11/01/Fully-Convolutional-Networks/]

語義分割中的深度學習方法全解：從FCN、SegNet到各代DeepLab

[https://zhuanlan.zhihu.com/p/27794982]

圖像語義分割之FCN和CRF

[https://zhuanlan.zhihu.com/p/22308032]

從特斯拉到計算機視覺之「圖像語義分割」

[http://www.52cs.org/?p=1089]

計算機視覺之語義分割

[http://blog.geohey.com/ji-suan-ji-shi-jue-zhi-yu-yi-fen-ge/]

Segmentation Results: VOC2012 PASCAL語義分割比賽排名

[http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=6]

進階論文

U-Net [https://arxiv.org/pdf/1505.04597.pdf]

SegNet [https://arxiv.org/pdf/1511.00561.pdf]

DeepLab [https://arxiv.org/pdf/1606.00915.pdf]

FCN [https://arxiv.org/pdf/1605.06211.pdf]

ENet [https://arxiv.org/pdf/1606.02147.pdf]

LinkNet [https://arxiv.org/pdf/1707.03718.pdf]

DenseNet [https://arxiv.org/pdf/1608.06993.pdf]

Tiramisu [https://arxiv.org/pdf/1611.09326.pdf]

DilatedNet [https://arxiv.org/pdf/1511.07122.pdf]

PixelNet [https://arxiv.org/pdf/1609.06694.pdf]

ICNet [https://arxiv.org/pdf/1704.08545.pdf]

ERFNet [http://www.robesafe.uah.es/personal/eduardo.romera/pdfs/Romera17iv.pdf]

RefineNet [https://arxiv.org/pdf/1611.06612.pdf]

PSPNet [https://arxiv.org/pdf/1612.01105.pdf]

CRFasRNN [http://www.robots.ox.ac.uk/%7Eszheng/papers/CRFasRNN.pdf]

Dilated convolution [https://arxiv.org/pdf/1511.07122.pdf]

DeconvNet [https://arxiv.org/pdf/1505.04366.pdf]

FRRN [https://arxiv.org/pdf/1611.08323.pdf]

GCN [https://arxiv.org/pdf/1703.02719.pdf]

DUC, HDC [https://arxiv.org/pdf/1702.08502.pdf]

Segaware [https://arxiv.org/pdf/1708.04607.pdf]

Semantic Segmentation using Adversarial Networks [https://arxiv.org/pdf/1611.08408.pdf]

綜述

A Review on Deep Learning Techniques Applied to Semantic Segmentation Alberto Garcia-Garcia,?Sergio Orts-Escolano,?Sergiu Oprea,?Victor Villena-Martinez,?Jose Garcia-Rodriguez 2017

[https://arxiv.org/abs/1704.06857]

Computer Vision for Autonomous Vehicles: Problems, Datasets and State-of-the-Art

[https://arxiv.org/abs/1704.05519]

基于內容的圖像分割方法綜述姜楓顧慶郝慧珍李娜郭延文陳道蓄 2017

[http://www.jos.org.cn/ch/reader/create_pdf.aspx?file_no=5136&journal_id=jos\]

Tutorial

Semantic Image Segmentation with Deep Learning

[http://www.robots.ox.ac.uk/~sadeep/files/crfasrnn_presentation.pdf\]

A 2017 Guide to Semantic Segmentation with Deep Learning

[http://blog.qure.ai/notes/semantic-segmentation-deep-learning-review]

Image Segmentation with Tensorflow using CNNs and Conditional Random Fields

[http://warmspringwinds.github.io/tensorflow/tf-slim/2016/12/18/image-segmentation-with-tensorflow-using-cnns-and-conditional-random-fields/]

視頻教程

CS231n: Convolutional Neural Networks for Visual Recognition Lecture 11 Detection and Segmentation?

[http://cs231n.stanford.edu/syllabus.html]

Machine Learning for Semantic Segmentation - Basics of Modern Image Analysis

[https://www.youtube.com/watch?v=psLChcm8aiU]

代碼

Semantic segmentation

U-Net (https://arxiv.org/pdf/1505.04597.pdf)

https://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/?(Caffe - Matlab)
https://github.com/jocicmarko/ultrasound-nerve-segmentation?(Keras)
https://github.com/EdwardTyantov/ultrasound-nerve-segmentation(Keras)
https://github.com/ZFTurbo/ZF_UNET_224_Pretrained_Model?(Keras)
https://github.com/yihui-he/u-net?(Keras)
https://github.com/jakeret/tf_unet?(Tensorflow)
https://github.com/DLTK/DLTK/blob/master/examples/Toy_segmentation/simple_dltk_unet.ipynb?(Tensorflow)
https://github.com/divamgupta/image-segmentation-keras?(Keras)
https://github.com/ZijunDeng/pytorch-semantic-segmentation?(PyTorch)
https://github.com/akirasosa/mobile-semantic-segmentation?(Keras)
https://github.com/orobix/retina-unet?(Keras)

SegNet (https://arxiv.org/pdf/1511.00561.pdf)

https://github.com/alexgkendall/caffe-segnet?(Caffe)
https://github.com/developmentseed/caffe/tree/segnet-multi-gpu?(Caffe)
https://github.com/preddy5/segnet?(Keras)
https://github.com/imlab-uiip/keras-segnet?(Keras)
https://github.com/andreaazzini/segnet?(Tensorflow)
https://github.com/fedor-chervinskii/segnet-torch?(Torch)
https://github.com/0bserver07/Keras-SegNet-Basic?(Keras)
https://github.com/tkuanlun350/Tensorflow-SegNet?(Tensorflow)
https://github.com/divamgupta/image-segmentation-keras?(Keras)
https://github.com/ZijunDeng/pytorch-semantic-segmentation?(PyTorch)
https://github.com/chainer/chainercv/tree/master/examples/segnet(Chainer)
https://github.com/ykamikawa/keras-SegNet?(Keras)

DeepLab (https://arxiv.org/pdf/1606.00915.pdf)

https://bitbucket.org/deeplab/deeplab-public/?(Caffe)
https://github.com/cdmh/deeplab-public?(Caffe)
https://bitbucket.org/aquariusjay/deeplab-public-ver2?(Caffe)
https://github.com/TheLegendAli/DeepLab-Context?(Caffe)
https://github.com/msracver/Deformable-ConvNets/tree/master/deeplab(MXNet)
https://github.com/DrSleep/tensorflow-deeplab-resnet?(Tensorflow)
https://github.com/muyang0320/tensorflow-deeplab-resnet-crf(TensorFlow)
https://github.com/isht7/pytorch-deeplab-resnet?(PyTorch)
https://github.com/bermanmaxim/jaccardSegment?(PyTorch)
https://github.com/martinkersner/train-DeepLab?(Caffe)
https://github.com/chenxi116/TF-deeplab?(Tensorflow)

FCN (https://arxiv.org/pdf/1605.06211.pdf)

https://github.com/vlfeat/matconvnet-fcn?(MatConvNet)
https://github.com/shelhamer/fcn.berkeleyvision.org?(Caffe)
https://github.com/MarvinTeichmann/tensorflow-fcn?(Tensorflow)
https://github.com/aurora95/Keras-FCN?(Keras)
https://github.com/mzaradzki/neuralnets/tree/master/vgg_segmentation_keras?(Keras)
https://github.com/k3nt0w/FCN_via_keras?(Keras)
https://github.com/shekkizh/FCN.tensorflow?(Tensorflow)
https://github.com/seewalker/tf-pixelwise?(Tensorflow)
https://github.com/divamgupta/image-segmentation-keras?(Keras)
https://github.com/ZijunDeng/pytorch-semantic-segmentation?(PyTorch)
https://github.com/wkentaro/pytorch-fcn?(PyTorch)
https://github.com/wkentaro/fcn?(Chainer)
https://github.com/apache/incubator-mxnet/tree/master/example/fcn-xs(MxNet)
https://github.com/muyang0320/tf-fcn?(Tensorflow)
https://github.com/ycszen/pytorch-seg?(PyTorch)
https://github.com/Kaixhin/FCN-semantic-segmentation?(PyTorch)

ENet (https://arxiv.org/pdf/1606.02147.pdf)

https://github.com/TimoSaemann/ENet?(Caffe)
https://github.com/e-lab/ENet-training?(Torch)
https://github.com/PavlosMelissinos/enet-keras?(Keras)

LinkNet (https://arxiv.org/pdf/1707.03718.pdf)

https://github.com/e-lab/LinkNet?(Torch)

DenseNet (https://arxiv.org/pdf/1608.06993.pdf)

https://github.com/flyyufelix/DenseNet-Keras?(Keras)

Tiramisu (https://arxiv.org/pdf/1611.09326.pdf)

https://github.com/0bserver07/One-Hundred-Layers-Tiramisu?(Keras)
https://github.com/SimJeg/FC-DenseNet?(Lasagne)

DilatedNet (https://arxiv.org/pdf/1511.07122.pdf)

https://github.com/nicolov/segmentation_keras?(Keras)

PixelNet (https://arxiv.org/pdf/1609.06694.pdf)

https://github.com/aayushbansal/PixelNet?(Caffe)

ICNet (https://arxiv.org/pdf/1704.08545.pdf)

https://github.com/hszhao/ICNet?(Caffe)

ERFNet (http://www.robesafe.uah.es/personal/eduardo.romera/pdfs/Romera17iv.pdf)

https://github.com/Eromera/erfnet?(Torch)

RefineNet (https://arxiv.org/pdf/1611.06612.pdf)

https://github.com/guosheng/refinenet?(MatConvNet)

PSPNet (https://arxiv.org/pdf/1612.01105.pdf)

https://github.com/hszhao/PSPNet?(Caffe)
https://github.com/ZijunDeng/pytorch-semantic-segmentation?(PyTorch)
https://github.com/mitmul/chainer-pspnet?(Chainer)
https://github.com/Vladkryvoruchko/PSPNet-Keras-tensorflow(Keras/Tensorflow)
https://github.com/pudae/tensorflow-pspnet?(Tensorflow)

CRFasRNN (http://www.robots.ox.ac.uk/%7Eszheng/papers/CRFasRNN.pdf)

https://github.com/torrvision/crfasrnn?(Caffe)
https://github.com/sadeepj/crfasrnn_keras?(Keras)

Dilated convolution (https://arxiv.org/pdf/1511.07122.pdf)

https://github.com/fyu/dilation?(Caffe)
https://github.com/fyu/drn#semantic-image-segmentataion?(PyTorch)
https://github.com/hangzhaomit/semantic-segmentation-pytorch?(PyTorch)

DeconvNet (https://arxiv.org/pdf/1505.04366.pdf)

http://cvlab.postech.ac.kr/research/deconvnet/?(Caffe)
https://github.com/HyeonwooNoh/DeconvNet?(Caffe)
https://github.com/fabianbormann/Tensorflow-DeconvNet-Segmentation(Tensorflow)

FRRN (https://arxiv.org/pdf/1611.08323.pdf)

https://github.com/TobyPDE/FRRN?(Lasagne)

GCN (https://arxiv.org/pdf/1703.02719.pdf)

https://github.com/ZijunDeng/pytorch-semantic-segmentation?(PyTorch)
https://github.com/ycszen/pytorch-seg?(PyTorch)

DUC, HDC (https://arxiv.org/pdf/1702.08502.pdf)

https://github.com/ZijunDeng/pytorch-semantic-segmentation?(PyTorch)
https://github.com/ycszen/pytorch-seg?(PyTorch)

Segaware (https://arxiv.org/pdf/1708.04607.pdf)

https://github.com/aharley/segaware?(Caffe)

Semantic Segmentation using Adversarial Networks (https://arxiv.org/pdf/1611.08408.pdf)

https://github.com/oyam/Semantic-Segmentation-using-Adversarial-Networks?(Chainer)

Instance aware segmentation

FCIS [https://arxiv.org/pdf/1611.07709.pdf]

https://github.com/msracver/FCIS?[MxNet]

MNC [https://arxiv.org/pdf/1512.04412.pdf]

https://github.com/daijifeng001/MNC?[Caffe]

DeepMask [https://arxiv.org/pdf/1506.06204.pdf]

https://github.com/facebookresearch/deepmask?[Torch]

SharpMask [https://arxiv.org/pdf/1603.08695.pdf]

https://github.com/facebookresearch/deepmask?[Torch]

Mask-RCNN [https://arxiv.org/pdf/1703.06870.pdf]

https://github.com/CharlesShang/FastMaskRCNN?[Tensorflow]
https://github.com/jasjeetIM/Mask-RCNN?[Caffe]
https://github.com/TuSimple/mx-maskrcnn?[MxNet]
https://github.com/matterport/Mask_RCNN?[Keras]

RIS [https://arxiv.org/pdf/1511.08250.pdf]

https://github.com/bernard24/RIS?[Torch]

FastMask [https://arxiv.org/pdf/1612.08843.pdf]

https://github.com/voidrank/FastMask?[Caffe]

Satellite images segmentation

https://github.com/mshivaprakash/sat-seg-thesis
https://github.com/KGPML/Hyperspectral
https://github.com/lopuhin/kaggle-dstl
https://github.com/mitmul/ssai
https://github.com/mitmul/ssai-cnn
https://github.com/azavea/raster-vision
https://github.com/nshaud/DeepNetsForEO
https://github.com/trailbehind/DeepOSM

Video segmentation

https://github.com/shelhamer/clockwork-fcn
https://github.com/JingchunCheng/Seg-with-SPN

Autonomous driving

https://github.com/MarvinTeichmann/MultiNet
https://github.com/MarvinTeichmann/KittiSeg
https://github.com/vxy10/p5_VehicleDetection_Unet?[Keras]
https://github.com/ndrplz/self-driving-car
https://github.com/mvirgo/MLND-Capstone

Annotation Tools:

https://github.com/AKSHAYUBHAT/ImageSegmentation
https://github.com/kyamagu/js-segment-annotator
https://github.com/CSAILVision/LabelMeAnnotationTool
https://github.com/seanbell/opensurfaces-segmentation-ui
https://github.com/lzx1413/labelImgPlus
https://github.com/wkentaro/labelme

Datasets

Stanford Background Dataset[http://dags.stanford.edu/projects/scenedataset.html]

Sift Flow Dataset[http://people.csail.mit.edu/celiu/SIFTflow/]

Barcelona Dataset[http://www.cs.unc.edu/~jtighe/Papers/ECCV10/]

Microsoft COCO dataset[http://mscoco.org/]

MSRC Dataset[http://research.microsoft.com/en-us/projects/objectclassrecognition/]

LITS Liver Tumor Segmentation Dataset[https://competitions.codalab.org/competitions/15595]

KITTI[http://www.cvlibs.net/datasets/kitti/eval_road.php]

Stanford background dataset[http://dags.stanford.edu/projects/scenedataset.html]

Data from Games dataset[https://download.visinf.tu-darmstadt.de/data/from_games/]

Human parsing dataset[https://github.com/lemondan/HumanParsing-Dataset]

Silenko person database[https://github.com/Maxfashko/CamVid]

Mapillary Vistas Dataset[https://www.mapillary.com/dataset/vistas]

Microsoft AirSim[https://github.com/Microsoft/AirSim]

MIT Scene Parsing Benchmark[http://sceneparsing.csail.mit.edu/]

COCO 2017 Stuff Segmentation Challenge[http://cocodataset.org/#stuff-challenge2017]

ADE20K Dataset[http://groups.csail.mit.edu/vision/datasets/ADE20K/]

INRIA Annotations for Graz-02[http://lear.inrialpes.fr/people/marszalek/data/ig02/]

比賽

MSRC-21 [http://rodrigob.github.io/are_we_there_yet/build/semantic_labeling_datasets_results.html]

Cityscapes [https://www.cityscapes-dataset.com/benchmarks/]

VOC2012 [http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=6]

領域專家

Jonathan Long

[http://people.eecs.berkeley.edu/~jonlong/\]

Liang-Chieh Chen

[http://liangchiehchen.com/]

Hyeonwoo Noh

[http://cvlab.postech.ac.kr/~hyeonwoonoh/\]

Bharath Hariharan

[http://home.bharathh.info/]

Fisher Yu

[http://www.yf.io/]

Vijay Badrinarayanan

[https://sites.google.com/site/vijaybacademichomepage/home/papers]

Guosheng Lin

[https://sites.google.com/site/guoshenglin/]

總結

以上是生活随笔為你收集整理的图像分割综述【深度学习方法】的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：面向犯罪编程，9名程序员锒铛入狱
下一篇：重要提醒！人脸识别一定要穿上衣服！

pytorch

图像分割综述【深度学习方法】

CNN圖像語義分割基本上是這個套路：

Image Segmentation（圖像分割）網絡結構比較

Image Segmentation（圖像分割）族譜

FCN

按分割目的劃分

普通分割

語義分割

實例分割

論文推薦：

Fully Convolutional Networks (FCN)

DeepLab

Pyramid Scene Parsing Network

Mask R-CNN

U-Net

綜述介紹

前端

為什么需要FCN？

前端結構

FCN

SegNet/DeconvNet

DeepLab

后端

感悟

引用

圖像分割 （Image Segmentation)?重大資源：

入門學習

進階論文

綜述

Tutorial

視頻教程

代碼

Datasets

比賽

領域專家

總結

圖像分割（Image Segmentation)?重大資源：