當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

PASCAL VOC 2012 and SBD (the augment dataset) 总结

發(fā)布時間：2023/12/14 编程问答 46 豆豆

生活随笔收集整理的這篇文章主要介紹了 PASCAL VOC 2012 and SBD (the augment dataset) 总结小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

在閱讀DeepLab時，發(fā)現(xiàn)paper中首先介紹了PASCAL VOC 2012數(shù)據(jù)集，然后又說使用一個augment后的dataset來進行訓練。論文中是這樣說的：

The proposed models are evaluated on the PASCAL VOC 2012 semantic segmentation benchmark [1] which contains 20 foreground object classes and one background class. The original dataset contains 1, 464 (train), 1, 449 (val ), and 1, 456 (test) pixel-level annotated images. We augment the dataset by the extra annotations provided by [76], resulting in 10, 582 (trainaug) training images. The performance is measured in terms of pixel intersection-over-union averaged across the 21 classes (mIOU).

接下來看一下這個original dataset和augment the datase的區(qū)別。

一、PASCAL VOC 2012 segmentation

VOC 2012官方已經說的非常清楚，1464 (train), 1449 (val), and 1456 (test).
詳細的分布如下：

二、SBD dataset

所謂的VOC的augment dataset也叫作SBD，8498 (train)， 2857 (val)是出自這篇文章《Semantic Contours from Inverse Detectors》鏈接：http://home.bharathh.info/pubs/pdfs/BharathICCV2011.pdf
作者還提供了一個網(wǎng)站，http://home.bharathh.info/pubs/codes/SBD/download.html，里邊介紹的比較詳細。

需要注意的是

SBD數(shù)據(jù)集的圖片來自于VOC 2011的圖片（11355張），而VOC 2012和VOC 2011在數(shù)據(jù)集的圖片上同樣沒有變化，它們只是標記數(shù)量的不同。
SBD數(shù)據(jù)集的train和val set和VOC是不同的，作者在上邊鏈接的網(wǎng)站里說明：

Please note that the train and val splits included with this dataset are different from the splits in the PASCAL VOC dataset. In particular some “train” images might be part of VOC 2012 val.

即這個訓練集包含了部分驗證集中的圖像。

三、10582 trainaug

DeepLab中所用的10582 trainaug是怎么來的？
參考這個鏈接：https://www.sun11.me/blog/2018/how-to-use-10582-trainaug-images-on-DeeplabV3-code/
這個博客里提供了完整的如何在tensorflow中使用10582 trainaug訓練DeepLabv3：

下載VOC 2012 和 SegmentationClassAug，后一個文件是SBD提供的額外標注（extra annotations）

從這個地址保存trainaug的文件名，創(chuàng)建一個trainaug.txt，然后復制進去。
用vscode打開直接拖到最后，發(fā)現(xiàn)確實是10582

最后用上面鏈接作者提供的腳本運行一下就可以得到10582的trainaug的tfrecord文件使用tensorflow來訓練了。

四、10582是怎么得到的？

最后再看一下10582這個數(shù)字是怎么得到的，先給一些數(shù)據(jù)：
voc數(shù)據(jù)集標簽：
voc_trainval：2913
voc_train：1464
voc_val：1449
sbd數(shù)據(jù)集標簽：
sbd_train：8498
sbd_val：2857
因為我們有上邊所有數(shù)據(jù)集對應的文件名.txt文件，通過對比其中圖片文件名重合情況，發(fā)現(xiàn)：
sbd_train(8498)=和voc_train重復的圖片(1133)+和voc_val重復的圖片(545)+sbd_train真正補充的圖片(6820)
sbd_val(2857)=和voc_train重復的圖片(1)+和voc_val重復的圖片(558)+sbd_val真正補充的圖片(2298)
所以可以得到的最大的擴充數(shù)據(jù)集應為：
voc_train(1464)+voc_val(1449)+sbd_train真正補充的圖片(6820)+sbd_val真正補充的圖片(2298)=12031張標注圖
用原來的voc_val(1449)作為驗證集，剩下的12031-voc_val(1449)=10582都可以用作訓練，就是trainaug(10582)

參考：

dataset for semantic sgementation ，圖像分割任務中VOC的augment dataset 到底在哪？

SBD數(shù)據(jù)集

總結

以上是生活随笔為你收集整理的PASCAL VOC 2012 and SBD (the augment dataset) 总结的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內容還不錯，歡迎將生活随笔推薦給好友。