當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

RCNN系列实验的PASCAL VOC数据集格式设置

發(fā)布時(shí)間：2024/9/21 编程问答 65 豆豆

生活随笔收集整理的這篇文章主要介紹了 RCNN系列实验的PASCAL VOC数据集格式设置小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

我們?cè)谧鯮CNN系列的實(shí)驗(yàn)時(shí)，往往需要把數(shù)據(jù)集的格式設(shè)置為和PASCAL VOC數(shù)據(jù)集一樣的格式，其實(shí)當(dāng)然也可以修改讀取數(shù)據(jù)的代碼，只是這樣更為麻煩，自己的數(shù)據(jù)格式變了又得修改。?
首先以VOC2008為例，先看一下VOCdevkit的文件夾結(jié)構(gòu)：

VOCdevkit中還有一個(gè)devkit_doc.pdf文件，關(guān)于PASCAL VOC數(shù)據(jù)集的所有信息都可以在里面找到。?
我們也按照這樣的樹形結(jié)構(gòu)建好文件夾，把VOC2007和VOC2008換成自己數(shù)據(jù)集的名字（保留一個(gè)即可），local下面也建一個(gè)自己數(shù)據(jù)集的名字的文件夾。SegmentationObject和SegmentationClass就不需要了。?
我們檢測(cè)任務(wù)所用的數(shù)據(jù)集只需要JPEGImages、Annotations、ImageSets文件夾。前提是自己要有數(shù)據(jù)，即圖片和標(biāo)注好的類別名與坐標(biāo)。

JPEGImages

把自己所有類別的圖片放到JPEGImages文件夾下，圖片名按類似于000001.jpg、000002.jpg…的格式，不一定非要按數(shù)字順序，但是一定不要重名，最好歸一化一下圖片的尺寸。

Annotations

VOCcode中的代碼已經(jīng)提供了寫注釋文件的東西，我的writexml是仿照VOCdevkit中的VOCwritexml來寫的。假設(shè)我的標(biāo)注都寫到了txt文件里面，且txt文件與相應(yīng)的圖片同名，形如：

第一行是類別名，第二行是目標(biāo)的坐標(biāo)（這里每張圖像只包含一個(gè)目標(biāo)，多目標(biāo)的標(biāo)注是差不多的）。下面是寫Annotations的代碼

%writeanno.m path_image='JPEGImages/'; path_label='labels/';%txt文件存放路徑 files_all=dir(path_image);for i = 3:length(files_all)msg = textread(strcat(path_label, files_all(i).name(1:end-4),'.txt'),'%s');clear rec;path = ['./Annotations/' files_all(i).name(1:end-4) '.xml'];fid=fopen(path,'w');rec.annotation.folder = 'lml';%數(shù)據(jù)集名rec.annotation.filename = files_all(i).name(1:end-4);%圖片名rec.annotation.source.database = 'The lmls Database';%隨便寫rec.annotation.source.annotation = 'The lmls Database';%隨便寫rec.annotation.source.image = 'lml';%隨便寫rec.annotation.source.flickrid = '0';%隨便寫rec.annotation.owner.flickrid = 'I do not know';%隨便寫rec.annotation.owner.name = 'I do not know';%隨便寫img = imread(['./JPEGImages/' files_all(i).name]);rec.annotation.size.width = int2str(size(img,2));rec.annotation.size.height = int2str(size(img,1));rec.annotation.size.depth = int2str(size(img,3));rec.annotation.segmented = '0';%不用于分割rec.annotation.object.name = msg{1};%類別名rec.annotation.object.pose = 'Unspecified';%不指定姿態(tài)rec.annotation.object.truncated = '0';%沒有被刪節(jié)rec.annotation.object.difficult = '0';%不是難以識(shí)別的目標(biāo)rec.annotation.object.bndbox.xmin = msg{2};%坐標(biāo)x1rec.annotation.object.bndbox.ymin = msg{3};%坐標(biāo)y1rec.annotation.object.bndbox.xmax = msg{4};%坐標(biāo)x2rec.annotation.object.bndbox.ymax = msg{5};%坐標(biāo)y2writexml(fid,rec,0);fclose(fid); end

%writexml.m function xml = writexml(fid,rec,depth)fn=fieldnames(rec); for i=1:length(fn)f=rec.(fn{i});if ~isempty(f)if isstruct(f)for j=1:length(f) fprintf(fid,'%s',repmat(char(9),1,depth));a=repmat(char(9),1,depth);fprintf(fid,'<%s>\n',fn{i});writexml(fid,rec.(fn{i})(j),depth+1);fprintf(fid,'%s',repmat(char(9),1,depth));fprintf(fid,'</%s>\n',fn{i});endelseif ~iscell(f)f={f};end for j=1:length(f)fprintf(fid,'%s',repmat(char(9),1,depth));fprintf(fid,'<%s>',fn{i});if ischar(f{j})fprintf(fid,'%s',f{j});elseif isnumeric(f{j})&&numel(f{j})==1fprintf(fid,'%s',num2str(f{j}));elseerror('unsupported type');endfprintf(fid,'</%s>\n',fn{i});endendend end

ImageSets

ImageSets里只需要用到Main文件夾，而在Main中，主要用到4個(gè)文件：?
- train.txt 是用來訓(xùn)練的圖片文件的文件名列表?
- trianval.txt是用來訓(xùn)練和驗(yàn)證的圖片文件的文件名列表?
- val.txt是用來驗(yàn)證的圖片文件的文件名列表?
- test.txt 是用來測(cè)試的圖片文件的文件名列表?
我們希望訓(xùn)練集、驗(yàn)證集、測(cè)試集的分別是隨機(jī)的，下面是實(shí)現(xiàn)隨機(jī)選取樣本集合與寫txt文件的代碼：

%writetxt.m file = dir('Annotations'); len = length(file)-2;num_trainval=sort(randperm(len, floor(9*len/10)));%trainval集占所有數(shù)據(jù)的9/10，可以根據(jù)需要設(shè)置 num_train=sort(num_trainval(randperm(length(num_trainval), floor(5*length(num_trainval)/6))));%train集占trainval集的5/6，可以根據(jù)需要設(shè)置 num_val=setdiff(num_trainval,num_train);%trainval集剩下的作為val集 num_test=setdiff(1:len,num_trainval);%所有數(shù)據(jù)中剩下的作為test集 path = 'ImageSets\Main\';fid=fopen(strcat(path, 'trainval.txt'),'a+'); for i=1:length(num_trainval)s = sprintf('%s',file(num_trainval(i)+2).name);fprintf(fid,[s(1:length(s)-4) '\n']); end fclose(fid);fid=fopen(strcat(path, 'train.txt'),'a+'); for i=1:length(num_train)s = sprintf('%s',file(num_train(i)+2).name);fprintf(fid,[s(1:length(s)-4) '\n']); end fclose(fid);fid=fopen(strcat(path, 'val.txt'),'a+'); for i=1:length(num_val)s = sprintf('%s',file(num_val(i)+2).name);fprintf(fid,[s(1:length(s)-4) '\n']); end fclose(fid);fid=fopen(strcat(path, 'test.txt'),'a+'); for i=1:length(num_test)s = sprintf('%s',file(num_test(i)+2).name);fprintf(fid,[s(1:length(s)-4) '\n']); end fclose(fid);

最后，在訓(xùn)練時(shí)要把VOCCode/VOCinit.m中的VOCopts.dataset即數(shù)據(jù)集名改為自己的數(shù)據(jù)集名字，VOCopts.classes即類別名改為自己的類別名字。?
此外多說一個(gè)RCNN系列實(shí)驗(yàn)使用數(shù)據(jù)集的問題，有時(shí)候測(cè)試的AP值總顯示results為0，發(fā)現(xiàn)問題在imdb_eval_voc.m中，改了數(shù)據(jù)集名字后得不到它想要的年份信息，就不會(huì)算AP值，因此也做了一點(diǎn)修改：

代碼風(fēng)格不好，請(qǐng)高手們盡情鄙視。

總結(jié)

以上是生活随笔為你收集整理的RCNN系列实验的PASCAL VOC数据集格式设置的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： python学习网址
下一篇： Faster R-CNN WINDOWS