日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 运维知识 > windows >内容正文

windows

py-faster-rcnn在Windows下的end2end训练

發布時間:2024/9/21 windows 43 豆豆
生活随笔 收集整理的這篇文章主要介紹了 py-faster-rcnn在Windows下的end2end训练 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

一、制作數據集

1. 關于訓練的圖片

不論你是網上找的圖片或者你用別人的數據集,記住一點你的圖片不能太小,width和height最好不要小于150。需要是jpeg的圖片。

2.制作xml文件

1)LabelImg

如果你的數據集比較小的話,你可以考慮用LabelImg手工打框https://github.com/tzutalin/labelImg。關于labelimg的具體使用方法我在這就不詳細說明了,大家可以去網上找一下。labelimg生成的xml直接就能給frcnn訓練使用。

2)自己制作xml

如果你的數據集比較小的話,你還可以考慮用上面的方法手工打框。如果你的數據集有1w+你就可以考慮自動生成xml文件。網上有些資料基本用的是matlab坐標生成xml。我給出一段python的生成xml的代碼

[python] view plaincopy
  • <span?style="font-size:14px;">??
  • def?write_xml(bbox,w,h,iter):??
  • ????'''''?
  • ????bbox為你保存的當前圖片的類別的信息和對應坐標的dict?
  • ????w,h為你當前保存圖片的width和height?
  • ????iter為你圖片的序號?
  • ????'''??
  • ????root=Element("annotation")??
  • ????folder=SubElement(root,"folder")#1??
  • ????folder.text="JPEGImages"??
  • ????filename=SubElement(root,"filename")#1??
  • ????filename.text=iter??
  • ????path=SubElement(root,"path")#1??
  • ????path.text='D:\\py-faster-rcnn\\data\\VOCdevkit2007\\VOC2007\\JPEGImages'+'\\'+iter+'.jpg'#把這個路徑改為你的路徑就行??
  • ????source=SubElement(root,"source")#1??
  • ????database=SubElement(source,"database")#2??
  • ????database.text="Unknown"??
  • ????size=SubElement(root,"size")#1??
  • ????width=SubElement(size,"width")#2??
  • ????height=SubElement(size,"height")#2??
  • ????depth=SubElement(size,"depth")#2??
  • ????width.text=str(w)??
  • ????height.text=str(h)??
  • ????depth.text='3'??
  • ????segmented=SubElement(root,"segmented")#1??
  • ????segmented.text='0'??
  • ????for?i?in?bbox:??
  • ????????object=SubElement(root,"object")#1??
  • ????????name=SubElement(object,"name")#2??
  • ????????name.text=i['cls']??
  • ????????pose=SubElement(object,"pose")#2??
  • ????????pose.text="Unspecified"??
  • ????????truncated=SubElement(object,"truncated")#2??
  • ????????truncated.text='0'??
  • ????????difficult=SubElement(object,"difficult")#2??
  • ????????difficult.text='0'??
  • ????????bndbox=SubElement(object,"bndbox")#2??
  • ????????xmin=SubElement(bndbox,"xmin")#3??
  • ????????ymin=SubElement(bndbox,"ymin")#3??
  • ????????xmax=SubElement(bndbox,"xmax")#3??
  • ????????ymax=SubElement(bndbox,"ymax")#3??
  • ????????xmin.text=str(i['xmin'])??
  • ????????ymin.text=str(i['ymin'])??
  • ????????xmax.text=str(i['xmax'])??
  • ????????ymax.text=str(i['ymax'])??
  • ????xml=tostring(root,pretty_print=True)??
  • ????file=open('D:/py-faster-rcnn/data/VOCdevkit2007/VOC2007/Annotations/'+iter+'.xml','w+')#這里的路徑也改為你自己的路徑??
  • ????file.write(xml)</span>??


  • 3.制作訓練、測試、驗證集

    這個網上可以參考的資料比較多,我直接copy一個小咸魚的用matlab的代碼

    我建議train和trainval的部分占得比例可以更大一點

    [plain] view plaincopy
  • <span?style="font-size:14px;">%%????
  • %該代碼根據已生成的xml,制作VOC2007數據集中的trainval.txt;train.txt;test.txt和val.txt????
  • %trainval占總數據集的50%,test占總數據集的50%;train占trainval的50%,val占trainval的50%;????
  • %上面所占百分比可根據自己的數據集修改,如果數據集比較少,test和val可少一些????
  • %%????
  • %注意修改下面四個值????
  • xmlfilepath='E:\Annotations';????
  • txtsavepath='E:\ImageSets\Main\';????
  • trainval_percent=0.5;%trainval占整個數據集的百分比,剩下部分就是test所占百分比????
  • train_percent=0.5;%train占trainval的百分比,剩下部分就是val所占百分比????
  • ????
  • ????
  • %%????
  • xmlfile=dir(xmlfilepath);????
  • numOfxml=length(xmlfile)-2;%減去.和..??總的數據集大小????
  • ????
  • ????
  • trainval=sort(randperm(numOfxml,floor(numOfxml*trainval_percent)));????
  • test=sort(setdiff(1:numOfxml,trainval));????
  • ????
  • ????
  • trainvalsize=length(trainval);%trainval的大小????
  • train=sort(trainval(randperm(trainvalsize,floor(trainvalsize*train_percent))));????
  • val=sort(setdiff(trainval,train));????
  • ????
  • ????
  • ftrainval=fopen([txtsavepath?'trainval.txt'],'w');????
  • ftest=fopen([txtsavepath?'test.txt'],'w');????
  • ftrain=fopen([txtsavepath?'train.txt'],'w');????
  • fval=fopen([txtsavepath?'val.txt'],'w');????
  • ????
  • ????
  • for?i=1:numOfxml????
  • ????if?ismember(i,trainval)????
  • ????????fprintf(ftrainval,'%s\n',xmlfile(i+2).name(1:end-4));????
  • ????????if?ismember(i,train)????
  • ????????????fprintf(ftrain,'%s\n',xmlfile(i+2).name(1:end-4));????
  • ????????else????
  • ????????????fprintf(fval,'%s\n',xmlfile(i+2).name(1:end-4));????
  • ????????end????
  • ????else????
  • ????????fprintf(ftest,'%s\n',xmlfile(i+2).name(1:end-4));????
  • ????end????
  • end????
  • fclose(ftrainval);????
  • fclose(ftrain);????
  • fclose(fval);????
  • fclose(ftest);</span>??


  • 4.文件保存路徑

    jpg,txt,xml分別保存到data\VOCdevkit2007\VOC2007\下的JPEGImages、ImageSets\Main、Annotations文件夾

    二、根據自己的數據集修改文件

    1.模型配置文件

    我用end2end的方式訓練,這里我用vgg_cnn_m_1024為例說明。所以我們先打開models\pascal_voc\VGG_CNN_M_1024\faster_rcnn_end2end\train.prototxt,有4處需要修改

    [plain] view plaincopy
  • <span?style="font-size:14px;">layer?{??
  • ??name:?'input-data'??
  • ??type:?'Python'??
  • ??top:?'data'??
  • ??top:?'im_info'??
  • ??top:?'gt_boxes'??
  • ??python_param?{??
  • ????module:?'roi_data_layer.layer'??
  • ????layer:?'RoIDataLayer'??
  • ????param_str:?"'num_classes':?3"?#這里改為你訓練類別數+1??
  • ??}??
  • }</span>??

  • [plain] view plaincopy
  • <span?style="font-size:14px;">layer?{??
  • ??name:?'roi-data'??
  • ??type:?'Python'??
  • ??bottom:?'rpn_rois'??
  • ??bottom:?'gt_boxes'??
  • ??top:?'rois'??
  • ??top:?'labels'??
  • ??top:?'bbox_targets'??
  • ??top:?'bbox_inside_weights'??
  • ??top:?'bbox_outside_weights'??
  • ??python_param?{??
  • ????module:?'rpn.proposal_target_layer'??
  • ????layer:?'ProposalTargetLayer'??
  • ????param_str:?"'num_classes':?3"?#這里改為你訓練類別數+1??
  • ??}??
  • }</span>??
  • [plain] view plaincopy
  • <span?style="font-size:14px;">layer?{??
  • ??name:?"cls_score"??
  • ??type:?"InnerProduct"??
  • ??bottom:?"fc7"??
  • ??top:?"cls_score"??
  • ??param?{??
  • ????lr_mult:?1??
  • ??}??
  • ??param?{??
  • ????lr_mult:?2??
  • ??}??
  • ??inner_product_param?{??
  • ????num_output:?3??#這里改為你訓練類別數+1??
  • ????weight_filler?{??
  • ??????type:?"gaussian"??
  • ??????std:?0.01??
  • ????}??
  • ????bias_filler?{??
  • ??????type:?"constant"??
  • ??????value:?0??
  • ????}??
  • ??}??
  • }??
  • layer?{??
  • ??name:?"bbox_pred"??
  • ??type:?"InnerProduct"??
  • ??bottom:?"fc7"??
  • ??top:?"bbox_pred"??
  • ??param?{??
  • ????lr_mult:?1??
  • ??}??
  • ??param?{??
  • ????lr_mult:?2??
  • ??}??
  • ??inner_product_param?{??
  • ????num_output:?12??#這里改為你的(類別數+1)*4??
  • ????weight_filler?{??
  • ??????type:?"gaussian"??
  • ??????std:?0.001??
  • ????}??
  • ????bias_filler?{??
  • ??????type:?"constant"??
  • ??????value:?0??
  • ????}??
  • ??}??
  • }</span>??
  • 然后我們修改models\pascal_voc\VGG_CNN_M_1024\faster_rcnn_end2end\test.prototxt

    [plain] view plaincopy
  • <span?style="font-size:14px;">layer?{??
  • ??name:?"relu7"??
  • ??type:?"ReLU"??
  • ??bottom:?"fc7"??
  • ??top:?"fc7"??
  • }??
  • layer?{??
  • ??name:?"cls_score"??
  • ??type:?"InnerProduct"??
  • ??bottom:?"fc7"??
  • ??top:?"cls_score"??
  • ??param?{??
  • ????lr_mult:?1??
  • ????decay_mult:?1??
  • ??}??
  • ??param?{??
  • ????lr_mult:?2??
  • ????decay_mult:?0??
  • ??}??
  • ??inner_product_param?{??
  • ????num_output:?3?</span><span?style="font-size:14px;">?#這里改為你訓練類別數+1</span><span?style="font-size:14px;">??
  • </span><span?style="font-size:14px;"></span><pre?name="code"?class="plain"><span?style="font-size:14px;">????weight_filler?{??
  • ??????type:?"gaussian"??
  • ??????std:?0.01??
  • ????}??
  • ????bias_filler?{??
  • ??????type:?"constant"??
  • ??????value:?0??
  • ????}??
  • ??}??
  • }??
  • layer?{??
  • ??name:?"bbox_pred"??
  • ??type:?"InnerProduct"??
  • ??bottom:?"fc7"??
  • ??top:?"bbox_pred"??
  • ??param?{??
  • ????lr_mult:?1??
  • ????decay_mult:?1??
  • ??}??
  • ??param?{??
  • ????lr_mult:?2??
  • ????decay_mult:?0??
  • ??}??
  • ??inner_product_param?{??
  • ????num_output:?12?</span><span?style="font-size:14px;">?#這里改為你的(類別數+1)*4</span><span?style="font-size:14px;">??
  • </span></pre><span?style="font-size:14px"></span><pre?name="code"?class="plain"><span?style="font-size:14px;">????weight_filler?{??
  • ??????type:?"gaussian"??
  • ??????std:?0.001??
  • ????}??
  • ????bias_filler?{??
  • ??????type:?"constant"??
  • ??????value:?0??
  • ????}??
  • ??}??
  • }</span></pre>??
  • <pre></pre>??
  • 另外在 solver里可以調訓練的學習率等參數,在這篇文章里不做說明

    ==================以下修改lib中的文件==================

    2.修改imdb.py

    [python] view plaincopy
  • <span?style="font-size:14px;">????def?append_flipped_images(self):????
  • ????????num_images?=?self.num_images????
  • ????????widths?=?[PIL.Image.open(self.image_path_at(i)).size[0]????
  • ??????????????????for?i?in?xrange(num_images)]????
  • ????????for?i?in?xrange(num_images):????
  • ????????????boxes?=?self.roidb[i]['boxes'].copy()????
  • ????????????oldx1?=?boxes[:,?0].copy()????
  • ????????????oldx2?=?boxes[:,?2].copy()????
  • ????????????boxes[:,?0]?=?widths[i]?-?oldx2?-?1????
  • ????????????boxes[:,?2]?=?widths[i]?-?oldx1?-?1??
  • ????????????for?b?in?range(len(boxes)):??
  • ????????????????if?boxes[b][2]<?boxes[b][0]:??
  • ????????????????????boxes[b][0]?=?0???????????
  • ????????????assert?(boxes[:,?2]?>=?boxes[:,?0]).all()????
  • ????????????entry?=?{'boxes'?:?boxes,????
  • ?????????????????????'gt_overlaps'?:?self.roidb[i]['gt_overlaps'],????
  • ?????????????????????'gt_classes'?:?self.roidb[i]['gt_classes'],????
  • ?????????????????????'flipped'?:?True}????
  • ????????????self.roidb.append(entry)????
  • ????????self._image_index?=?self._image_index?*?2?</span>??
  • 找到這個函數,并修改為如上

    3、修改rpn層的5個文件

    在如下目錄下,將文件中param_str_全部改為param_str


    4、修改config.py

    將訓練和測試的proposals改為gt

    [plain] view plaincopy
  • <span?style="font-size:14px;">#?Train?using?these?proposals??
  • __C.TRAIN.PROPOSAL_METHOD?=?'gt'??
  • #?Test?using?these?proposals??
  • __C.TEST.PROPOSAL_METHOD?=?'gt</span>??
  • 5、修改pascal_voc.py

    因為我們使用VOC來訓練,所以這個是我們主要修改的訓練的文件。

    [plain] view plaincopy
  • <span?style="font-size:14px;">?def?__init__(self,?image_set,?year,?devkit_path=None):??
  • ????????imdb.__init__(self,?'voc_'?+?year?+?'_'?+?image_set)??
  • ????????self._year?=?year??
  • ????????self._image_set?=?image_set??
  • ????????self._devkit_path?=?self._get_default_path()?if?devkit_path?is?None?\??
  • ????????????????????????????else?devkit_path??
  • ????????self._data_path?=?os.path.join(self._devkit_path,?'VOC'?+?self._year)??
  • ????????self._classes?=?('__background__',?#?always?index?0??
  • ????????????????????????????'cn-character','seal')??
  • ????????self._class_to_ind?=?dict(zip(self.classes,?xrange(self.num_classes)))??
  • ????????self._image_ext?=?'.jpg'??
  • ????????self._image_index?=?self._load_image_set_index()??
  • ????????#?Default?to?roidb?handler??
  • ????????self._roidb_handler?=?self.selective_search_roidb??
  • ????????self._salt?=?str(uuid.uuid4())??
  • ????????self._comp_id?=?'comp4'</span>??

  • 在self.classes這里,'__background__'使我們的背景類,不要動他。下面的改為你自己標簽的內容。

    修改以下2段內容。否則你的test部分一定會出問題。

    [python] view plaincopy
  • def?_get_voc_results_file_template(self):??
  • ???????#?VOCdevkit/results/VOC2007/Main/<comp_id>_det_test_aeroplane.txt??
  • ???????filename?=?self._get_comp_id()?+?'_det_'?+?self._image_set?+?'_{:s}.txt'??
  • ???????path?=?os.path.join(??
  • ???????????self._devkit_path,??
  • ???????????'VOC'?+?self._year,??
  • ????????ImageSets,??
  • ???????????'Main',??
  • ???????????'{}'?+?'_test.txt')??
  • ???????return?path??
  • [python] view plaincopy
  • def?_write_voc_results_file(self,?all_boxes):??
  • ???????for?cls_ind,?cls?in?enumerate(self.classes):??
  • ???????????if?cls?==?'__background__':??
  • ???????????????continue??
  • ???????????print?'Writing?{}?VOC?results?file'.format(cls)??
  • ???????????filename?=?self._get_voc_results_file_template().format(cls)??
  • ???????????with?open(filename,?'w+')?as?f:??
  • ???????????????for?im_ind,?index?in?enumerate(self.image_index):??
  • ???????????????????dets?=?all_boxes[cls_ind][im_ind]??
  • ???????????????????if?dets?==?[]:??
  • ???????????????????????continue??
  • ???????????????????#?the?VOCdevkit?expects?1-based?indices??
  • ???????????????????for?k?in?xrange(dets.shape[0]):??
  • ???????????????????????f.write('{:s}?{:.3f}?{:.1f}?{:.1f}?{:.1f}?{:.1f}\n'.??
  • ???????????????????????????????format(index,?dets[k,?-1],??
  • ??????????????????????????????????????dets[k,?0]?+?1,?dets[k,?1]?+?1,??
  • ??????????????????????????????????????dets[k,?2]?+?1,?dets[k,?3]?+?1))??
  • 三、end2end訓練

    1、刪除緩存文件

    每次訓練前將data\cache 和 data\VOCdevkit2007\annotations_cache中的文件刪除。

    2、開始訓練

    在py-faster-rcnn的根目錄下打開git bash輸入

    [plain] view plaincopy
  • <span?style="font-size:18px;">./experiments/scripts/faster_rcnn_end2end.sh?0?VGG_CNN_M_1024?pascal_voc</span>??

  • 當然你可以去experiments\scripts\faster_rcnn_end2end.sh中調自己的訓練的一些參數,也可以中VGG16、ZF模型去訓練。我這里就用默認給的參數說明。

    出現了這種東西的話,那就是訓練成功了。用vgg1024的話還是很快的,還是要看你的配置,我用1080ti的話也就85min左右。我就沒有讓他訓練結束了。

    四、測試

    1、創建自己的demo.py

    如果想方便的話,直接把已經有的demo.py復制一份,并把它的標簽改為自己的標簽,把模型改為自己的模型。

    這是我的demo,類別和模型部分,供參考

    [python] view plaincopy
  • <span?style="font-size:14px;">CLASSES?=?('__background__',??
  • ???????????'cn-character','seal')??
  • ??
  • NETS?=?{'vgg16':?('VGG16',??
  • ??????????????????'vgg16_faster_rcnn_iter_70000.caffemodel'),??
  • ????'vgg1024':('VGG_CNN_M_1024',??
  • ?????????'vgg_cnn_m_1024_faster_rcnn_iter_70000.caffemodel'),??
  • ????????'zf':?('ZF',??
  • ??????????????????'ZF_faster_rcnn_final.caffemodel')}</span>??
  • [python] view plaincopy
  • if?__name__?==?'__main__':??
  • ????cfg.TEST.HAS_RPN?=?True??#?Use?RPN?for?proposals??
  • ??
  • ????args?=?parse_args()??
  • ??
  • ????prototxt?=?os.path.join(cfg.MODELS_DIR,?NETS[args.demo_net][0],??
  • ????????????????????????????'faster_rcnn_end2end',?'test.prototxt')??
  • ????caffemodel?=?os.path.join(cfg.DATA_DIR,?'faster_rcnn_models',??
  • ??????????????????????????????NETS[args.demo_net][1])??
  • ??
  • ????if?not?os.path.isfile(caffemodel):??
  • ????????raise?IOError(('{:s}?not?found.\nDid?you?run?./data/script/'??
  • ???????????????????????'fetch_faster_rcnn_models.sh?').format(caffemodel))??
  • ??
  • ????if?args.cpu_mode:??
  • ????????caffe.set_mode_cpu()??
  • ????else:??
  • ????????caffe.set_mode_gpu()??
  • ????????caffe.set_device(args.gpu_id)??
  • ????????cfg.GPU_ID?=?args.gpu_id??
  • ????net?=?caffe.Net(prototxt,?caffemodel,?caffe.TEST)??
  • ??
  • ????print?'\n\nLoaded?network?{:s}'.format(caffemodel)??
  • ??
  • ????#?Warmup?on?a?dummy?image??
  • ????im?=?128?*?np.ones((300,?500,?3),?dtype=np.uint8)??
  • ????for?i?in?xrange(2):??
  • ????????_,?_=?im_detect(net,?im)??
  • ??
  • ????im_names?=?['f1.jpg','f8.jpg','f7.jpg','f6.jpg','f5.jpg','f4.jpg','f3.jpg','f2.jpg',]??
  • ????for?im_name?in?im_names:??
  • ????????print?'~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'??
  • ????????print?'Demo?for?data/demo/{}'.format(im_name)??
  • ????????demo(net,?im_name)??
  • ??
  • ????plt.show()??
  • 在這個部分,將你要測試的圖片寫在im_names里,并把圖片放在data\demo這個文件夾下。

    2、輸出的模型

    將output\里你剛剛訓練好的caffemodel復制到data\faster_rcnn_models

    3、結果

    運行你自己的demo.py即可得到結果

    我這個中文文字的識別初步還是可以的,但還需要再加強一下

    總結

    以上是生活随笔為你收集整理的py-faster-rcnn在Windows下的end2end训练的全部內容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。