日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

py-faster-rcnn代码roidb.py的解读

發布時間:2023/12/10 编程问答 29 豆豆
生活随笔 收集整理的這篇文章主要介紹了 py-faster-rcnn代码roidb.py的解读 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

roidb是比較復雜的數據結構,存放了數據集的roi信息。原始的roidb來自數據集,在trian.py的get_training_roidb(imdb)函數進行了水平翻轉擴充數量,然后prepare_roidb(imdb)【定義在roidb.py】為roidb添加了一些說明性的屬性。

在這里暫時記錄下roidb的結構信息,后面繼續看的時候可能會有些修正:

roidb是由字典組成的list,roidb[img_index]包含了該圖片索引所包含到roi信息,下面以roidb[img_index]為例說明:

roidb[img_index]包含的key, value
boxes box位置信息,box_num*4的np array
gt_overlaps 所有box在不同類別的得分,box_num*class_num矩陣
gt_classes 所有box的真實類別,box_num長度的list
flipped 是否翻轉
?image 該圖片的路徑,字符串
width 圖片的寬
height? 圖片的高
max_overlaps 每個box的在所有類別的得分最大值,box_num長度
max_classes 每個box的得分最高所對應的類,box_num長度
bbox_targets 每個box的類別,以及與最接近的gt-box的4個方位偏移
(共5列)

def add_bbox_regression_targets(roidb):"""Add information needed to train bounding-box regressors."""assert len(roidb) > 0assert 'max_classes' in roidb[0], 'Did you call prepare_roidb first?'num_images = len(roidb)# Infer number of classes from the number of columns in gt_overlaps# 類別數,roidb[0]對應第0號圖片上的roi,shape[1]多少列表示roi屬于不同類上的概率num_classes = roidb[0]['gt_overlaps'].shape[1]for im_i in xrange(num_images):rois = roidb[im_i]['boxes']max_overlaps = roidb[im_i]['max_overlaps']max_classes = roidb[im_i]['max_classes']# bbox_targets:每個box的類別,以及與最接近的gt-box的4個方位偏移roidb[im_i]['bbox_targets'] = \_compute_targets(rois, max_overlaps, max_classes)# 這里config是falseif cfg.TRAIN.BBOX_NORMALIZE_TARGETS_PRECOMPUTED:# Use fixed / precomputed "means" and "stds" instead of empirical values# 使用固定的均值和方差代替經驗值means = np.tile(np.array(cfg.TRAIN.BBOX_NORMALIZE_MEANS), (num_classes, 1))stds = np.tile(np.array(cfg.TRAIN.BBOX_NORMALIZE_STDS), (num_classes, 1))else:# Compute values needed for means and stds# 計算所需的均值和方差# var(x) = E(x^2) - E(x)^2# 計數各個類別出現box的數量class_counts = np.zeros((num_classes, 1)) + cfg.EPS #加上cfg.EPS防止除0出錯# 21類*4個位置,如果出現box的類別與其中某一類相同,將該box的4個target加入4個列元素中sums = np.zeros((num_classes, 4)) # 21類*4個位置,如果出現box的類別與其中某一類相同,將該box的4個target的平方加入4個列元素中squared_sums = np.zeros((num_classes, 4))for im_i in xrange(num_images):targets = roidb[im_i]['bbox_targets']for cls in xrange(1, num_classes):cls_inds = np.where(targets[:, 0] == cls)[0]# box的類別與該類匹配,計入if cls_inds.size > 0:class_counts[cls] += cls_inds.sizesums[cls, :] += targets[cls_inds, 1:].sum(axis=0)squared_sums[cls, :] += \(targets[cls_inds, 1:] ** 2).sum(axis=0)means = sums / class_counts # 均值stds = np.sqrt(squared_sums / class_counts - means ** 2) #標準差print 'bbox target means:'print meansprint means[1:, :].mean(axis=0) # ignore bg classprint 'bbox target stdevs:'print stdsprint stds[1:, :].mean(axis=0) # ignore bg class# Normalize targets# 對每一box歸一化targetif cfg.TRAIN.BBOX_NORMALIZE_TARGETS:print "Normalizing targets"for im_i in xrange(num_images):targets = roidb[im_i]['bbox_targets']for cls in xrange(1, num_classes):cls_inds = np.where(targets[:, 0] == cls)[0]roidb[im_i]['bbox_targets'][cls_inds, 1:] -= means[cls, :]roidb[im_i]['bbox_targets'][cls_inds, 1:] /= stds[cls, :]else:print "NOT normalizing targets"# 均值和方差也用于預測# These values will be needed for making predictions# (the predicts will need to be unnormalized and uncentered)return means.ravel(), stds.ravel() # ravel()排序拉成一維def _compute_targets(rois, overlaps, labels): # 參數rois只含有當前圖片的box信息"""Compute bounding-box regression targets for an image."""# Indices目錄 of ground-truth ROIs# ground-truth ROIsgt_inds = np.where(overlaps == 1)[0]if len(gt_inds) == 0:# Bail if the image has no ground-truth ROIs# 不存在gt ROI,返回空數組return np.zeros((rois.shape[0], 5), dtype=np.float32)# Indices of examples for which we try to make predictions# BBOX閾值,只有ROI與gt的重疊度大于閾值,這樣的ROI才能用作bb回歸的訓練樣本ex_inds = np.where(overlaps >= cfg.TRAIN.BBOX_THRESH)[0]# Get IoU overlap between each ex ROI and gt ROI# 計算ex ROI and gt ROI的IoUex_gt_overlaps = bbox_overlaps(# 變數據格式為floatnp.ascontiguousarray(rois[ex_inds, :], dtype=np.float),np.ascontiguousarray(rois[gt_inds, :], dtype=np.float))# Find which gt ROI each ex ROI has max overlap with:# this will be the ex ROI's gt target# 這里每一行代表一個ex_roi,列代表gt_roi,元素數值代表兩者的IoUgt_assignment = ex_gt_overlaps.argmax(axis=1) #按行求最大,返回索引.gt_rois = rois[gt_inds[gt_assignment], :] #每個ex_roi對應的gt_rois,與下面ex_roi數量相同ex_rois = rois[ex_inds, :]targets = np.zeros((rois.shape[0], 5), dtype=np.float32)targets[ex_inds, 0] = labels[ex_inds] #第一個元素是labeltargets[ex_inds, 1:] = bbox_transform(ex_rois, gt_rois) #后4個元素是ex_box與gt_box的4個方位的偏移return targets

下面解讀一下這兩個函數。

1.?_compute_targets(rois, overlaps, labels)

這個函數用來計算roi的偏移量。基本的步驟就是先確認是否含有ground-truth rois,主要通過?ground-truth ROIs的overlaps=1來確認。

然后找到重疊度大于一定閾值的box,再進行計算。


對于滿足條件的box,會調用程序bbox_overlaps重新計算box對應于ground-truth box的重疊度,根據最大的重疊度找對應的ground truth box.

這樣就可以利用?fast_rcnn.bbox_transform 的bbox_transform計算4個偏移(分別是中心點的x,y坐標,w,d長度偏移)。

輸出的是一個二維數組,橫坐標是盒子的序號,縱坐標是5維,第一維是類別,第二維到第五維為偏移。

bbox_overlaps的代碼如下:

def bbox_overlaps(np.ndarray[DTYPE_t, ndim=2] boxes,np.ndarray[DTYPE_t, ndim=2] query_boxes):"""Parameters----------boxes: (N, 4) ndarray of floatquery_boxes: (K, 4) ndarray of floatReturns-------overlaps: (N, K) ndarray of overlap between boxes and query_boxes"""cdef unsigned int N = boxes.shape[0]cdef unsigned int K = query_boxes.shape[0]cdef np.ndarray[DTYPE_t, ndim=2] overlaps = np.zeros((N, K), dtype=DTYPE)cdef DTYPE_t iw, ih, box_areacdef DTYPE_t uacdef unsigned int k, nfor k in range(K):box_area = ((query_boxes[k, 2] - query_boxes[k, 0] + 1) *(query_boxes[k, 3] - query_boxes[k, 1] + 1))for n in range(N):iw = (min(boxes[n, 2], query_boxes[k, 2]) -max(boxes[n, 0], query_boxes[k, 0]) + 1)if iw > 0:ih = (min(boxes[n, 3], query_boxes[k, 3]) -max(boxes[n, 1], query_boxes[k, 1]) + 1)if ih > 0:ua = float((boxes[n, 2] - boxes[n, 0] + 1) *(boxes[n, 3] - boxes[n, 1] + 1) +box_area - iw * ih)overlaps[n, k] = iw * ih / uareturn overlaps

2.?add_bbox_regression_targets

? ? 主要兩個兩件事: 1. 確定roidb每個圖片的box的回歸偏移量bbox_targets

? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?2. 對于所有的類別,計算偏移量的均值和方差,這樣輸出的矩陣是二維,行坐標是種類(這里是21類),縱坐標是偏移量(這里是4).

并且在需要正則化目標項(即cfg.TRAIN.BBOX_NORMALIZE_TARGETS=true)時,使每個偏移都減去均值除以標準差。并返回均值和方差的折疊一維向量,

用于預測(即逆操作,去正則化,則中心化)。


參考:

  • py-faster-rcnn代碼閱讀3-roidb.py
  • Faster RCNN roidb.py


總結

以上是生活随笔為你收集整理的py-faster-rcnn代码roidb.py的解读的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。