得到按钮句柄后如何点集_RepPoint:可形变卷积生成的目标轮廓点集
論文題目為: RepPoints: Point Set Representation for Object Detection
idea總結:
respoints 表示
傳統目標檢測采用一個4-D的向量來表示一個物體
,其分別代表了物體的中心點坐標,物體框的寬與高.respoint則是用一組點集來表示,其中n代表了取樣點的數量(文中設置為9).建議為某個數的平方
如圖表示,respints在backbone骨干網絡抽取特征后,通過其RepPointsHead結構轉化成9個物體的輪廓點,然后,這9個點形成物體邊框的pseudo box,然后再轉化為傳統目標檢測的bbox.
回顧傳統的多階段目標檢測
傳統的兩階段目標檢測流程:
邊界框與點集回歸對比
逐步完善邊界框定位和特征提取對于多階段目標檢測方法的成功至關重要。
對于bbox表現形式:
4-d的回歸量
map到原始的建議框bounding box proposal:對于ground truth bounding box
,我們的loss是要使更接近gt,所以其4-d的loss為:對于respoint形式
是預測點的offset.所以我們只需要學習其offset,然后加到原始點坐標即可.
RPDet:anchor free的respoint 檢測器
其流程如下圖所示:
其RPDet的head主要算法結構如圖所示:
其中locate subnet 與class subnet兩個子網絡的輸入都是通過rpn主干網絡抽取的相同圖像特征.
我們看到通過center point生成respoint的奧秘在于locate subnet中那個 3 X 3 的可變形卷積自動學習得到的關于物體的感受野位置
respoint 生成bbox的三種方法:
loss的計算:
代碼分析
RPDet的代碼在https://github.com/microsoft/RepPoints.已合并如mmdetion框架中,我們來看mmdetion中的代碼:
config文件: config/reppoints/reppoints_moment_r50_fpn_1x.py
#model定義 model = dict(type='RepPointsDetector',pretrained='torchvision://resnet50',backbone=dict(type='ResNet',depth=50,num_stages=4,out_indices=(0, 1, 2, 3),frozen_stages=1,style='pytorch'),neck=dict(type='FPN',in_channels=[256, 512, 1024, 2048],out_channels=256,start_level=1,add_extra_convs=True,num_outs=5,norm_cfg=norm_cfg),bbox_head=dict(type='RepPointsHead',num_classes=81,in_channels=256,feat_channels=256,point_feat_channels=256,stacked_convs=3,num_points=9,gradient_mul=0.1,point_strides=[8, 16, 32, 64, 128],point_base_scale=4,norm_cfg=norm_cfg,loss_cls=dict(type='FocalLoss',use_sigmoid=True,gamma=2.0,alpha=0.25,loss_weight=1.0),loss_bbox_init=dict(type='SmoothL1Loss', beta=0.11, loss_weight=0.5),loss_bbox_refine=dict(type='SmoothL1Loss', beta=0.11, loss_weight=1.0),transform_method='moment'))其主干網絡采用restnet+fpn的形式,正常的多尺度抽取圖像特征;
下面我們結合reppoint-head的結構圖,來看兩個subnet是如何發揮作用的: mmdet/models/anchor_heads/resppoints_head.py
@HEADS.register_module class RepPointsHead(nn.Module):def __init__(self,****) #部分省略初始化定義# we use deformable conv to extract points features#dcn的kernel大小即為定義點的數量,即用一個dcn的感受野來表示物體輪廓self.dcn_kernel = int(np.sqrt(num_points))self.dcn_pad = int((self.dcn_kernel - 1) / 2)assert self.dcn_kernel * self.dcn_kernel == num_points, "The points number should be a square number."assert self.dcn_kernel % 2 == 1, "The points number should be an odd square number."#可變形卷積的初始化x,y偏移量dcn_base = np.arange(-self.dcn_pad,self.dcn_pad + 1).astype(np.float64)dcn_base_y = np.repeat(dcn_base, self.dcn_kernel)dcn_base_x = np.tile(dcn_base, self.dcn_kernel)dcn_base_offset = np.stack([dcn_base_y, dcn_base_x], axis=1).reshape((-1))self.dcn_base_offset = torch.tensor(dcn_base_offset).view(1, -1, 1, 1)self._init_layers()def _init_layers(self):self.relu = nn.ReLU(inplace=True)self.cls_convs = nn.ModuleList()self.reg_convs = nn.ModuleList()#兩個subnet分別都有3個3X3的卷積進行特征抽取工作for i in range(self.stacked_convs):chn = self.in_channels if i == 0 else self.feat_channelsself.cls_convs.append(ConvModule(chn,self.feat_channels,3,stride=1,padding=1,conv_cfg=self.conv_cfg,norm_cfg=self.norm_cfg))self.reg_convs.append(ConvModule(chn,self.feat_channels,3,stride=1,padding=1,conv_cfg=self.conv_cfg,norm_cfg=self.norm_cfg))#respoint利用dcn進行offset學習部分網絡定義pts_out_dim = 4 if self.use_grid_points else 2 * self.num_pointsself.reppoints_cls_conv = DeformConv(self.feat_channels,self.point_feat_channels,self.dcn_kernel, 1, self.dcn_pad)self.reppoints_cls_out = nn.Conv2d(self.point_feat_channels,self.cls_out_channels, 1, 1, 0)self.reppoints_pts_init_conv = nn.Conv2d(self.feat_channels,self.point_feat_channels, 3,1, 1)self.reppoints_pts_init_out = nn.Conv2d(self.point_feat_channels,pts_out_dim, 1, 1, 0)self.reppoints_pts_refine_conv = DeformConv(self.feat_channels,self.point_feat_channels,self.dcn_kernel, 1,self.dcn_pad)self.reppoints_pts_refine_out = nn.Conv2d(self.point_feat_channels,pts_out_dim, 1, 1, 0)#網絡前饋計算def forward_single(self, x):dcn_base_offset = self.dcn_base_offset.type_as(x)# If we use center_init, the initial reppoints is from center points.# If we use bounding bbox representation, the initial reppoints is# from regular grid placed on a pre-defined bbox.if self.use_grid_points or not self.center_init:scale = self.point_base_scale / 2points_init = dcn_base_offset / dcn_base_offset.max() * scalebbox_init = x.new_tensor([-scale, -scale, scale,scale]).view(1, 4, 1, 1)else:points_init = 0cls_feat = xpts_feat = xfor cls_conv in self.cls_convs:cls_feat = cls_conv(cls_feat)for reg_conv in self.reg_convs:pts_feat = reg_conv(pts_feat)# initialize reppointspts_out_init = self.reppoints_pts_init_out(self.relu(self.reppoints_pts_init_conv(pts_feat)))if self.use_grid_points:pts_out_init, bbox_out_init = self.gen_grid_from_reg(pts_out_init, bbox_init.detach())else:pts_out_init = pts_out_init + points_init# refine and classify reppointspts_out_init_grad_mul = (1 - self.gradient_mul) * pts_out_init.detach() + self.gradient_mul * pts_out_initdcn_offset = pts_out_init_grad_mul - dcn_base_offsetcls_out = self.reppoints_cls_out(self.relu(self.reppoints_cls_conv(cls_feat, dcn_offset)))pts_out_refine = self.reppoints_pts_refine_out(self.relu(self.reppoints_pts_refine_conv(pts_feat, dcn_offset)))if self.use_grid_points:pts_out_refine, bbox_out_refine = self.gen_grid_from_reg(pts_out_refine, bbox_out_init.detach())else:pts_out_refine = pts_out_refine + pts_out_init.detach()return cls_out, pts_out_init, pts_out_refine總結與tips
這篇論文在我的理解中,更像是將可形變卷積應用在了目標檢測領域,通過定位和分類的監督loss來監督可形變卷積對于物體偏移量的學習,使得卷積的學習變得可解釋性.啟發我們可以可以用不同的監督信息來使用可形變卷積.
respoint 如何解決同一位置多個物體的遮擋問題:
In RPDet, we show that this issue can be greatly alleviated by using the FPN structure [24] for the following reasons: first, objects of different scales will be assigned to different image feature levels, which addresses objects of different scales and the same center points locations; second, FPN has a high-resolution feature map for small objects, which also reduces the chance of two objects having centers located at the same feature position.作者認為通過rpn結構將不同比例對象分配給不同的圖像特征的方式來解決;
但這種方式能放解決像行人檢測中多個行人遮擋問題還有待商榷.
總結
以上是生活随笔為你收集整理的得到按钮句柄后如何点集_RepPoint:可形变卷积生成的目标轮廓点集的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: C++学习笔记(一):头文件和源文件
- 下一篇: 设置图例字体_R高级画图(0903)关于