當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

得到按钮句柄后如何点集_RepPoint：可形变卷积生成的目标轮廓点集

發布時間：2023/12/19 编程问答 33 豆豆

生活随笔收集整理的這篇文章主要介紹了得到按钮句柄后如何点集_RepPoint：可形变卷积生成的目标轮廓点集小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

論文題目為: RepPoints: Point Set Representation for Object Detection

idea總結:

改變目標檢測領域中對于目標用矩形框的表現形式,而是采用點集的形式來表現一個物體的輪廓

特征抽取后,配合deformable convolution來進對物體中心點的偏移量學習,得到其點集的位置.

提出三種轉換方式,將點集轉化為矩形框方便評測該目標檢測算法的指標

respoints 表示

傳統目標檢測采用一個4-D的向量來表示一個物體

,其分別代表了物體的中心點坐標,物體框的寬與高.

respoint則是用一組點集來表示,其中n代表了取樣點的數量(文中設置為9).建議為某個數的平方

如圖表示,respints在backbone骨干網絡抽取特征后,通過其RepPointsHead結構轉化成9個物體的輪廓點,然后,這9個點形成物體邊框的pseudo box,然后再轉化為傳統目標檢測的bbox.

回顧傳統的多階段目標檢測

傳統的兩階段目標檢測流程:

通過預設的錨點(anchor)來覆蓋一定范圍的邊界框比例和縱橫比.

對于錨點,將其中心點處的圖像特征作為對象特征,生成有關錨點是否為目標對象的置信度得分,并通過邊界框回歸生成精煉的邊界框(bbox proposals)

在第二階段,通過 RoI-pooling 或 RoI-Align從(2)中獲得的邊界框建議提取對象特征.

經過改進的特征將通過邊界框回歸產生最終的邊界框目標。

對于多階段方法，還通過邊界框回歸，使用改進的特征來生成中間的改進的邊界框建議（S2）。在生成最終的邊界框目標之前，可以多次重復此步驟,用以修正目標框邊界.

邊界框與點集回歸對比

逐步完善邊界框定位和特征提取對于多階段目標檢測方法的成功至關重要。

對于bbox表現形式:

4-d的回歸量

map到原始的建議框bounding box proposal:

對于ground truth bounding box

,我們的loss是要使更接近gt,所以其4-d的loss為:

對于respoint形式

是預測點的offset.

所以我們只需要學習其offset,然后加到原始點坐標即可.

RPDet:anchor free的respoint 檢測器

其流程如下圖所示:

使用中心點作為對象的初始表示.

基于中心點,通過deformable convolution 來學習每個中心點的偏移量,如9個點偏移量來表示物體,則用一個3 X 3的可變形卷積.然后利用偏移量對物體位置進行回歸.

經過兩次deformable convolution的offset偏移量回歸矯正,形成respints object

其RPDet的head主要算法結構如圖所示:

其中locate subnet 與class subnet兩個子網絡的輸入都是通過rpn主干網絡抽取的相同圖像特征.

我們看到通過center point生成respoint的奧秘在于locate subnet中那個 3 X 3 的可變形卷積自動學習得到的關于物體的感受野位置

respoint 生成bbox的三種方法:

Min-max function.在RepPoints上執行兩個軸上的Min-max操作以確定Bp，等效于所有采樣點上的邊界框值.

Partial min-max function.在兩個軸上分別對樣本點的子集進行最小-最大運算，以獲得矩形框值.

Moment-based function.RepPoints的平均值和標準偏差用于計算矩形框Bp的中心點和比例，其中比例與全球共享的可學習乘數λx和λy相乘。(代碼中默認使用這種方式)

loss的計算:

location loss:先將respoint轉換為偽框(pseudo box),然后計算pseudo box與ground- truth bounding box的loss.(論文中使用左上角與右下角之間的smooth l1 loss來得到location loss)

classification loss:采用FocalLoss的形式來解決類別不平衡問題

代碼分析

RPDet的代碼在https://github.com/microsoft/RepPoints.已合并如mmdetion框架中,我們來看mmdetion中的代碼:

config文件: config/reppoints/reppoints_moment_r50_fpn_1x.py

#model定義 model = dict(type='RepPointsDetector',pretrained='torchvision://resnet50',backbone=dict(type='ResNet',depth=50,num_stages=4,out_indices=(0, 1, 2, 3),frozen_stages=1,style='pytorch'),neck=dict(type='FPN',in_channels=[256, 512, 1024, 2048],out_channels=256,start_level=1,add_extra_convs=True,num_outs=5,norm_cfg=norm_cfg),bbox_head=dict(type='RepPointsHead',num_classes=81,in_channels=256,feat_channels=256,point_feat_channels=256,stacked_convs=3,num_points=9,gradient_mul=0.1,point_strides=[8, 16, 32, 64, 128],point_base_scale=4,norm_cfg=norm_cfg,loss_cls=dict(type='FocalLoss',use_sigmoid=True,gamma=2.0,alpha=0.25,loss_weight=1.0),loss_bbox_init=dict(type='SmoothL1Loss', beta=0.11, loss_weight=0.5),loss_bbox_refine=dict(type='SmoothL1Loss', beta=0.11, loss_weight=1.0),transform_method='moment'))

其主干網絡采用restnet+fpn的形式,正常的多尺度抽取圖像特征;

下面我們結合reppoint-head的結構圖,來看兩個subnet是如何發揮作用的: mmdet/models/anchor_heads/resppoints_head.py

@HEADS.register_module class RepPointsHead(nn.Module):def __init__(self,****) #部分省略初始化定義# we use deformable conv to extract points features#dcn的kernel大小即為定義點的數量,即用一個dcn的感受野來表示物體輪廓self.dcn_kernel = int(np.sqrt(num_points))self.dcn_pad = int((self.dcn_kernel - 1) / 2)assert self.dcn_kernel * self.dcn_kernel == num_points, "The points number should be a square number."assert self.dcn_kernel % 2 == 1, "The points number should be an odd square number."#可變形卷積的初始化x,y偏移量dcn_base = np.arange(-self.dcn_pad,self.dcn_pad + 1).astype(np.float64)dcn_base_y = np.repeat(dcn_base, self.dcn_kernel)dcn_base_x = np.tile(dcn_base, self.dcn_kernel)dcn_base_offset = np.stack([dcn_base_y, dcn_base_x], axis=1).reshape((-1))self.dcn_base_offset = torch.tensor(dcn_base_offset).view(1, -1, 1, 1)self._init_layers()def _init_layers(self):self.relu = nn.ReLU(inplace=True)self.cls_convs = nn.ModuleList()self.reg_convs = nn.ModuleList()#兩個subnet分別都有3個3X3的卷積進行特征抽取工作for i in range(self.stacked_convs):chn = self.in_channels if i == 0 else self.feat_channelsself.cls_convs.append(ConvModule(chn,self.feat_channels,3,stride=1,padding=1,conv_cfg=self.conv_cfg,norm_cfg=self.norm_cfg))self.reg_convs.append(ConvModule(chn,self.feat_channels,3,stride=1,padding=1,conv_cfg=self.conv_cfg,norm_cfg=self.norm_cfg))#respoint利用dcn進行offset學習部分網絡定義pts_out_dim = 4 if self.use_grid_points else 2 * self.num_pointsself.reppoints_cls_conv = DeformConv(self.feat_channels,self.point_feat_channels,self.dcn_kernel, 1, self.dcn_pad)self.reppoints_cls_out = nn.Conv2d(self.point_feat_channels,self.cls_out_channels, 1, 1, 0)self.reppoints_pts_init_conv = nn.Conv2d(self.feat_channels,self.point_feat_channels, 3,1, 1)self.reppoints_pts_init_out = nn.Conv2d(self.point_feat_channels,pts_out_dim, 1, 1, 0)self.reppoints_pts_refine_conv = DeformConv(self.feat_channels,self.point_feat_channels,self.dcn_kernel, 1,self.dcn_pad)self.reppoints_pts_refine_out = nn.Conv2d(self.point_feat_channels,pts_out_dim, 1, 1, 0)#網絡前饋計算def forward_single(self, x):dcn_base_offset = self.dcn_base_offset.type_as(x)# If we use center_init, the initial reppoints is from center points.# If we use bounding bbox representation, the initial reppoints is# from regular grid placed on a pre-defined bbox.if self.use_grid_points or not self.center_init:scale = self.point_base_scale / 2points_init = dcn_base_offset / dcn_base_offset.max() * scalebbox_init = x.new_tensor([-scale, -scale, scale,scale]).view(1, 4, 1, 1)else:points_init = 0cls_feat = xpts_feat = xfor cls_conv in self.cls_convs:cls_feat = cls_conv(cls_feat)for reg_conv in self.reg_convs:pts_feat = reg_conv(pts_feat)# initialize reppointspts_out_init = self.reppoints_pts_init_out(self.relu(self.reppoints_pts_init_conv(pts_feat)))if self.use_grid_points:pts_out_init, bbox_out_init = self.gen_grid_from_reg(pts_out_init, bbox_init.detach())else:pts_out_init = pts_out_init + points_init# refine and classify reppointspts_out_init_grad_mul = (1 - self.gradient_mul) * pts_out_init.detach() + self.gradient_mul * pts_out_initdcn_offset = pts_out_init_grad_mul - dcn_base_offsetcls_out = self.reppoints_cls_out(self.relu(self.reppoints_cls_conv(cls_feat, dcn_offset)))pts_out_refine = self.reppoints_pts_refine_out(self.relu(self.reppoints_pts_refine_conv(pts_feat, dcn_offset)))if self.use_grid_points:pts_out_refine, bbox_out_refine = self.gen_grid_from_reg(pts_out_refine, bbox_out_init.detach())else:pts_out_refine = pts_out_refine + pts_out_init.detach()return cls_out, pts_out_init, pts_out_refine

總結與tips

這篇論文在我的理解中,更像是將可形變卷積應用在了目標檢測領域,通過定位和分類的監督loss來監督可形變卷積對于物體偏移量的學習,使得卷積的學習變得可解釋性.啟發我們可以可以用不同的監督信息來使用可形變卷積.

respoint 如何解決同一位置多個物體的遮擋問題:

In RPDet, we show that this issue can be greatly alleviated by using the FPN structure [24] for the following reasons: first, objects of different scales will be assigned to different image feature levels, which addresses objects of different scales and the same center points locations; second, FPN has a high-resolution feature map for small objects, which also reduces the chance of two objects having centers located at the same feature position.

作者認為通過rpn結構將不同比例對象分配給不同的圖像特征的方式來解決;

但這種方式能放解決像行人檢測中多個行人遮擋問題還有待商榷.

總結

以上是生活随笔為你收集整理的得到按钮句柄后如何点集_RepPoint：可形变卷积生成的目标轮廓点集的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： C++学习笔记（一）：头文件和源文件
下一篇：设置图例字体_R高级画图（0903）关于