生活随笔
收集整理的這篇文章主要介紹了
yolov3损失函数分析
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
代碼下載:https://github.com/pakaqiu/yolov3_simple
視頻鏈接:https://www.bilibili.com/video/BV1MK4y1X74Q?p=1
yolov1到v3損失函數都是逐步修改而來,特別是v4,v5損失函數的改動對于目標檢測的性能和效果具有較大的提升;當然v3相比v1,v2還是有不少提升的,這種性能提升不僅僅是損失函數所帶來的,網絡結構的修改和優化也具有比較可觀的效果。本文主要講解v3損失函數的設計,這里首先回顧下v1,v2:
v1損失函數:
v2的損失函數:
v2只是在v1的基礎上改動了box寬高的損失計算方式,即去除了w和h的根號:
v3相對v2最大的改動就是分類的損失以及box置信度的損失改為了二分交叉熵:
上式中S為網絡輸出層網格的數目,B為anchors的數目;網絡的輸出為SS大小的特征圖,即網格SS;每個網格有B個anchor,一共會得到SSB個bounding box,那這么多個bounding box,損失函數是如何進行回歸的呢?下面對v3的損失函數中的(a)-(e)一一進行解析:
IijobjI_{ij}^{obj}Iijobj?表示第i個網格中第j個anchor,如果第j個anchor負責這個object,那么Iijobj=1I_{ij}^{obj} = 1Iijobj?=1,否則的話就等于0。一個網絡中有B個anchor,那么負責就是B個anchor中與ground truth box的IOU最大的那個anchor。
IijnoobjI_{ij}^{noobj}Iijnoobj?表示第i個網格中的第j個anchor不負責這個object。
上面Loss中的(a)表示目標物體的中心坐標的誤差,在訓練過程中回歸的是是中心坐標的偏移量,這里需要結合代碼去仔細體會;每個網格有B個anchors,在訓練過程中只取iou最大的anchor才能負責當前網格的回歸。
Loss中的(b)表示目標物體的寬高坐標誤差,訓練過程中并不是直接回歸目標物體的寬高坐標,而是利用網格本身滿足條件的最佳anchor去回歸寬高坐標誤差的。
Loss中 ( c) ,(d)表示目標物體置信度的誤差,這里的損失是采用交叉熵來進行計算的,此外,不管網格是否有負責某個目標物體,都會計算置信度誤差,但是輸入的圖像中大部分的空間不包含目標物體,只有少部分空間包含了目標物體,因此需要添加權重對不包含目標物體的置信度損失進行約束。其中C~ji\tilde{C}_{j}^{i}C~ji?表示真實值,C~ji\tilde{C}_{j}^{i}C~ji?的值由網格的bounding box是否負責預測某個對象決定;負責,則C~ji\tilde{C}_{j}^{i}C~ji? = 1,反之,則相反;Cji{C}_{j}^{i}Cji?表示擬合值。
Loss中(e)表示目標物體的分類誤差,這里的損失采用交叉熵進行計算,只有當第i個網格的第j個anchor負責某個真實目標物體,這個anchor產生的bounding box才會去計算分類損失精度。
具體代碼實現:
def forward(self
, x
, targets
=None, img_dim
=None):FloatTensor
= torch
.cuda
.FloatTensor
if x
.is_cuda
else torch
.FloatTensorLongTensor
= torch
.cuda
.LongTensor
if x
.is_cuda
else torch
.LongTensorByteTensor
= torch
.cuda
.ByteTensor
if x
.is_cuda
else torch
.ByteTensorself
.img_dim
= img_dimnum_samples
= x
.size
(0)grid_size
= x
.size
(2)prediction
= (x
.view
(num_samples
, self
.num_anchors
, self
.num_classes
+ 5, grid_size
, grid_size
).permute
(0, 1, 3, 4, 2).contiguous
())x
= torch
.sigmoid
(prediction
[..., 0]) y
= torch
.sigmoid
(prediction
[..., 1]) w
= prediction
[..., 2] h
= prediction
[..., 3] pred_conf
= torch
.sigmoid
(prediction
[..., 4]) pred_cls
= torch
.sigmoid
(prediction
[..., 5:]) if grid_size
!= self
.grid_size
:self
.compute_grid_offsets
(grid_size
, cuda
=x
.is_cuda
)pred_boxes
= FloatTensor
(prediction
[..., :4].shape
)pred_boxes
[..., 0] = x
.data
+ self
.grid_xpred_boxes
[..., 1] = y
.data
+ self
.grid_ypred_boxes
[..., 2] = torch
.exp
(w
.data
) * self
.anchor_wpred_boxes
[..., 3] = torch
.exp
(h
.data
) * self
.anchor_houtput
= torch
.cat
((pred_boxes
.view
(num_samples
, -1, 4) * self
.stride
,pred_conf
.view
(num_samples
, -1, 1),pred_cls
.view
(num_samples
, -1, self
.num_classes
),),-1,)if targets
is None:return output
, 0else:iou_scores
, class_mask
, obj_mask
, noobj_mask
, tx
, ty
, tw
, th
, tcls
, tconf
= build_targets
(pred_boxes
=pred_boxes
,pred_cls
=pred_cls
,target
=targets
,anchors
=self
.scaled_anchors
,ignore_thres
=self
.ignore_thres
,)loss_x
= self
.mse_loss
(x
[obj_mask
], tx
[obj_mask
])loss_y
= self
.mse_loss
(y
[obj_mask
], ty
[obj_mask
])loss_w
= self
.mse_loss
(w
[obj_mask
], tw
[obj_mask
])loss_h
= self
.mse_loss
(h
[obj_mask
], th
[obj_mask
])loss_conf_obj
= self
.bce_loss
(pred_conf
[obj_mask
], tconf
[obj_mask
])loss_conf_noobj
= self
.bce_loss
(pred_conf
[noobj_mask
], tconf
[noobj_mask
])loss_conf
= self
.obj_scale
* loss_conf_obj
+ self
.noobj_scale
* loss_conf_noobjloss_cls
= self
.bce_loss
(pred_cls
[obj_mask
], tcls
[obj_mask
])total_loss
= loss_x
+ loss_y
+ loss_w
+ loss_h
+ loss_conf
+ loss_clscls_acc
= 100 * class_mask
[obj_mask
].mean
()conf_obj
= pred_conf
[obj_mask
].mean
()conf_noobj
= pred_conf
[noobj_mask
].mean
()conf50
= (pred_conf
> 0.5).float()iou50
= (iou_scores
> 0.5).float()iou75
= (iou_scores
> 0.75).float()detected_mask
= conf50
* class_mask
* tconfprecision
= torch
.sum(iou50
* detected_mask
) / (conf50
.sum() + 1e-16)recall50
= torch
.sum(iou50
* detected_mask
) / (obj_mask
.sum() + 1e-16)recall75
= torch
.sum(iou75
* detected_mask
) / (obj_mask
.sum() + 1e-16)self
.metrics
= {"loss": to_cpu
(total_loss
).item
(),"x": to_cpu
(loss_x
).item
(),"y": to_cpu
(loss_y
).item
(),"w": to_cpu
(loss_w
).item
(),"h": to_cpu
(loss_h
).item
(),"conf": to_cpu
(loss_conf
).item
(),"cls": to_cpu
(loss_cls
).item
(),"cls_acc": to_cpu
(cls_acc
).item
(),"recall50": to_cpu
(recall50
).item
(),"recall75": to_cpu
(recall75
).item
(),"precision": to_cpu
(precision
).item
(),"conf_obj": to_cpu
(conf_obj
).item
(),"conf_noobj": to_cpu
(conf_noobj
).item
(),"grid_size": grid_size
,}return output
, total_loss
def build_targets(pred_boxes
, pred_cls
, target
, anchors
, ignore_thres
):ByteTensor
= torch
.cuda
.ByteTensor
if pred_boxes
.is_cuda
else torch
.ByteTensorFloatTensor
= torch
.cuda
.FloatTensor
if pred_boxes
.is_cuda
else torch
.FloatTensornB
= pred_boxes
.size
(0)nA
= pred_boxes
.size
(1)nC
= pred_cls
.size
(-1)nG
= pred_boxes
.size
(2)obj_mask
= ByteTensor
(nB
, nA
, nG
, nG
).fill_
(0)noobj_mask
= ByteTensor
(nB
, nA
, nG
, nG
).fill_
(1)class_mask
= FloatTensor
(nB
, nA
, nG
, nG
).fill_
(0)iou_scores
= FloatTensor
(nB
, nA
, nG
, nG
).fill_
(0)tx
= FloatTensor
(nB
, nA
, nG
, nG
).fill_
(0)ty
= FloatTensor
(nB
, nA
, nG
, nG
).fill_
(0)tw
= FloatTensor
(nB
, nA
, nG
, nG
).fill_
(0)th
= FloatTensor
(nB
, nA
, nG
, nG
).fill_
(0)tcls
= FloatTensor
(nB
, nA
, nG
, nG
, nC
).fill_
(0)target_boxes
= target
[:, 2:6] * nGgxy
= target_boxes
[:, :2]gwh
= target_boxes
[:, 2:]ious
= torch
.stack
([bbox_wh_iou
(anchor
, gwh
) for anchor
in anchors
])best_ious
, best_n
= ious
.max(0) b
, target_labels
= target
[:, :2].long().t
()gx
, gy
= gxy
.t
()gw
, gh
= gwh
.t
()gi
, gj
= gxy
.long().t
()obj_mask
[b
, best_n
, gj
, gi
] = 1 noobj_mask
[b
, best_n
, gj
, gi
] = 0 for i
, anchor_ious
in enumerate(ious
.t
()):noobj_mask
[b
[i
], anchor_ious
> ignore_thres
, gj
[i
], gi
[i
]] = 0 tx
[b
, best_n
, gj
, gi
] = gx
- gx
.floor
() ty
[b
, best_n
, gj
, gi
] = gy
- gy
.floor
()tw
[b
, best_n
, gj
, gi
] = torch
.log
(gw
/ anchors
[best_n
][:, 0] + 1e-16) th
[b
, best_n
, gj
, gi
] = torch
.log
(gh
/ anchors
[best_n
][:, 1] + 1e-16)tcls
[b
, best_n
, gj
, gi
, target_labels
] = 1class_mask
[b
, best_n
, gj
, gi
] = (pred_cls
[b
, best_n
, gj
, gi
].argmax
(-1) == target_labels
).float()iou_scores
[b
, best_n
, gj
, gi
] = bbox_iou
(pred_boxes
[b
, best_n
, gj
, gi
], target_boxes
, x1y1x2y2
=False)tconf
= obj_mask
.float()return iou_scores
, class_mask
, obj_mask
, noobj_mask
, tx
, ty
, tw
, th
, tcls
, tconf
水平有限,不當之處請指教,謝謝!
總結
以上是生活随笔為你收集整理的yolov3损失函数分析的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。