當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Soft NMS+Softer NMS+KL Loss

發(fā)布時間：2023/12/14 编程问答 34 豆豆

生活随笔收集整理的這篇文章主要介紹了 Soft NMS+Softer NMS+KL Loss 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

論文1： Soft-NMS – Improving Object Detection With One Line of Code （ICCV2017）速達(dá)>>
論文2： Softer-NMS–Bounding Box Regression with Uncertainty for Accurate Object Detection（CVPR 2019）速達(dá)>>

文章目錄

- 針對問題
- Soft-NMS
- - - Soft NMS算法流程
    - Soft-NMS算法實現(xiàn)
    - 實驗
- Softer-NMS
- - - Bounding Box Regression with KL Loss
    - Softer-NMS算法
    - 實驗
- 參考文獻

針對問題

傳統(tǒng)的NMS存在的問題:

同類的兩目標(biāo)重合度比較大時，容易誤刪，如 Figure 1
如果過目標(biāo)附近的預(yù)測框均不好呢？Figure 2 (a)的情況如何抉擇，兩個框都不是好的選擇
IoU 和分類 score 并不強相關(guān)，最高 score 的框不一定是最好的，如 Figure 2 (b)

Soft-NMS

Soft NMS算法流程

NMS 刪除框的方式太 Hard，容易誤刪，針對該問題改進 NMS 得到了Soft NMS：IOU超過閾值時不是立馬將其當(dāng)做重復(fù)框剔除，而是降低其分?jǐn)?shù)，最后剔除分?jǐn)?shù)低的，大致流程如下：

傳統(tǒng)NMS處理方式比較剛，超過設(shè)定閾值就刪除該框，容易誤傷友軍（兩個目標(biāo)的框被當(dāng)做一個目標(biāo)的給處理了）：
$si={si,iou(M,bi)≤Nt0,iou(M,bi)≥Nt\begin{aligned}s_i = \left\{\begin{aligned} s_i,\qquad iou(\mathcal M,b_i)\leq N_t \\ 0,\qquad iou(\mathcal M,b_i)\geq N_t \end{aligned}\right.\end{aligned}$

Soft NMS則更溫柔，多給了重合度高的框一個證明自己的機會，以一個降低該框 score 方式（IoU越高則分?jǐn)?shù)應(yīng)該越低），讓其重新去后面排隊。最容易想到的就是用線性的方法，將該框 score 和 IoU 直接相乘，論文中給出的是乘以 $(1?iou(M,bi)(1-iou(\mathcal M,b_i)$ ：
$si={si,iou(M,bi)≤Ntsi(1?iou(M,bi)),iou(M,bi)≥Nt\begin{aligned}s_i = \left\{\begin{aligned} &s_i,\qquad\qquad\qquad\qquad\quad iou(\mathcal M,b_i)\leq N_t \\ &s_i(1-iou(\mathcal M,b_i)),\qquad iou(\mathcal M,b_i)\geq N_t \end{aligned}\right.\end{aligned}$

但上式不連續(xù)，所以實際上用 高斯函數(shù)：
$si=sie?iou(M,bi)2σ,?bi?Ds_i = s_ie^{-\frac{iou(\mathcal M,b_i)^2}{\sigma}},\forall b_i \notin \mathcal D$

當(dāng)然，最后還是得選擇一個合適的score閾值來去掉那些重復(fù)框

Soft-NMS算法實現(xiàn)

def cpu_soft_nms(np.ndarray[float, ndim=2] boxes, float sigma=0.5, float Nt=0.3, float threshold=0.001, unsigned int method=0):cdef unsigned int N = boxes.shape[0]cdef float iw, ih, box_areacdef float uacdef int pos = 0cdef float maxscore = 0cdef int maxpos = 0cdef float x1,x2,y1,y2,tx1,tx2,ty1,ty2,ts,area,weight,ovfor i in range(N):# 在i之后找到confidence最高的框，標(biāo)記為max_posmaxscore = boxes[i, 4]maxpos = itx1 = boxes[i,0]ty1 = boxes[i,1]tx2 = boxes[i,2]ty2 = boxes[i,3]ts = boxes[i,4]pos = i + 1# 找到max的框while pos < N:if maxscore < boxes[pos, 4]:maxscore = boxes[pos, 4]maxpos = pospos = pos + 1# 交換max_pos位置和i位置的數(shù)據(jù)# add max box as a detection boxes[i,0] = boxes[maxpos,0]boxes[i,1] = boxes[maxpos,1]boxes[i,2] = boxes[maxpos,2]boxes[i,3] = boxes[maxpos,3]boxes[i,4] = boxes[maxpos,4]# swap ith box with position of max boxboxes[maxpos,0] = tx1boxes[maxpos,1] = ty1boxes[maxpos,2] = tx2boxes[maxpos,3] = ty2boxes[maxpos,4] = tstx1 = boxes[i,0]ty1 = boxes[i,1]tx2 = boxes[i,2]ty2 = boxes[i,3]ts = boxes[i,4]# 交換完畢# 開始循環(huán)pos = i + 1while pos < N:# 先記錄內(nèi)層循環(huán)的數(shù)據(jù)bix1 = boxes[pos, 0]y1 = boxes[pos, 1]x2 = boxes[pos, 2]y2 = boxes[pos, 3]s = boxes[pos, 4]# 計算iouarea = (x2 - x1 + 1) * (y2 - y1 + 1)iw = (min(tx2, x2) - max(tx1, x1) + 1) # 計算兩個框交叉矩形的寬度，如果寬度小于等于0，即沒有相交，因此不需要判斷if iw > 0:ih = (min(ty2, y2) - max(ty1, y1) + 1) # 同理if ih > 0:ua = float((tx2 - tx1 + 1) * (ty2 - ty1 + 1) + area - iw * ih) #計算union面積ov = iw * ih / ua #iou between max box and detection boxif method == 1: # linearif ov > Nt: weight = 1 - ovelse:weight = 1elif method == 2: # gaussianweight = np.exp(-(ov * ov)/sigma)else: # original NMSif ov > Nt: weight = 0else:weight = 1boxes[pos, 4] = weight*boxes[pos, 4]# if box score falls below threshold, discard the box by swapping with last box# update Nif boxes[pos, 4] < threshold:boxes[pos,0] = boxes[N-1, 0]boxes[pos,1] = boxes[N-1, 1]boxes[pos,2] = boxes[N-1, 2]boxes[pos,3] = boxes[N-1, 3]boxes[pos,4] = boxes[N-1, 4]N = N - 1pos = pos - 1pos = pos + 1keep = [i for i in range(N)]return keep

實驗

Softer-NMS

Soft-NMS 針對的是誤刪的問題，對另外兩個問題沒有考慮，而要解釋 Softer-NMS 首先得介紹 Bounding Box Regression with KL Loss

Bounding Box Regression with KL Loss

在原本分類和回歸兩個支路的基礎(chǔ)上，增加了一條關(guān)于Box std（目標(biāo)框與對應(yīng)預(yù)測框的距離）的回歸支路，定位的同時估計定位置信度，指導(dǎo)修正預(yù)測框位置

假設(shè)預(yù)測框位置與目標(biāo)框位置間的距離分布為高斯分布， $x_e$ 表示預(yù)測框位置，用一維高斯分布描述如下：
$PΘ(x)=12πσ2e?(x?xe)22σ2P_\Theta(x) = \frac1{2\pi\sigma^2}e^{-\frac{(x-x_e)^2}{2\sigma^2}}$

目標(biāo)框位置視為狄拉克分布（只存在有沒有的問題）
$PD(x)=δ(x?xg)P_D(x) = \delta(x - x_g)$

狄拉克函數(shù)性質(zhì)： $∫?∞+∞PD(x)dx=1\int^{+\infty}_{-\infty}P_D(x)dx=1$

KL距離：衡量兩個分布間的差異，也稱為 KL散度(Kullback-Leibler divergence)、相對熵(relative entropy)，令 $p, q$ 分別為真實和假設(shè)兩個分布，則兩分布間的 KL 距離為：
$DKL(p∣∣q)=Ep[log?p(x)?真實分布q(x)?假設(shè)分布]=∑x∈χp(x)log?p(x)q(x)=∑x∈χ[p(x)log?p(x)?p(x)log?q(x)]=∑x∈χp(x)log?p(x)?∑x∈χp(x)log?q(x)\begin{aligned} D_{KL}(p||q)&=E_p\bigg[\log\frac{\overbrace{p(x)}^{\color{blue}\text{真實分布}} }{\underbrace{q(x)}_{\color{blue}\text{假設(shè)分布}} } \bigg]=\sum_{x∈χ} p(x)\log\frac{p(x)}{q(x)}\\ &=\sum_{x∈χ}[p(x)\log p(x)?p(x)\log q(x)]\\ &=\sum_{x∈χ}p(x)\log p(x)?\sum_{x∈χ}p(x)\log q(x)\\ \end{aligned}$

所以優(yōu)化目標(biāo)就是讓預(yù)測框分布與目標(biāo)框分布接近：

又有：

對 $x_e$ 和 $σ\sigma$ 分別求偏導(dǎo)：

$σ\sigma$ 作為分母，為避免梯度爆炸，令 $α=log?(σ2)\alpha=\log(\sigma^2)$ 代替 $σ\sigma$ ，：

參照 $Smooth?L1Loss\text{Smooth}\ {L_1}\ Loss$ 的形式：
$SmoothL1(x)={0.5x2if∣x∣<1∣x∣?0.5otherwise\text{Smooth}_{L_1}(x)=\bigg\{\begin{aligned} &0.5x^2 \qquad \quad\; if\;|x|<1\\ &|x|-0.5\qquad otherwise \end{aligned}$

當(dāng) $x_g-x_e >1|$ 時， $L_{reg}$ 取下列形式

最終的損失形式為：
$SmoothLreg(x)={e?α2(xg?xe)2+12αif∣xg?xe∣<1e?α(∣xg?xe∣?12)+12αotherwise{\color{blue}\text{Smooth}_{L_{reg}}}(x)=\left\{\begin{aligned} &\frac{e^{-\alpha}}{2}(x_g-x_e)^2+\frac 12 \alpha \qquad \qquad\;\; if\;|xg-x_e|<1\\ &{e^{-\alpha}}\left(|x_g-x_e|-\frac12\right)+\frac 12 \alpha\qquad otherwise \end{aligned} \right.$

Softer-NMS算法

Softer-NMS 與標(biāo)準(zhǔn) NMS 不同的是：超過閾值的框根據(jù) IoU 置信度加權(quán)合并多個框得到最終框，而不是直接舍棄

IoU置信度與兩個因素有關(guān)：

方差：方差大置信度低（認(rèn)為離目標(biāo)遠(yuǎn)）
IOU：IOU小置信度低（與 score 最高的Box）

分類分?jǐn)?shù)與權(quán)重?zé)o關(guān)，因為得分較低的盒子可能具有較高的定位置信度

Softer-NMS 算法流程

實驗

參考文獻

【1】Soft(er)-NMS：非極大值抑制算法的兩個改進算法
【2】softer-nms論文學(xué)習(xí)詳解
【3】Soft NMS算法筆記

總結(jié)

以上是生活随笔為你收集整理的Soft NMS+Softer NMS+KL Loss的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇： L2-离散变量分布：Bernoulli分
下一篇：新员工加入企业微信的2种方法：邀请和被邀