當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

[论文笔记]CVPR2017_Joint Detection and Identification Feature Learning for Person Search

發(fā)布時(shí)間：2023/12/13 编程问答 49 豆豆

生活随笔收集整理的這篇文章主要介紹了 [论文笔记]CVPR2017_Joint Detection and Identification Feature Learning for Person Search 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

Title:?Joint Detection and Identification Feature Learning for Person Search;

aXiv上該論文的第一個(gè)版本題目是 End-to-End Deep Learning for Person Search

Authors:?Tong Xiao^1*?; Shuang Li^1*?; Bochao Wang²?; Liang Lin^2;? Xiaogang Wang¹

Affilations: 1.The Chinese University of Hong Kong; 2.Sun Yat-Sen University

Paper?Code

第一遍看的時(shí)候看的是第一個(gè)版本，只簡(jiǎn)單地掃了一眼結(jié)構(gòu)圖，覺得就是對(duì)faster r-cnn做了小修，而且沒有OIM loss，覺得創(chuàng)新性一般。然后發(fā)現(xiàn)好幾篇后來的文章都用了OIM loss，回過頭來再細(xì)看文章才發(fā)現(xiàn)文章有很多有意思的地方。慚愧！

Motivation

person re-id問題往往是用已經(jīng)cropped的行人圖像塊進(jìn)行檢索，判斷query和gallary中的圖像是否是同一個(gè)identity。這里面存在幾個(gè)問題：

　　①現(xiàn)實(shí)中檢索都是直接從原始場(chǎng)景圖像中實(shí)現(xiàn)，而不是利用detection之后的cropped image；

　　②很多數(shù)據(jù)集都是手動(dòng)標(biāo)注的框，實(shí)際上detector的檢測(cè)精度以及是否存在漏檢都會(huì)對(duì)行人重識(shí)別的結(jié)果造成影響。

因此，作者提出端到端的person search思想，將detection和re-id問題融在一起。

模型

網(wǎng)絡(luò)的輸入是整張圖像；
pedestrian proposal net：輸入經(jīng)過ResNet-50的第一個(gè)部分(conv1-conv4_3)之后輸出1024d的feature maps(大小是原輸入的1/16)；類似于RPN，該feature map先經(jīng)過一個(gè)$512\times3\times3$的卷積，得到的特征每個(gè)位置的9個(gè)anchors分別送入一個(gè)softmax classifier（person/non-person）和linear layer（bbox regression）；bbox經(jīng)過NMS，得到128個(gè)final proposals；
identification net：每個(gè)proposal經(jīng)過ROI pooling得到$1024\times14\times14$的特征，然后送入ResNet-50的第二個(gè)部分(conv4_4-conv5_3)，經(jīng)過一個(gè)GAP(global average pooling)得到一個(gè)1024維的feature map；這個(gè)1024 feature map一分為三：①softmax二分類；②linear regression位置回歸；③映射成一個(gè)256維、l2 normalized的子控件，實(shí)際上是一個(gè)FC層，得到256d的id-feat，inference階段id-feat用來計(jì)算consine similarity，training階段用來計(jì)算OIM loss。

Online Instance Matching Loss（OIM LOSS）

注意是用所有final proposals的256d id-feat計(jì)算OIM loss。

訓(xùn)練集中有$L$個(gè)labeled identities，賦予他們class-id（1到$L$）；也有許多unlabeled identities；還有許多背景和錯(cuò)誤信息。OIM只考慮前兩種。

做法：

對(duì)于labeled identities: 記mini-batch中的一個(gè)labeled identity為$x\in\mathbb{R}^D$，$D$是特征維度。線下計(jì)算和存儲(chǔ)一個(gè)lookup table(LUT)$V\in\mathbb{R}^{D \times L}$，里面存儲(chǔ)著所有l(wèi)abeled identities的id-feat。
- 前向階段，用$V^Tx$計(jì)算mini-batch中的樣本和所有l(wèi)abeled identities之間的余弦相似性。
- 后向階段，如果目標(biāo)的class-id是$t$，那么用$v_t \leftarrow \gamma v_t+(1-\gamma)x$更新LUT的第$t$列，其中$r\in[0,1]$不明白為什么這么更新
對(duì)于unlabeled identities，由于數(shù)量不等，作者用了一個(gè)循環(huán)隊(duì)列來存儲(chǔ)$U\in\mathbb{R}^{D \times Q}$，$Q$是隊(duì)列空間大小。同樣用$U^Tx$來計(jì)算mini-batch中樣本和隊(duì)列中unlabeled identities的余弦相似性。每次循環(huán)，將新的feature vector push，pop一個(gè)舊的，保證隊(duì)列大小不變。
基于上述結(jié)構(gòu)，$x$被認(rèn)作class-id $i$的概率用softmax函數(shù)計(jì)算

同樣，被認(rèn)作第$i$個(gè)unlabeled identity的概率是

OIM objective是最大化log似然的期望

求導(dǎo)是

為什么不用softmax loss直接分類?

一是類別太多，而每類的正樣本太少，使得訓(xùn)練很難
二是無法利用unlabeled identities，因?yàn)樗麄儧]有標(biāo)簽

Dataset

作者提出了新的person search的數(shù)據(jù)集，包含street view和視頻截圖，即CUHK-SYSY

Evaluation Protocols and Metrics

person search很自然地繼承了detection和re-ID的評(píng)價(jià)指標(biāo)，cumulative?matching characteristics (CMC top-K) 和mean averaged?precision (mAP)。這里要注意和person re-id中這兩個(gè)指標(biāo)的異同。

CMC

原文：a matching is counted if there is at?least one of the top-K predicted bounding boxes overlaps?with the ground truths with intersection-over-union (IoU)?greater or equal to 0.5.

這里相對(duì)好理解，對(duì)于輸出的bbox，與GT的IoU>0.5的算作candidates，然后和re-id一樣計(jì)算top K中是否包含，包含則算做匹配上。對(duì)于誤檢或者漏檢不管。

mAP

原文：（MAP）is inspired from?the object detection tasks. We follow the ILSVRC object?detection criterion [29] to judge the correctness of predicted?bounding boxes. An averaged precision (AP) is calculated?for each query based on the precision-recall curve, and then?we average the APs across all the queries to get the final?result.

這個(gè)和reid的mAP應(yīng)該有較大區(qū)別；應(yīng)該是對(duì)每個(gè)query相當(dāng)于一類，求detection的AP

轉(zhuǎn)載于:https://www.cnblogs.com/xiaoaoran/p/11125791.html

總結(jié)

以上是生活随笔為你收集整理的[论文笔记]CVPR2017_Joint Detection and Identification Feature Learning for Person Search的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：在线诊断工具arthas （window
下一篇： HBase性能优化总结