當(dāng)前位置：首頁(yè) > 人文社科 > 生活经验 >内容正文

生活经验

计算机视觉：Bag of words算法实现图像识别与搜索

發(fā)布時(shí)間：2023/11/27 生活经验 35 豆豆

生活随笔收集整理的這篇文章主要介紹了计算机视觉：Bag of words算法实现图像识别与搜索小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

計(jì)算機(jī)視覺(jué)：bag of words算法實(shí)現(xiàn)圖像識(shí)別與搜索

原理
- 綜述
- 基礎(chǔ)流程
結(jié)果與解析
- 數(shù)據(jù)集
- 結(jié)果與解析
- 總結(jié)
- 源代碼
出現(xiàn)的錯(cuò)誤及解決方案

原理

綜述

Bag of words，顧名思義，就是單詞袋模型。這里的“單詞”指代我們?cè)趫D像數(shù)據(jù)庫(kù)中所提取出的“圖像特征”，每個(gè)特征就是一個(gè)單詞，如下圖所示。我們主要通過(guò)匹配圖像中出現(xiàn)單詞頻率“最像”的圖像，為其匹配圖像。通過(guò)獲取到的單詞直方圖，計(jì)算其與數(shù)據(jù)庫(kù)中圖像的歐氏距離，規(guī)定一定閾值內(nèi)的圖像為其所匹配到的圖像，最終實(shí)現(xiàn)圖像識(shí)別與搜索。

基礎(chǔ)流程

1.特征提取
如下圖所示，提取數(shù)據(jù)圖像中的每個(gè)特征點(diǎn)即sift特征點(diǎn)。

2.學(xué)習(xí) “視覺(jué)詞典”
將所有提取到的特征點(diǎn)進(jìn)行歸類(lèi)，獲得“單詞”，將所有的單詞整合為一份“視覺(jué)詞典”

3.數(shù)據(jù)量化
針對(duì)所輸入的圖片數(shù)據(jù)特征集，根據(jù)所得到的視覺(jué)詞典對(duì)其進(jìn)行量化

4.詞頻統(tǒng)計(jì)
將輸入的圖片數(shù)據(jù)轉(zhuǎn)化為視覺(jué)單詞的頻率直方圖

5.聚類(lèi)
對(duì)圖像進(jìn)行聚類(lèi)，可以使用K-means算法
首先隨機(jī)初始化 K 個(gè)聚類(lèi)中心，接著對(duì)應(yīng)每個(gè)特征，根據(jù)距離關(guān)系賦值給某個(gè)中心/類(lèi)別，再對(duì)每個(gè)類(lèi)別，根據(jù)其對(duì)應(yīng)的特征集重新計(jì)算聚類(lèi)中心，重復(fù)該過(guò)程直值算法收斂，完成聚類(lèi)。
在聚類(lèi)過(guò)程中，可以通過(guò)給定輸入圖像的BOW直方圖, 在數(shù)據(jù)庫(kù)中查找 k 個(gè)最近鄰的圖像，可以根據(jù)這k個(gè)近鄰圖像的分類(lèi)標(biāo)簽，進(jìn)行投票獲得分類(lèi)結(jié)果。

結(jié)果與解析

數(shù)據(jù)集

本文所用數(shù)據(jù)集為實(shí)地拍攝的廈門(mén)市各自然景區(qū)與博物館場(chǎng)景圖片數(shù)據(jù)，經(jīng)Photoshop處理統(tǒng)一格式為 500X376大小的JPG圖片，共130張，部分圖片數(shù)據(jù)如下圖

經(jīng)過(guò)特征提取等處理后生成的數(shù)據(jù)庫(kù)如下圖，用于記錄圖片數(shù)據(jù)的名稱(chēng)，特征匹配值地址等數(shù)據(jù)

結(jié)果與解析

通過(guò)該算法，我們實(shí)現(xiàn)了輸入一張圖片找到數(shù)據(jù)集中多張與其類(lèi)似的圖片。
所得結(jié)果如下圖所示，其中第一張圖片為輸入圖片，2至5張為匹配后所得到的圖片，從圖中可以看出其成功匹配出了多張?jiān)撋碁┑膱D片，但是也會(huì)出現(xiàn)最后一張麗龜標(biāo)本圖所示毫不相干的圖片。這是由于該算法應(yīng)用sift特征取過(guò)程中可能會(huì)出現(xiàn)錯(cuò)誤的特征匹配，而在后續(xù)進(jìn)行圖片搜索，進(jìn)行特征直方圖對(duì)比過(guò)程中就會(huì)出現(xiàn)錯(cuò)配。但由于只是局部錯(cuò)配，因此在進(jìn)行重排后其特征匹配的比率會(huì)低于像第二張圖片這樣與輸入圖片相似的圖片數(shù)據(jù)。

同時(shí)，打開(kāi)訓(xùn)練數(shù)據(jù)集，我們會(huì)發(fā)現(xiàn)里面有多張?jiān)撋碁┑膱D片，但是卻未能夠進(jìn)行正確的匹配，對(duì)于這一點(diǎn)我還有一些疑惑，個(gè)人覺(jué)得可能是由于沙灘的特征比較難提取，角點(diǎn)較少，且無(wú)法提取到關(guān)鍵區(qū)域的特征點(diǎn)多導(dǎo)致。

接著通過(guò)匹配灰鯨標(biāo)本圖片，并輸入9張類(lèi)似圖片。可以發(fā)現(xiàn)所匹配的圖像基本都是準(zhǔn)確的，或是像最后幾張圖一般輪廓大致相似的圖片，出現(xiàn)少量差別較大圖片原因如上文所述。而對(duì)于下圖所示出現(xiàn)輪廓相似的圖片。個(gè)人認(rèn)為是由于在將特征識(shí)別并匯入詞典時(shí)都是應(yīng)用灰度圖的形式，因此可能會(huì)將不同顏色的特征單詞識(shí)別為同一個(gè)單詞，從而造成的特征匹配錯(cuò)誤。

總結(jié)

問(wèn)題	原因	解決方案
大差異圖像錯(cuò)配	sift特征識(shí)別問(wèn)題
相似圖像未匹配	圖像景物單一，特征重復(fù)
同輪廓圖像錯(cuò)配	灰度圖問(wèn)題	改為RGB格式詞典

源代碼

1.生成詞匯表

import pickle
from PCV.imagesearch import vocabulary
from PCV.tools.imtools import get_imlist
from PCV.localdescriptors import sift
from PCV.imagesearch import imagesearch
from PCV.geometry import homography
from sqlite3 import dbapi2 as sqlite#獲取圖像列表
imlist = get_imlist('datasets/')
nbr_images = len(imlist)
#獲取特征列表
featlist = [imlist[i][:-3]+'sift' for i in range(nbr_images)]#提取文件夾下圖像的sift特征
for i in range(nbr_images):sift.process_image(imlist[i], featlist[i])#生成詞匯
voc = vocabulary.Vocabulary('ukbenchtest')
voc.train(featlist, 888, 10) # 使用k-means算法在featurelist里邊訓(xùn)練處一個(gè)詞匯# 注意這里使用了下采樣的操作加快訓(xùn)練速度# 將描述子投影到詞匯上，以便創(chuàng)建直方圖
#保存詞匯
# saving vocabulary
with open('BOW/vocabulary.pkl', 'wb') as f:pickle.dump(voc, f)
print ('vocabulary is:', voc.name, voc.nbr_words)

2.生成數(shù)據(jù)庫(kù)

import pickle
from PCV.imagesearch import vocabulary
from PCV.tools.imtools import get_imlist
from PCV.localdescriptors import sift
from PCV.imagesearch import imagesearch
from PCV.geometry import homography
from sqlite3 import dbapi2 as sqlite # 使用sqlite作為數(shù)據(jù)庫(kù)#獲取圖像列表
imlist = get_imlist('datasets/')
nbr_images = len(imlist)
#獲取特征列表
featlist = [imlist[i][:-3]+'sift' for i in range(nbr_images)]# load vocabulary
#載入詞匯
with open('BOW/vocabulary.pkl', 'rb') as f:voc = pickle.load(f)
#創(chuàng)建索引
indx = imagesearch.Indexer('testImaAdd.db',voc) # 在Indexer這個(gè)類(lèi)中創(chuàng)建表、索引，將圖像數(shù)據(jù)寫(xiě)入數(shù)據(jù)庫(kù)
indx.create_tables() # 創(chuàng)建表
# go through all images, project features on vocabulary and insert
#遍歷所有的圖像，并將它們的特征投影到詞匯上
for i in range(nbr_images)[:888]:locs,descr = sift.read_features_from_file(featlist[i])indx.add_to_index(imlist[i],descr) # 使用add_to_index獲取帶有特征描述子的圖像，投影到詞匯上# 將圖像的單詞直方圖編碼存儲(chǔ)
# commit to database
#提交到數(shù)據(jù)庫(kù)
indx.db_commit()con = sqlite.connect('testImaAdd.db')
print (con.execute('select count (filename) from imlist').fetchone())
print (con.execute('select * from imlist').fetchone())

3.圖像搜索

# -*- coding: utf-8 -*- 
#使用視覺(jué)單詞表示圖像時(shí)不包含圖像特征的位置信息
import pickle
from PCV.localdescriptors import sift
from PCV.imagesearch import imagesearch
from PCV.geometry import homography
from PCV.tools.imtools import get_imlist# load image list and vocabulary
#載入圖像列表
#imlist = get_imlist('E:/Python37_course/test7/first1000/')
imlist = get_imlist('datasets/')
nbr_images = len(imlist)
#載入特征列表
featlist = [imlist[i][:-3]+'sift' for i in range(nbr_images)]#載入詞匯
'''with open('E:/Python37_course/test7/first1000/vocabulary.pkl', 'rb') as f:voc = pickle.load(f)'''
with open('BOW/vocabulary.pkl', 'rb') as f:voc = pickle.load(f)src = imagesearch.Searcher('testImaAdd.db',voc)# Searcher類(lèi)讀入圖像的單詞直方圖執(zhí)行查詢(xún)# index of query image and number of results to return
#查詢(xún)圖像索引和查詢(xún)返回的圖像數(shù)
q_ind = 0
nbr_results = 130# regular query
# 常規(guī)查詢(xún)(按歐式距離對(duì)結(jié)果排序)
res_reg = [w[1] for w in src.query(imlist[q_ind])[:nbr_results]] # 查詢(xún)的結(jié)果 
print ('top matches (regular):', res_reg)# load image features for query image
#載入查詢(xún)圖像特征進(jìn)行匹配
q_locs,q_descr = sift.read_features_from_file(featlist[q_ind])
fp = homography.make_homog(q_locs[:,:2].T)# RANSAC model for homography fitting
#用單應(yīng)性進(jìn)行擬合建立RANSAC模型
model = homography.RansacModel()
rank = {}
# load image features for result
#載入候選圖像的特征
for ndx in res_reg[1:]:try:locs,descr = sift.read_features_from_file(featlist[ndx]) except:continue#locs,descr = sift.read_features_from_file(featlist[ndx])  # because 'ndx' is a rowid of the DB that starts at 1# get matchesmatches = sift.match(q_descr,descr)ind = matches.nonzero()[0]ind2 = matches[ind]tp = homography.make_homog(locs[:,:2].T)# compute homography, count inliers. if not enough matches return empty list# 計(jì)算單應(yīng)性矩陣try:H,inliers = homography.H_from_ransac(fp[:,ind],tp[:,ind2],model,match_theshold=4)except:inliers = []# store inlier countrank[ndx] = len(inliers)# sort dictionary to get the most inliers first
# 對(duì)字典進(jìn)行排序，可以得到重排之后的查詢(xún)結(jié)果
sorted_rank = sorted(rank.items(), key=lambda t: t[1], reverse=True)
res_geom = [res_reg[0]]+[s[0] for s in sorted_rank]
print ('top matches (homography):', res_geom)# 顯示查詢(xún)結(jié)果
imagesearch.plot_results(src,res_reg[:6]) #常規(guī)查詢(xún)
imagesearch.plot_results(src,res_geom[:6]) #重排后的結(jié)果

出現(xiàn)的錯(cuò)誤及解決方案

https://blog.csdn.net/qq_43605229/article/details/117608008

總結(jié)

以上是生活随笔為你收集整理的计算机视觉：Bag of words算法实现图像识别与搜索的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：照相机模型与增强现实（相机标定）
下一篇：计算机视觉：Bag of words算法