當前位置：首頁 > 人文社科 > 生活经验 >内容正文

生活经验

Python计算机视觉——SIFT特征

發布時間：2023/11/27 生活经验 29 豆豆

生活随笔收集整理的這篇文章主要介紹了 Python计算机视觉——SIFT特征小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

Python計算機視覺——SIFT特征

文章目錄

Python計算機視覺——SIFT特征
- 寫在前面
- 1 SIFT特征算法步驟
- - 1.1 尺度空間的極值檢測
  - 1.2 特征點定位
  - 1.3 特征方向賦值
  - 1.4 特征點描述
- 2 實驗分析
- 3 關鍵點匹配
- 4 匹配地理標記圖像

寫在前面

Scale invariant feature transform（SIFT），中文含義就是尺度不變特征變換。由于在此之前的目標檢測算法對圖片的大小、旋轉非常敏感，而SIFT算法是一種基于局部興趣點的算法，因此不僅對圖片大小和旋轉不敏感，而且對光照、噪聲等影響的抗擊能力也非常優秀，因此，該算法在性能和適用范圍方面較于之前的算法有著質的改變。SIFT算法具的特點：

圖像的局部特征，對旋轉、尺度縮放、亮度變化保持不變，對視角變化、仿射變換、噪聲也保持一定程度的穩定性。
獨特性好，信息量豐富，適用于海量特征庫進行快速、準確的匹配。
多量性，即使是很少幾個物體也可以產生大量的SIFT特征
高速性，經優化的SIFT匹配算法甚至可以達到實時性
擴招性，可以很方便的與其他的特征向量進行聯合。

歸結于上述優點，SIFT特征檢測共分四個步驟實現：

尺度空間的極值檢測 搜索所有尺度空間上的圖像，通過高斯微分函數來識別潛在的對尺度和選擇不變的興趣點。
特征點定位 在每個候選的位置上，通過一個擬合精細模型來確定位置尺度，關鍵點的選取依據他們的穩定程度。
特征方向賦值 基于圖像局部的梯度方向，分配給每個關鍵點位置一個或多個方向，后續的所有操作都是對于關鍵點的方向、尺度和位置進行變換，從而提供這些特征的不變性。
特征點描述 在每個特征點周圍的鄰域內，在選定的尺度上測量圖像的局部梯度，這些梯度被變換成一種表示，這種表示允許比較大的局部形狀的變形和光照變換。

1 SIFT特征算法步驟

1.1 尺度空間的極值檢測

搜索所有尺度空間上的圖像，通過高斯差分金字塔(difference-of-Gaussian function)來識別潛在的對尺度和選擇不變的興趣點。這句話有三個概念要解釋。尺度空間、高斯差分金字塔、興趣點。

尺度空間理論最早于1962年提出，其主要思想是通過對原始圖像進行尺度變換，獲得圖像多尺度下的空間表示。從而實現邊緣、角點檢測和不同分辨率上的特征提取，以滿足特征點的尺度不變性。尺度空間中各尺度圖像的模糊程度逐漸變大，能夠模擬人在距離目標由近到遠時目標在視網膜上的形成過程。尺度越大圖像越模糊。

高斯差分金字塔可以理解為一種圖像的濾波器，那為啥要使用高斯，因為高斯核是唯一可以產生多尺度空間的核。可以通過高斯差分圖像看出圖像上的像素值變化情況（如果沒有變化，也就沒有特征。特征必須是變化盡可能多的點）。DOG圖像描繪的是目標的輪廓。DOG金字塔的第1組第1層是由高斯金字塔的第1組第2層減第1組第1層得到的。以此類推，逐組逐層生成每一個差分圖像，所有差分圖像構成差分金字塔。DOG金字塔的構建可以用下圖描述：每一組在層數上，DOG金字塔比高斯金字塔少一層。后續Sift特征點的提取都是在DOG金字塔上進行的。

哪些是這些感興趣點？這些點是一些十分突出的點不會因光照、尺度、旋轉等因素的改變而消失，比如角點、邊緣點、暗區域的亮點以及亮區域的暗點。

1.2 特征點定位

以上方法檢測到的極值點是離散空間的極值點，以下通過擬合三維二次函數來精確確定關鍵點的位置和尺度，同時去除低對比度的關鍵點和不穩定的邊緣響應點(因為DoG算子會產生較強的邊緣響應)，以增強匹配穩定性、提高抗噪聲能力。離散空間的極值點并不是真正的極值點，下圖顯示了二維函數離散空間得到的極值點與連續空間極值點的差別。利用已知的離散空間點插值得到的連續空間極值點的方法叫做子像素插值（Sub-pixel Interpolation）。

為了提高關鍵點的穩定性，需要對尺度空間DoG函數進行曲線擬合。利用DoG函數在尺度空間的泰勒展開式(擬合函數)為：
$D(X)=D+?DT?XX+12XT?2D?X2XD(X)=D+\frac{\partial D^{T}}{\partial X} X+\frac{1}{2} X^{T} \frac{\partial^{2} D}{\partial X^{2}} X$
其中 $\sigma)^{T}$ 求導并讓方程等于零，可以得到極值點的偏移量為：
$X^=??2D?1?X2?D?X\hat{X}=-\frac{\partial^{2} D^{-1}}{\partial X^{2}} \frac{\partial D}{\partial X}$
對應極值點，方程的值為：
$D(X^)=D+12?DT?XX^D(\hat{X})=D+\frac{1}{2} \frac{\partial D^{T}}{\partial X} \hat{X}$
因此，為了尋找DoG函數的極值點，每一個像素點要和它所有的相鄰點比較，看其是否比它的圖像域和尺度域的相鄰點大或者小。

且由于DoG函數在圖像邊緣有較強的邊緣響應，因此需要排除邊緣響應。 DoG函數的峰值點在邊緣方向有較大的主曲率，而在垂直邊緣的方向有較小的主曲率。主曲率可以通過計算在該點位置尺度的2×2的Hessian矩陣得到，導數由采樣點相鄰差來估計：
$Hessian(H)=[DxxDxyDxyDyy]Hessian(H)=\left[\begin{array}{ll} D_{x x} & D_{x y} \\ D_{x y} & D_{y y} \end{array}\right]$
$D_{xx}$ 表示DOG金字塔中某一尺度的圖像x方向求導兩次。D的主曲率和H的特征值成正比。令 α，β為特征值，則：
$Tr?(H)2Det?(H)=(α+β)2αβdet?(H)=αβtrace?(H)=α+β\begin{aligned} \frac{\operatorname{Tr}(H)^{2}}{\operatorname{Det}(H)} &=\frac{(\alpha+\beta)^{2}}{\alpha \beta} \\ \operatorname{det}(H) &=\alpha \beta \\ \operatorname{trace}(H) &=\alpha+\beta \end{aligned}$
該值在兩特征值相等時達最小。即 $Tr?(H)2Det?(H)<閾值\frac{\operatorname{Tr}(H)^{2}}{\operatorname{Det}(H)}<閾值$ 時保留關鍵點，反之剔除。

1.3 特征方向賦值

特征方向賦值基于圖像局部的梯度方向，分配給每個關鍵點位置一個或多個方向，后續的所有操作都是對于關鍵點的方向、尺度和位置進行變換，從而提供這些特征的不變性。確定關鍵點的方向采用梯度直方圖統計法，統計以關鍵點為原點，一定區域內的圖像像素點對關鍵點方向生成所作的貢獻。對于檢測出的每一個關鍵點，梯度的模值和方向如下：
$m(x,y)=(L(x+1,y)?L(x?1,y))2+(L(x,y+1)?L(x,y?1))2θ(x,y)=tan??1((L(x,y+1)?L(x,y?1))/L(x+1,y)?L(x?1,y)))\begin{aligned} &m(x, y)=\sqrt{(L(x+1, y)-L(x-1, y))^{2}+(L(x, y+1)-L(x, y-1))^{2}} \\ &\left.\theta(x, y)=\tan ^{-1}((L(x, y+1)-L(x, y-1)) / L(x+1, y)-L(x-1, y))\right) \end{aligned}$
在完成關鍵點的梯度計算后，使用直方圖統計鄰域內像素的梯度和方向。梯度直方圖將0~360度的方向范圍分為36個柱(bins)，其中每柱10度。例如下圖，直方圖的峰值方向代表了關鍵點的主方向，(為簡化，圖中只畫了八個方向的直方圖)。

關鍵點的主方向：極值點周圍區域梯度直方圖的主峰值也是特征點方向。

關鍵點的輔方向：在梯度方向直方圖中，當存在另一個相當于主峰值 80%能量的峰值時，則將這個方向認為是該關鍵點的輔方向。

1.4 特征點描述

通過以上步驟，對于每一個關鍵點，擁有三個信息：位置、尺度以及方向。接下來就是為每個關鍵點建立一個描述符，用一組向量將這個關鍵點描述出來，使其不隨各種變化而改變，比如光照變化、視角變化等等。這個描述子不但包括關鍵點，也包含關鍵點周圍對其有貢獻的像素點，并且描述符應該有較高的獨特性，以便于提高特征點正確匹配的概率。 SIFT描述子是關鍵點鄰域高斯圖像梯度統計結果的一種表示。通過對關鍵點周圍圖像區域分塊，計算塊內梯度直方圖，生成具有獨特性的向量，這個向量是該區域圖像信息的一種抽象，具有唯一性。當然，經過Lowe實驗表明，建議描述子使用在關鍵點尺度空間內4×4的窗口中計算的8個方向的梯度信息，共4×4×8=128維向量表征。

2 實驗分析

SIFT選取的對象會使用DoG檢測關鍵點，并且對每個關鍵點周圍的區域計算特征向量，它主要包括兩個操作：檢測和計算，操作的返回值是關鍵點信息和描述符，最后在圖像上繪制關鍵點，并用imshow函數顯示這幅圖像。

首先是處理圖像得txt，txt中存儲圖片提取的重要信息，也就是這個函數的作用，你給我一張圖，我就能給你一個txt。

def process_image(imagename,resultname,params="--edge-thresh 10 --peak-thresh 5"):""" process an image and save the results in a file"""path = os.path.abspath(os.path.join(os.path.dirname("__file__"),os.path.pardir))path = path+"\\ch02\\sift.exe "if imagename[-3:] != 'pgm':#create a pgm fileim = Image.open(imagename).convert('L')im.save('tmp.pgm')imagename = 'tmp.pgm'cmmd = str(path+imagename+" --output="+resultname+" "+params)os.system(cmmd)print ('processed', imagename, 'to', resultname)

txt信息分析：下面數據的每一行前 4 個數值依次表示興趣點的坐標、尺度和方向角度，后面緊接著的是對應描述符的 128 維向量。也就是一個特征點就用128維的向量表示，可以理解為這個向量的身份證。可以發現前兩行的坐標值相同，但是方向不同。當同一個興趣點上出現不同的顯著方向，這種情況就會出現的。

有了上述的特征信息，就能夠讀取特征屬性值，然后將其以矩陣的形式返回：

def read_features_from_file(filename):""" read feature properties and return in matrix form"""f = loadtxt(filename)return f[:,:4],f[:,4:] # feature locations, descriptors

這里返回兩個參數，前一個代表坐標、尺度和方向角度四個數，后一個表示返回128維向量。有了以上兩個函數便可實現sift特征檢測：整合代碼如下：

from PIL import Image
from numpy import *
from pylab import *
import osdef process_image(imagename,resultname,params="--edge-thresh 10 --peak-thresh 5"):""" process an image and save the results in a file"""path = os.path.abspath(os.path.join(os.path.dirname("__file__"),os.path.pardir))path = path+"\\ch02\\sift.exe "if imagename[-3:] != 'pgm':#create a pgm fileim = Image.open(imagename).convert('L')im.save('tmp.pgm')imagename = 'tmp.pgm'cmmd = str(path+imagename+" --output="+resultname+" "+params)os.system(cmmd)print ('processed', imagename, 'to', resultname)def read_features_from_file(filename):""" read feature properties and return in matrix form"""f = loadtxt(filename)return f[:,:4],f[:,4:] # feature locations, descriptorsdef plot_features(im,locs,circle=False):""" show image with features. input: im (image as array), locs (row, col, scale, orientation of each feature) """def draw_circle(c,r):t = arange(0,1.01,.01)*2*pix = r*cos(t) + c[0]y = r*sin(t) + c[1]plot(x,y,'b',linewidth=2)imshow(im)if circle:[draw_circle([p[0],p[1]],p[2]) for p in locs]else:plot(locs[:,0],locs[:,1],'ob')axis('off')if __name__ == '__main__':imname=(r' ')im=Image.open(imname)process_image(imname,'luda.sift')l1,d1 = read_features_from_file('luda.sift')  #l1為興趣點坐標、尺度和方位角度 l2是對應描述符的128 維向figure(dpi = 100)gray()plot_features(im,l1,circle = True)title('sift-features')show()

效果如下：sift提取的是圖像的局部特征，對旋轉、尺度縮放、亮度變化保持不變，對視角變化、仿射變換、噪聲也保持一定程度的穩定性。同時也可以看出，可以看出即使是很少幾個物體也可以產生大量的SIFT特征。

3 關鍵點匹配

在有了sift特征之后，便可以實現關鍵點的匹配。SIFT算法實現特征匹配主要有三個流程，1、提取關鍵點；2、對關鍵點附加詳細的信息（局部特征），即描述符；3、通過特征點（附帶上特征向量的關鍵點）的兩兩比較找出相互匹配的若干對特征點，建立景物間的對應關系。

首先得準備兩張size一樣大的圖片，他們要有相似的地方。分別調用sift檢測出相應特征。

接著進行關鍵點匹配：下半部分是兩張原圖，上半部分是匹配后的圖。

當然也可以嘗試兩張一摸一樣的圖片進行特征匹配：能夠看出兩張完全一樣的圖片匹配結果一致

實驗分析：可以看出，當拍攝角度和距離存在區別時，導致景觀出現較大的改變（例如尚大樓只剩一半），此時兩張圖片相匹配的特征點小于閾值，因此就無法匹配出來。也就是當兩張圖片的差異越大，相匹配的特征點不足，則匹配結果越少。比如，同個場景，如果沒有相同的特征目標，或是說相匹配的特征太少，那么可能不會匹配。相反，若是兩張圖一摸一樣，那么特征點則完全能夠匹配。

整體代碼如下：

from PIL import Image
import os
from numpy import *
from pylab import *def process_image(imagename,resultname,params="--edge-thresh 10 --peak-thresh 5"):""" process an image and save the results in a file"""path = os.path.abspath(os.path.join(os.path.dirname("__file__"),os.path.pardir))path = path+"\\ch02\\sift.exe "if imagename[-3:] != 'pgm':#create a pgm fileim = Image.open(imagename).convert('L')im.save('tmp.pgm')imagename = 'tmp.pgm'cmmd = str(path+imagename+" --output="+resultname+" "+params)os.system(cmmd)print ('processed', imagename, 'to', resultname)def read_features_from_file(filename):""" read feature properties and return in matrix form"""f = loadtxt(filename)return f[:,:4],f[:,4:] # feature locations, descriptorsdef write_features_to_file(filename,locs,desc):""" save feature location and descriptor to file"""savetxt(filename,hstack((locs,desc)))def plot_features(im,locs,circle=False):""" show image with features. input: im (image as array), locs (row, col, scale, orientation of each feature) """def draw_circle(c,r):t = arange(0,1.01,.01)*2*pix = r*cos(t) + c[0]y = r*sin(t) + c[1]plot(x,y,'b',linewidth=2)imshow(im)if circle:[draw_circle([p[0],p[1]],p[2]) for p in locs]else:plot(locs[:,0],locs[:,1],'ob')axis('off')def match(desc1,desc2):""" for each descriptor in the first image, select its match in the second image.input: desc1 (descriptors for the first image), desc2 (same for second image). """desc1 = array([d/linalg.norm(d) for d in desc1])desc2 = array([d/linalg.norm(d) for d in desc2])dist_ratio = 0.6desc1_size = desc1.shapematchscores = zeros((desc1_size[0],1))desc2t = desc2.T #precompute matrix transposefor i in range(desc1_size[0]):dotprods = dot(desc1[i,:],desc2t) #vector of dot productsdotprods = 0.9999*dotprods#inverse cosine and sort, return index for features in second imageindx = argsort(arccos(dotprods))#check if nearest neighbor has angle less than dist_ratio times 2ndif arccos(dotprods)[indx[0]] < dist_ratio * arccos(dotprods)[indx[1]]:matchscores[i] = int(indx[0])return matchscoresdef appendimages(im1,im2):""" return a new image that appends the two images side-by-side."""#select the image with the fewest rows and fill in enough empty rowsrows1 = im1.shape[0]    rows2 = im2.shape[0]if rows1 < rows2:im1 = concatenate((im1,zeros((rows2-rows1,im1.shape[1]))), axis=0)elif rows1 > rows2:im2 = concatenate((im2,zeros((rows1-rows2,im2.shape[1]))), axis=0)#if none of these cases they are equal, no filling needed.return concatenate((im1,im2), axis=1)def plot_matches(im1,im2,locs1,locs2,matchscores,show_below=True):""" show a figure with lines joining the accepted matchesinput: im1,im2 (images as arrays), locs1,locs2 (location of features), matchscores (as output from 'match'), show_below (if images should be shown below). """im3 = appendimages(im1,im2)if show_below:im3 = vstack((im3,im3))# show imageimshow(im3)# draw lines for matchescols1 = im1.shape[1]for i in range(len(matchscores)):if matchscores[i] > 0:plot([locs1[i,0], locs2[int(matchscores[i,0]),0]+cols1], [locs1[i,1], locs2[int(matchscores[i,0]),1]], 'c')axis('off')def match_twosided(desc1,desc2):""" two-sided symmetric version of match(). """matches_12 = match(desc1,desc2)matches_21 = match(desc2,desc1)ndx_12 = matches_12.nonzero()[0]#remove matches that are not symmetricfor n in ndx_12:if matches_21[int(matches_12[n])] != n:matches_12[n] = 0return matches_12if __name__ == "__main__":imname1=(r'  ')process_image(imname1,'tmp.sift')l,d = read_features_from_file('tmp.sift')im = array(Image.open(imname1))# figure()# plot_features(im,l,True)imname2=(r'  ')process_image(imname2,'tmp2.sift')l2,d2 = read_features_from_file('tmp2.sift')im2 = array(Image.open(imname2))	# figure()# plot_features(im2,l2,True)m = match_twosided(d,d2)figure()plot_matches(im,im2,l,l2,m)show()

4 匹配地理標記圖像

匹配地理標記圖像指的是輸入同一場景的序列圖像，然后通過SIFT算法對地理標記圖像進行兩兩匹配，構造連接矩陣，最后可視化圖像連接關系。首先準備一序列的圖片，對這些圖像提取局部描述子。然后得到連接矩陣，最后利用pydot工具包可視化連接結果。為了創建顯示可能圖像組的圖，如果匹配的數目高于一個閾值，我們使用邊來連接相應的圖像節點。同時，縮略圖的最大邊被定格在100 像素。代碼如下：

# -*- coding: utf-8 -*-
from pylab import *
from PIL import Image
import sift
import imtools
import pydotdownload_path = "E:\\master_workspace\\pcv-book-code-master\\ch02\\pano_imgs"  # set this to the path where you downloaded the panoramio images
path = "E:\\master_workspace\\pcv-book-code-master\\ch02\\pano_imgs\\results\\"  # path to save thumbnails (pydot needs the full system path)# list of downloaded filenames
# imlist = imtools.get_imlist(download_path)
imlist = [os.path.join(download_path,f) for f in os.listdir(download_path) if f.endswith('.jpeg')]
nbr_images = len(imlist)# extract features
featlist = [imname[:-4] + 'sift' for imname in imlist]
for i, imname in enumerate(imlist):sift.process_image(imname, featlist[i])matchscores = zeros((nbr_images, nbr_images))for i in range(nbr_images):for j in range(i, nbr_images):  # only compute upper triangle
#         print ('comparing ', imlist[i], imlist[j])l1, d1 = sift.read_features_from_file(featlist[i])l2, d2 = sift.read_features_from_file(featlist[j])matches = sift.match_twosided(d1, d2)nbr_matches = sum(matches > 0)
#         print ('number of matches = ', nbr_matches)matchscores[i, j] = nbr_matches
# print ("The match scores is: %d", matchscores)# copy values
for i in range(nbr_images):for j in range(i + 1, nbr_images):  # no need to copy diagonalmatchscores[j, i] = matchscores[i, j]threshold = 1  # min number of matches needed to create linkg = pydot.Dot(graph_type='graph')  # don't want the default directed graph
for i in range(nbr_images):for j in range(i + 1, nbr_images):if matchscores[i, j] > threshold:print(i, j)# first image in pairim = Image.open(imlist[i])im.thumbnail((100, 100))filename = path + str(i) + '.png'im.save(filename)  # need temporary files of the right sizeg.add_node(pydot.Node(str(i), fontcolor='transparent', shape='rectangle', image=filename))# second image in pairim = Image.open(imlist[j])im.thumbnail((100, 100))filename = path + str(j) + '.png'im.save(filename)  # need temporary files of the right sizeg.add_node(pydot.Node(str(j), fontcolor='transparent', shape='rectangle', image=filename))g.add_edge(pydot.Edge(str(i), str(j)))
g.write_png('jmu.png')

實驗分析，下圖閾值設定為1。低閾值說明能夠得到更多的匹配點，但同樣置信度也會降低，因此會出現上圖中錯誤匹配的情況。

當閾值設置為2時，更夠看到匹配的圖片變少，但是置信度即匹配的結果明顯更加準確。從實驗結果來看，SIFT可以解決一些角度，光照，雜物等問題實現地理場景匹配。但是也存在一些不足，比如，同個場景，如果沒有相同的特征目標，或是說相匹配的特征太少，那么可能不會匹配。反過來，如上圖所示，如果兩個不同的場景外觀上太過相似，那么可能就會被誤判成同個場景。

總結

以上是生活随笔為你收集整理的Python计算机视觉——SIFT特征的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：数字图像处理——第十章图像分割
下一篇： Python计算机视觉——图像到图像的映