生活随笔
收集整理的這篇文章主要介紹了
计算图片相似度的方法
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
文章目錄
- 1.余弦相似度計算
- 2.哈希算法計算圖片的相似度
- 3.直方圖計算圖片的相似度
- 4.SSIM(結構相似度度量)計算圖片的相似度
- 5.基于互信息(Mutual Information)計算圖片的相似度
1.余弦相似度計算
把圖片表示成一個向量,通過計算向量之間的余弦距離來表征兩張圖片的相似度。
from PIL
import Image
from numpy
import average
, dot
, linalg
def get_thum(image
, size
=(64, 64), greyscale
=False):image
= image
.resize
(size
, Image
.ANTIALIAS
)if greyscale
:image
= image
.convert
('L')return image
def image_similarity_vectors_via_numpy(image1
, image2
):image1
= get_thum
(image1
)image2
= get_thum
(image2
)images
= [image1
, image2
]vectors
= []norms
= []for image
in images
:vector
= []for pixel_tuple
in image
.getdata
():vector
.append
(average
(pixel_tuple
))vectors
.append
(vector
)norms
.append
(linalg
.norm
(vector
, 2))a
, b
= vectorsa_norm
, b_norm
= normsres
= dot
(a
/ a_norm
, b
/ b_norm
)return res
image1
= Image
.open('010.jpg')
image2
= Image
.open('011.jpg')
cosin
= image_similarity_vectors_via_numpy
(image1
, image2
)
print('圖片余弦相似度', cosin
)
2.哈希算法計算圖片的相似度
感知哈希算法是一類算法的總稱,包括aHash、pHash、dHash。顧名思義,感知哈希不是以嚴格的方式計算Hash值,而是以更加相對的方式計算哈希值,因為“相似”與否,就是一種相對的判定。
幾種hash值的比較:
aHash:平均值哈希。速度比較快,但是常常不太精確。
pHash:感知哈希。精確度比較高,但是速度方面較差一些。
dHash:差異值哈希。精確度較高,且速度也非常快
值哈希算法、差值哈希算法和感知哈希算法都是值越小,相似度越高,取值為0-64,即漢明距離中,64位的hash值有多少不同。三直方圖和單通道直方圖的值為0-1,值越大,相似度越高。
import cv2
import numpy
as np
from PIL
import Image
import requests
from io
import BytesIO
import matplotlib
matplotlib
.use
('TkAgg')
import matplotlib
.pyplot
as plt
def aHash(img
):img
= cv2
.resize
(img
, (8, 8))gray
= cv2
.cvtColor
(img
, cv2
.COLOR_BGR2GRAY
)s
= 0hash_str
= ''for i
in range(8):for j
in range(8):s
= s
+gray
[i
, j
]avg
= s
/64for i
in range(8):for j
in range(8):if gray
[i
, j
] > avg
:hash_str
= hash_str
+'1'else:hash_str
= hash_str
+'0'return hash_str
def dHash(img
):img
= cv2
.resize
(img
, (9, 8))gray
= cv2
.cvtColor
(img
, cv2
.COLOR_BGR2GRAY
)hash_str
= ''for i
in range(8):for j
in range(8):if gray
[i
, j
] > gray
[i
, j
+1]:hash_str
= hash_str
+'1'else:hash_str
= hash_str
+'0'return hash_str
def pHash(img
):img
= cv2
.resize
(img
, (32, 32)) gray
= cv2
.cvtColor
(img
, cv2
.COLOR_BGR2GRAY
)dct
= cv2
.dct
(np
.float32
(gray
))dct_roi
= dct
[0:8, 0:8]hash = []avreage
= np
.mean
(dct_roi
)for i
in range(dct_roi
.shape
[0]):for j
in range(dct_roi
.shape
[1]):if dct_roi
[i
, j
] > avreage
:hash.append
(1)else:hash.append
(0)return hash
def calculate(image1
, image2
):hist1
= cv2
.calcHist
([image1
], [0], None, [256], [0.0, 255.0])hist2
= cv2
.calcHist
([image2
], [0], None, [256], [0.0, 255.0])degree
= 0for i
in range(len(hist1
)):if hist1
[i
] != hist2
[i
]:degree
= degree
+ \
(1 - abs(hist1
[i
] - hist2
[i
]) / max(hist1
[i
], hist2
[i
]))else:degree
= degree
+ 1degree
= degree
/ len(hist1
)return degree
def classify_hist_with_split(image1
, image2
, size
=(256, 256)):image1
= cv2
.resize
(image1
, size
)image2
= cv2
.resize
(image2
, size
)sub_image1
= cv2
.split
(image1
)sub_image2
= cv2
.split
(image2
)sub_data
= 0for im1
, im2
in zip(sub_image1
, sub_image2
):sub_data
+= calculate
(im1
, im2
)sub_data
= sub_data
/ 3return sub_data
def cmpHash(hash1
, hash2
):n
= 0if len(hash1
) != len(hash2
):return -1for i
in range(len(hash1
)):if hash1
[i
] != hash2
[i
]:n
= n
+ 1return n
def getImageByUrl(url
):html
= requests
.get
(url
, verify
=False)image
= Image
.open(BytesIO
(html
.content
))return image
def PILImageToCV():path
= "/Users/waldenz/Documents/Work/doc/TestImages/t3.png"img
= Image
.open(path
)plt
.subplot
(121)plt
.imshow
(img
)print(isinstance(img
, np
.ndarray
))img
= cv2
.cvtColor
(np
.asarray
(img
), cv2
.COLOR_RGB2BGR
)print(isinstance(img
, np
.ndarray
))plt
.subplot
(122)plt
.imshow
(img
)plt
.show
()def CVImageToPIL():path
= "/Users/waldenz/Documents/Work/doc/TestImages/t3.png"img
= cv2
.imread
(path
)plt
.subplot
(121)plt
.imshow
(img
)img2
= Image
.fromarray
(cv2
.cvtColor
(img
, cv2
.COLOR_BGR2RGB
))plt
.subplot
(122)plt
.imshow
(img2
)plt
.show
()def bytes_to_cvimage(filebytes
):image
= Image
.open(filebytes
)img
= cv2
.cvtColor
(np
.asarray
(image
), cv2
.COLOR_RGB2BGR
)return img
def runAllImageSimilaryFun(para1
, para2
):if para1
.startswith
("http"):img1
= getImageByUrl
(para1
)img1
= cv2
.cvtColor
(np
.asarray
(img1
), cv2
.COLOR_RGB2BGR
)img2
= getImageByUrl
(para2
)img2
= cv2
.cvtColor
(np
.asarray
(img2
), cv2
.COLOR_RGB2BGR
)else:img1
= cv2
.imread
(para1
)img2
= cv2
.imread
(para2
)hash1
= aHash
(img1
)hash2
= aHash
(img2
)n1
= cmpHash
(hash1
, hash2
)print('均值哈希算法相似度aHash:', n1
)hash1
= dHash
(img1
)hash2
= dHash
(img2
)n2
= cmpHash
(hash1
, hash2
)print('差值哈希算法相似度dHash:', n2
)hash1
= pHash
(img1
)hash2
= pHash
(img2
)n3
= cmpHash
(hash1
, hash2
)print('感知哈希算法相似度pHash:', n3
)n4
= classify_hist_with_split
(img1
, img2
)print('三直方圖算法相似度:', n4
)n5
= calculate
(img1
, img2
)print("單通道的直方圖", n5
)print("%d %d %d %.2f %.2f " % (n1
, n2
, n3
, round(n4
[0], 2), n5
[0]))print("%.2f %.2f %.2f %.2f %.2f " % (1-float(n1
/64), 1 -float(n2
/64), 1-float(n3
/64), round(n4
[0], 2), n5
[0]))plt
.subplot
(121)plt
.imshow
(Image
.fromarray
(cv2
.cvtColor
(img1
, cv2
.COLOR_BGR2RGB
)))plt
.subplot
(122)plt
.imshow
(Image
.fromarray
(cv2
.cvtColor
(img2
, cv2
.COLOR_BGR2RGB
)))plt
.show
()if __name__
== "__main__":p1
="https://ww3.sinaimg.cn/bmiddle/007INInDly1g336j2zziwj30su0g848w.jpg"p2
="https://ww2.sinaimg.cn/bmiddle/007INInDly1g336j10d32j30vd0hnam6.jpg"runAllImageSimilaryFun
(p1
,p2
)
3.直方圖計算圖片的相似度
利用直方圖計算圖片的相似度時,是按照顏色的全局分布情況來看待的,無法對局部的色彩進行分析,同一張圖片如果轉化成為灰度圖時,在計算其直方圖時差距就更大了。對于灰度圖可以將圖片進行等分,然后在計算圖片的相似度。
def make_regalur_image(img
, size
=(64, 64)):gray_image
= img
.resize
(size
).convert
('RGB')return gray_image
def hist_similar(lh
, rh
):assert len(lh
) == len(rh
)hist
= sum(1 - (0 if l
== r
else float(abs(l
- r
)) / max(l
, r
)) for l
, r
in zip(lh
, rh
)) / len(lh
)return hist
def calc_similar(li
, ri
):calc_sim
= hist_similar
(li
.histogram
(), ri
.histogram
())return calc_sim
if __name__
== '__main__':image1
= Image
.open('123.jpg')image1
= make_regalur_image
(image1
)image2
= Image
.open('456.jpg')image2
= make_regalur_image
(image2
)print("圖片間的相似度為", calc_similar
(image1
, image2
))
4.SSIM(結構相似度度量)計算圖片的相似度
SSIM是一種全參考的圖像質量評價指標,分別從亮度、對比度、結構三個方面度量圖像相似性。SSIM取值范圍[0, 1],值越大,表示圖像失真越小。在實際應用中,可以利用滑動窗將圖像分塊,令分塊總數為N,考慮到窗口形狀對分塊的影響,采用高斯加權計算每一窗口的均值、方差以及協方差,然后計算對應塊的結構相似度SSIM,最后將平均值作為兩圖像的結構相似性度量,即平均結構相似性SSIM。
from skimage
.measure
import compare_ssim
from scipy
.misc
import imread
import numpy
as np
img1
= imread
('../dataset/100002.png')
img2
= imread
('../dataset/100001.png')
img2
= np
.resize
(img2
, (img1
.shape
[0], img1
.shape
[1], img1
.shape
[2]))
print(img1
.shape
)
print(img2
.shape
)
ssim
= compare_ssim
(img1
, img2
, multichannel
= True)
print(ssim
)
5.基于互信息(Mutual Information)計算圖片的相似度
通過計算兩個圖片的互信息來表征他們之間的相似度,如果兩張圖片尺寸相同,還是能在一定程度上表征兩張圖片的相似性的。但是,大部分情況下圖片的尺寸不相同,如果把兩張圖片尺寸調成相同的話,又會讓原來很多的信息丟失,所以很難把握。經過實際驗證,此種方法的確很難把握。
from sklearn
import metrics
as mr
from scipy
.misc
import imread
import numpy
as npimg1
= imread
('1.jpg')
img2
= imread
('2.jpg')img2
= np
.resize
(img2
, (img1
.shape
[0], img1
.shape
[1], img1
.shape
[2]))img1
= np
.reshape
(img1
, -1)
img2
= np
.reshape
(img2
, -1)
print(img2
.shape
)
print(img1
.shape
)
mutual_infor
= mr
.mutual_info_score
(img1
, img2
)print(mutual_infor
)
總結
以上是生活随笔為你收集整理的计算图片相似度的方法的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。