當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

计算机视觉-自定义对象检测器

發(fā)布時間：2023/11/29 编程问答 39 豆豆

生活随笔收集整理的這篇文章主要介紹了计算机视觉-自定义对象检测器小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

1、模板匹配

運行指令：python template_matching.py --source 3.jpg --template 2.jpg

import argparse import cv2ap = argparse.ArgumentParser() ap.add_argument("-s", "--source", required=True, help="Path to the source image") ap.add_argument("-t", "--template", required=True, help="Path to the template image") args = vars(ap.parse_args())source = cv2.imread(args["source"]) template = cv2.imread(args["template"]) (tempH, tempW) = template.shape[:2]result = cv2.matchTemplate(source, template, cv2.TM_CCOEFF) #參數(shù)1：源圖像參數(shù)2：模板圖像參數(shù)3：模板匹配方法 (minVal, maxVal, minLoc, (x, y)) = cv2.minMaxLoc(result) #獲取最佳匹配的（x,y）坐標 cv2.rectangle(source, (x, y), (x + tempW, y + tempH), (0, 255, 0), 2) #在源圖像上繪制邊框

?2、訓(xùn)練自己的物體探測器

作用：結(jié)合caltech101數(shù)據(jù)集，結(jié)合.mat文件，訓(xùn)練對象檢測器，生成SVM線性支持向量機

train_detector.py

運行指令：python train_detector.py --class stop_sign_images --annotations stop_sign_annotations \

--output output/stop_sign_detector.svm from __future__ import print_function from imutils import paths from scipy.io import loadmat from skimage import io import argparse import dlibap = argparse.ArgumentParser() ap.add_argument("-c", "--class", required=True,help="Path to the CALTECH-101 class images")#要訓(xùn)練一個對象檢測器的具體CALTECH-101（數(shù)據(jù)集）類的路徑 ap.add_argument("-a", "--annotations", required=True,help="Path to the CALTECH-101 class annotations")#指定我們正在訓(xùn)練的特定類的邊界框的路徑（caltech101數(shù)據(jù)集中對應(yīng)的.mat文件夾） ap.add_argument("-o", "--output", required=True,help="Path to the output detector")#輸出分類器的路徑 args = vars(ap.parse_args())print("[INFO] gathering images and bounding boxes...") options = dlib.simple_object_detector_training_options() images = [] boxes = []for imagePath in paths.list_images(args["class"]):#循環(huán)輸入需要被訓(xùn)練的圖像imageID = imagePath[imagePath.rfind("/") + 1:].split("_")[1]imageID = imageID.replace(".jpg", "")p = "{}/annotation_{}.mat".format(args["annotations"], imageID)annotations = loadmat(p)["box_coord"]#從路徑中提取圖像ID，然后使用圖像ID ，從磁盤加載相應(yīng)的注釋（即邊界框） bb = [dlib.rectangle(left=long(x), top=long(y), right=long(w), bottom=long(h)) for (y, h, x, w) in annotations]#構(gòu) 矩形對象來表示邊界框 boxes.append(bb)images.append(io.imread(imagePath))#更新邊界當(dāng)前圖像框和添加圖片到列表中，在DLIB庫將需要的兩個圖像和函數(shù)加載到訓(xùn)練分類器中

?test_detector.py

運行指令：python test_detector.py --detector output/stop_sign_detector.svm --testing stop_sign_testing

作用：測試自定義對象檢測器效果

from imutils import paths import argparse import dlib import cv2ap = argparse.ArgumentParser() ap.add_argument("-d", "--detector", required=True, help="Path to trained object detector")#訓(xùn)練出的SVM線性檢測器 ap.add_argument("-t", "--testing", required=True, help="Path to directory of testing images")#包含停止標志圖像進行測試的目錄的路徑 args = vars(ap.parse_args())detector = dlib.simple_object_detector(args["detector"])for testingPath in paths.list_images(args["testing"]):#循環(huán)測試需要測試的圖像image = cv2.imread(testingPath)boxes = detector(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))for b in boxes:(x, y, w, h) = (b.left(), b.top(), b.right(), b.bottom())cv2.rectangle(image, (x, y), (w, h), (0, 255, 0), 2)cv2.imshow("Image", image)cv2.waitKey(0)

3.1、圖像金字塔　　

作用：指圖像按一定比例縮放，并且返回。

知識點：關(guān)鍵字 yield 返回并不結(jié)束，理解為延遲返回結(jié)果

helper.py:

import imutils #自定義金字塔函數(shù) def pyramid(image, scale=1.5, minSize=(30, 30)): #參數(shù)1：源圖像參數(shù)2：每次縮放比例參數(shù)3：設(shè)置最小尺寸yield image #定義為金字塔原圖像while True:w = int(image.shape[1] / scale)image = imutils.resize(image, width=w) #設(shè)置長寬按比例縮放if image.shape[0] < minSize[1] or image.shape[1] < minSize[0]:#判斷縮放的圖片是否滿足需求breakyield image #定義滑動穿口函數(shù) def sliding_window(image, stepSize, windowSize):#參數(shù)1：要檢查的對象參數(shù)2：每次跳過多少像素，參數(shù)3：每次窗口要檢查的大小　　for y in xrange(0, image.shape[0], stepSize): 　　　　for x in xrange(0, image.shape[1], stepSize): 　　　　　　yield (x, y, image[y:y + windowSize[1], x:x + windowSize[0]])

test_pyramid.py

示例：python test_pyramid.py --image florida_trip.png --scale 1.5

#對金字塔函數(shù)的使用
from pyimagesearch.object_detection.helpers import pyramid import argparse import cv2ap = argparse.ArgumentParser() ap.add_argument("-i", "--image", required=True, help="path to the input image") ap.add_argument("-s", "--scale", type=float, default=1.5, help="scale factor size") #每次圖像縮小比例 args = vars(ap.parse_args())image = cv2.imread(args["image"])for (i, layer) in enumerate(pyramid(image, scale=args["scale"])):cv2.imshow("Layer {}".format(i + 1), layer)cv2.waitKey(0)

?3.2、滑動窗戶

test_sliding_window.py

作用：金字塔與滑動窗口的聯(lián)合的運用

運行指令：python test_sliding_window.py --image florida_trip.png --width 64 --height 64

from pyimagesearch.object_detection.helpers import sliding_window from pyimagesearch.object_detection.helpers import pyramid import argparse import time import cv2ap = argparse.ArgumentParser() ap.add_argument("-i", "--image", required=True, help="path to the input image")#需要處理的圖像 ap.add_argument("-w", "--width", type=int, help="width of sliding window")#滑動窗口的寬度 ap.add_argument("-t", "--height", type=int, help="height of sliding window")#滑動窗口的高度 ap.add_argument("-s", "--scale", type=float, default=1.5, help="scale factor size")#圖像金字塔的調(diào)整大小因子 args = vars(ap.parse_args())image = cv2.imread(args["image"]) (winW, winH) = (args["width"], args["height"])for layer in pyramid(image, scale=args["scale"]):for (x, y, window) in sliding_window(layer, stepSize=32, windowSize=(winW, winH)):if window.shape[0] != winH or window.shape[1] != winW:continueclone = layer.copy()cv2.rectangle(clone, (x, y), (x + winW, y + winH), (0, 255, 0), 2)cv2.imshow("Window", clone)cv2.waitKey(1)time.sleep(0.025)

4.1構(gòu)建自定義檢測框架的6個步驟

哈爾級聯(lián)的問題（Viola-Jones探測器（2））：OpenCV中檢測到面孔/人物/對象/任何東西，將花費大量時間調(diào)整cv2.detectMultiScale參數(shù)。

Viola-Jones探測器不是我們唯一的物體檢測選擇。我們可以使用關(guān)鍵點對象檢測，局部不變描述符和一系列的視覺詞模型。

六步框架：

步驟1：從想要檢測的對象的訓(xùn)練數(shù)據(jù)中取樣p個正樣本，并從這些樣本中提取HOG描述符。將提取對象的邊界框（包括圖像的訓(xùn)練數(shù)據(jù)），然后在該ROI上計算HOG特征，HOG功能將作為正面例子。

步驟2：負面訓(xùn)練集不包含任何要檢測的對象，并從這些樣品中提取HOG描述為好。實踐中負面樣本遠遠大于正樣本

步驟3：在正負樣本上訓(xùn)練線性支持向量機。

步驟4：應(yīng)用硬陰極開采。對于負面訓(xùn)練集中的每個圖像和每個圖像的每個可能的比例（即圖像金字塔），應(yīng)用滑動窗口技術(shù)將窗口滑過圖像。減少我們最終檢測器中的假陽性數(shù)量。

步驟5：采取在硬陰極開采階段發(fā)現(xiàn)的假陽性樣本，以其置信度（即概率）進行排序，并使用這些陰性樣本重新訓(xùn)練分類器

步驟6：分類器現(xiàn)在已經(jīng)受過培訓(xùn)，可以應(yīng)用于測試數(shù)據(jù)集。再次，就像在步驟4中，對于測試集中的每個圖像，并且對于圖像的每個比例，應(yīng)用滑動窗口技術(shù)。在每個窗口中，提取HOG描述符并應(yīng)用分類器。如果分類器以很大的概率檢測到對象，記錄窗口的邊界框。完成掃描圖像后，應(yīng)用非最大抑制來刪除冗余和重疊的邊界框。

擴展和其他方法：

在物體檢測中使用HOG+線性SVM方法簡單易懂。與使用的標準6步框架略有不同。

第一個變化是關(guān)于HOG滑動窗口和非最大抑制方法。代替從提取特征的二者的正和負數(shù)據(jù)集，所述方法DLIB優(yōu)化HOG滑動窗口使得上的錯誤的數(shù)目??的每個訓(xùn)練圖像。這意味著??整個??訓(xùn)練圖像都用于（1）提取正例，和（2）?從圖像的所有其他區(qū)域提取??負樣本。這完全減輕?了負面培訓(xùn)的需要和強烈的消極采礦的要求。這是Max-Margin Object?檢測方法如此之快的原因之一??。

其次，在實際的訓(xùn)練階段，dlib也考慮到非最大的壓制。我們通常只應(yīng)用NMS來獲得最終的邊界框，但在這種情況下，我們實際上可以在訓(xùn)練階段使用NMS。這有助于減少誤報??實質(zhì)上并再次減輕了硬負開采的需要。

最后，dlib使用非常精確的算法來找到分離兩個圖像類的最優(yōu)超平面。該方法比許多其他最先進的對象檢測器獲得更高的精度（具有較低的假陽性率）。

5、準備實驗和培訓(xùn)數(shù)據(jù)

框架的完整目錄結(jié)構(gòu)：（pyimagesearch同級目錄還有conf目錄存放json文件，datasets目錄，存放數(shù)據(jù)集）

實驗配置：運用JSON配置文件

json配置文件優(yōu)勢：

1、不需要明確定義一個永無止盡的命令行參數(shù)列表，只需要提供的是我們配置文件的路徑。

2、配置文件允許將所有相關(guān)參數(shù)整合到一個?位置。

3、確保我們不會忘記為每個Python腳本使用哪些命令行選項。所有選項將在我們的配置文件中定義。

4、允許我們?yōu)槊總€要創(chuàng)建的對象檢測器配置一個配置文件??。這是一個巨大的優(yōu)勢，允許我們通過修改單個文件來定義對象檢測器??。

cars.json:

{######## DATASET PATHS#######"image_dataset": "datasets/caltech101/101_ObjectCategories/car_side",#我們的“正例”圖像的路徑，需要訓(xùn)練的基礎(chǔ)數(shù)據(jù)"image_annotations": "datasets/caltech101/Annotations/car_side",#包含與image_dataset中每個圖像相關(guān)聯(lián)的邊界框的目錄的路徑"image_distractions": "datasets/sceneclass13",#不包含我們想要檢測的對象的任何示例的“否定示例” }

explore_dims.py?

作用：在caltech101數(shù)據(jù)中提取.mat文件信息，遍歷所有圖片輪廓信息，同時獲取滑動活動窗口尺寸

涉及到知識點：1、處理caltech101數(shù)據(jù)集方法及提取.mat文件信息

2、用用golb.golb()函數(shù)遍歷文件夾中的文件方法

運行指令：python explore_dims.py --conf conf/cars.json

from __future__ import print_function from pyimagesearch.utils import Conf from scipy import io import numpy as np import argparse import globap = argparse.ArgumentParser() ap.add_argument("-c", "--conf", required=True, help="path to the configuration file") args = vars(ap.parse_args())conf = Conf(args["conf"])#加載配置文件 widths = []#初始化檢測對象的寬度 heights = []#初始化檢測對象的高度for p in glob.glob(conf["image_annotations"] + "/*.mat"):#循環(huán)檢測對象的注釋文件(y, h, x, w) = io.loadmat(p)["box_coord"][0]widths.append(w - x)heights.append(h - y)#加載每個檢測對象的注釋文件相關(guān)聯(lián)的邊界框，并更新相應(yīng)的寬度和高度列表。#計算平均寬度和高度 (avgWidth, avgHeight) = (np.mean(widths), np.mean(heights)) print("[INFO] avg. width: {:.2f}".format(avgWidth)) print("[INFO] avg. height: {:.2f}".format(avgHeight)) print("[INFO] aspect ratio: {:.2f}".format(avgWidth / avgHeight))

conf.py:解析命令行參數(shù)

作用：解析car.json文件的類
python內(nèi)置函數(shù)__getitem__的作用:

在類中定義了__getitem__()方法，那么他的實例對象（假設(shè)為P）就可以這樣P[key]取值。當(dāng)實例對象做P[key]運算時，就會調(diào)用類中的__getitem__()方法

import commentjson as jsonclass Conf:def __init__(self, confPath):conf = json.loads(open(confPath).read())self.__dict__.update(conf)def __getitem__(self, k):return self.__dict__.get(k, None)

?6、構(gòu)建HOG描述符

cars.json:

{######## DATASET PATHS#######"image_dataset": "datasets/caltech101/101_ObjectCategories/car_side","image_annotations": "datasets/caltech101/Annotations/car_side","image_distractions": "datasets/sceneclass13",######## FEATURE EXTRACTION#######"features_path": "output/cars/car_features.hdf5","percent_gt_images": 0.5,"offset": 5,"use_flip": true,"num_distraction_images": 500,"num_distractions_per_image": 10,######## HISTOGRAM OF ORIENTED GRADIENTS DESCRIPTOR 使用的方向梯度直方圖#######"orientations": 9,"pixels_per_cell": [4, 4], #能被滑動窗口尺寸整除"cells_per_block": [2, 2],"normalize": true,######## OBJECT DETECTOR 定義滑動窗口大小#######"window_step": 4,"overlap_thresh": 0.3,"pyramid_scale": 1.5,"window_dim": [96, 32],"min_probability": 0.7 }

?dataset.py

作用：定義h5py數(shù)據(jù)庫運用的方法

涉及到知識點：1、對h5py數(shù)據(jù)庫的運用

疑問：create_dataset（）函數(shù)參數(shù)作用:

參數(shù)：數(shù)據(jù)庫的名字，參數(shù)2：數(shù)據(jù)庫維度，參數(shù)3：數(shù)據(jù)類型

擴展：h5py文件是存放兩類對象的容器，數(shù)據(jù)集(dataset)和組(group)，dataset類似數(shù)組類的數(shù)據(jù)集合，和numpy的數(shù)組差不多。group是像文件夾一樣的容器，它好比python中的字典，有鍵(key)和值(value)。group中可以存放dataset或者其他的group。”鍵”就是組成員的名稱。

import numpy as np import h5py#從磁盤上的數(shù)據(jù)集加載特征向量和標簽 def dump_dataset(data,labels,path,datasetName,writeMethod="w"):#參數(shù)1：要寫入HDF5數(shù)據(jù)集的特征向量列表。參數(shù)2：標簽，與每個特征向量相關(guān)聯(lián)的標簽列表。參數(shù)3：HDF5數(shù)據(jù)集在磁盤上的存儲位置。參數(shù)5：HDF5文件中數(shù)據(jù)集的名稱。參數(shù)5：HDF5數(shù)據(jù)集的寫入方法db = h5py.File(path, writeMethod)dataset = db.create_dataset(datasetName, (len(data), len(data[0]) + 1), dtype="float")dataset[0:len(data)] = np.c_[labels, data]db.close()def load_dataset(path, datasetName):#加載與datasetName相關(guān)聯(lián)的特征向量和標簽db = h5py.File(path, "r")(labels, data) = (db[datasetName][:, 0], db[datasetName][:, 1:])db.close()return (data, labels)

helpers.py:

?作用：返回每張圖片的ROI,（最小包圍矩陣）

import imutils import cv2def crop_ct101_bb(image, bb, padding=10, dstSize=(32, 32)):(y, h, x, w) = bb(x, y) = (max(x - padding, 0), max(y - padding, 0))roi = image[y:h + padding, x:w + padding]roi = cv2.resize(roi, dstSize, interpolation=cv2.INTER_AREA)return roi

extract_features.py:

作用：提取圖片的hog特征向量，為SVC數(shù)據(jù)分類提供數(shù)據(jù)

?涉及到知識點：1、運用imutils中的paths模塊遍歷文件

疑惑：1、 progressbar模塊的作用：

　　　　創(chuàng)建一個進度條顯示對象

　　　　widgets可選參數(shù)含義：

　　　　'Progress: ' ：設(shè)置進度條前顯示的文字

　　　　Percentage() ：顯示百分比

　　　　Bar('#') ：設(shè)置進度條形狀

　　　　ETA() ：顯示預(yù)計剩余時間

　　　　Timer() ：顯示已用時間?

2、HOG函數(shù)的詳解

https://blog.csdn.net/zhazhiqiang/article/details/20221143

https://baike.baidu.com/item/HOG/9738560?fr=aladdin

3、random.sample函數(shù)作用？

sample(seq, n) 從序列seq中選擇n個隨機且獨立的元素；

4、random.choice函數(shù)的作用？

choice(seq) 從序列seq中返回隨機的元素

random模塊拓展：

1 )、random() 返回0<=n<1之間的隨機實數(shù)n；

?2)、getrandbits(n) 以長整型形式返回n個隨機位；
3)、shuffle(seq[, random]) 原地指定seq序列；

5、sklearn.feature_extraction.image模塊中extract_patches_2d函數(shù)的作用？

6）提示信息：Default value of `block_norm`==`L1` is deprecated and will be changed to `L2-Hys` in v0.15? 比Py中特征少？

運行指令：python extract_features.py --conf conf/cars.json

# import the necessary packages from __future__ import print_function from sklearn.feature_extraction.image import extract_patches_2d from pyimagesearch.object_detection import helpers from pyimagesearch.descriptors import HOG from pyimagesearch.utils import dataset from pyimagesearch.utils import Conf from imutils import paths from scipy import io import numpy as np import progressbar import argparse import random import cv2ap = argparse.ArgumentParser() ap.add_argument("-c", "--conf", required=True, help="path to the configuration file") args = vars(ap.parse_args())conf = Conf(args["conf"])#加載配置文件#調(diào)用函數(shù)初始化HOG描述符 hog = HOG(orientations=conf["orientations"], pixelsPerCell=tuple(conf["pixels_per_cell"]),cellsPerBlock=tuple(conf["cells_per_block"]), normalize=conf["normalize"]) data = [] labels = []#隨機抽取車測試圖 trnPaths = list(paths.list_images(conf["image_dataset"])) trnPaths = random.sample(trnPaths, int(len(trnPaths) * conf["percent_gt_images"])) print("[INFO] describing training ROIs...")widgets = ["Extracting: ", progressbar.Percentage(), " ", progressbar.Bar(), " ", progressbar.ETA()] pbar = progressbar.ProgressBar(maxval=len(trnPaths), widgets=widgets).start() #訓(xùn)練每個圖像 for (i, trnPath) in enumerate(trnPaths):image = cv2.imread(trnPath)image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)imageID = trnPath[trnPath.rfind("_") + 1:].replace(".jpg", "")#提取文件名 p = "{}/annotation_{}.mat".format(conf["image_annotations"], imageID)bb = io.loadmat(p)["box_coord"][0]roi = helpers.crop_ct101_bb(image, bb, padding=conf["offset"], dstSize=tuple(conf["window_dim"]))#確定我們是否應(yīng)該使用ROI的水平翻轉(zhuǎn)作為額外的訓(xùn)練數(shù)據(jù)rois = (roi, cv2.flip(roi, 1)) if conf["use_flip"] else (roi,)#中提取HOG特征，并更新數(shù)據(jù) 和標簽列表for roi in rois:features = hog.describe(roi)data.append(features)labels.append(1)pbar.update(i)dstPaths = list(paths.list_images(conf["image_distractions"])) pbar = progressbar.ProgressBar(maxval=conf["num_distraction_images"], widgets=widgets).start() print("[INFO] describing distraction ROIs...")#訓(xùn)練負圖像樣本 for i in np.arange(0, conf["num_distraction_images"]):image = cv2.imread(random.choice(dstPaths))image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)patches = extract_patches_2d(image, tuple(conf["window_dim"]),max_patches=conf["num_distractions_per_image"])for patch in patches:features = hog.describe(patch)data.append(features)labels.append(-1)pbar.update(i)pbar.finish() print("[INFO] dumping features and labels to file...") dataset.dump_dataset(data, labels, conf["features_path"], "features")

?7、初始訓(xùn)練階段

car.json:

{######## DATASET PATHS#######"image_dataset": "datasets/caltech101/101_ObjectCategories/car_side","image_annotations": "datasets/caltech101/Annotations/car_side","image_distractions": "datasets/sceneclass13",######## FEATURE EXTRACTION#######"features_path": "output/cars/car_features.hdf5","percent_gt_images": 0.5,"offset": 5,"use_flip": true,"num_distraction_images": 500,"num_distractions_per_image": 10,######## HISTOGRAM OF ORIENTED GRADIENTS DESCRIPTOR#######"orientations": 9,"pixels_per_cell": [4, 4],"cells_per_block": [2, 2],"normalize": true,######## OBJECT DETECTOR#######"window_step": 4,"overlap_thresh": 0.3,"pyramid_scale": 1.5,"window_dim": [96, 32],"min_probability": 0.7,######## LINEAR SVM#######"classifier_path": "output/cars/model.cpickle",#分類器被儲存的位置"C": 0.01, }

train_model.py

作用：對獲取的圖像hog特征向量，運用SVC線性分類處理

疑問：1、sklearn函數(shù)模塊詳解？

2、args["hard_negatives"]參數(shù)作用？

3、numpy.stack()函數(shù)作業(yè)用：

改變列表數(shù)據(jù)維度

參數(shù)1：列表數(shù)據(jù)，參數(shù)2：設(shè)置列表維度

4、numpy.hstack()函數(shù)作用

水平(按列順序)把數(shù)組給堆疊起來，vstack()函數(shù)正好和它相反

參數(shù)tup可以是元組，列表，或者numpy數(shù)組，返回結(jié)果為numpy的數(shù)組。

?運行指令：python train_model.py --conf conf/cars.json

from __future__ import print_function from pyimagesearch.utils import dataset from pyimagesearch.utils import Conf from sklearn.svm import SVC import numpy as np import argparse import cPickleap = argparse.ArgumentParser() ap.add_argument("-c", "--conf", required=True,help="path to the configuration file") ap.add_argument("-n", "--hard-negatives", type=int, default=-1,help="flag indicating whether or not hard negatives should be used") args = vars(ap.parse_args())print("[INFO] loading dataset...") conf = Conf(args["conf"]) (data, labels) = dataset.load_dataset(conf["features_path"], "features")#抓取提取的特征向量和標簽if args["hard_negatives"] > 0:print("[INFO] loading hard negatives...")(hardData, hardLabels) = dataset.load_dataset(conf["features_path"], "hard_negatives")data = np.vstack([data, hardData])labels = np.hstack([labels, hardLabels])print("[INFO] training classifier...") model = SVC(kernel="linear", C=conf["C"], probability=True, random_state=42) model.fit(data, labels)print("[INFO] dumping classifier...") f = open(conf["classifier_path"], "w") f.write(cPickle.dumps(model))#將分類器轉(zhuǎn)儲成檔 f.close()

objectdetector.py:

作用：經(jīng)過滑動窗口和金字塔處理后的圖像，提取符合概率的輪廓列表。
疑惑：改變概率參數(shù)，符合要求的輪廓數(shù)量沒有發(fā)生改變？
pyramid函數(shù)（3.1）、sliding_window函數(shù)（3.2）寫到helpers.py中 iimport helpersclass ObjectDetector:def __init__(self, model, desc):self.model = modelself.desc = descdef detect(self, image, winDim, winStep=4, pyramidScale=1.5, minProb=0.7):#image:需要檢測的圖像，winDim:滑動窗口尺寸大小boxes = []probs = []for layer in helpers.pyramid(image, scale=pyramidScale, minSize=winDim):#循環(huán)金子塔中的圖像scale = image.shape[0] / float(layer.shape[0])for (x, y, window) in helpers.sliding_window(layer, winStep, winDim):(winH, winW) = window.shape[:2]if winH == winDim[1] and winW == winDim[0]:features = self.desc.describe(window).reshape(1, -1)prob = self.model.predict_proba(features)[0][1]if prob > minProb:(startX, startY) = (int(scale * x), int(scale * y))endX = int(startX + (scale * winW))endY = int(startY + (scale * winH)) boxes.append((startX, startY, endX, endY))probs.append(prob)return (boxes, probs)

test_model_no_nms.py(與pyimagesearch同級目錄下)

作用：測試通過輪廓列表尋找到輪廓是否正確

疑問：

1、sklearn.svm模塊中SVC的詳解

參數(shù)解釋鏈接：https://blog.csdn.net/szlcw1/article/details/52336824

2、提示錯誤信息：Default value of `block_norm`==`L1` is deprecated and will be changed to `L2-Hys` in v0.15

運行指令：python test_model_no_nms.py --conf conf/cars.json--image datasets/caltech101/101_ObjectCategories/car_side/image_0004.jpg

from pyimagesearch.object_detection import ObjectDetector from pyimagesearch.descriptors import HOG from pyimagesearch.utils import Conf import imutils import argparse import cPickle import cv2ap = argparse.ArgumentParser() ap.add_argument("-c", "--conf", required=True, help="path to the configuration file") ap.add_argument("-i", "--image", required=True, help="path to the image to be classified") args = vars(ap.parse_args())conf = Conf(args["conf"])model = cPickle.loads(open(conf["classifier_path"]).read()) #SVC線性值 hog = HOG(orientations=conf["orientations"], pixelsPerCell=tuple(conf["pixels_per_cell"]),cellsPerBlock=tuple(conf["cells_per_block"]), normalize=conf["normalize"]) #hog特征向量值提取方法 od = ObjectDetector(model, hog)image = cv2.imread(args["image"]) image = imutils.resize(image, width=min(260, image.shape[1])) gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)(boxes, probs) = od.detect(gray, conf["window_dim"], winStep=conf["window_step"],pyramidScale=conf["pyramid_scale"], minProb=conf["min_probability"])for (startX, startY, endX, endY) in boxes:cv2.rectangle(image, (startX, startY), (endX, endY), (0, 0, 255), 2)cv2.imshow("Image", image) cv2.waitKey(0)

問題1：修改感興趣概率，獲取的敏感區(qū)域一樣？

8、非最大抑制

作用：解決重疊邊界框，尋找到最佳匹配輪廓

nms.py(object_detection目錄下)：

疑問：1、numpy模塊中argsort()函數(shù)的作用：

　　獲取數(shù)組從小到的大索引值

2、numpy模塊中concatenate()函數(shù)作用：

　　對數(shù)組進行拼接

3、while里面對idx的處理邏輯？語法規(guī)則？

講解示例：https://blog.csdn.net/scut_salmon/article/details/79318387

nms.py

import numpy as np def non_max_suppression(boxes, probs, overlapThresh):#參數(shù)1：邊界框列表，參數(shù)2：每個框相關(guān)的概率，參數(shù)3：重疊的閥值if len(boxes) == 0:#判斷邊界框列表是否為空return []if boxes.dtype.kind == "i":boxes = boxes.astype("float")#將邊界框數(shù)據(jù)由整型轉(zhuǎn)換成浮點型#獲取邊界框每個角的坐標pick = []x1 = boxes[:, 0]y1 = boxes[:, 1]x2 = boxes[:, 2]y2 = boxes[:, 3]#獲取邊界框的面積area = (x2 - x1 + 1) * (y2 - y1 + 1)idxs = np.argsort(probs)#獲取列表的長度，并將其保留在邊框列表中while len(idxs) > 0:last = len(idxs) - 1i = idxs[last]pick.append(i)#獲取邊最大坐標的界框和最小的坐標邊界寬xx1 = np.maximum(x1[i], x1[idxs[:last]])yy1 = np.maximum(y1[i], y1[idxs[:last]])xx2 = np.minimum(x2[i], x2[idxs[:last]])yy2 = np.minimum(y2[i], y2[idxs[:last]])w = np.maximum(0, xx2 - xx1 + 1)h = np.maximum(0, yy2 - yy1 + 1)# 計算重疊比例overlap = (w * h) / area[idxs[:last]]idxs = np.delete(idxs, np.concatenate(([last],np.where(overlap > overlapThresh)[0])))return boxes[pick].astype("int")

test_model.py(與pyimagesearch目錄同級)：

作用：測試解決重疊輪廓的邊界效果

運行指令：python test_model.py --conf conf/cars.json--image datasets/caltech101/101_ObjectCategories/car_side/image_0004.jpg

from pyimagesearch.object_detection import non_max_suppression from pyimagesearch.object_detection import ObjectDetector from pyimagesearch.descriptors import HOG from pyimagesearch.utils import Conf import numpy as np import imutils import argparse import cPickle import cv2ap = argparse.ArgumentParser() ap.add_argument("-c", "--conf", required=True, help="path to the configuration file") ap.add_argument("-i", "--image", required=True, help="path to the image to be classified") args = vars(ap.parse_args())conf = Conf(args["conf"])model = cPickle.loads(open(conf["classifier_path"]).read()) hog = HOG(orientations=conf["orientations"], pixelsPerCell=tuple(conf["pixels_per_cell"]),cellsPerBlock=tuple(conf["cells_per_block"]), normalize=conf["normalize"]) od = ObjectDetector(model, hog)image = cv2.imread(args["image"]) image = imutils.resize(image, width=min(260, image.shape[1])) gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)(boxes, probs) = od.detect(gray, conf["window_dim"], winStep=conf["window_step"],pyramidScale=conf["pyramid_scale"], minProb=conf["min_probability"]) pick = non_max_suppression(np.array(boxes), probs, conf["overlap_thresh"]) orig = image.copy()for (startX, startY, endX, endY) in boxes:cv2.rectangle(orig, (startX, startY), (endX, endY), (0, 0, 255), 2)for (startX, startY, endX, endY) in pick:cv2.rectangle(image, (startX, startY), (endX, endY), (0, 255, 0), 2)cv2.imshow("Original", orig) cv2.imshow("Image", image) cv2.waitKey(0)

?9、堅硬的負面特征采集

作用：訓(xùn)練與需要提取的特征完全不相關(guān)的特征，一般是物體的背面場景，所以一般應(yīng)用sceneclass13數(shù)據(jù)集，訓(xùn)練負面特征變量，減少誤判情況。將負面數(shù)據(jù)也寫入h5py數(shù)據(jù)庫中。

運行指令：python hard_negative_mine.py --conf conf/cars.json

hard_negative_mine.py

from __future__ import print_function from pyimagesearch.object_detection.objectdetector import ObjectDetector from pyimagesearch.descriptors.hog import HOG from pyimagesearch.utils import dataset from pyimagesearch.utils.conf import Conf from imutils import paths import numpy as np import progressbar import argparse import cPickle import random import cv2ap = argparse.ArgumentParser() ap.add_argument("-c", "--conf", required = True, help = "path to the configuration file") args = vars(ap.parse_args())conf =Conf(args["conf"]) data =[]model =cPickle.loads(open(conf["classifier_path"]).read()) hog =HOG(orientations = conf["orientations"], pixelsPerCell = tuple(conf["pixels_per_cell"]), cellsPerBlock = tuple(conf["cells_per_block"]), normalize = conf["normalize"])od = ObjectDetector(model, hog)dstPaths = list(paths.list_images(conf["image_distractions"])) dstPaths =random.sample(dstPaths, conf["hn_num_distraction_images"])widgets = ["Mining:", progressbar.Percentage(), " ", progressbar.Bar(), "", progressbar.ETA()] pbar = progressbar.ProgressBar(maxval = len(dstPaths), widgets = widgets).start() myindex = 0 for (i, imagePath) in enumerate(dstPaths):image = cv2.imread(imagePath)gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)(boxes, probs) = od.detect(gray, conf["window_dim"], winStep = conf["hn_window_step"], pyramidScale = conf["hn_pyramid_scale"], minProb = conf["hn_min_probability"])for (prob, (startX, startY, endX, endY)) in zip(probs, boxes):roi = cv2.resize(gray[startY:endY, startX:endX], tuple(conf["window_dim"]), interpolation = cv2.INTER_AREA)features = hog.describe(roi)data.append(np.hstack([[prob], features]))pbar.update(i)pbar.finish() print("[INFO] sorting by probability...") data = np.array(data) data = data[data[:, 0].argsort()[::-1]]print("[INFO] dmping hard negatives to file...") dataset.dump_dataset(data[:, 1:], [-1] * len(data), conf["features_path"], "hard_negatives", writeMethod = "a")

10、重新訓(xùn)練對象檢測器

作用：將堅硬的負面特征加入到SVM中，減少虛假輪廓的出現(xiàn)

運行指令：python train_model.py --conf conf/cars.json --hard-negatives 1

train_model.py

from __future__ import print_function from pyimagesearch.utils import dataset from pyimagesearch.utils.conf import Conf from sklearn.svm import SVC import argparse import pickle import numpy as npap = argparse.ArgumentParser() ap.add_argument("-c", "--conf", required = True, help = "path to the configuration file") ap.add_argument("-n", "--hard-negatives", type = int, default = -1, help="flag indicating whether or not hard negatives should be used") args = vars(ap.parse_args())print("[INFO] loading dataset...") conf = Conf(args["conf"]) (data, labels) = dataset.load_dataset(conf["features_path"], "features")if args["hard_negatives"] > 0:print("[INFO] loading hard negatives...")(hardData,hardLabels) = dataset.load_dataset(conf["features_path"], "hard_negatives")data = np.vstack([data, hardData])labels = np.hstack([labels, hardLabels])print("[INFO] training classifier...") model = SVC(kernel = "linear", C = conf["C"], probability = True, random_state = 42) model.fit(data, labels)print("[INFO] dumping classifier...") f = open(conf["classifier_path"], "wb") f.write(pickle.dumps(model)) f.close()

11、imglab的運用

前期準備工作：生成xml文件，運用imglab生成器，選取特征輪廓。

步驟1：imglab -c 文件夾路徑生成xml文件路徑? 步驟2：imglab xml文件? ? ? ? ? ? 手動框選特征區(qū)域

作用：將提取圖片的邊界框，并且將其特征運用svm分類。

運行指令：python?train_detector.py?--xml?face_detector/faces_annotations.xml?--detector?face_detector/detector.svm

from __future__ import print_function import argparse import dlibap = argparse.ArgumentParser() ap.add_argument("-x", "--xml", required = True, help = "path to input XML file") ap.add_argument("-d", "--detector", required = True, help = "path to output director") args = vars(ap.parse_args())print("[INFO] training detector....") options = dlib.simple_object_detector_training_options() options.C = 1.0 options.num_threads = 4 options.bei_verbose = True dlib.train_simple_object_detector(args["xml"], args["detector"], options)print("[INFO] training accuracy:{}".format(dlib.test_simple_object_detector(args["xml"], args["detector"])))detector = dlib.simple_object_detector(args["detector"]) win = dlib.image_window() win.set_image(detector) dlib.hit_enter_to_continue()

test_detector.py

作用：測試訓(xùn)練出來SVM線性向量

運行指令：python test_detector.py --detector face_detector/detector.svm--testing face_detector/testing

from imutils import paths import argparse import dlib import cv2ap = argparse.ArgumentParser() ap.add_argument("-d", "--detector", required = True, help = "Path to train object detector") ap.add_argument("-t", "--testing", required = True, help = "Path to directory of testing images") args = vars(ap.parse_args())detector = dlib.simple_object_detector(args["detector"])for testingPath in paths.list_images(args["testing"]):image = cv2.imread(testingPath)boxes = detector(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))for b in boxes:(x, y, w, h) = (b.left(), b.top(), b.right(), b.bottom())cv2.rectangle(image, (x, y), (w, h), (0, 255, 0), 2)

轉(zhuǎn)載于:https://www.cnblogs.com/w-x-me/p/7528427.html

總結(jié)

以上是生活随笔為你收集整理的计算机视觉-自定义对象检测器的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

编程问答

计算机视觉-自定义对象检测器

1、模板匹配

?2、訓(xùn)練自己的物體探測器

3.1、圖像金字塔

?3.2、滑動窗戶

4.1構(gòu)建自定義檢測框架的6個步驟

5、準備實驗和培訓(xùn)數(shù)據(jù)

?6、構(gòu)建HOG描述符

?7、初始訓(xùn)練階段

8、非最大抑制

?9、堅硬的負面特征采集

10、重新訓(xùn)練對象檢測器

11、imglab的運用

總結(jié)

3.1、圖像金字塔