OpenCV中的HOG+SVM在自动驾驶车辆检测中的应用实例
車輛檢測在車輛感知模塊中是非常重要的功能,本節(jié)我們的目標(biāo)如下:
-
在標(biāo)記的圖像訓(xùn)練集上進(jìn)行面向梯度的直方圖(HOG)特征提取并訓(xùn)練分類器線性SVM分類器
-
應(yīng)用顏色轉(zhuǎn)換,并將分箱的顏色特征以及顏色的直方圖添加到HOG特征矢量中
-
對于上面兩個(gè)步驟,不要忘記標(biāo)準(zhǔn)化您的功能,并隨機(jī)選擇一個(gè)用于訓(xùn)練和測試的選項(xiàng)
-
實(shí)施滑動(dòng)窗口技術(shù),并使用您訓(xùn)練的分類器搜索圖像中的車輛
-
在視頻流上運(yùn)行流水線(從test_video.mp4開始,稍后在完整的project_video.mp4中實(shí)現(xiàn)),并逐幀創(chuàng)建循環(huán)檢測的熱圖,以拒絕異常值并跟蹤檢測到的車輛
-
估算檢測到的車輛的邊界框
定向梯度直方圖(HOG)
定向梯度直方圖(HOG)是計(jì)算機(jī)視覺和圖像處理中用于目標(biāo)檢測的特征描述符。該技術(shù)計(jì)算圖像的局部部分中梯度定向的發(fā)生。這種方法類似于邊緣方向直方圖,尺度不變特征變換描述符和形狀上下文,但不同之處在于它是在均勻間隔的單元的密集網(wǎng)格上計(jì)算的,并使用重疊的局部對比度歸一化來提高準(zhǔn)確性。
從訓(xùn)練圖像中提取HOG特征
此步驟的代碼包含在方法“get_hog_features”中的文件vehicle_detection.py中
# Define a function to return HOG features and visualizationdef get_hog_features(self, img, orient, pix_per_cell, cell_per_block, vis=False, feature_vec=True):# Call with two outputs if vis==Trueif vis == True:features, hog_image = hog(img, orientations=orient, pixels_per_cell=(pix_per_cell, pix_per_cell),cells_per_block=(cell_per_block, cell_per_block), transform_sqrt=True, visualise=vis, feature_vector=feature_vec) ? ?return features, hog_image# Otherwise call with one outputelse: ? ? ?features = hog(img, orientations=orient, pixels_per_cell=(pix_per_cell, pix_per_cell),cells_per_block=(cell_per_block, cell_per_block), transform_sqrt=True, visualise=vis, feature_vector=feature_vec) ? ?return features我開始閱讀所有的車輛和非車輛圖像。這里是每一個(gè)中的一個(gè)的一個(gè)例子vehicle和non-vehicle類:
然后我探索不同的色彩空間和不同的skimage.hog()參數(shù)(orientations,pixels_per_cell,和cells_per_block)。我從兩個(gè)類的每一個(gè)中抓取隨機(jī)圖像,并顯示它們,以感受skimage.hog()輸出的樣子。使用后的圖像以不同的對比度和光度多次試驗(yàn),它的工作最好使用YCrCb相結(jié)合的色彩空間與HOG提取的特征orientations=9,pixels_per_cell=(8, 8)和cells_per_block=(2, 2)。
這是一個(gè)使用YCrCb色彩空間和HOG參數(shù)的例子orientations=9,pixels_per_cell=(8, 8)并且cells_per_block=(2, 2):
提取顏色特征的空間分級
為了使算法在識別汽車時(shí)更加穩(wěn)健,HOG特征還增加了一種新的特征。除非你確切地知道你的目標(biāo)對象是什么樣子,否則模板匹配不是一個(gè)特別可靠的尋找車輛的方法。但是,原始像素值在搜索汽車中包含在您的特征向量中仍然非常有用。
雖然包含全分辨率圖像的三個(gè)顏色通道可能很麻煩,但是我們可以對圖像執(zhí)行空間分級,并且仍然保留足夠的信息來幫助查找車輛。
正如你在下面的例子中看到的那樣,即使一路下降到32×32像素分辨率,汽車本身仍然可以被眼睛清楚地識別,這意味著相關(guān)特征仍然保留在這個(gè)分辨率下。
OpenCV的cv2.resize()是一個(gè)方便的縮小圖像分辨率的函數(shù)。
# Define a function to compute binned color features ?def bin_spatial(self, img, size=(32, 32)):# Use cv2.resize().ravel() to create the feature vector#features = cv2.resize(img, size).ravel() # Return the feature vector#return featurescolor1 = cv2.resize(img[:,:,0], size).ravel()color2 = cv2.resize(img[:,:,1], size).ravel()color3 = cv2.resize(img[:,:,2], size).ravel() ? ? ? ?return np.hstack((color1, color2, color3))提取顏色特征的直方圖
在這個(gè)項(xiàng)目中使用的另一個(gè)技術(shù),使更多的功能是顏色強(qiáng)度的直方圖,如下圖所示。
并執(zhí)行如下所示:
# Define a function to compute color histogram features # NEED TO CHANGE bins_range if reading .png files with mpimg!def color_hist(self, img, nbins=32, bins_range=(0, 256)):# Compute the histogram of the color channels separatelychannel1_hist = np.histogram(img[:,:,0], bins=nbins, range=bins_range)channel2_hist = np.histogram(img[:,:,1], bins=nbins, range=bins_range)channel3_hist = np.histogram(img[:,:,2], bins=nbins, range=bins_range) ? ? ? ?# Concatenate the histograms into a single feature vectorhist_features = np.concatenate((channel1_hist[0], channel2_hist[0], channel3_hist[0])) ? ? ? ?# Return the individual histograms, bin_centers and feature vectorreturn hist_features合并和規(guī)范化的功能
現(xiàn)在我們的工具箱中已經(jīng)有了幾個(gè)特征提取方法,我們幾乎已經(jīng)準(zhǔn)備好對分類器進(jìn)行訓(xùn)練了,但是首先,就像在任何機(jī)器學(xué)習(xí)應(yīng)用程序中一樣,我們需要規(guī)范化數(shù)據(jù)。Python的sklearn包為您提供了StandardScaler()方法來完成這個(gè)任務(wù)。
將單個(gè)圖像的所有不同特征組合為一組特征:
def convert_color(self, image, color_space='RGB'):if color_space == 'HSV':image = cv2.cvtColor(image, cv2.COLOR_RGB2HSV) ? ? ? ?elif color_space == 'LUV':image = cv2.cvtColor(image, cv2.COLOR_RGB2LUV) ? ? ? ?elif color_space == 'HLS':image = cv2.cvtColor(image, cv2.COLOR_RGB2HLS) ? ? ? ?elif color_space == 'YUV':image = cv2.cvtColor(image, cv2.COLOR_RGB2YUV) ? ? ? ?elif color_space == 'YCrCb':image = cv2.cvtColor(image, cv2.COLOR_RGB2YCrCb) ? ? ? ?return image ? ?# Define a function to extract features from a list of images# Have this function call bin_spatial() and color_hist()def extract_features(self, image, color_space='RGB', spatial_size=(32, 32),hist_bins=32, orient=9, pix_per_cell=8, cell_per_block=2, hog_channel=0,spatial_feat=True, hist_feat=True, hog_feat=True):file_features = [] ? ? ? ?# apply color conversion if other than 'RGB'if color_space != 'RGB':feature_image = self.convert_color(image, color_space) ? ? ? ?else: feature_image = np.copy(image) ? ? ?if spatial_feat == True:spatial_features = self.bin_spatial(feature_image, size=spatial_size)file_features.append(spatial_features) ? ? ? ?if hist_feat == True: ? ? ? ? ? ?# Apply color_hist()hist_features = self.color_hist(feature_image, nbins=hist_bins)file_features.append(hist_features) ? ? ? ?if hog_feat == True: ? ? ? ?# Call get_hog_features() with vis=False, feature_vec=Trueif hog_channel == 'ALL':hog_features = [] ? ? ? ? ? ? ? ?for channel in range(feature_image.shape[2]):hog_features.append(self.get_hog_features(feature_image[:,:,channel], orient, pix_per_cell, cell_per_block, vis=False, feature_vec=True))hog_features = np.ravel(hog_features) ? ? ? ?else:hog_features = self.get_hog_features(feature_image[:,:,hog_channel], orient, pix_per_cell, cell_per_block, vis=False, feature_vec=True) ? ? ? ? ? ?# Append the new feature vector to the features listfile_features.append(hog_features) ? ? ? ?return file_features規(guī)范化是需要避免一些功能類型更重要的其他:
# Extract features of from all not-car imagesnotcar_features = [] ? ? ? ?for file in notcar_filenames:# Read in each one by oneimage = mpimg.imread(file)features = self.extract_features(image, color_space=self.color_space, spatial_size=self.spatial_size, hist_bins=self.hist_bins, orient=self.orient, pix_per_cell=self.pix_per_cell, cell_per_block=self.cell_per_block, hog_channel=self.hog_channel, spatial_feat=self.spatial_feat, hist_feat=self.hist_feat, hog_feat=self.hog_feat)notcar_features.append(np.concatenate(features))X = np.vstack((car_features, notcar_features)).astype(np.float64) ? ? ? ? ? ? ? ? ? ? ? ?# Fit a per-column scaler ? ? ? ?self.X_scaler = StandardScaler().fit(X)# Apply the scaler to Xscaled_X = self.X_scaler.transform(X)使用規(guī)范化的功能來訓(xùn)練分類器
我使用2類圖像車輛和非車輛圖像訓(xùn)練了一個(gè)線性支持向量機(jī)。首先加載圖像,然后提取歸一化的特征,并在2個(gè)數(shù)據(jù)集中訓(xùn)練(80%)和測試(20%)中的混洗和分裂。在使用StandardScaler()訓(xùn)練分類器之前,將特征縮放到零均值和單位方差。源代碼可以在vechicle_detection.py中找到
def __train(self):print ('Training the model ...') ? ? ? ?# Read in and make a list of calibration imagescar_filenames = glob.glob(self.__train_directory+'/vehicles/*/*') ?notcar_filenames = glob.glob(self.__train_directory+'/non-vehicles/*/*') ? ? ? ? ? ? ? # Extract features of from all car imagescar_features = [] ? ? ? ?for file in car_filenames:# Read in each one by oneimage = mpimg.imread(file)features = self.extract_features(image, color_space=self.color_space, spatial_size=self.spatial_size, hist_bins=self.hist_bins, orient=self.orient, pix_per_cell=self.pix_per_cell, cell_per_block=self.cell_per_block, hog_channel=self.hog_channel, spatial_feat=self.spatial_feat, hist_feat=self.hist_feat, hog_feat=self.hog_feat)car_features.append(np.concatenate(features)) ? ? ? ?# Extract features of from all not-car imagesnotcar_features = [] ? ? ? ?for file in notcar_filenames:# Read in each one by oneimage = mpimg.imread(file)features = self.extract_features(image, color_space=self.color_space, spatial_size=self.spatial_size, hist_bins=self.hist_bins, orient=self.orient, pix_per_cell=self.pix_per_cell, cell_per_block=self.cell_per_block, hog_channel=self.hog_channel, spatial_feat=self.spatial_feat, hist_feat=self.hist_feat, hog_feat=self.hog_feat)notcar_features.append(np.concatenate(features)) ? ? ? ?X = np.vstack((car_features, notcar_features)).astype(np.float64) ? ? ? ? ? ? ? ? ? ? ? ?# Fit a per-column scalerself.X_scaler = StandardScaler().fit(X) ? ? ? ?# Apply the scaler to Xscaled_X = self.X_scaler.transform(X) ? ? ? ?# Define the labels vectory = np.hstack((np.ones(len(car_features)), np.zeros(len(notcar_features)))) ? ? ? ?# Split up data into randomized training and test setsrand_state = np.random.randint(0, 100) ? ? ? ?X_train, X_test, y_train, y_test = train_test_split(scaled_X, y, test_size=0.2, random_state=rand_state)print('Using:',self.orient,'orientations',self.pix_per_cell, ? ? ? ? ? ?'pixels per cell and', self.cell_per_block,'cells per block')print('Feature vector length:', len(X_train[0])) ? ? ? ?# Use a linear SVC self.svc = LinearSVC() ? ? ? ?self.svc.fit(X_train, y_train) ? ? ? ?# Check the score of the SVCprint('Test Accuracy of SVC = ', round(self.svc.score(X_test, y_test), 4)) ? ? ? ?# Pickle to save time for subsequent runsbinary = {}binary["svc"] = self.svcbinary["X_scaler"] = self.X_scalerpickle.dump(binary, open(self.__train_directory + '/' + self.__binary_filename, "wb")) ? ?def __load_binary(self):'''Load previously computed trained classifier'''with open(self.__train_directory + '/' + self.__binary_filename, mode='rb') as f:binary = pickle.load(f) ? ? ? ?self.svc = binary['svc'] ? ? ? ?self.X_scaler = binary['X_scaler'] ? ?def get_data(self):'''Getter for the trained data. At the first call it gerenates it.'''if os.path.isfile(self.__train_directory + '/' + self.__binary_filename):self.__load_binary() ? ? ? ?else:self.__train() ? ? ? ?return self.svc, self.X_scaler整個(gè)數(shù)據(jù)集(列車+測試)在車輛和非車輛之間均勻分布有17.767個(gè)項(xiàng)目。訓(xùn)練完成后,train.p被保存在子文件夾列中的磁盤上,供以后重新使用。訓(xùn)練好的線性支持向量機(jī)分類器在測試數(shù)據(jù)集上的準(zhǔn)確性相當(dāng)高?0.989
滑動(dòng)窗口搜索
我決定使用重疊的滑動(dòng)窗口搜索來搜索圖像下部的車輛。只需要搜索下面的部分,以避免搜索天空中的車輛,并使算法更快。窗口大小為64像素,每個(gè)單元8個(gè)單元和8個(gè)像素。在每張幻燈片窗戶移動(dòng)2個(gè)單元向右或向下。為了避免每個(gè)窗口反復(fù)提取特征,搜索速度更快,特征提取只進(jìn)行一次,滑動(dòng)窗口只使用該部分圖像。如果窗戶在長短途容納所有車輛時(shí)具有不同的比例尺,則檢測也可以更加穩(wěn)健。
該實(shí)現(xiàn)可以在vehicle_detection.py中找到:
# Define a single function that can extract features using hog sub-sampling and make predictionsdef find_cars(self, img, plot=False):bbox_list = [] draw_img = np.copy(img)img = img.astype(np.float32)/255img_tosearch = img[self.ystart:self.ystop,:,:]ctrans_tosearch = self.convert_color(img_tosearch, color_space='YCrCb') ? ? ? ?if self.scale != 1:imshape = ctrans_tosearch.shapectrans_tosearch = cv2.resize(ctrans_tosearch, (np.int(imshape[1]/self.scale), np.int(imshape[0]/self.scale)))ch1 = ctrans_tosearch[:,:,0]ch2 = ctrans_tosearch[:,:,1]ch3 = ctrans_tosearch[:,:,2] ? ? ? ?# Define blocks and steps as abovenxblocks = (ch1.shape[1] // self.pix_per_cell)-1nyblocks = (ch1.shape[0] // self.pix_per_cell)-1 nfeat_per_block = self.orient*self.cell_per_block**2# 64 was the orginal sampling rate, with 8 cells and 8 pix per cellwindow = 64nblocks_per_window = (window // self.pix_per_cell)-1 cells_per_step = 2 ?# Instead of overlap, define how many cells to stepnxsteps = (nxblocks - nblocks_per_window) // cells_per_stepnysteps = (nyblocks - nblocks_per_window) // cells_per_step# Compute individual channel HOG features for the entire imagehog1 = self.get_hog_features(ch1, self.orient, self.pix_per_cell, self.cell_per_block, feature_vec=False)hog2 = self.get_hog_features(ch2, self.orient, self.pix_per_cell, self.cell_per_block, feature_vec=False)hog3 = self.get_hog_features(ch3, self.orient, self.pix_per_cell, self.cell_per_block, feature_vec=False)bbox_all_list = [] for xb in range(nxsteps+1): ? ? ? ? ? ?for yb in range(nysteps):ypos = yb*cells_per_stepxpos = xb*cells_per_step ? ? ? ? ? ? ? ?# Extract HOG for this patchhog_feat1 = hog1[ypos:ypos+nblocks_per_window, xpos:xpos+nblocks_per_window].ravel() hog_feat2 = hog2[ypos:ypos+nblocks_per_window, xpos:xpos+nblocks_per_window].ravel() hog_feat3 = hog3[ypos:ypos+nblocks_per_window, xpos:xpos+nblocks_per_window].ravel() hog_features = np.concatenate((hog_feat1, hog_feat2, hog_feat3))xleft = xpos*self.pix_per_cellytop = ypos*self.pix_per_cell ? ? ? ? ? ? ? ?# Extract the image patchsubimg = cv2.resize(ctrans_tosearch[ytop:ytop+window, xleft:xleft+window], (64,64)) ? ? ? ? ? ? ? ?# Get color featuresspatial_features = self.bin_spatial(subimg, size=self.spatial_size)hist_features = self.color_hist(subimg, nbins=self.hist_bins) ? ? ? ? ? ? ? ?# Scale features and make a predictiontest_features = self.X_scaler.transform(np.hstack((spatial_features, hist_features, hog_features)).reshape(1, -1)) ? ?test_prediction = self.svc.predict(test_features) ? ? ? ? ? ? ? ?# compute current seize of the windowxbox_left = np.int(xleft*self.scale)ytop_draw = np.int(ytop*self.scale)win_draw = np.int(window*self.scale)bbox = ((xbox_left, ytop_draw+self.ystart),(xbox_left+win_draw,ytop_draw+win_draw+self.ystart)) ? ? ? ? ? ? ? ?if test_prediction == 1:bbox_list.append(bbox)bbox_all_list.append(bbox) ? ? ? ?if(plot==True):draw_img_detected = np.copy(draw_img) ? ? ? ? ? ?# draw all all searched windowsfor bbox in bbox_all_list:cv2.rectangle(draw_img, bbox[0], bbox[1], (0,0,255), 3) for bbox in bbox_list:cv2.rectangle(draw_img_detected, bbox[0], bbox[1], (0,0,255), 3) fig = plt.figure()plt.subplot(121)plt.imshow(draw_img)plt.title('Searched sliding windows')plt.subplot(122)plt.imshow(draw_img_detected, cmap='hot')plt.title('Detected vechicle windows')fig.tight_layout()plt.show() ? ? ? ?return bbox_listdef draw_labeled_bboxes(self, img, labels): ? ? ? ?# Iterate through all detected carsfor car_number in range(1, labels[1]+1): ? ? ? ? ? ?# Find pixels with each car_number label valuenonzero = (labels[0] == car_number).nonzero() ? ? ? ? ? ?# Identify x and y values of those pixelsnonzeroy = np.array(nonzero[0])nonzerox = np.array(nonzero[1]) ? ? ? ? ? ?# Define a bounding box based on min/max x and ybbox = ((np.min(nonzerox), np.min(nonzeroy)), (np.max(nonzerox), np.max(nonzeroy))) ? ? ? ? ? ?# Draw the box on the imagecv2.rectangle(img, bbox[0], bbox[1], (0,0,255), 6) ? ? ? ?# Return the imagereturn img從圖中可以看出,2輛車正確檢測到,但也有一些誤報(bào)。
為了避免誤報(bào),使用熱圖。點(diǎn)擊地圖加窗,重疊的窗口有更高的價(jià)值。超過一定的閾值的值保持為真正的正值。
def add_heat(self, heatmap, bbox_list):# Iterate through list of bboxesfor box in bbox_list: ? ? ? ? ? ?# Add += 1 for all pixels inside each bbox# Assuming each "box" takes the form ((x1, y1), (x2, y2))heatmap[box[0][1]:box[1][1], box[0][0]:box[1][0]] += 1# Return updated heatmapreturn heatmap# Iterate through list of bboxesdef apply_threshold(self, heatmap, threshold):# Zero out pixels below the thresholdheatmap[heatmap <= threshold] = 0# Return thresholded mapreturn heatmap box_list = vehicle_detector.find_cars(image, plot=plot)heat = np.zeros_like(image[:,:,0]).astype(np.float)# Add heat to each box in box listheat = vehicle_detector.add_heat(heat, box_list)# Apply threshold to help remove false positivesheat = vehicle_detector.apply_threshold(heat,1)# Visualize the heatmap when displaying ? ?heatmap = np.clip(heat, 0, 255)
要從熱圖找到最終的框,使用標(biāo)簽功能。
from scipy.ndimage.measurements import label # Find final boxes from heatmap using label functionlabels = label(heatmap)if(plot==True):#print(labels[1], 'cars found')plt.imshow(labels[0], cmap='gray')plt.show()
管道處理一個(gè)圖像
如下面的代碼所示,首先我們提取邊界框,包括真和假的正面。然后使用熱圖我們丟棄誤報(bào)。在使用該scipy.ndimage.measurements.label()方法計(jì)算最終的框之后。最后,這些框被渲染。
def process_image(image, plot=False):box_list = vehicle_detector.find_cars(image, plot=plot)heat = np.zeros_like(image[:,:,0]).astype(np.float) ? ?# Add heat to each box in box listheat = vehicle_detector.add_heat(heat, box_list) ? ?# Apply threshold to help remove false positivesheat = vehicle_detector.apply_threshold(heat,1) ? ?# Visualize the heatmap when displaying ? ?heatmap = np.clip(heat, 0, 255) ? ?# Find final boxes from heatmap using label functionlabels = label(heatmap) ? ?if(plot==True): ? ? ? ?#print(labels[1], 'cars found')plt.imshow(labels[0], cmap='gray')plt.show()new_image = vehicle_detector.draw_labeled_bboxes(image, labels) ? ?if(plot==True):fig = plt.figure()plt.subplot(121)plt.imshow(new_image)plt.title('Car Positions')plt.subplot(122)plt.imshow(heatmap, cmap='hot')plt.title('Heat Map')fig.tight_layout()plt.show() ? ?return new_imagedef process_test_images(vehicle_detector, plot=False):test_filenames = glob.glob(TEST_DIRECTORY+'/'+TEST_FILENAME) # Process each test imagefor image_filename in test_filenames: ? ? ? ?# Read in each imageimage = cv2.imread(image_filename)image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # RGB is standard in matlibplotimage = process_image(image, plot)image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR) # RGB is standard in matlibplotcv2.imwrite(OUTPUT_DIRECTORY+'/'+image_filename.split('/')[-1], image)這是測試圖像之一的流水線的結(jié)果:
管道處理一個(gè)視頻
process_image(image, plot=False)在視頻處理中使用了用于處理一個(gè)圖像的相同流水線。每個(gè)幀都從視頻中提取,由圖像管道處理,并使用VideoFileClip和ffmpeg合并到最終的視頻中
from moviepy.editor import VideoFileClip def process_video(video_filename, vehicle_detector, plot=False):video_input = VideoFileClip(video_filename + ".mp4")video_output = video_input.fl_image(process_image)video_output.write_videofile(video_filename + "_output.mp4", audio=False)process_test_images(vehicle_detector, plot=False)結(jié)論
當(dāng)前使用SVM分類器的實(shí)現(xiàn)對于測試的圖像和視頻來說工作良好,這主要是因?yàn)閳D像和視頻被記錄在類似的環(huán)境中。用一個(gè)非常不同的環(huán)境測試這個(gè)分類器不會(huì)有類似的好結(jié)果。使用深度學(xué)習(xí)和卷積神經(jīng)網(wǎng)絡(luò)的更健壯的分類器將更好地推廣到未知數(shù)據(jù)。
當(dāng)前實(shí)現(xiàn)的另一個(gè)問題是在視頻處理流水線中不考慮后續(xù)幀。保持連續(xù)幀之間的熱圖將更好地丟棄誤報(bào)。
目前的實(shí)施還有一個(gè)更大的改進(jìn)是多尺寸滑動(dòng)窗口,這將更好地概括查找短距離和長距離的車輛。
轉(zhuǎn)自:https://mp.weixin.qq.com/s/mNkLu15ZrnssnlrPvXSg7w
總結(jié)
以上是生活随笔為你收集整理的OpenCV中的HOG+SVM在自动驾驶车辆检测中的应用实例的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 摄像头自动曝光相关基础知识
- 下一篇: % 在C语言中的用法