當(dāng)前位置：首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

RANSAC算法做直线拟合

發(fā)布時(shí)間：2025/3/21 编程问答 32 豆豆

生活随笔收集整理的這篇文章主要介紹了 RANSAC算法做直线拟合小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

RANSAC算法之前了解過(guò)相關(guān)的原理，這兩天利用晚上閑暇的時(shí)間，看了一下RANSAC算法的Python代碼實(shí)現(xiàn)，這方面的資料很多了，這里就不在重復(fù)。在分析該RANSAC.py代碼之前，想用自己的對(duì)RANSAC的理解對(duì)其做下總結(jié)。

在實(shí)際應(yīng)用中獲取到的數(shù)據(jù)，常常會(huì)包含有噪聲數(shù)據(jù)，這些噪聲數(shù)據(jù)會(huì)使對(duì)模型的構(gòu)建造成干擾，我們稱這樣的噪聲數(shù)據(jù)點(diǎn)為outliers，那些對(duì)于模型構(gòu)建起積極作用的我們稱它們?yōu)閕nliers，RANSAC做的一件事就是先隨機(jī)的選取一些點(diǎn)，用這些點(diǎn)去獲得一個(gè)模型(這個(gè)講得有點(diǎn)玄，如果是在做直線擬合的話，這個(gè)所謂的模型其實(shí)就是斜率)，然后用此模型去測(cè)試剩余的點(diǎn)，如果測(cè)試的數(shù)據(jù)點(diǎn)在誤差允許的范圍內(nèi)，則將該數(shù)據(jù)點(diǎn)判為inlier，否則判為outlier。inliers的數(shù)目如果達(dá)到了某個(gè)設(shè)定的閾值，則說(shuō)明此次選取的這些數(shù)據(jù)點(diǎn)集達(dá)到了可以接受的程度，否則繼續(xù)前面的隨機(jī)選取點(diǎn)集后所有的步驟，不斷重復(fù)此過(guò)程，直到找到選取的這些數(shù)據(jù)點(diǎn)集達(dá)到了可以接受的程度為止，此時(shí)得到的模型便可認(rèn)為是對(duì)數(shù)據(jù)點(diǎn)的最優(yōu)模型構(gòu)建。

在Cookbook/RANSAC中給出的是一個(gè)用RANSAC做直線擬合的例子。這個(gè)例子非常的直觀，而且代碼也很簡(jiǎn)短易懂，為便于后面詳細(xì)解讀該代碼，這里把它貼出來(lái)：

1 # -*- coding: utf-8 -*- 2 import numpy 3 import scipy # use numpy if scipy unavailable 4 import scipy.linalg # use numpy if scipy unavailable 5 import pylab 6 7 ## Copyright (c) 2004-2007, Andrew D. Straw. All rights reserved. 8 9 def ransac(data,model,n,k,t,d,debug=False,return_all=False): 10 """fit model parameters to data using the RANSAC algorithm 11 12 This implementation written from pseudocode found at 13 http://en.wikipedia.org/w/index.php?title=RANSAC&oldid=116358182 14 15 Given: 16 data - a set of observed data points # 可觀測(cè)數(shù)據(jù)點(diǎn)集 17 model - a model that can be fitted to data points # 18 n - the minimum number of data values required to fit the model # 擬合模型所需的最小數(shù)據(jù)點(diǎn)數(shù)目 19 k - the maximum number of iterations allowed in the algorithm # 最大允許迭代次數(shù) 20 t - a threshold value for determining when a data point fits a model #確認(rèn)某一數(shù)據(jù)點(diǎn)是否符合模型的閾值 21 d - the number of close data values required to assert that a model fits well to data 22 Return: 23 bestfit - model parameters which best fit the data (or nil if no good model is found) 24 """ 25 iterations = 0 26 bestfit = None 27 besterr = numpy.inf 28 best_inlier_idxs = None 29 while iterations < k: 30 maybe_idxs, test_idxs = random_partition(n,data.shape[0]) 31 maybeinliers = data[maybe_idxs,:] 32 test_points = data[test_idxs] 33 maybemodel = model.fit(maybeinliers) 34 test_err = model.get_error( test_points, maybemodel) 35 also_idxs = test_idxs[test_err < t] # select indices of rows with accepted points 36 alsoinliers = data[also_idxs,:] 37 if debug: 38 print 'test_err.min()',test_err.min() 39 print 'test_err.max()',test_err.max() 40 print 'numpy.mean(test_err)',numpy.mean(test_err) 41 print 'iteration %d:len(alsoinliers) = %d'%( 42 iterations,len(alsoinliers)) 43 if len(alsoinliers) > d: 44 betterdata = numpy.concatenate( (maybeinliers, alsoinliers) ) 45 bettermodel = model.fit(betterdata) 46 better_errs = model.get_error( betterdata, bettermodel) 47 thiserr = numpy.mean( better_errs ) 48 if thiserr < besterr: 49 bestfit = bettermodel 50 besterr = thiserr 51 best_inlier_idxs = numpy.concatenate( (maybe_idxs, also_idxs) ) 52 iterations+=1 53 if bestfit is None: 54 raise ValueError("did not meet fit acceptance criteria") 55 if return_all: 56 return bestfit, {'inliers':best_inlier_idxs} 57 else: 58 return bestfit 59 60 def random_partition(n,n_data): 61 """return n random rows of data (and also the other len(data)-n rows)""" 62 all_idxs = numpy.arange( n_data ) 63 numpy.random.shuffle(all_idxs) 64 idxs1 = all_idxs[:n] 65 idxs2 = all_idxs[n:] 66 return idxs1, idxs2 67 68 class LinearLeastSquaresModel: 69 """linear system solved using linear least squares 70 71 This class serves as an example that fulfills the model interface 72 needed by the ransac() function. 73 74 """ 75 def __init__(self,input_columns,output_columns,debug=False): 76 self.input_columns = input_columns 77 self.output_columns = output_columns 78 self.debug = debug 79 def fit(self, data): 80 A = numpy.vstack([data[:,i] for i in self.input_columns]).T 81 B = numpy.vstack([data[:,i] for i in self.output_columns]).T 82 x,resids,rank,s = scipy.linalg.lstsq(A,B) 83 return x 84 def get_error( self, data, model): 85 A = numpy.vstack([data[:,i] for i in self.input_columns]).T 86 B = numpy.vstack([data[:,i] for i in self.output_columns]).T 87 B_fit = scipy.dot(A,model) 88 err_per_point = numpy.sum((B-B_fit)**2,axis=1) # sum squared error per row 89 return err_per_point 90 91 def test(): 92 # generate perfect input data 93 n_samples = 500 94 n_inputs = 1 95 n_outputs = 1 96 A_exact = 20*numpy.random.random((n_samples,n_inputs) ) # x坐標(biāo) 97 perfect_fit = 60*numpy.random.normal(size=(n_inputs,n_outputs) ) # the model(斜率) 98 B_exact = scipy.dot(A_exact,perfect_fit) # y坐標(biāo) 99 assert B_exact.shape == (n_samples,n_outputs) #驗(yàn)證y坐標(biāo)數(shù)組的大小 100 #pylab.plot( A_exact, B_exact, 'b.', label='data' ) 101 #pylab.show() 102 103 # add a little gaussian noise (linear least squares alone should handle this well) 104 A_noisy = A_exact + numpy.random.normal(size=A_exact.shape ) # x坐標(biāo)添加高斯噪聲 105 B_noisy = B_exact + numpy.random.normal(size=B_exact.shape ) # y坐標(biāo).... 106 #pylab.plot( A_noisy, B_noisy, 'b.', label='data' ) 107 108 if 1: 109 # add some outliers 110 n_outliers = 100 # 500個(gè)數(shù)據(jù)點(diǎn)有100個(gè)是putliers 111 all_idxs = numpy.arange( A_noisy.shape[0] ) 112 numpy.random.shuffle(all_idxs) # 索引隨機(jī)排列 113 outlier_idxs = all_idxs[:n_outliers] # 選取all_idxs前100個(gè)做outlier_idxs 114 non_outlier_idxs = all_idxs[n_outliers:] # 后面的不是outlier_idxs 115 A_noisy[outlier_idxs] = 20*numpy.random.random((n_outliers,n_inputs) ) # 外點(diǎn)的橫坐標(biāo) 116 B_noisy[outlier_idxs] = 50*numpy.random.normal(size=(n_outliers,n_outputs) ) # 外點(diǎn)的縱坐標(biāo) 117 #pylab.plot( A_noisy, B_noisy, 'b.', label='data' ) 118 #pylab.show() 119 120 121 # setup model 122 123 all_data = numpy.hstack( (A_noisy,B_noisy) ) # 組成坐標(biāo)對(duì) 124 input_columns = range(n_inputs) # the first columns of the array 125 output_columns = [n_inputs+i for i in range(n_outputs)] # the last columns of the array 126 debug = False 127 model = LinearLeastSquaresModel(input_columns,output_columns,debug=debug) 128 129 linear_fit,resids,rank,s = scipy.linalg.lstsq(all_data[:,input_columns], 130 all_data[:,output_columns]) 131 132 # run RANSAC algorithm 133 ransac_fit, ransac_data = ransac(all_data,model, 134 50, 1000, 7e3, 300, # misc. parameters 135 debug=debug,return_all=True) 136 if 1: 137 import pylab 138 139 sort_idxs = numpy.argsort(A_exact[:,0]) # 對(duì)A_exact排序， sort_idxs為排序索引 140 A_col0_sorted = A_exact[sort_idxs] # maintain as rank-2 array 141 142 if 1: 143 pylab.plot( A_noisy[:,0], B_noisy[:,0], 'k.', label='data' ) 144 pylab.plot( A_noisy[ransac_data['inliers'],0], B_noisy[ransac_data['inliers'],0], 'bx', label='RANSAC data' ) 145 else: 146 pylab.plot( A_noisy[non_outlier_idxs,0], B_noisy[non_outlier_idxs,0], 'k.', label='noisy data' ) 147 pylab.plot( A_noisy[outlier_idxs,0], B_noisy[outlier_idxs,0], 'r.', label='outlier data' ) 148 pylab.plot( A_col0_sorted[:,0], 149 numpy.dot(A_col0_sorted,ransac_fit)[:,0], 150 label='RANSAC fit' ) 151 pylab.plot( A_col0_sorted[:,0], 152 numpy.dot(A_col0_sorted,perfect_fit)[:,0], 153 label='exact system' ) 154 pylab.plot( A_col0_sorted[:,0], 155 numpy.dot(A_col0_sorted,linear_fit)[:,0], 156 label='linear fit' ) 157 pylab.legend() 158 pylab.show() 159 160 if __name__=='__main__': 161 test()

上面代碼跟原版的代碼相比，我刪除了一些冗余的東西。在test()中做的是直線擬合。在看test()部分之前，我們先來(lái)看看RANSAC部分的代碼，傳入RANSAC函數(shù)中的參數(shù)有8個(gè)，前面6個(gè)是比較重要的。data就是全部的數(shù)據(jù)點(diǎn)集，model注釋里給出的是擬合點(diǎn)集的模型，放到這個(gè)直線擬合的實(shí)例下，就是斜率，n就是擬合時(shí)所需要的最小數(shù)據(jù)點(diǎn)數(shù)目，放在這里直線擬合的例子中，就是用于選取的用于去做直線擬合的數(shù)據(jù)點(diǎn)數(shù)目，k就是最大允許的迭代次數(shù)，t是人為設(shè)定的用于判斷誤差接受許可的范圍。這幾個(gè)參數(shù)的含義知道了，剩下的就是理解while循環(huán)里面的內(nèi)容了。在每一次循環(huán)中，選對(duì)所有的數(shù)據(jù)點(diǎn)做一個(gè)隨機(jī)的劃分，將數(shù)據(jù)點(diǎn)集分成兩堆，分別對(duì)應(yīng)maybeinliers和test_points，maybeinliers這部分?jǐn)?shù)據(jù)用于做直線擬合，這里直線擬合采用的是最小二乘法，得到擬合到的直線的斜率maybemodel，然后用該直線及測(cè)試數(shù)據(jù)的橫坐標(biāo)去估計(jì)測(cè)試數(shù)據(jù)的縱坐標(biāo)，也就是在該模型下測(cè)試數(shù)據(jù)的估計(jì)值，測(cè)試數(shù)據(jù)的估計(jì)值和測(cè)試數(shù)據(jù)的真實(shí)值做一個(gè)平方和便得到誤差，將得到的誤差分別和設(shè)定的可接受誤差進(jìn)行判斷，在誤差范圍內(nèi)的判定為inlier，否者判斷為outlier。當(dāng)inliers的數(shù)目達(dá)到了設(shè)定的數(shù)目的要求是，再講inliers和maybeinliers放一下再做一下最小二乘擬合，便得到最終的最佳斜率了。

test()部分的內(nèi)容很簡(jiǎn)單，先生成在某條直線上的一些離散點(diǎn)，這里某條直線的斜率就是精確的模型：然后添加高斯平穩(wěn)高斯噪聲：將其中的某些點(diǎn)變?yōu)閛utliers:最后用RANSAC擬合出來(lái)的結(jié)果如下：整個(gè)過(guò)程就醬紫，后面有時(shí)間繼續(xù)前面在BoW圖像檢索Python實(shí)戰(zhàn)用RANSAC做一個(gè)重排過(guò)程。

from:?http://yongyuan.name/blog/fitting-line-with-ransac.html

總結(jié)

以上是生活随笔為你收集整理的RANSAC算法做直线拟合的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： VLFeat SLIC超像素分割(Cpp
下一篇： GMM、fisher vector、SI