日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

歡迎訪問(wèn) 生活随笔!

生活随笔

當(dāng)前位置: 首頁(yè) >

【计算机科学】【2011.05】【含源码】微阵列数据的SVM分类与边缘距离分析

發(fā)布時(shí)間:2024/3/12 44 豆豆
生活随笔 收集整理的這篇文章主要介紹了 【计算机科学】【2011.05】【含源码】微阵列数据的SVM分类与边缘距离分析 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

本文為美國(guó)阿克倫大學(xué)(作者:Ameer Basha Shaik Abdul)的碩士論文,共84頁(yè)。

支持向量機(jī)是一種統(tǒng)計(jì)分類算法,它借助于泛函超平面將兩類數(shù)據(jù)分開進(jìn)行分類。SVM在噪聲和高維數(shù)據(jù)(如微陣列)的應(yīng)用上具有良好的性能。(注:微陣列(DNA Microarray)也叫寡核苷酸陣列(Oligonucleitide array),是人類基因組計(jì)劃(Human Geneome Project,HGP)的逐步實(shí)施和分子生物學(xué)的迅猛發(fā)展及運(yùn)用的產(chǎn)物,它是生物學(xué)家受到計(jì)算機(jī)芯片制造和廣為應(yīng)用的啟迪,融微電子學(xué)、生命科學(xué)、計(jì)算機(jī)科學(xué)和光電化學(xué)為一體,在原來(lái)核酸雜交(Northern、Southern)的基礎(chǔ)上發(fā)展起來(lái)的一項(xiàng)新技術(shù),它是第三次革命(基因組革命)中的主要技術(shù)之一,是生物芯片中的一種。該技術(shù)的原理是在固體表面上集成已知序列的基因探針,被測(cè)生物細(xì)胞或組織中大量標(biāo)記的核酸序列與上述探針陣列進(jìn)行雜交,通過(guò)檢測(cè)相應(yīng)位置雜交探針,實(shí)現(xiàn)基因信息的快速檢測(cè)。)

泛函超平面的邊緣區(qū)域稱為危險(xiǎn)區(qū)域,它定義為兩個(gè)平行超平面之間的區(qū)域,平行超平面由兩類數(shù)據(jù)支持向量與泛函超平面之間的平均距離確定。本研究的主要目的是確定邊緣距離、危險(xiǎn)區(qū)寬度對(duì)分類器精度的影響,并分析邊緣距離在特征選擇中的作用。本文的研究使用了三組微陣列數(shù)據(jù)集。對(duì)于每個(gè)數(shù)據(jù)集,推導(dǎo)了兩類數(shù)據(jù)的泛函超平面方程,并獲得了相應(yīng)的支持向量。研究了危險(xiǎn)區(qū)寬度與分類精度之間的關(guān)系,還研究了用于構(gòu)建支持向量機(jī)的特征數(shù)量相對(duì)于邊緣距離的變化率。

研究結(jié)果表明,雖然邊緣距離與分類精度的相關(guān)性不是很強(qiáng),但利用分類精度相對(duì)于邊緣距離的變化率,可以確定構(gòu)造高性能支持向量機(jī)的最優(yōu)特征數(shù)。

Support vector machine is statisticalclassification algorithm that classifies data by separating two classes withthe help of a functional hyper plane. SVM is known for good performance onnoisy and high dimensional data such as microarray. A marginal region offunctional hyper plane named ?danger zone?is defined to be the regionbetween two parallel hyper planes that are determinedby the average distances of the support vectors from the two classes tofunctional hyper plane. The main aim of this study was to determine the effectof margin distance, the width of the danger zone, on the accuracy of theclassifier and to analyze the role of margin distance in feature selection. Thestudy was carried out using three microarray datasets. For each dataset,equation of functional hyper plane separating the two classes of data wasderived. The corresponding support vectors were obtained. The average distancesbetween support vectors from the two classes to functional hyper plane werecalculated. The relations between the width of the danger zone and theclassification accuracy were investigated. The rate of change of the margindistance with respect to the number of features used for constructing thesupport vector machine was also examined. The results indicate that althoughcorrelation between margin and accuracy is not very strong, but the rate ofchange of classification accuracy with respect to margin distance can beemployed to determine the optimal number of features for constructing highperformance support vector machine for classifying microarray samples.

1 引言

2 相關(guān)文獻(xiàn)回顧

3 研究數(shù)據(jù)與方法

4 研究結(jié)果與討論

5 結(jié)論

附錄 MATLAB源碼

附錄A 隨機(jī)產(chǎn)生訓(xùn)練與測(cè)試數(shù)據(jù)

附錄B 訓(xùn)練與測(cè)試數(shù)據(jù)集定標(biāo)

附錄C 對(duì)定標(biāo)訓(xùn)練數(shù)據(jù)進(jìn)行T檢驗(yàn)

附錄D 計(jì)算SVM分類器的邊緣距離

下載英文原文地址:

http://page2.dfpan.com/fs/3lcj02214291a659985/

更多精彩文章請(qǐng)關(guān)注微信號(hào):

總結(jié)

以上是生活随笔為你收集整理的【计算机科学】【2011.05】【含源码】微阵列数据的SVM分类与边缘距离分析的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。