日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

r语言degseq2_第二次RNA-seq实战总结(3)-用DESeq2进行基因表达差异分析

發布時間:2023/12/20 编程问答 37 豆豆
生活随笔 收集整理的這篇文章主要介紹了 r语言degseq2_第二次RNA-seq实战总结(3)-用DESeq2进行基因表达差异分析 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

DESeq2是一個用于分析基因表達差異的R包,具體操作要在R語言中運行

1.R語言安裝DESeq2>source("https://bioconductor.org/biocLite.R")

>biocLite("DESeq2")

2.載入基因表達量文件,添加列名>?setwd("C:\\Users\\18019\\Desktop\\counts")

>?options(stringsAsFactors=FALSE)

>?control1

>?head(control1)

gene_id?control11?ENSG00000000003.14_2?????15762??ENSG00000000005.5_2????????03?ENSG00000000419.12_2??????7564?ENSG00000000457.13_3??????3015?ENSG00000000460.16_5??????7646?ENSG00000000938.12_2????????0>?control2

>?treat1

>treat2

3.數據整合>?raw_count?

>?head(raw_count)

gene_id?control1?control2??treat1??treat21?__alignment_not_unique??7440131??2973831?7861484?86768842????????????__ambiguous???976485???412543?1014239?11790513???????????__no_feature??1860117???768637?1289737?18120564??????????__not_aligned??1198545???572588?1256232?13480685????????__too_low_aQual????????0????????0???????0???????06???ENSG00000000003.14_2?????1576??????713????1589????1969#刪除前五行>raw_count_filt?aENSEMBL?row.names(raw_count_filt)?

>?raw_count_filt?

>?colnames(raw_count_filt)[1]?

>head(raraw_count_filt?)

ensembl_gene_id??????????????gene_id?control1?control2?treat1?treat2

ENSG00000000003?ENSG00000000003?ENSG00000000003.14_2?????1576??????713???1589???1969ENSG00000000005?ENSG00000000005??ENSG00000000005.5_2????????0????????0??????0??????1ENSG00000000419?ENSG00000000419?ENSG00000000419.12_2??????756??????384????806????984ENSG00000000457?ENSG00000000457?ENSG00000000457.13_3??????301??????151????217????324ENSG00000000460?ENSG00000000460?ENSG00000000460.16_5??????764??????312????564????784ENSG00000000938?ENSG00000000938?ENSG00000000938.12_2????????0????????0??????0??????0

4.對基因進行注釋-獲取gene_symbol

用bioMart對ensembl_id轉換成gene_symbol>?library("biomaRt")

>?library("curl")

>?mart?

>?my_ensembl_gene_id?

>??options(timeout?=?4000000)

>?hg_symbols

>?head(readcount)

ensembl_gene_id??????????????gene_id?control1?control2?treat1?treat2?hgnc_symbol?chromosome_name?start_position1?ENSG00000000003?ENSG00000000003.14_2?????1576??????713???1589???1969??????TSPAN6???????????????X??????1006271092?ENSG00000000005??ENSG00000000005.5_2????????0????????0??????0??????1????????TNMD???????????????X??????1005848023?ENSG00000000419?ENSG00000000419.12_2??????756??????384????806????984????????DPM1??????????????20???????509348674?ENSG00000000457?ENSG00000000457.13_3??????301??????151????217????324???????SCYL3???????????????1??????1698496315?ENSG00000000460?ENSG00000000460.16_5??????764??????312????564????784????C1orf112???????????????1??????1696620076?ENSG00000000938?ENSG00000000938.12_2????????0????????0??????0??????0?????????FGR???????????????1???????27612064

end_position???band1????100639991??q22.12????100599885??q22.13?????50958555?q13.134????169894267??q24.25????169854080??q24.26?????27635277??p35.3#輸出count表達矩陣>?write.csv(readcount,?file='readcount_all.csv')

>?readcount

>?write.csv(readcount,?file='readcount.csv')

>?head(readcount)

control1?control2?treat1?treat2

ENSG00000000003?????1576??????713???1589???1969ENSG00000000005????????0????????0??????0??????1ENSG00000000419??????756??????384????806????984ENSG00000000457??????301??????151????217????324ENSG00000000460??????764??????312????564????784ENSG00000000938????????0????????0??????0??????0

5.DEseq2篩選差異表達基因并注釋(bioMart)#載入數據(countData和colData)>?mycounts

>?head(mycounts)

control1?control2?treat1?treat2

ENSG00000000003?????1576??????713???1589???1969

ENSG00000000005????????0????????0??????0??????1

ENSG00000000419??????756??????384????806????984

ENSG00000000457??????301??????151????217????324

ENSG00000000460??????764??????312????564????784

ENSG00000000938????????0????????0??????0??????0

>?condition?

>?condition

[1]?control?control?treat???treat

Levels:?control?treat

>?colData?

>?colData

condition

control1???control

control2???control

treat1???????treat

treat2???????treat

構建dds對象,開始DESeq流程>library("DESeq2")

>?dds?

>?dds?

estimating?size?factors

estimating?dispersions

gene-wise?dispersion?estimates

mean-dispersion?relationship

final?dispersion?estimates

fitting?model?and?testing

>?ddsclass:?DESeqDataSet?dim:?60880?4?metadata(1):?version

assays(4):?counts?mu?H?cooks

rownames(60880):?ENSG00000000003?ENSG00000000005?...?ENSG00000285993?ENSG00000285994

rowData?names(22):?baseMean?baseVar?...?deviance?maxCooks

colnames(4):?control1?control2?treat1?treat2

colData?names(2):?condition?sizeFactor#查看總體結果>?res?=?results(dds,?contrast=c("condition",?"control",?"treat"))

>?res?=?res[order(res$pvalue),]

>?head(res)

log2?fold?change?(MLE):?condition?control?vs?treat

Wald?test?p-value:?condition?control?vs?treat

DataFrame?with?6?rows?and?6?columns

baseMean???log2FoldChange?????????????lfcSE?????????????stat???????????????pvalue

ENSG00000178691?1025.66218695436?2.83012875791025?0.225513526042636?12.5497073615672?3.98981786210676e-36ENSG00000135535?2415.77359618136?1.22406336488047?0.183431131037356?6.67314952460929??2.5037106203736e-11ENSG00000164172?531.425786834548?1.30449018960413?0.207785830749451?6.27805170785243?3.42841906697453e-10ENSG00000172239?483.998634607265?1.31701332235233?0.215453141699223?6.11275988813803?9.79226522597759e-10ENSG00000237296?53.0114998109978?2.70139282483841?0.480033904207378?5.62750422660019?1.82835684560772e-08

ENSG00000196504?3592.67315807893?1.09372324353448?0.200308218929736?5.46020153031335?4.75594407815571e-08

padj

ENSG00000178691?3.90682965057494e-32ENSG00000135535?1.22581671973492e-07ENSG00000164172?1.11903598346049e-06ENSG00000172239?2.39714652731931e-06ENSG00000237296???????????????????NA

ENSG00000196504?9.31404088266014e-05>?summary(res)

out?of?33100?with?nonzero?total?read?count

adjusted?p-value??0?(up)???????:?78,?0.24%

LFC?

outliers?[1]???????:?0,?0%

low?counts?[2]?????:?23308,?70%

(mean?count?

[1]?see?'cooksCutoff'?argument?of??results

[2]?see?'independentFiltering'?argument?of??results#這里可以看到有78個基因上調,15個基因下調#將分析結果輸出>?write.csv(res,file="All_results.csv")

提取差異表達基因

這里我用的方法是倍差法

獲取padj(p值經過多重校驗校正后的值)小于0.05,表達倍數取以2為對數后大于1或者小于-1的差異表達基因>?diff_gene_deseq2??1)

>?dim(diff_gene_deseq2)

[1]?21??6

>?head(diff_gene_deseq2)

log2?fold?change?(MLE):?condition?control?vs?treat

Wald?test?p-value:?condition?control?vs?treat

DataFrame?with?6?rows?and?6?columns

baseMean???log2FoldChange?????????????lfcSE?????????????stat???????????????pvalue

ENSG00000178691?1025.66218695436?2.83012875791025?0.225513526042636?12.5497073615672?3.98981786210676e-36

ENSG00000135535?2415.77359618136?1.22406336488047?0.183431131037356?6.67314952460929??2.5037106203736e-11

ENSG00000164172?531.425786834548?1.30449018960413?0.207785830749451?6.27805170785243?3.42841906697453e-10

ENSG00000172239?483.998634607265?1.31701332235233?0.215453141699223?6.11275988813803?9.79226522597759e-10

ENSG00000196504?3592.67315807893?1.09372324353448?0.200308218929736?5.46020153031335?4.75594407815571e-08

ENSG00000163848?633.066990185649?1.15489622775117?0.219655131372136?5.25777030810433?1.45812478575117e-07

padj

ENSG00000178691?3.90682965057494e-32

ENSG00000135535?1.22581671973492e-07

ENSG00000164172?1.11903598346049e-06

ENSG00000172239?2.39714652731931e-06

ENSG00000196504?9.31404088266014e-05

ENSG00000163848?0.000230253090268928#輸出差異基因>?write.csv(diff_gene_deseq2,file=?"DEG_treat_vs_control.csv")#用bioMart對差異表達基因進行注釋>?library("biomaRt")

>?library("curl")

>?hg_symbols

>?head(hg_symbols)

ensembl_gene_id?external_gene_name

1?ENSG00000011405????????????PIK3C2A

2?ENSG00000100731??????????????PCNX1

3?ENSG00000128512??????????????DOCK4

4?ENSG00000135535??????????????CD164

5?ENSG00000140526??????????????ABHD2

6?ENSG00000144228??????????????SPOPL

description

1?phosphatidylinositol-4-phosphate?3-kinase?catalytic?subunit?type?2?alpha?[Source:HGNC?Symbol;Acc:HGNC:8971]

2???????????????????????????????????????????????????????????????pecanex?1?[Source:HGNC?Symbol;Acc:HGNC:19740]

3??????????????????????????????????????????????dedicator?of?cytokinesis?4?[Source:HGNC?Symbol;Acc:HGNC:19192]

4???????????????????????????????????????????????????????????CD164?molecule?[Source:HGNC?Symbol;Acc:HGNC:1632]

5?????????????????????????????????????????abhydrolase?domain?containing?2?[Source:HGNC?Symbol;Acc:HGNC:18717]

6???????????????????????????????????????speckle?type?BTB/POZ?protein?like?[Source:HGNC?Symbol;Acc:HGNC:27934]#合并數據:res結果hg_symbols合并成一個文件>?ensembl_gene_id

>?diff_gene_deseq2

>?colnames(diff_gene_deseq2)[1]

>?diff_name

>?head(diff_name)

DataFrame?with?6?rows?and?9?columns

ensembl_gene_id?????????baseMean???log2FoldChange?????????????lfcSE?????????????stat???????????????pvalue

1?ENSG00000011405?1600.01408863821?1.07722909393382??0.24714564887963?4.35868120202462?1.30848557424083e-05

2?ENSG00000100731?1162.93822827396??1.0006257630015?0.214393389946423?4.66724166846545?3.05270197242525e-06

3?ENSG00000128512?368.442571635954?1.19657846347522?0.262780839813213??4.5535224878867?5.27550292947225e-06

4?ENSG00000135535?2415.77359618136?1.22406336488047?0.183431131037356?6.67314952460929??2.5037106203736e-11

5?ENSG00000140526?796.447227235737?1.05296203760187??0.23492350092969?4.48214858639031?7.38952622958053e-06

6?ENSG00000144228?293.746859588111?1.10903132755747?0.283181091639851?3.91633255290906??8.9906210067011e-05

padj?external_gene_name

1??0.00533862114290257????????????PIK3C2A

2????0.002491004809499??????????????PCNX1

3??0.00319558231320097??????????????DOCK4

4?1.22581671973492e-07??????????????CD164

5??0.00364483675888146??????????????ABHD2

6???0.0214722343652725??????????????SPOPL

description

1?phosphatidylinositol-4-phosphate?3-kinase?catalytic?subunit?type?2?alpha?[Source:HGNC?Symbol;Acc:HGNC:8971]

2???????????????????????????????????????????????????????????????pecanex?1?[Source:HGNC?Symbol;Acc:HGNC:19740]

3??????????????????????????????????????????????dedicator?of?cytokinesis?4?[Source:HGNC?Symbol;Acc:HGNC:19192]

4???????????????????????????????????????????????????????????CD164?molecule?[Source:HGNC?Symbol;Acc:HGNC:1632]

5?????????????????????????????????????????abhydrolase?domain?containing?2?[Source:HGNC?Symbol;Acc:HGNC:18717]

6???????????????????????????????????????speckle?type?BTB/POZ?protein?like?[Source:HGNC?Symbol;Acc:HGNC:27934]#輸出含注釋的差異基因文件write.csv(diff_name,file=?"diff_gene.csv")

作者:孤獨巡禮_435a

鏈接:https://www.jianshu.com/p/4d0812195b65

總結

以上是生活随笔為你收集整理的r语言degseq2_第二次RNA-seq实战总结(3)-用DESeq2进行基因表达差异分析的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。