日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Quality control of sequencing data

發布時間:2023/12/20 编程问答 39 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Quality control of sequencing data 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

What is quality control?

質控是從數據刪除可辨認錯誤從而提高數據質量的過程,是拿到數據后的第一步工作。

How critical is quality control?

The more unknowns about the genome under study, the more important it is to correct any errors.
When aligning against well-studied and understood genomes, we can recognize and identify errors by their
alignments. When assembling a de-novo genome, errors can derail the process; hence, it is more
important to apply a higher stringency for filtering.

When do we perform quality control?

Quality control is performed at different stages
Pre-alignment: “raw data” - the protocols are the same regardless of what analysis will follow
Post-alignment: “data? filtering” - the protocols are specific to the analysis that is being performed.

How reliable are QC tools?

Does quality control introduce errors?

How does read quality trimming work?

Originally, the reliability of sequencing decreased along the read. A common correction is to work backwards from the end of each read and remove low quality measurements from it.? This is called trimming

Why do we need to trim adapters?

How do we trim adapters?

Trim adapters with trimmomatic :

trimmomatic SE SRR519926_1. fastq output. fq ILLUMINACLIP: adapter. fa: 2: 30: 5

Trimming adapter sequences - is it necessary?

Removal of adapter sequences in a process called read trimming, or clipping, is one of the first steps in analyzing NGS data. With more than 30 published adapter trimming tools there is a more than large choice for the appropriate tool. Yet, there is a debate whether this step really is as important as the number of tools suggests, or whether it is possible to skip this time-consuming step for many NGS applications.

Why do adapters contaminate my sequences?
Adapters have to be ligated to every single DNA molecule during library preparation. For Illumina short read sequencing, the corresponding protocols involve (in most cases) a DNA fragmentation step, followed by the ligation of certain oligonucleotides to the 5’ and 3’ ends. These 5’ and 3’ adapter sequences have important functions in Illumina sequencing, since they hold barcoding sequences, forward/reverse primers (for paired-end sequencing) and the important binding sequences for immobilizing the fragments to the flowcell and allowing bridge-amplification.

When are adapters sequences observed in the reads?
In common short read sequencing, the DNA insert (original molecule to be sequenced) is downstream from the read primer, meaning that the 5’ adapters will not appear in the sequenced read. But, if the fragment is shorter than the number of bases sequenced, one will sequence into the 3’ adapter. To make it clear: In Illumina sequencing, adapter sequences will only occur at the 3’ end of the read and only if the DNA insert is shorter than the number of sequencing cycles (see picture below)!

How often that happens largely depends on the used NGS protocol. Think about it: How often will you sequence into the 3’ adapters when performing common RNA-Seq? After mRNA enrichment, cDNA creation (using a reverse transcriptase) and DNA fragmentation the protocols typically involve a size selection. When using a miSeq with 2x300 paired-end mode, one will select molecules that are longer than the read length, in our example greater than 600 nucleotides in length. However, it is technically impossible to obtain a specific fragment size, but one will rather get a distribution of fragment lengths (see picture). Thus, one will also obtain a certain fraction of adapter contamination for large fragment sizes. For RNA-Seq you will observe that only 0.2 - 2% of reads contain adapter sequences.

Summary
Adapter contamination will lead to NGS alignment errors and an increased number of unaligned reads, since the adapter sequences are synthetic and do not occur in the genomic sequence. There are applications (e.g. small RNA sequencing) where adapter trimming is highly necessary. With a fragment size of around 24 nucleotides, one will definitely sequence into the 3’ adapter. But there are also applications (transcriptome sequencing, whole genome sequencing, etc.) where adapter contamination can be expected to be so small (due to an appropriate size selection) that one could consider to skip the adapter removal and thereby save time and efforts.

總結

以上是生活随笔為你收集整理的Quality control of sequencing data的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。