Trimmomatic SE(single-end)(SE数据去接头、低质量)
分析模块封装了Trimmomatic工具,Trimmomatic是一个针对Illumina高通量测序的reads trim工具,支持paired-end(双末端)和single-end(单末端)数据。
l ILLUMINACLIP: Cut adapter and other illumina-specific sequences from the read. 去除接头污染。
l SLIDINGWINDOW: Perform a sliding window trimming, cutting once the average quality within the window falls below a threshold. 以固定窗口滑动,去除低质量。
l MINLEN: Drop the read if it is below a specified length. 过滤长度过短的read。
l LEADING: Cut bases off the start of a read, if below a threshold quality. 去除read头部低质量。
l TRAILING: Cut bases off the end of a read, if below a threshold quality. 去除read尾部低质量
l CROP: Cut the read to a specified length. 去除read尾部序列,将read截成指定长度。
l HEADCROP: Cut the specified number of bases from the start of the read. 去除read头部固定长度的序列。
设置质量值参数,Illumina 1.3-1.7 Phred+64 对应Illumina早期平台,Illumina 1.8+ Phred+33 对应Illumina最新平台,默认参数为:Illumina 1.8+ Phred+33。
对于single-end(单末端)数据,输出修剪和过滤的clean data数据,为单个FASTQ文件。
两个FASTQ文件(R1-paired and R2-paired),包含read的两端pair(R1和R2)均通过数据质控的结果文件。
额外的两个FASTQ文件(R1-unpaired and R2-unpaired),包含read,其中一端pair(R1 或 R2)通过数据质控,另一端无法通过数据质控,这样,就仅保留了一端的数据结果。
Perform initial ILLUMINACLIP step:Yes
Maximum mismatch count which will still allow a full match to be performed:2
How accurate the match between the two 'adapter ligated' reads must be for PE palindrome read alignment:30
How accurate the match between any adapter etc. sequence must be against a read:10
Perform Sliding window trimming (SLIDINGWINDOW):Yes
Number of bases to average across:20
Average quality required:20
Drop reads below a specified length (MINLEN):Yes
Minimum length of reads to be kept:35
Cut bases off the end of a read, if below a threshold quality (TRAILING):Yes
Minimum quality required to keep a base:20
分析模块引用了Trimmomatic v0.32 软件( )。
Bolger, A.M., Lohse, M., & Usadel, B. (2014). Trimmomatic: A flexible trimmer for Illumina Sequence Data. Bioinformatics, btu170.