本帖最后由 ydchen 于 2017-8-26 21:09 编辑
现在主流的测序仪都是phred33系统的质量值,也就是说你看到你的fastq序列的第四行质量值应该都是大写字母。
如果你看到是小写字母,那么你就麻烦了,需要格外的主意。
我就是忽略了,导致我的bam文件里面是phred64,这样GATK就报错。
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 3.5-0-g36282e4):
##### ERROR
##### ERROR This means that one or more arguments or inputs in your command are incorrect.
##### ERROR The error message below tells you what is the problem.
##### ERROR
##### ERROR If the problem is an invalid argument, please check the online documentation guide
##### ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
##### ERROR
##### ERROR Visit our website and forum for extensive documentation and answers to
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
##### ERROR
##### ERROR MESSAGE: SAM/BAM/CRAM file htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter@33ffe19c appears to be using the wrong encoding for quality scores: we encountered an extremely high quality score of 65. Please see https://www.broadinstitute.org/gatk/guide?id=6470 for more details and options related to this error.
##### ERROR ------------------------------------------------------------------------------------------
刚开始看到这个错误一脸懵逼,这个pipeline我用了3年了,木有任何问题。
仔细谷歌才发现是这个问题,这个数据是一个朋友委托我处理的公共数据,而公共数据一般很老旧。
搜索了一下,有一个脚本可以解决
[Shell] 纯文本查看 复制代码 samtools view file.bam | \
perl -lane 'if('/^#/){print; next;}
@qual=split //, $F[$qualIndex]; $_=chr(ord($_)+33) for(@qual); $F[$qualIndex]=join("",@qual); print join("\t",@F)' | \
samtools -bS > file.corrected.bam
其实就是改变第11列即可。
|