搜索
查看: 3168|回复: 2

[mRNA-seq] RNA-seq的测序深度严重影响它的结果

[复制链接]

365

主题

512

帖子

1713

积分

管理员

Rank: 9Rank: 9Rank: 9

积分
1713
发表于 2016-12-7 16:29:35 | 显示全部楼层 |阅读模式
http://journals.plos.org/plosone ... ournal.pone.0066883

Recent advances in RNA sequencing (RNA-Seq) have enabled the discovery of novel transcriptomic variations that are not possible with traditional microarray-based methods. Tissue and cell specific transcriptome changes during pathophysiological stress in disease cases versus controls and in response to therapies are of particular interest to investigators studying cardiometabolic diseases. Thus, knowledge on the relationships between sequencing depth and detection of transcriptomic variation is needed for designing RNA-Seq experiments and for interpreting results of analyses. Using deeply sequenced Illumina HiSeq 2000 101 bp paired-end RNA-Seq data derived from adipose of a healthy individual before and after systemic administration of endotoxin (LPS), we investigated the sequencing depths needed for studies of gene expression and alternative splicing (AS). In order to detect expressed genes and AS events, we found that ∼100 to 150 million (M) filtered reads were needed. However, the requirement on sequencing depth for the detection of LPS modulated differential expression (DE) and differential alternative splicing (DAS) was much higher. To detect 80% of events, ∼300 M filtered reads were needed for DE analysis whereas at least 400 M filtered reads were necessary for detecting DAS. Although the majority of expressed genes and AS events can be detected with modest sequencing depths (∼100 M filtered reads), the estimated gene expression levels and exon/intron inclusion levels were less accurate. We report the first study that evaluates the relationship between RNA-Seq depth and the ability to detect DE and DAS in human adipose. Our results suggest that a much higher sequencing depth is needed to reliably identify DAS events than for DE genes.
数据是:
We obtained 912 million (M) and 1,040 M reads for the pre- and post-LPS samples, respectively, with a high mapping rate, 85% and 82% of the reads mapped to the reference genome for the pre- and post-LPS samples, respectively, and 72% and 69% of the reads uniquely mapped and properly filtered
In our analysis, we only considered reads from autosomal and sex chromosomes, and this left 482 M filtered reads pre-LPS and 519 M filtered reads post-LPS. For ease of notation, we denote the 482 M and 519 M filtered datasets both as 500 M, and assume results from the analyses of these two datasets provide a comprehensive catalogue of transcriptomic variation.

定义高低表达量基因
In order to assess the impact of gene expression levels on our results, we looked at highly-expressed genes and lowly-expressed genes separately based on their FPKM values in the 500 M-read datasets. “Highly-expressed genes” were defined as those with FPKM values >75th percentile for both the pre-LPS (75th percentile FPKM = 11.46) and post-LPS (75thpercentile FPKM = 9.09) samples, and “lowly-expressed genes” were defined as those with the FPKM values <25th percentile (25th percentile FPKM = 1.51 pre-LPS, 0.85 post-LPS).

数据可以下载:
RNA-seq data have been deposited in the Gene Expression Omnibus (GEO) database (accession number GSE46323).



上一篇:Counts vs. FPKMs in RNA-seq
下一篇:根据RPKM值来定义高低表达量基因
回复

使用道具 举报

11

主题

50

帖子

275

积分

版主

Rank: 7Rank: 7Rank: 7

积分
275
发表于 2016-12-8 05:42:51 | 显示全部楼层
How much coverage is sufficient?  It really depends on what's your aim.
回复 支持 反对

使用道具 举报

0

主题

1

帖子

82

积分

注册会员

Rank: 2

积分
82
发表于 2019-6-8 04:47:33 | 显示全部楼层
Mint 发表于 2016-12-8 05:42
How much coverage is sufficient?  It really depends on what's your aim.

您好,如果aim是检测 3vs3 samples的differentically alternative spliced genes。 How much coverage do you suggest? Thanks.
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

QQ|手机版|小黑屋|生信技能树 ( 粤ICP备15016384号  

GMT+8, 2019-10-21 09:22 , Processed in 0.032272 second(s), 25 queries .

Powered by Discuz! X3.2

© 2001-2013 Comsenz Inc.