搜索
查看: 621|回复: 0

[workflow] DEXSeq package

[复制链接]

29

主题

29

帖子

149

积分

管理员

Rank: 9Rank: 9Rank: 9

积分
149
发表于 2018-9-26 17:03:57 | 显示全部楼层 |阅读模式
1)Introduction

DEXSeq是一种在多个比较RNA-seq实验中,检验差异外显子使用情况的方法。 差异外显子使用(DEU),指的是由实验条件引起的外显子相对使用的变化。 外显子的相对使用定义为:
number of transcripts from the gene that contain this exon / number of all transcripts from the gene
大致思想:. For each exon (or part of an exon) and each sample, we count how many reads map to this exon and how many reads map to any of the other exons of the same gene. We consider the ratio of these two counts, and how it changes across conditions, to infer changes in the relative exon usage。
2)安装

[Python] 纯文本查看 复制代码
if("DEXSeq" %in% rownames(installed.packages()) == FALSE) {source("http://bioconductor.org/biocLite.R");biocLite("DEXSeq")}
suppressMessages(library(DEXSeq))
ls('package:DEXSeq')
pythonScriptsDir = system.file( "python_scripts", package="DEXSeq" )
list.files(pythonScriptsDir)
## [1] "dexseq_count.py" "dexseq_prepare_annotation.py"  #查看是否含有这两个脚本
python dexseq_prepare_annotation.py Drosophila_melanogaster.BDGP5.72.gtf Dmel.BDGP5.25.62.DEXSeq.chr.gff #GTF转化为GFF with collapsed exon counting bins.
python dexseq_count.py Dmel.BDGP5.25.62.DEXSeq.chr.gff untreated1.sam untreated1fb.txt #count

3) 用自带实验数据集(数据预处理)

[Python] 纯文本查看 复制代码
suppressMessages(library(pasilla))
inDir = system.file("extdata", package="pasilla")
countFiles = list.files(inDir, pattern="fb.txt$", full.names=TRUE)  #countfile(如果不是自带数据集,可以由dexseq_count.py脚本生成)
basename(countFiles)
flattenedFile = list.files(inDir, pattern="gff$", full.names=TRUE)
basename(flattenedFile)                                       #gff文件(如果不是自带数据集,可以由dexseq_prepare_annotation.py脚本生成)
########构造数据框sampleTable,包含sample名字,实验,文库类型等信息#######################
sampleTable = data.frame(
  row.names = c( "treated1", "treated2", "treated3",
                 "untreated1", "untreated2", "untreated3", "untreated4" ),
  condition = c("knockdown", "knockdown", "knockdown",
                "control", "control", "control", "control" ),
  libType = c( "single-end", "paired-end", "paired-end",
               "single-end", "single-end", "paired-end", "paired-end" ) )
sampleTable
##############构建 DEXSeqDataSet object#############################
dxd = DEXSeqDataSetFromHTSeq(
  countFiles,
  sampleData=sampleTable,
  design= ~ sample + exon + condition:exon,
  flattenedfile=flattenedFile )   #四个参数

4)Standard analysis work-flow
[Python] 纯文本查看 复制代码
########以下是简单的实验设计#####genesForSubset = read.table(file.path(inDir, "geneIDsinsubset.txt"),stringsAsFactors=FALSE)[[1]] #基因子集ID  
dxd = dxd[geneIDs( dxd ) %in% genesForSubset,]     #取子集,减少运行量
head(colData(dxd))
head( counts(dxd), 5 )
split( seq_len(ncol(dxd)), colData(dxd)$exon )
sampleAnnotation( dxd )
############# dispersion estimates and the size factors#############
dxd = estimateSizeFactors( dxd )  ##Normalisation
dxd = estimateDispersions( dxd )
plotDispEsts( dxd )  #图1
#################Testing for differential exon usage############
dxd = testForDEU( dxd )
dxd = estimateExonFoldChanges( dxd, fitExpToVar="condition")
dxr1 = DEXSeqResults( dxd )
dxr1
mcols(dxr1)$description
table ( dxr1$padj < 0.1 )
table ( tapply( dxr1$padj < 0.1, dxr1$groupID, any ) )
plotMA( dxr1, cex=0.8 )  #图2


[Python] 纯文本查看 复制代码
############以下是更复杂的实验设计##################
formulaFullModel = ~ sample + exon + libType:exon + condition:exon
formulaReducedModel = ~ sample + exon + libType:exon
dxd = estimateDispersions( dxd, formula = formulaFullModel )
dxd = testForDEU( dxd,
                  reducedModel = formulaReducedModel,
                  fullModel = formulaFullModel )
dxr2 = DEXSeqResults( dxd )
table( dxr2$padj < 0.1 )
table( before = dxr1$padj < 0.1, now = dxr2$padj < 0.1 )##和简单的实验设计比较

5)可视化
[Python] 纯文本查看 复制代码
plotDEXSeq( dxr2, "FBgn0010909", legend=TRUE, cex.axis=1.2, cex=1.3,
            lwd=2 )
plotDEXSeq( dxr2, "FBgn0010909", displayTranscripts=TRUE, legend=TRUE,
            cex.axis=1.2, cex=1.3, lwd=2 )
plotDEXSeq( dxr2, "FBgn0010909", expression=FALSE, norCounts=TRUE,
            legend=TRUE, cex.axis=1.2, cex=1.3, lwd=2 )
plotDEXSeq( dxr2, "FBgn0010909", expression=FALSE, splicing=TRUE,
            legend=TRUE, cex.axis=1.2, cex=1.3, lwd=2 )
DEXSeqHTML( dxr2, FDR=0.1, color=c("#FF000080", "#0000FF80") )








本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有帐号?立即注册

x
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

QQ|手机版|小黑屋|生信技能树    

GMT+8, 2019-4-26 10:42 , Processed in 0.038241 second(s), 25 queries .

Powered by Discuz! X3.2

© 2001-2013 Comsenz Inc.