搜索
查看: 3591|回复: 2

用R包来完成RNA-Seq; Ribo-Seq; VAR-Seq流程分析

[复制链接]

634

主题

1182

帖子

4030

积分

管理员

Rank: 9Rank: 9Rank: 9

积分
4030
发表于 2017-4-20 10:38:52 | 显示全部楼层 |阅读模式
R包地址;http://bioconductor.org/packages/devel/bioc/html/systemPipeR.html
里面很详细的给出了RNA-Seq; Ribo-Seq; VAR-Seq流程分析的脚本,只需要照猫画虎的学习即可。流程图如下:

如果需要运行该流程,需要对R语言有一定的了解:

其实要学的东西很多:
Function name
Description
genWorkenvir
Generates workflow templates provided by systemPipeRdata helper package
systemArgs
Constructs SYSargs workflow control module (S4 object) fromtargets and param files
runCommandline
Executes command-line software on samples and parameters specified in SYSargs
clusterRun
Runs command-line software in parallel mode on a computer cluster
[size=0.875]preprocessReads
[size=0.875]Filtering and/or trimming of short reads using predefined or custom parameters
[size=0.875]seeFASTQ/seeFASTQplot
[size=0.875]Generates quality reports for any number of FASTQ files
[size=0.875]alignStats
[size=0.875]Generates alignment statistics, such as total number of reads and alignment frequency
[size=0.875]run_edgeR/run_DESeq2
[size=0.875]Runs edgeR or DESeq2 for any number of pairwise sample comparisons
[size=0.875]filterDEGs
[size=0.875]Filters and plots DEG results based on user-defined parameters
[size=0.875]overLapper/vennPlot
[size=0.875]Computation of Venn intersects for 2-20 or more samples and 2-5 way Venn diagrams
[size=0.875]GOCluster_Report
[size=0.875]GO term enrichment analysis for large numbers of gene sets
[size=0.875]variantReport
[size=0.875]Generates a variant report containing genomic annotations and confidence statistics
[size=0.875]predORF
[size=0.875]Prediction of short open reading frames in DNA sequences
[size=0.875]featuretypeCounts
[size=0.875]Computes and plots read distribution for many feature types at once
[size=0.875]featureCoverage
[size=0.875]Computes and plots read depth coverage from many transcripts


当然,这个包也在BMC上面发文章了的,https://www.ncbi.nlm.nih.gov/pubmed/27650223
里面很详细的说明了它的优缺点,自己去看吧。

值得一提的是作者费了大量笔墨来说明为什么开发R包流程:
(i) R is currently one of the most popular statistical data analysis and programming environments in bioinformatics.
(ii) Its external language bindings support the implementation of computationally time-consuming analysis steps in high-performance languages such as C/C++.
(iii) It supports advanced parallel computation on multi-core machines and computer clusters.
(iv) A well developed infrastructure interfaces R with several other popular programing languages such as Python.
(v) R provides advanced graphical and visualization utilities for scientific computing.
(vi) It offers access to a vast landscape of statistical and machine learning tools.
(vii) Its integration with the Bioconductor project promotes reusability of genomics software components, while also making efficient use of a large number of existing NGS packages that are well tested and widely used by the community.

你这个问题很复杂,需要打赏,请点击 http://www.bio-info-trainee.com/donate 进行打赏,谢谢
回复

使用道具 举报

3

主题

43

帖子

212

积分

中级会员

Rank: 3Rank: 3

积分
212
发表于 2017-4-21 14:10:58 | 显示全部楼层
学习了~~
回复

使用道具 举报

4

主题

20

帖子

395

积分

中级会员

Rank: 3Rank: 3

积分
395
发表于 2017-4-21 21:43:41 | 显示全部楼层
谢谢版主,先标记下来,有空好好研究
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

QQ|手机版|小黑屋|生信技能树 ( 粤ICP备15016384号  

GMT+8, 2019-8-23 18:00 , Processed in 0.030396 second(s), 23 queries .

Powered by Discuz! X3.2

© 2001-2013 Comsenz Inc.