搜索
查看: 2456|回复: 0

[mRNA-seq] rnaSeqFPro分析转录本数据

[复制链接]

6

主题

23

帖子

201

积分

中级会员

Rank: 3Rank: 3

积分
201
发表于 2017-2-25 20:57:15 | 显示全部楼层 |阅读模式
本帖最后由 qin_qinyang 于 2017-3-7 18:23 编辑

最近要分析一组样本的RNAseq数据,现将流程进行记录:
总体参考https://github.com/milospjanic/rnaSeqFPro
1.创建文件夹,并将fastq数据准备好
mkdir work.folder
cd work.folder
ls -lh

2.安装fastqc
wget http://www.bioinformatics.babrah ... /fastqc_v0.11.5.zip
unzip fastqc_v0.11.5.zip
chmod 755 ./FastQC/fastqc
3.安装star
mkdir STAR
wget https://codeload.github.com/alexdobin/STAR/zip/master
./STAR/STAR-master/bin/Linux_x86_64/STAR

3.准备reference基因组
mkdir -p reference_genomes/hg38
cd reference_genomes/hg38

4.用STAR建立索引
nohup /media/qin/software/work.folder/STAR/STAR-master/bin/Linux_x86_64/STAR  --runMode genomeGenerate --runThreadN 8 --genomeDir /media/qin/software/work.folder/reference_genomes/hg19 --genomeFastaFiles hg19.fa --sjdbGTFfile Homo_sapiens.GRCh37.75.gtf &
由于电脑配置有限,时间较久。
Feb 24 10:51:37 ..... Started STAR run
Feb 24 10:51:37 ... Starting to generate Genome files
Feb 24 10:52:42 ... starting to sort  Suffix Array. This may take a long time...
Feb 24 10:52:54 ... sorting Suffix Array chunks and saving them to disk...
Feb 24 15:39:37 ... loading chunks from disk, packing SA...
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
但是最终并没有成功,经过搜索,显示可能内存不足。
最钟放弃自己建立索引,找到网址下载:http://labshare.cshl.edu/shares/ ... E/GRCh38_Gencode25/经过漫长的下载,如下:


4.安装两个软件
wget https://sourceforge.net/projects ... _64.tar.gz/download
wget https://github.com/pachterlab/ka ... inux-v0.43.0.tar.gz
5.安装R的一些package
source("https://bioconductor.org/biocLite.R")
biocLite("rgsepd")
biocLite("DESeq2")
biocLite("goseq")

6.下载GENCODE,并建立索引
wget ftp://ftp.sanger.ac.uk/pub/genco ... 7.annotation.gtf.gz
wget ftp://ftp.sanger.ac.uk/pub/genco ... 7.transcripts.fa.gz
./kallisto_linux-v0.42.4/kallisto index -i GENCODE_transcripts_human gencode.v25lift37.transcripts.fa.gz
7.合并你要进行测试的数据,注意:列表的顺序必须与ls -1后的顺序一样

8.现在就可以运行,rnaSeqFPro.PE.hg19.sh,但是由于作者写的程序,可能与自己的本地环境不一样,所以我们采取分步骤进行
1).质控fastqc
ls -1 *fastq.gz > commands.1
sed -i 's/^/.\/FastQC\/fastqc /g' commands.1
#生成命令行
#./FastQC/fastqc WGC039520_7721-SINC_combined_R1.fastq.gz
#./FastQC/fastqc WGC039520_7721-SINC_combined_R2.fastq.gz
#./FastQC/fastqc WGC039520_7721-SISF3B4_combined_R1.fastq.gz
#./FastQC/fastqc WGC039520_7721-SISF3B4_combined_R2.fastq.gz

#运行commands.1
source commands.1
#将结果放到文件夹FastQC_OUTPUT中
mkdir FastQC_OUTPUT
mv *zip FastQC_OUTPUT
mv *html FastQC_OUTPUT

2.),比对 /media/qin/software/work.folder/STAR/STAR-master/bin/Linux_x86_64/STAR --runThreadN 2 --genomeDir /media/qin/software/work.folder/reference_genomes/ --readFilesIn paired_WGC039520_7721-SINC_combined_R1.fastq paired_WGC039520_7721-SINC_combined_R2.fastq --outFileNamePrefix paired_WGC039520_7721-SINC_combined
Feb 26 21:37:06 ..... started STAR run
Feb 26 21:37:06 ..... loading genome

EXITING: fatal error trying to allocate genome arrays, exception thrown: std::bad_alloc
Possible cause 1: not enough RAM. Check if you have enough RAM 31679977102 bytes
Possible cause 2: not enough virtual memory allowed with ulimit. SOLUTION: run ulimit -v 31679977102

Feb 26 21:37:06 ...... FATAL ERROR, exiting
结果失败,内存不够,转战服务器
#############################################将数据上传到服务器上后
1.直接比对
files=(*fastq.gz);for (( i=0; i<${#files[@]} ; i+=2 )) ; do mkdir "${files}.${files[i+1]}.STAR";done
cd WGC039520_7721-SINC_combined_R1.fastq.gz.WGC039520_7721-SINC_combined_R2.fastq.gz.STAR
mkdir Pass1
cd Pass1
nohup /media/DISK2TB/yangqin/My_DISK/work.folder/STAR/STAR-master/bin/Linux_x86_64/STAR --runThreadN 64 --outSAMattributes All --genomeLoad NoSharedMemory --genomeDir /media/DISK2TB/yangqin/My_DISK/work.folder/reference_genomes/ --readFilesIn /media/DISK2TB/yangqin/My_DISK/work.folder/WGC039520_7721-SINC_combined_R1.fastq.gz /media/DISK2TB/yangqin/My_DISK/work.folder/WGC039520_7721-SINC_combined_R2.fastq.gz --readFilesCommand zcat &

本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有帐号?立即注册

x



上一篇:根据基因名找找不同物种的特异基因!比如-人-老鼠
下一篇:MAC下使用终端
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

QQ|手机版|小黑屋|生信技能树 ( 粤ICP备15016384号  

GMT+8, 2019-9-22 08:29 , Processed in 0.030983 second(s), 29 queries .

Powered by Discuz! X3.2

© 2001-2013 Comsenz Inc.