不需要去找基因组序列,不需要去找基因结构坐标信息,也不需要写脚本了。
一切,都是在R里面完成,如下:
[AppleScript] 纯文本查看 复制代码 source("https://bioconductor.org/biocLite.R")
biocLite("BSgenome.Hsapiens.UCSC.hg19")
biocLite("TxDb.Hsapiens.UCSC.hg19.knownGene")
# BSgenome data package was made from the following source data files:
# chromFa.zip from [url]http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/[/url]
library(BSgenome.Hsapiens.UCSC.hg19)
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
BSgenome.Hsapiens.UCSC.hg19
genome <- BSgenome.Hsapiens.UCSC.hg19
seqlengths(genome)
genome$chr1 # same as genome[["chr1"]]
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
gn <- sort(genes(txdb))
up1000 <- flank(gn, width=1000)
up1000seqs <- getSeq(genome, up1000) |