请先看:
http://www.biotrainee.com/thread-904-1-1.html ( illumina芯片的raw signal intensity data from the *.idat files)
http://www.biotrainee.com/thread-905-1-1.html(affymetrix跟illumina芯片的raw data标准不一样)
当然,用R包crlmm来分析也可以的:http://www.biotrainee.com/thread-875-1-1.html,它的函数是readIdatFiles
这个R包就一个函数,readIDAT ,就是把IDAT文件读到R里面,变成一个高级对象;
https://bioconductor.org/package ... /doc/illuminaio.pdf
https://bioconductor.org/package ... tml/illuminaio.html
至于后面怎么处理,就不是它的事情了。
比如我用:
首先去 https://www.encodeproject.org/experiments/ENCSR112SHO/ 下载IDAT文件,放在tmp/illumina-omin4/目录下面:
[AppleScript] 纯文本查看 复制代码 #source("https://bioconductor.org/biocLite.R")
#biocLite("illuminaio")
setwd('tmp/illumina-omin4/')
library(illuminaio)
idatFile='ENCFF267GLV.idat'
idat <- readIDAT(idatFile)
names(idat)
idatData <- idat$Quants
head(idatData);dim(idatData)
得到的idat 对象如下:
其中最重要的就是idatData,储存着基因型数据;[AppleScript] 纯文本查看 复制代码 > head(idatData,50)
Mean SD NBeads
1600101 3506 948 11
1600103 18868 1994 14
1600105 591 446 15
1600107 621 550 14
1600111 541 266 16
1600113 884 465 15
1600119 628 369 13
1600121 1066 730 15
1600125 791 330 12
1600131 660 434 9
1600133 1002 648 14
可以看到4933285个探针的具体call到的值!
For both file types the most important entry in the returned list is the item Quants.
When readingunencrypted files this contains average intensity (Mean), number of beads (NBeads) and a measureof variability (SD) for each bead type on the array.
所以你需要明白的是这Mean,NBeads,SD如何转换成基因型AA,AB,BB 即可!记住,对一个样本,你需要下载两个IDAT文件,因为探针是双色的,两个数据组合起来才能判断基因型!!!
For expression arrays in addition to these someadditional information is available, including median and trimmed-mean estimates of average intensity,averaged local background intensities and the number of beads present before outliers whereexcluded.
可以看到这款新品的官网也是这样介绍的:
This BeadChip array covers > 4.3 million whole-genome variants down to 1% minor allele frequency (MAF), plus novel functional exonic variants.
https://www.illumina.com/products/by-type/microarray-kits/infinium-omni5-exome.html
但是这四百多万探针如何注释到人类基因组,就需要自己再仔细看看该芯片的说明书啦
https://support.illumina.com/array/array_kits/infinium_humanomni5exome_beadchip_kit/documentation.html比如是基因型芯片,就看:https://support.illumina.com/con ... g_data_analysis.pdf
|