搜索
查看: 13629|回复: 13

clusterprofiler>gseGO>genelist 怎么搞 多谢大家了【R语言】

[复制链接]

2

主题

16

帖子

249

积分

中级会员

Rank: 3Rank: 3

积分
249
发表于 2017-9-30 10:41:42 | 显示全部楼层 |阅读模式
本帖最后由 蔡卷子 于 2018-10-11 21:01 编辑

目的:我现在手上有一堆geneID(symbol格式),想用clusterprofiler package做kegg 和 GO 的 GSEA enrichment
打算:用geo数据库得到表达量差异的fold change的排序的genelist,之后带入package跑。不知这种想法可以不

大家好,我最近在用Y叔的神包cluster profiler想做个gsea的enrichment,其他的都好说,但是,那个genelist我搞不清楚啊

在这个里面我用的gene列表是Y叔在clusterprofiler中id转换的示例的基因,用的代码大多数也是基于也是y叔讲的例子
[Plain Text] 纯文本查看 复制代码
library(clusterProfiler)
x <- c("GPX3",  "GLRX",   "LBP",   "CRYAB", "DEFB1", "HCLS1",   "SOD2",   "HSPA2",
       "ORM1",  "IGFBP1", "PTHLH", "GPC3",  "IGFBP3","TOB1",    "MITF",   "NDRG1",
       "NR1H4", "FGFR3",  "PVR",   "IL6",   "PTPRM", "ERBB2",   "NID2",   "LAMB1",
       "COMP",  "PLS3",   "MCAM",  "SPP1",  "LAMC1", "COL4A2",  "COL4A1", "MYOC",
       "ANXA4", "TFPI2",  "CST6",  "SLPI",  "TIMP2", "CPM",     "GGT1",   "NNMT",
       "MAL",   "EEF1A2", "HGD",   "TCN2",  "CDA",   "PCCA",    "CRYM",   "PDXK",
       "STC1",  "WARS",  "HMOX1", "FXYD2", "RBP4",   "SLC6A12", "KDELR3", "ITM2B")
eg = bitr(x, fromType="SYMBOL", toType=c("ENTREZID","GO"), OrgDb="org.Hs.eg.db")
head(eg)

eg_new = eg[c("ENTREZID","GO")]


结果> head(eg_new)
  ENTREZID         GO
1     2878 GO:0000302
2     2878 GO:0004602
3     2878 GO:0004602
4     2878 GO:0005515
5     2878 GO:0005576
6     2878 GO:0005615
>


If user have GO annotation data (in data.frame format with first column of gene ID and second column of GO ID), they can use enricher() and gseGO() functions to perform over-representation test and gene set enrichment analysis.

If genes are annotated by direction annotation, it should also annotated by its ancestor GO nodes (indirect annation). If user only has direct annotation, they can pass their annotation to buildGOmap function, which will infer indirection annotation and generate a data.frame that suitable for both enricher() and  gseGO().

这里面应该讲的是如果有GO注释的数据(数据框形式,第一列是gene ID,第二列是GO ID),因为我不清楚这个geneID到底是symbol还是entreID,于是两个都试了,都失败了。而且根据文档的上下文我猜这里的geneID应该是entreID。

之后跑gseGO

[Plain Text] 纯文本查看 复制代码
ego3 <- gseGO(geneList     = eg_new,
              OrgDb        = org.Hs.eg.db,