搜索
楼主: Jimmy

生信编程直播第六题:下载最新版的KEGG信息,并且解析好

[复制链接]

13

主题

33

帖子

243

积分

版主

Rank: 7Rank: 7Rank: 7

积分
243
发表于 2017-2-27 17:04:45 | 显示全部楼层
092-R-python

还是R比较熟悉,自己走了下,发现还是有收获的,不过后续的KEGG图 还是不是很好看。。。需要修改。。。

[AppleScript] 纯文本查看 复制代码
setwd("c:/Users/Administrator/Desktop/xyl/xyl/3D VS 2D/") 
rm(list=ls())

source("https://bioconductor.org/biocLite.R")
options(BioC_mirror="http://mirrors.ustc.edu.cn/bioc/")
biocLite("DOSE")
biocLite("topGO")
biocLite("clusterProfiler")
biocLite("pathview")

## 
library(DOSE)
library(GO.db)
library(org.Hs.eg.db)
library(topGO)
library(GSEABase)
library(clusterProfiler)


hsa_kegg<-download_KEGG("hsa")
str(hsa_kegg)
length(unique(hsa_kegg$KEGGPATHID2EXTID$from)) ###信号通路个数
length(unique(hsa_kegg$KEGGPATHID2EXTID$to))  ###所有信号通路中的基因数
length(unique(hsa_kegg$KEGGPATHID2NAME$from)) ###信号通路对应的名字 为啥与1个length不等;
length(unique(hsa_kegg$KEGGPATHID2NAME$to)) ###信号通路对应的名字,部分信号通路无基因??

x<-data.frame(hsa_kegg$KEGGPATHID2EXTID)  ###两列为pathwayid,基因id extid 扩增标识符
#xxx#一个信号通路含有多个基因,一个基因在多个信号通路

y<-data.frame(hsa_kegg$KEGGPATHID2NAME)  ###信号通路个数  
### 一个信号通路一个名字,部分信号通路无基因???


###
hsa_go<-org.Hs.egGO   ###GO分析
str(hsa_go)
head(hsa_go@L2Rchain)

mapped_Go_genes<-mappedkeys(hsa_go)
length(mapped_Go_genes)  #GO(BP,CC,MF三个基因个数为18622个gene)

hsa_kegg2<-org.Hs.egPATH
str(hsa_kegg2)
mapped_Kegg_genes<-mappedkeys(hsa_kegg2)
length(mapped_Kegg_genes)
length(unique(mapped_Kegg_genes))  
###KEGG里面PATH 的Gene数 unique的基因数为5869个基因


###
data(geneList, package="DOSE")
str(geneList)
# a<-as.data.frame(data(geneList, package="DOSE")) ##错误
# b<-data(geneList, package="DOSE") 
# ##错误
c<-as.data.frame(geneList)  ##可以,基因为entrez gene id,后面为对应差异基因倍数。

gene <- names(geneList)[abs(geneList) > 2]

d<-as.data.frame(gene)


kk <- enrichKEGG(gene = gene,
                 organism     = 'hsa',
                 pvalueCutoff = 0.05)
head(kk)
length(kk$ID)


barplot(kk, drop=TRUE, showCategory=12)  ##x轴为基因counts数,y为信号通路,颜色为p值
dotplot(kk, showCategory=12)  ##x为geneRatio数
enrichMap(kk)
cnetplot(kk, categorySize="pvalue", foldChange=geneList)
MM <- setReadable(kk, OrgDb = org.Hs.eg.db,keytype = "ENTREZID")
cnetplot(MM, categorySize="pvalue", foldChange=geneList)



##
library(pathview)
gene.with.fc<-geneList[abs(geneList)>2]

e<-as.data.frame(geneList)

hsa04110 <- pathview(gene.data = gene.with.fc,
                     pathway.id = "hsa04110",
                     species = "hsa", 
                     out.suffix = "fc",
                     kegg.native=T)

hsa04110.21layer<-pathview(gene.data = gene.with.fc,
                           pathway.id = "hsa04110",
                           species = "hsa", 
                           out.suffix = "fc.21layer",
                           limit=list(gene=max(abs(gene.with.fc)),cpd=1),
                           kegg.native=TRUE,same.layer=F)  ###将部分基因ID转换成gene name;

hsa04110.graphviz<-pathview(gene.data = gene.with.fc,
                            pathway.id = "hsa04110",
                            species = "hsa", 
                            out.suffix = "fc.graphviz",
                            limit=list(gene=max(abs(gene.with.fc)),cpd=1),
                            kegg.native=F,sign.pos="bottomleft")  ###改变基因与基因之间的线条



                         






回复 支持 反对

使用道具 举报

0

主题

15

帖子

151

积分

注册会员

Rank: 2

积分
151
发表于 2017-3-1 20:05:39 | 显示全部楼层
本帖最后由 Aiyawq 于 2017-3-1 20:23 编辑

还没看视频,但是先做了一下,这个还是有点小问题的  ,会丢掉一些有通路ID没有相关基因在通路里面的这种通路ID,暂时还没想到办法解决,所以,等看了视频后面怎么处理一下吧~
[Perl] 纯文本查看 复制代码
#!/ur/bin/perl -w
use strict;
open KEGG,$ARGV[0] or die $!;
my ($path_id,$discription,$gene_id,$protein);
print "path_id\tgene_id\tprotein\tdiscription\n";
while(<KEGG>){
        chomp;
        if (/^C/) {
                $_ =~ /^C\s+(\d+)\s(\S+)/;
                $path_id = $1;
                $discription = $2;
        }
        if(/^D/){
                $_ =~ /^D\s+(\d+)\s(\S+)/;
                $gene_id = $1;
                $protein = $2;
                print "$path_id\t$gene_id\t$protein\t$discription\n";
        }
}
close KEGG;


结果如下:

本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有帐号?立即注册

x
回复 支持 反对

使用道具 举报

3

主题

43

帖子

212

积分

中级会员

Rank: 3Rank: 3

积分
212
发表于 2017-3-2 12:31:34 | 显示全部楼层
群主,我有个问题,为什么用你的方法下载的结果有pathway注释的基因条目比我从http://rest.kegg.jp/link/hsa/ko 这里下载的少的多呢?
回复 支持 反对

使用道具 举报

2

主题

21

帖子

241

积分

中级会员

Rank: 3Rank: 3

积分
241
发表于 2017-3-2 22:37:19 | 显示全部楼层
本帖最后由 AdaWon 于 2017-3-2 22:47 编辑

147-python

解题思路:
1. 观察文件的构造;
2. 利用有序字典,递进嵌套;
3. 统计

[Python] 纯文本查看 复制代码
#!/usr/bin/python
# -*- coding: utf-8 -*-
import re
import sys
from collections import OrderedDict

args = sys.argv

slyKEGG = OrderedDict()
with open(args[1],'rt') as f:
    for line in f:
        line = line.strip('\n')
        if line.startswith('A'):
            mch = re.search('A<b>(.+)</b>',line) #A<b>Metabolism</b>
            className = mch.group(1) #eg. Metabolism
            slyKEGG[className] = OrderedDict()
            
        elif line.startswith('B'):
            if line == "B":
                continue
            else:
                mch = re.search('<b>(.+)</b>',line) #eg.  B  <b>Overview</b>
                subClass = mch.group(1) # Overview
                slyKEGG[className][subClass] = OrderedDict()
                
        elif line.startswith('C'):
                mch = re.search('(\d+)\s(.+)',line) #eg.  C    01200 Carbon metabolism [PATH:sly01200]
                # 匹配一个数字的"\\d"可以写成 r"\d";"\s" 表示空白字符:[<空格>\t\r\n\f\v]
                pathID = 'sly'+ mch.group(1)  # sly01200
                pathName = re.sub('\s\[.+\]','',mch.group(2)) #Carbon metabolism 
                #re.sub用于替换字符串中的匹配项;re.sub(pattern, repl, string, count=0, flags=0) 
                
                pathway = pathID + ':' + pathName
                slyKEGG[className][subClass][pathway] = [[],[]] 
            
        elif line.startswith('D'):
                lst = line.split(';') 
                #eg.  D      101249034 probable hexokinase-like 2 protein^IK00844 HK; hexokinase [EC:2.7.1.1]$
                geneInfo = lst[0].split('\t') # lst[0] ="D      101249034 probable hexokinase-like 2 protein^IK00844 HK"
                mch = re.match('D\s+(\d+)\s(.+)',geneInfo[0]) 
                #match :只从字符串的开始与正则表达式匹配,匹配成功返回matchobject,否则返回none;
                geneID = mch.group(1) #101249034
                gene = mch.group(2) #probable hexokinase-like 2 protein
                
                slyKEGG[className][subClass][pathway][0].append(gene)
                slyKEGG[className][subClass][pathway][1].append(geneID)
                
with open(args[2],'wt') as f:
        for ke,val in slyKEGG.items():
                for subk,subv in val.items():
                        for ptwy,geneList in subv.items():
                                genes = ';'.join(geneList[0])
                                geneIDs = ';'.join(geneList[1])
                                f.write('\t'.join([ke,subk,ptwy,genes,geneIDs])+'\n')

pthwyNum = 0
allGenes = []
with open(args[2]) as f:
    for line in f:
        line = line.rstrip()
        lst = line.split('\t')
        if len(lst) > 3:
            pthwyNum += 1
            geneList = lst[-2].split(';')
            allGenes = allGenes + [gene for gene in geneList if gene not in allGenes]
            
print('Number of non-empty pathways: %d' %pthwyNum)
print('Number of genes in all pathways: %d' %len(allGenes))

运行代码:
$ python27 biotree_06_kegg.py sly00001.keg  x.keg
#Number of non-empty pathways: 139
#Number of genes in all pathways: 5183
$ cat x.keg |wc -l
#472
$ less -S x.keg
Metabolism      Overview        sly01200:Carbon metabolism      probable hexokinase-like 2 protein;hexokinase-3-like;Hxk2;HXK3;HXK4;h
Metabolism      Overview        sly01210:2-Oxocarboxylic acid metabolism        citrate synthase;citrate synthase;citrate synthase 3;
Metabolism      Overview        sly01212:Fatty acid metabolism  acetyl-CoA carboxylase 1-like;acetyl-coenzyme A carboxylase carboxyl
Metabolism      Overview        sly01230:Biosynthesis of amino acids    triosephosphate isomerase;triosephosphate isomerase;triosepho



#看看这其中具体的嵌套关系;(jupyter run and test)
[Python] 纯文本查看 复制代码
slyKEGG = OrderedDict()
slyKEGG['Metabolism'] = OrderedDict()
slyKEGG['Metabolism']['Overview'] = OrderedDict()
slyKEGG['Metabolism']['Overview']['sly01200:Carbon metabolism'] = [[],[]]
geneID = '101249034'
gene = 'probable hexokinase-like 2 protein'
slyKEGG['Metabolism']['Overview']['sly01200:Carbon metabolism'][0].append(gene)
slyKEGG['Metabolism']['Overview']['sly01200:Carbon metabolism'][1].append(geneID)
print(slyKEGG)
slyKEGG.items()


for ke,val in slyKEGG.items():
    for subk,subv in val.items():
        for ptwy,geneList in subv.items():
            genes = ';'.join(geneList[0])
            geneIDs = ';'.join(geneList[1])
            print(slyKEGG)
            print("***\t",ke)
            print(val)
            print("***\t",subk)
            print(subv)
            print("***\t",ptwy)
            print(geneList)
            print("***\t",gene)
            print("***\t",geneIDs)

本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有帐号?立即注册

x
回复 支持 反对

使用道具 举报

1

主题

10

帖子

112

积分

注册会员

Rank: 2

积分
112
发表于 2017-3-3 14:45:28 | 显示全部楼层
127
投文章,补实验,零星的学点
代码重头到尾都详细看过查过,思路基本了解get了,贴上仿写的
[Python] 纯文本查看 复制代码
#coding:UTF-8
import re
import os
from collections import OrderedDict
dosaKEGG=OrderedDict()
with open ("C:\Users\Chenjian\Desktop\dosa00001.keg",'r') as f:
    for line in f:
        line=line.rstrip()
        if line.startswith('A'):
            mch=re.search('A<b>(.+)</b>',line)
            className=mch.group(1)
            dosaKEGG[className]=OrderedDict()
        elif line.startswith('B'):
            if line=='B':
                continue
            else:
                mch=re.search('<b>(.+)</b>',line)
                subclass=mch.group(1)
                dosaKEGG[className][subclass]=OrderedDict()
        elif line.startswith('C'):
            mch=re.search('(\d+)\s(.+)',line)
            pathID='dosa'+mch.group(1)
            pathName=re.sub('\s\[.+\]','',mch.group(2))
            pathway=pathID+':'+pathName

            dosaKEGG[className][subclass][pathway]=[[],[]]

        elif line.startswith('D'):
            lst=line.split(';')
            geneInfo=lst[0].split('\t')
            type(geneInfo)

            mch=re.match('D\s+(.+)\s(.+)',geneInfo[0])
            geneID=mch.group(2)


            dosaKEGG[className][subclass][pathway][0].append(geneID)


fh = open('dosa00001.cleaned.keg','wt')
for ke,val in dosaKEGG.items():
    for subk,subv in val.items():
        for ptwy,geneList in subv.items():
            genes=';'.join(geneList[0])
            geneIDs=';'.join(geneList[1])
            fh.write('\t'.join([ke,subk,ptwy,genes,geneIDs])+'\n')

fh.close()


这里还有几个小问题,有几行没看懂,查也没查到
回复 支持 反对

使用道具 举报

1

主题

10

帖子

112

积分

注册会员

Rank: 2

积分
112
发表于 2017-3-3 14:48:39 | 显示全部楼层
鹏洛克贾路 发表于 2017-3-3 14:45
127
投文章,补实验,零星的学点
代码重头到尾都详细看过查过,思路基本了解get了,贴上仿写的

一个是这行
 dosaKEGG[className][subclass][pathway]=[[],[]]
后面这个[[],[]]?不懂!
回复 支持 反对

使用道具 举报

1

主题

41

帖子

285

积分

中级会员

Rank: 3Rank: 3

积分
285
发表于 2017-3-5 01:28:55 | 显示全部楼层
本帖最后由 learnyoung 于 2017-3-10 17:01 编辑

将程序重新修改了一下,将解析好的结果写入CSV,发现在B类和C类通路名称中都有',',这样写入CSV式微自动分割,用re.sub将其替换为;
[Python] 纯文本查看 复制代码
# coding=utf-8
import re
import os
import csv
from collections import OrderedDict

os.chdir('e:/practics of bioinformatics')
dit = OrderedDict()

with open('hsa00001.keg', 'r') as f:
    for line in f:
        line = line.rstrip()
        if line.startswith('A'):
            mch = re.search('A<b>(.+)</b>', line)
            class_A = mch.group(1)
            dit[class_A] = OrderedDict()

        elif line.startswith('B'):
            if not line == 'B':
                mch = re.search('B\s+<b>(.+)</b', line)
                class_B = re.sub(',',';',mch.group(1))#将B有中的逗号换成分号,以免写入CSV自动分割
                dit[class_A][class_B] = OrderedDict()

        elif line.startswith('C'):
            mch = re.match('C\s+(\d+)\s(.+)', line)
            pathID = mch.group(1)
            pathName = re.sub('\s\[.+\]', '', mch.group(2))  # []需要转义
            pathwany_name=re.sub(',',';',pathName)#将pathName中的逗号换成分号,以免写入CSV自动分割
            pathway = pathID + '\t' + pathwany_name
            dit[class_A][class_B][pathway] = [[], []]

        elif line.startswith('D'):
            lst = line.split(';')
            mch = re.search('D\s+(\d+)\s(.+)', lst[0])
            geneID = mch.group(1)
            geneName = mch.group(2)
            dit[class_A][class_B][pathway][0].append(geneID)
            dit[class_A][class_B][pathway][1].append(geneName)

with open('cleaned_KEGG.csv', 'wb') as f:
    mycsv=csv.writer(f)
    for ka, va in dit.items():
        for kb, vb in va.items():
            for kc, vc in vb.items():
                geneid = ';'.join(vc[0])
                genename = ';'.join(vc[1])
                f.write(','.join([ka, kb, kc, geneid, genename])+'\n')

numpathway = 0
allgenes = []
with open('cleaned_kegg.CSV','rb') as f:
    mycsv=csv.reader(f)
    for line in mycsv:
        print type(line[3])
        if line[3]=='':
            continue
        ko_num=line[3].split(';')
        numpathway=numpathway+1
        allgenes=allgenes+ko_num
print numpathway, len(set(allgenes))  
            
运行结果322 7234

本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有帐号?立即注册

x
回复 支持 反对

使用道具 举报

10

主题

52

帖子

559

积分

版主

Rank: 7Rank: 7Rank: 7

积分
559
QQ
发表于 2017-3-6 12:11:18 | 显示全部楼层
本帖最后由 旭日早升 于 2017-3-6 16:19 编辑

201-python-无名

问题解析:问题说明:下载KEGG文件并且做解析。文件格式:keg文件详见楼主帖子http://www.bio-info-trainee.com/1188.html及下载地址http://www.genome.jp/kegg-bin/ge ... brite%2fhsa&length=。问题解析:keg文件是按一定结构(大类-小类-通路-基因)组织的,需要转换成更容易操作做的数据框。数据结构:{pathway1{gene1,gene2,...},pathway2{gene3,gene4,...}}。提高效率:暂时未知。
这一讲主要是keg数据结构的转换,结构转换在数据分析中挺常见的,我们需要解析成我们自己常用的结构。
我的尝试代码:
[Python] 纯文本查看 复制代码
###import module
import re, os
from collections import OrderedDict
###change directory if necessary
os.chdir("./")
###initialization variable
pathway = OrderedDict()
###read kegg file and transform to a dictory
with open("hsa00001.keg", 'rt') as f:
        for line in f:
                if not line.startswith(("C", "D")): # not pathway or gene line
                        continue
                if line.startswith("C"): # pathway line 
                        line = line.rstrip()
                        lst = line.split()
                        pathway_id = lst[1]
                        if pathway_id not in pathway:
                                pathway[pathway_id] = []
                        else:
                                print("Error: duplicated pathway ID")
                        continue 
                if line.startswith("D"): ## gene line
                        line = line.rstrip()
                        lst = line.split(";")
                        gene_id = lst[0].split()[1]
                        if len(lst[0].split()[2:]) > 1: # if gene symbol has more than one string
                                gene_symbol = " ".join(lst[0].split()[2:])
                        else:
                                gene_symbol = lst[0].split()[2]
                        if gene_id not in pathway[pathway_id]:
                                 pathway[pathway_id].append(str(gene_id)+"\t"+gene_symbol)
                        else:
                                print("Error: dup gene id in one pathway")
###output the result to a file
with open("KEGG2gene.txt", 'wt') as f:
        for path, genes in pathway.items():
                for gene in genes:
                        f.write(path+"\t"+gene+"\n")


输出结果:

可以看到第二列和第三列去重复后数目不同,有一些gene id不同但是symbol相同的情况



看了东老师的视频,老师是按照另一种结构解析的。其中gene的name那里提醒了我们很多需要注意的细节,比如有些基因没有简写,是由几个单词外加一些注释组成,我但是只是把分号“;”分开,忽略了分号前还有注释。另外老师用了很多正则匹配,这也是我的一个短板,需要加强学习的内容。重新跑了老师的代码,和自己的结果做了比对。发现老师输出的gene name虽然已经考虑去掉重复,但是用set查看后还是存在重复,python里一个有意思的现象。
重跑老师代码:
[Python] 纯文本查看 复制代码
import os, re 
from collections import OrderedDict
##change directory 
os.chdir("./")
##read the keg file and transform each line 
hsaKEGG = OrderedDict()
with open("hsa00001.keg", 'rt') as f:
	for line in f:
		line = line.rstrip()
		if line.startswith("A"): # the Big class line
			mch = re.search("A<b>(.+)</b>",line) # match Big class name
			className = mch.group(1)
			hsaKEGG[className] = OrderedDict()
		elif line.startswith("B"): # the Small class line
			if line == "B":
				continue
			else:
				mch = re.search("<b>(.+)</b>", line) # match Small class name
				subClass = mch.group(1)
				hsaKEGG[className][subClass] = OrderedDict()
		elif line.startswith("C"): # Pathway line
			mch = re.search("(\d+)\s(.+)", line) # match pathwayID and pathwayName
			pathID = "hsa"+mch.group(1)
			pathName = re.sub("\s\[.+\]", "", mch.group(2)) # remove the annotation ID like "[PATH:hsa01200]"
			pathway = pathID + ":" + pathName
			hsaKEGG[className][subClass][pathway] = [[],[]]
		elif line.startswith("D"): # Gene line
			lst = line.split(";") # split gene and annotation  
			geneInfo = lst[0].split("\t") # split gene which has no abbreviation
			mch = re.match("D\s+(\d+)\s(.+)", geneInfo[0]) # match geneID and geneSymbol
			geneID = mch.group(1)
			gene = mch.group(2)
			hsaKEGG[className][subClass][pathway][0].append(gene)
			hsaKEGG[className][subClass][pathway][1].append(geneID)

## Write to a file
fh = open("hsa00001.cleaned.keg", 'wt')
for ke, val in hsaKEGG.items():
	for subk, subv in val.items():
		for ptwy, geneList in subv.items():
			genes = ";".join(geneList[0])
			geneIDs = ";".join(geneList[1])
			fh.write("\t".join([ke, subk, ptwy, genes, geneIDs])+"\n")

fh.close()

## count the number of pathways and genes
pthwyNum = 0
allGenes = []
with open("hsa00001.cleaned.keg", 'rt') as f:
	for line in f:
		line = line.rstrip()
		lst = line.split("\t")
		if len(lst) > 3:
			pthwyNum += 1
			geneList = lst[-2].split(";")
			allGenes = allGenes + [gene for gene in geneList if gene not in allGenes]

print("Number of non-empty pathways: %d" %pthwyNum)
print("Number of genes in all pathways: %d" %len(allGenes))
print("Number of genes in all pathways: %d" %len(set(allGenes))) # allGenes still has duplication






本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有帐号?立即注册

x
回复 支持 1 反对 0

使用道具 举报

1

主题

43

帖子

463

积分

中级会员

Rank: 3Rank: 3

积分
463
发表于 2017-3-6 19:10:02 | 显示全部楼层
本帖最后由 y461650833y 于 2017-3-6 20:35 编辑

这题下载好办,直接去KEGG官网下载就好了!我下载了人,小鼠,猪的数据,并参照jimmy的perl单行命令解析了一下!
[AppleScript] 纯文本查看 复制代码
perl -alne '{if(/^C/){/C\s+(\d+)\s+(.*?)\s+\[(.*?)\]/;$kegg=$1,$path=$2}else{print "$kegg\t$path\t$F[1]\t$F[2]" if /^D/ and $kegg;}}' mmu00001.keg >mmu2_kegg2gene.txt
perl -alne '{if(/^C/){/C\s+(\d+)\s+(.*?)\s+\[(.*?)\]/;$kegg=$1,$path=$2}else{print "$kegg\t$path\t$F[1]\t$F[2]" if /^D/ and $kegg;}}' hsa00001.keg >hsa1_kegg2gene.txt
perl -alne '{if(/^C/){/C\s+(\d+)\s+(.*?)\s+\[(.*?)\]/;$kegg=$1,$path=$2}else{print "$kegg\t$path\t$F[1]\t$F[2]" if /^D/ and $kegg;}}' ssc00001.keg >ssc1_kegg2gene.txt
图片是小鼠与猪的解析结果!


[AppleScript] 纯文本查看 复制代码
统计相关数量
小鼠
cut -f 1 mmu2_kegg2gene.txt |sort -u |wc -l
309
cut -f 3 mmu2_kegg2gene.txt |sort -u |wc -l
8039
猪
cut -f 1 ssc1_kegg2gene.txt |sort -u |wc -l
309
cut -f 3 ssc1_kegg2gene.txt |sort -u |wc -l
7669


perl编程,直接把单行命令改过去就好了
#!/usr/bin/perl -w
print "kegg\tpath\tgene\symbol\n";
while(<>){
 chomp;
 if (/^C/){
    /C\s+(\d+)\s+(.*?)\s+\[(.*?)\]/;
        $kegg=$1;
        $path=$2;
 }
 if(/^D/){
 /D\s+(\d+)\s+(\w+)/;
 $gene=$1;
 $symbol=$2;
 print "$kegg\t$path\t$gene\$symbol\n";
  }
}
027-R+Perl-游戏


本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有帐号?立即注册

x
人生若只如初见!
回复 支持 反对

使用道具 举报

0

主题

6

帖子

131

积分

注册会员

Rank: 2

积分
131
发表于 2017-3-8 09:45:48 | 显示全部楼层
chapman 发表于 2017-2-26 10:00
[mw_shl_code=python,true]import re
from collections import OrderedDict

number of non-empty pathways:317
number of gene in all pathways:7227
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

QQ|手机版|小黑屋|生信技能树 ( 粤ICP备15016384号  

GMT+8, 2019-9-16 15:42 , Processed in 0.038658 second(s), 22 queries .

Powered by Discuz! X3.2

© 2001-2013 Comsenz Inc.