搜索
查看: 662|回复: 0

bedtools——基因组数据分析的瑞士军刀

[复制链接]

1

主题

1

帖子

38

积分

新手上路

Rank: 1

积分
38
发表于 2017-8-10 20:15:50 | 显示全部楼层 |阅读模式
bedtools是什么?
总的来讲,bedtools是处理基因组数据的瑞士军刀,它在分析基因组数据这一领域独领风骚。它是一个工具集,不仅工作效率高,而且工作范围广,用的好工作量自然就会很大。首先,它可以处理多种格式的数据,如常用的BAM, BED, GFF/GTF, VCF;其次,处理的问题具体而广泛,比如annotate命令,其作用就是Annotate coverage of features from multiple files.;bamtobed命令用来处理BAM格式的数据,作用是Convert BAM alignments to BED (& other) formats.,该软计用起来很舒服的地方就在于每一个工具(命令)处理一个问题,而通过脚本的编写可以将这样的工具集合起来一起使用达到事半而功倍的效果,希望每个人都能通过学习用在日常的基因组分析当中。
前世今生
其实该软件的前世今生总结起来就是一句话:当我需要使用这类工具的时候,去当前的市场上寻寻觅密,发现没有一个让我满意的,那我就自己开发一个出来喽!(感觉满满的正能量)这个软件是犹他大学昆兰实验室开发的,2009年发布第一个版本,目前最新版本是2.26版,可以从GitHub上自行下载安装。
软件的安装
这个软件的安装还是非常简易的,我个人比较喜欢使用系统自带的安装工具进行安装,比如我自己的是ubuntu,一般都会采用apt-get命令进行软件的安装,本软件同样可以使用该命令安装,具体命令如下:
apt-get install bedtools
如果是Fedora/Centos系统,也可以用yum命令安装,具体命令如下:
yum install BEDTools
但是比较尴尬的是目前的系统里自带的软件源中,该软件是2.25版本,所以我这种类似强迫症般追求新版本的家伙就使用的是源码安装,首先是从github上将软件包下载下来,然后具体命令如下:
tar -zxvf bedtools-2.25.0.tar.gz cd bedtools2 make cp bin/* /usr/local/bin/
然后你就可以使用了!
关于具体使用
其实我个人认为,一个软件如果不是自己真正需要而去使用它,单纯的了解没有半分意义,最多是增添了一些谈资,而如果真正的想要使用一个软件,最好的入门指南我认为就是help命令,通过这个命令你能快速了解该软件的一些情况而后决定你是否需要学它以及需要学习哪些工具,下面我就放上该软件的help结果,供你参考。
bedtools: flexible tools for genome arithmetic and DNA sequence analysis. usage:    bedtools <subcommand> [options] The bedtools sub-commands include: [ Genome arithmetic ] intersect     Find overlapping intervals in various ways. window        Find overlapping intervals within a window around an interval. closest       Find the closest, potentially non-overlapping interval. coverage      Compute the coverage over defined intervals. map           Apply a function to a column for each overlapping interval. genomecov     Compute the coverage over an entire genome. merge         Combine overlapping/nearby intervals into a single interval. cluster       Cluster (but don't merge) overlapping/nearby intervals. complement    Extract intervals _not_ represented by an interval file. shift         Adjust the position of intervals. subtract      Remove intervals based on overlaps b/w two files. slop          Adjust the size of intervals. flank         Create new intervals from the flanks of existing intervals. sort          Order the intervals in a file. random        Generate random intervals in a genome. shuffle       Randomly redistrubute intervals in a genome. sample        Sample random records from file using reservoir sampling. spacing       Report the gap lengths between intervals in a file. annotate      Annotate coverage of features from multiple files. [ Multi-way file comparisons ] multiinter    Identifies common intervals among multiple interval files. unionbedg     Combines coverage intervals from multiple BEDGRAPH files. [ Paired-end manipulation ] pairtobed     Find pairs that overlap intervals in various ways. pairtopair    Find pairs that overlap other pairs in various ways. [ Format conversion ] bamtobed      Convert BAM alignments to BED (& other) formats. bedtobam      Convert intervals to BAM records. bamtofastq    Convert BAM records to FASTQ records. bedpetobam    Convert BEDPE intervals to BAM records. bed12tobed6   Breaks BED12 intervals into discrete BED6 intervals. [ Fasta manipulation ] getfasta      Use intervals to extract sequences from a FASTA file. maskfasta     Use intervals to mask sequences from a FASTA file. nuc           Profile the nucleotide content of intervals in a FASTA file. [ BAM focused tools ] multicov      Counts coverage from multiple BAMs at specific intervals. tag           Tag BAM alignments based on overlaps with interval files. [ Statistical relationships ] jaccard       Calculate the Jaccard statistic b/w two sets of intervals. reldist       Calculate the distribution of relative distances b/w two files. fisher        Calculate Fisher statistic b/w two feature files. [ Miscellaneous tools ] overlap       Computes the amount of overlap from two intervals. igv           Create an IGV snapshot batch script. links         Create a HTML page of links to UCSC locations. makewindows   Make interval "windows" across a genome. groupby       Group by common cols. & summarize oth. cols. (~ SQL "groupBy") expand        Replicate lines based on lists of values in columns. split         Split a file into multiple files with equal records or base pairs. [ General help ] --help        Print this help menu. --version     What version of bedtools are you using?. --contact     Feature requests, bugs, mailing lists, etc.
感谢你看完今天的文章,和你一起成长为更好地自己!
欢迎关注本人微信号,大家一起学生信~
[size=0em]​



本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有帐号?立即注册

x



上一篇:转录组入门——3
下一篇:扩增子分析解读4去嵌合体,非细菌序列,生成代表性序列和OT...
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

QQ|手机版|小黑屋|生信技能树    

GMT+8, 2017-12-18 05:35 , Processed in 0.115215 second(s), 29 queries .

Powered by Discuz! X3.2

© 2001-2013 Comsenz Inc.