2024年3月31日发(作者:)

Manual Reference Pages- bwa (1)

NAME

bwa - Burrows-Wheeler Alignment Tool

CONTENTS

Synopsis

Description

Commands And Options

Sam Alignment Format

Notes On Short-read Alignment

Alignment Accuracy

Estimating Insert Size Distribution

Memory Requirement

Speed

Changes In Bwa-0.6

See Also

Author

License And Citation

History

SYNOPSIS

bwa index 构建索引

bwa mem > 单端测序

bwa mem > 双端测序

bwa aln short_ > aln_

bwa samse aln_ short_ >

bwa sampe aln_ aln_ >

bwa bwasw long_ >

DESCRIPTION

BWA is a software package for mapping low-divergent sequences against a large

reference genome, such as the human genome. It consists of three algorithms:

BWA-backtrack, BWA-SW and BWA-MEM. The first algorithm is designed for Illumina

sequence reads up to 100bp, while the rest two for longer sequences ranged from

70bp to 1Mbp. BWA-MEM and BWA-SW share similar features such as long-read support

and split alignment, but BWA-MEM, which is the latest, is generally recommended for

high-quality queries as it is faster and more accurate. BWA-MEM also has better

performance than BWA-backtrack for 70-100bp Illumina reads.

For all the algorithms, BWA first needs to construct the FM-index for the reference

genome (the index command). Alignment algorithms are invoked with different

sub-commands: aln/samse/sampe for BWA-backtrack, bwasw for BWA-SW and mem for

the BWA-MEM algorithm.

COMMANDS AND OPTIONS

index bwa index [-p prefix] [-a algoType] <>

Index database sequences in the FASTA format.

OPTIONS:

-p STR Prefix of the output database [same as db filename]

-a STR Algorithm for constructing BWT index. Available options are:

Is(默认)

IS linear-time algorithm for constructing suffix array. It

requires 5.37N memory where N is the size of the database.

IS is moderately fast, but does not work with database

larger than 2GB. IS is the default algorithm due to its

simplicity. The current codes for IS algorithm are

reimplemented by Yuta Mori.

bwtsw Algorithm implemented in BWT-SW. This method works

with the whole human genome.

mem bwa mem [-aCHMpP] [-t nThreads] [-k minSeedLen] [-w bandWidth] [-d

zDropoff] [-r seedSplitRatio] [-c maxOcc] [-A matchScore] [-B mmPenalty] [-O

gapOpenPen] [-E gapExtPen] [-L clipPen] [-U unpairPen] [-R RGline] [-v

verboseLevel] []

Align 70bp-1Mbp query sequences with the BWA-MEM algorithm. Briefly,

the algorithm works by seeding alignments with maximal exact matches (MEMs)

and then extending seeds with the affine-gap Smith-Waterman algorithm (SW).

If file is absent and option -p is not set, this command regards input

reads are single-end. If is present, this command assumes the i-th read

in and the i-th read in constitute a read pair. If -p is used, the

command assumes the 2i-th and the (2i+1)-th read in constitute a read

pair (such input file is said to be interleaved). In this case, is ignored. In

the paired-end mode, the mem command will infer the read orientation and the

insert size distribution from a batch of reads.