2023年12月7日发(作者:)

SnpEff:SNP的vcf文件注释

SnpEff文章

标题:A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff

中文:预测注释SNP的作用

杂志:Fly (Austin)

时间:2012

引用:6162 (谷歌学术2021.11.19)

地址

安装

wget -c /versions/snpEff_latest_

unzip snpEff_latest_

route="/your_route/hutongyuan/softwares/snpEff"

java -jar $route/ -h

SnpEff的数据库

java -jar $route/ databases >

46041个不同的基因组,含基因组下载地址

自定义数据库

cd snpEff/ # 进入jar文件所在路径

mkdir -p ./data/bacteria/

cp my_ref_ ./data/bacteria/ # 拷贝my基因组的gff(含序列)

echo " : a_name" >> # 添加信息到config

java -jar $route/ build -gff3 -v bacteria # 建库

部分过程

00:00:00 SnpEff version SnpEff 5.0e (build 2021-03-09 06:01), by Pablo Cingolani

00:00:00 Command: 'build'

00:00:00 Building database for 'bacteria'

00:00:00 Reading configuration file ''. Genome: 'bacteria'

00:00:00 Reading config file: /hwfssz1/ST_HEALTH/P18Z10200N0423/hutongyuan/softwares/snpEff/

00:00:00 done

Reading GFF3 data file : '/hwfssz1/ST_HEALTH/P18Z10200N0423/hutongyuan/softwares/snpEff/./data/bacteria/'

#-----------------------------------------------

# Genome name : 'a_name'

# Genome version : 'bacteria'

# Genome ID : 'bacteria[0]'

# Has protein coding info : true

# Has Tr. Support Level info : true

# Genes : 4196

# Protein coding genes : 4196

#-----------------------------------------------

00:00:03 Done

00:00:03 Logging

00:00:08 Checking

00:00:11 Done.

结果

注释SNP

mkdir result/

java -jar $route/ -v bacteria

> ./result/parsnp_

mv snpEff_ ./result/

mv snpEff_ ./result/

参数

-v 啰嗦模式

-no-intergenic 舍弃基因间注释

-no-downstream 舍弃下游注释

-no-upstream 舍弃上游注释

-no-intron 舍弃内含子注释

-no-utr 舍弃utr注释

部分过程

00:00:00 SnpEff version SnpEff 5.0e (build 2021-03-09 06:01), by Pablo Cingolani

00:00:00 Command: 'ann'

00:00:00 Reading configuration file ''. Genome: 'bacteria'

00:00:00 Reading config file: /hwfssz1/ST_HEALTH/P18Z10200N0423/hutongyuan/softwares/snpEff/

00:00:00 done

00:00:00 Reading database for genome version 'bacteria' from file '/hwfssz1/ST_HEALTH/P18Z10200N0423/hutongyuan/softwares/snpEff/./data/bacteria/snpEffectPredic

00:00:01 done

#-----------------------------------------------

# Genome name : 'a_name'

# Genome version : 'bacteria'

# Genome ID : 'bacteria[0]'

# Has protein coding info : true

# Has Tr. Support Level info : true

# Genes : 4196

# Protein coding genes : 4196

#-----------------------------------------------

00:00:15 Creating summary file: snpEff_

00:00:15 Creating genes file: snpEff_

00:00:15 done.

00:00:15 Logging

00:00:20 Checking

00:00:23 Done.

结果

结果:文件

1 染色体

2 突变位置

3 突变周边,“.”的右边时突变位置

4 参考碱基

5 突变碱基

6 是否通过过滤

7 突变类型,氨基酸变化,上下游、基因间、内含子的突变情况

8 参考时0

9/10 不突变0,突变1

第七列详情,任取三个突变

结果:HTML文件结果模块SummaryVariant rate by chromosomeVariants by typeNumber of variants by impactNumber of variants by functional classNumber of variants by effectQuality histogramInDel length histogramBase variant tableTransition vs transversions (ts/tv)Allele frequencyAllele CountCodon change tableAmino acid change tableChromosome variants plotsDetails by gene突变效果分类统计突变类型和区域统计碱基改变统计

氨基酸变化统计

突变位置统计

预测变异的效果

推荐一款高引超6000次的全基因组/全外显子组变异注释工具