2023年12月7日发(作者:)
SnpEff:SNP的vcf文件注释
SnpEff文章
标题:A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff
中文:预测注释SNP的作用
杂志:Fly (Austin)
时间:2012
引用:6162 (谷歌学术2021.11.19)
地址
安装
wget -c /versions/snpEff_latest_
unzip snpEff_latest_
route="/your_route/hutongyuan/softwares/snpEff"
java -jar $route/ -h
SnpEff的数据库
java -jar $route/ databases >
46041个不同的基因组,含基因组下载地址
自定义数据库
cd snpEff/ # 进入jar文件所在路径
mkdir -p ./data/bacteria/
cp my_ref_ ./data/bacteria/ # 拷贝my基因组的gff(含序列)
echo " : a_name" >> # 添加信息到config
java -jar $route/ build -gff3 -v bacteria # 建库
部分过程
00:00:00 SnpEff version SnpEff 5.0e (build 2021-03-09 06:01), by Pablo Cingolani
00:00:00 Command: 'build'
00:00:00 Building database for 'bacteria'
00:00:00 Reading configuration file ''. Genome: 'bacteria'
00:00:00 Reading config file: /hwfssz1/ST_HEALTH/P18Z10200N0423/hutongyuan/softwares/snpEff/
00:00:00 done
Reading GFF3 data file : '/hwfssz1/ST_HEALTH/P18Z10200N0423/hutongyuan/softwares/snpEff/./data/bacteria/'
#-----------------------------------------------
# Genome name : 'a_name'
# Genome version : 'bacteria'
# Genome ID : 'bacteria[0]'
# Has protein coding info : true
# Has Tr. Support Level info : true
# Genes : 4196
# Protein coding genes : 4196
#-----------------------------------------------
00:00:03 Done
00:00:03 Logging
00:00:08 Checking
00:00:11 Done.
结果
注释SNP
mkdir result/
java -jar $route/ -v bacteria
> ./result/parsnp_
mv snpEff_ ./result/
mv snpEff_ ./result/
参数
-v 啰嗦模式
-no-intergenic 舍弃基因间注释
-no-downstream 舍弃下游注释
-no-upstream 舍弃上游注释
-no-intron 舍弃内含子注释
-no-utr 舍弃utr注释
部分过程
00:00:00 SnpEff version SnpEff 5.0e (build 2021-03-09 06:01), by Pablo Cingolani
00:00:00 Command: 'ann'
00:00:00 Reading configuration file ''. Genome: 'bacteria'
00:00:00 Reading config file: /hwfssz1/ST_HEALTH/P18Z10200N0423/hutongyuan/softwares/snpEff/
00:00:00 done
00:00:00 Reading database for genome version 'bacteria' from file '/hwfssz1/ST_HEALTH/P18Z10200N0423/hutongyuan/softwares/snpEff/./data/bacteria/snpEffectPredic
00:00:01 done
#-----------------------------------------------
# Genome name : 'a_name'
# Genome version : 'bacteria'
# Genome ID : 'bacteria[0]'
# Has protein coding info : true
# Has Tr. Support Level info : true
# Genes : 4196
# Protein coding genes : 4196
#-----------------------------------------------
00:00:15 Creating summary file: snpEff_
00:00:15 Creating genes file: snpEff_
00:00:15 done.
00:00:15 Logging
00:00:20 Checking
00:00:23 Done.
结果
结果:文件
1 染色体
2 突变位置
3 突变周边,“.”的右边时突变位置
4 参考碱基
5 突变碱基
6 是否通过过滤
7 突变类型,氨基酸变化,上下游、基因间、内含子的突变情况
8 参考时0
9/10 不突变0,突变1
第七列详情,任取三个突变
结果:HTML文件结果模块SummaryVariant rate by chromosomeVariants by typeNumber of variants by impactNumber of variants by functional classNumber of variants by effectQuality histogramInDel length histogramBase variant tableTransition vs transversions (ts/tv)Allele frequencyAllele CountCodon change tableAmino acid change tableChromosome variants plotsDetails by gene突变效果分类统计突变类型和区域统计碱基改变统计
氨基酸变化统计
突变位置统计
预测变异的效果
推荐一款高引超6000次的全基因组/全外显子组变异注释工具


发布评论