Skip to content

taffish/vcftools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vcftools

TAFFISH wrapper for VCFtools, a C++ and Perl toolkit for working with VCF and BCF variant files.

Package identity

  • name: vcftools
  • command: taf-vcftools
  • TAFFISH version: 0.1.17-r1
  • kind: tool
  • container image: ghcr.io/taffish/vcftools:0.1.17-r1
  • upstream: VCFtools v0.1.17
  • runtime version: VCFtools (0.1.17)
  • upstream license: LGPL-3.0-only

What is packaged

This app builds VCFtools from the official vcftools/vcftools v0.1.17 tag. The image includes the main vcftools binary and the upstream-installed Perl helper commands such as vcf-validator, vcf-stats, vcf-query, vcf-sort, vcf-compare, vcf-merge, vcf-isec, vcf-consensus, fill-an-ac, fill-aa, and fill-ref-md5.

The build enables the optional LAPACK-backed PCA feature. Runtime helper tools include Perl, gzip/gunzip, GNU sort, bgzip/tabix, samtools faidx, md5sum, and gnuplot for the helper paths that call them.

Usage

taf-vcftools -- --help
taf-vcftools -- --version
taf-vcftools -- --vcf input.vcf --freq --out sample
taf-vcftools -- --gzvcf input.vcf.gz --recode --recode-INFO-all --out filtered
taf-vcftools vcftools --vcf input.vcf --site-mean-depth --out depth
taf-vcftools vcf-validator input.vcf
taf-vcftools vcf-query -f '%CHROM\t%POS\t%REF\t%ALT\n' input.vcf

taf-vcftools uses vcftools as the default upstream command. Because most VCFtools options start with --, use taf-vcftools -- ... when passing upstream options directly to the default command. Command mode is available for the helper commands:

taf-vcftools vcf-sort input.vcf > sorted.vcf
taf-vcftools bgzip -c sorted.vcf > sorted.vcf.gz
taf-vcftools tabix -p vcf sorted.vcf.gz
taf-vcftools vcf-compare sorted.vcf.gz other.vcf.gz > compare.txt

Common workflows

# Allele counts and frequencies
taf-vcftools -- --vcf cohort.vcf --freq --out cohort.freq
taf-vcftools -- --vcf cohort.vcf --counts --out cohort.counts

# Recode a filtered VCF while preserving INFO annotations
taf-vcftools -- --gzvcf cohort.vcf.gz --remove-indels --recode --recode-INFO-all --out snps

# Missingness, depth, and genotype matrix summaries
taf-vcftools -- --vcf cohort.vcf --missing-site --site-mean-depth --out qc
taf-vcftools -- --vcf cohort.vcf --012 --out cohort.matrix

# PCA, enabled in this image with LAPACK
taf-vcftools -- --vcf cohort.vcf --pca --out cohort.pca

# Validate, sort, index, compare, and query through helper commands
taf-vcftools vcf-validator cohort.vcf
taf-vcftools vcf-sort cohort.vcf > cohort.sorted.vcf
taf-vcftools bgzip -c cohort.sorted.vcf > cohort.sorted.vcf.gz
taf-vcftools tabix -p vcf cohort.sorted.vcf.gz
taf-vcftools vcf-query -f '%CHROM\t%POS\t%REF\t%ALT\n' cohort.sorted.vcf.gz

Notes and boundaries

  • VCFtools writes most result files using the --out prefix. Without --out, upstream defaults to out.* in the current working directory.
  • Compressed VCF input uses --gzvcf. Region-aware helper scripts usually need bgzipped and tabix-indexed files.
  • Some helper commands are deprecated upstream in favor of BCFtools equivalents, especially query-style utilities. For BCF/VCF manipulation not covered here, use the separate bcftools app.
  • This image includes samtools only to support VCFtools helper scripts that call samtools faidx; it is not intended to replace the dedicated samtools app.
  • Reference genomes, population panels, annotation databases, PLINK, and BCFtools are not bundled.

Upstream

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors