Key Features#
VCF Processing: Direct conversion from VCF files to diem format
Genome Polarization: Automated polarization of genetic markers using EM algorithm
Thresholding: Tools to help decide on threshold for minium diagnostic index of markers to retain
Kernel Smoothing: Spatial smoothing of genomic data along chromosomes
Obtaining Tract Length: Detection and analysis of genomic regions with consistent ancestry
Parallel Processing: Multi-core support for computationally intensive operations
Flexible I/O: Support for various input/output formats including BED-like files
Core Functionality#
The package provides several main analysis workflows:
Data Import and Processing
Convert VCF files to diem format using
vcf2diemRead diem BED format files with
read_diem_bedHandle masking of individuals and sites
Handling correct ploidy information, e.g. with regard to sex chromosomes
Polarization Analysis
Initialize polarization using random null
Run EM algorithm to optimize marker polarization
Calculate diagnostic indices and support values
Parallel and linear processing options available
Post-Polarization Analysis
Compute hybrid indices for individuals
Apply thresholding to filter less informative markers
Perform kernel smoothing across genomic windows
Generate tracts of contiguous ancestry and store them as contigs