Gmin Details

Thanks for visiting the details page for our SMBE 2014 poster:

Poster PDF [figshare] | Gmin Manuscript [arXiv]

 


Definition

Gmin is a simple sequence measure intended to rapidly scan population genomic datasets to identify putative regions of gene flow. This measure is calculated as the ratio of the minimum genetic distance between two individuals drawn from two populations, min(dxy), and the average genetic distance of all between-population pairs, dxy:

G_{\mathrm{min}} = \frac{\min \left( d_{xy} \right)}{\bar{d}_{xy}}


Performance

We created a modified version of Hudson’s popular coalescent simulation software ms called msmove [doi link], which calculates Gmin for simulated genealogies, allows for finer control of the timing and probability of simulated migration events, and keep track of genealogies that have experienced migration.  We then simulated over a wide range of parameters to test the sensitivity and specificity of Gmin, versus a commonly used alternative FST Gmin is consistently more sensitive and specific that FST and performs best when migration is rare to intermediate and has occurred relatively recently in the history of pair of populations. See our arXiv manuscript for more details on the design and results of these simulations.


Use

Gmin is intended to serve as a summary measure that identifies candidate gene flow regions which can then be examined in more detail. It is therefore useful to calculate Gmin in sliding windows then visually or statistically identify outliers. Due to the simplicity of the Gmin measure it can easily be calculated from any aligned sequence dataset drawn from two or more populations. The ability to calculate Gmin from BAM files has also been added to the population genomic software package POPBAM, using the haplo function.

popbam haplo -f <fasta_file> -o 2 -w <window_size> [bam_file] > <outfile>

This command calculates min(dxy) and dxy in <window_size> kb windows for each pair of populations in a dataset. See the POPBAM documentation for more details on running analyses with this software.


For additional  information please contact:
The Garrigan lab @ the University of Rochester
Anthony Geneva [PhD candidate – Garrigan lab]

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top