The ABBA BABA test was developed to try and determine whether there was any admixture between modern human and neandertal populations.  Since then this test has been used as evidence for or against hybridization in a number of studies.  So this test requires four species basically three ingroups the most basal of which is the one we think may be hybridizing and one outgroup.  Furthermore the sites we want to look at are a special class of biallelic sites.  Specifically sites that have the pattern ABBA or BABA:




So what can produce these patterns?  To me the most obvious is simply a series of mutations either tree requires two mutations.  All things being equal we have no reason to think that mutations would lead to one or the other of these trees being more common.  What else, well we could have had a mutation after the first node and then the tree might be the result of simple lineage sorting from the one mutation.  If this process produces trees like this they again should be equally likely.  So this basically sets up the expectation that these two values should be equal (ABBA sites = BABA sites).  The statistic that was developed is called Patterson’s D statistic and it is defined as 

I have implemented two version of the ABBA/BABA test these are available in my R package evobiR.  I recently wrote a short post about extending these two functions to support IUPAC ambiguity codes.


2

View comments

  1. Thank you for helpful explanation!

    ReplyDelete
  2. Hi,
    I find your calcD.loop function very useful particularly because consideres ambiguities in sequence data. I know that you probably moved on from this but would you consider to add another function to estimate fd (Martin et al, 2014; Mol. Biol. Evol. 32(1):244–257)? I know it is a lot to ask, but I thought of giving it a try. For people that code as you do (elegant beautiful code) it should not be as difficult as for the mere mortals... Anyway, I thank you to share your calcD.loop function!
    Best,
    Rita

    ReplyDelete
Great Blogs
Great Blogs
About Me
About Me
My Photo
I am broadly interested in the application and development of comparative methods to better understand genome evolution at all scales from nucleotides to chromosomes.
Subscribe
Subscribe
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.