The transition distribution of a sample from a Wright-Fisher diffusion with general small mutation rates (Q2007701)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | The transition distribution of a sample from a Wright-Fisher diffusion with general small mutation rates |
scientific article |
Statements
The transition distribution of a sample from a Wright-Fisher diffusion with general small mutation rates (English)
0 references
22 November 2019
0 references
The authors find the sampling distribution in the Wright-Fisher diffusion with general mutation rates at time \(t \geq 0\) with small rates by using a coalescent approach. The probability distribution of a gene tree at time \(t\) is found in a similar way. A gene tree is equivalent to a sample of sequences in the infinitely-many-sites model. The sampling distribution leads to an approximation for the transition density in the general mutation model which is of interest as a solution to the diffusion process, in population genetics and, more widely, as an approximate differential equation solution. Expressions are defined for the sampling distribution in a sample of \(n\) genes taken at time \(t\) from a Wright-Fisher population which follows a diffusion model. The overall mutation rate \(\theta\) is taken to be small and mutation rates between types are general. An extension to gene trees is described. The idea in obtaining formulae is based on the result that to order \(\theta\) it is only necessary to consider at most one mutation in sample lineages up to coalescence. A sample then consists of families of genes from founder lineages which may be of different types and at most one family of types descendent from a mutation in the coalescent tree. If the population at time zero consists of just one type, expressions for the sampling probability are much simpler. There are several reasons to consider large sample sizes by taking \(n\to\infty\). The first is that current sample sizes can be large. The second is that the approximate sampling distributions are obtained in probability to \(o(\theta)\) for fixed sample size \(n\), and the error can possibly be large when \(n\to\infty\). It is shown that, when \(n\to\infty\), this order of approximation can still be small if \(n\to\infty\) and \(\theta \to 0\) such that \(n\theta \to \alpha\), where \(\alpha \ll 1\). The third reason is that ideally one could use the coalescent approach to find an approximate transition density in the Wright-Fisher diffusion for small mutation rates. The approach taken in a sample cannot be used directly in an infinite-leaf coalescent tree because as \(\theta \to 0\) the probability of greater than one mutation in the coalescent tree is 1, not \(O(\theta^{2})\). Nevertheless, a large sample size model can be got by approximating a discrete Wright-Fisher model with an effective population size of \(N\). Then it is appropriate to consider sampling formulae when \(n\to \infty\) and \(\theta \to 0\) such that \(n\theta \to \alpha \ll 1\) by thinking of \(N = n\). \(\alpha \ll 1\) is likely to be satisfied in applications, thinking of this as a population limit. Mutation rates at neutral genomic sites are typically less than \(10^{-7}\) per generation per base, effective population sizes \(N\sim O(10^{4})\), so \(\theta\) is typically \(O(10^{-3})\), and then \(\alpha\) is typically less than 0.01, which gives the asymptotic probability of more than one mutation in an infinite coalescent tree limit, as \(Q_{1} \sim 1-e^{-\alpha}(1 + \alpha) < 0.00005\).
0 references
coalescent tree
0 references
small mutation rates
0 references
Wright-Fisher diffusion
0 references
0 references
0 references
0 references
0 references