Gaussian asymptotic limits for the \({\alpha}\)-transformation in the analysis of compositional data (Q2316994)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Gaussian asymptotic limits for the \({\alpha}\)-transformation in the analysis of compositional data
scientific article

    Statements

    Gaussian asymptotic limits for the \({\alpha}\)-transformation in the analysis of compositional data (English)
    0 references
    0 references
    0 references
    0 references
    7 August 2019
    0 references
    Consider compositional data, consisting of vectors of proportions whose elements sum to one, which lie in the standard \(D-1\) dimensional simplex in \(\mathbb{R}^D\). When analysing such data, there is a choice to be made about the metric assumed on this simplex; common approaches have been to use either the standard Euclidean norm on the data itself, or on the log-transformed data. A more recent alternative has been to define a family of metrics, parameterised by \(\alpha\), and to estimate the optimal \(\alpha\) alongside other parameters of the model for the data, for example by maximum likelihood techniques. Some potential consequences of adopting such an approach are investigated both theoretically and numerically in the present paper. Given the data \(\mathbf{x}=(x_1,\ldots,x_D)\) on the \(D-1\) dimensional simplex in \(\mathbb{R}^D\), define the transformed data \[ \mathbf{u}_\alpha(\mathbf{x})=\left(\frac{x_1^\alpha}{\sum_{k=1}^Dx_k^\alpha},\ldots,\frac{x_D^\alpha}{\sum_{k=1}^Dx_k^\alpha}\right)^T\,, \] for \(0\leq\alpha\leq1\), where the case \(\alpha=0\) is defined by the limit \(\alpha\rightarrow0\). This interpolates between log-transformed data (at \(\alpha=0\)) and the original data (at \(\alpha=1\)). For each choice of \(\alpha\), a metric is induced by taking the Euclidean norm on the \(\alpha\)-transformed data. In the present paper, the authors consider the case where the underlying data are modelled by a Dirichlet distribution, and maximum likelihood is used to find an appropriate value of \(\alpha\) while simultaneously estimating the parameters of the Dirichlet model. In particular, they investigate limiting Gaussian behaviour in the case \(\alpha\rightarrow0\), both theoretically and numerically (using real and simulated data). They observe that in cases where the estimated value of \(\alpha\) is close to zero, other parameter estimates tend to be large.
    0 references
    0 references
    Dirichlet distribution
    0 references
    log-ratio transformation
    0 references
    manifold
    0 references
    metric
    0 references
    power transformation
    0 references

    Identifiers