Genetic linkage analysis: An irregular statistical problem (Q1126821)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Genetic linkage analysis: An irregular statistical problem
scientific article

    Statements

    Genetic linkage analysis: An irregular statistical problem (English)
    0 references
    5 August 1998
    0 references
    This expository paper broadly surveys the background and mathematical results for a particular class of statistical models connecting the scientific problem of mapping genetic loci related to a phenotypic trait such as a disease, using detailed genetic data on identity by descent of genetic markers for many independent pairs of relatives, such as half-siblings, who exhibit the trait. The models, derived via central-limit asymptotics in the number of relative-pairs, reduce the score statistic for the data, with respect to a parameter measuring the strength of association of the trait and a single locus on a single chromosome, to a set of independent Gaussian processes -- one for each chromosome -- with variances 1 and covariance functions \(~R(s,t)~\) assumed stationary in location along chromosomes and the same for all chromosomes. The means of these processes are 0 except for the one(s) corresponding to chromosome(s) where a gene-locus related to the trait resides. The author and various co-authors have developed asymptotic size and power formulas for the test for the null hypothesis that the trait is unrelated to any gene locus, based upon the score statistic. Related work of the author, also summarized in the article, provides a confidence region -- based upon the same type of identity by descent data -- for the location of the (assumed single) gene locus associated with the trait of interest. The statistical problem treated is `irregular' as in the article's title for two reasons: that a single locus assumed responsible for the trait is not identifiable in the limiting case where the strength of the relationship between the trait and the single gene locus vanishes, and also that the log-likelihood is not a smooth function of the parameter of location of the gene locus causing the trait. This statistical problem shares many features, including these statistical `irregularities', with the `change-point problem' of estimating or testing for the presence of a change in the underlying distribution of (usually independent) time-sequence data. The article is interestingly written, at a descriptive level without formal treatment of assumptions and results, but assumes that the reader is familar with basic genetics, the central limit theorem, and mathematical statistics. The main example treated is the case of complete identity by descent data on a large sample of half-siblings for a trait governed by a single gene-locus. The author sketches extensions of the models and statistical applications which often have greater genetic realism: to genetic traits influenced by multiple gene-loci, to genetic mapping data which are partially missing or uninformative, and to traits which are quantitative rather than categorical.
    0 references
    change point problem
    0 references
    Gaussian processes
    0 references
    gene mapping
    0 references
    genetic markers
    0 references
    genome error rate
    0 references
    identity by descent
    0 references
    score statistic
    0 references
    0 references
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references