• Title of article

    Two-Sample Tests for Comparing Intra-Individual Genetic Sequence Diversity between Populations

  • Author/Authors

    Gilbert، Peter B. نويسنده , , Rossini، A. J. نويسنده , , Shankarappa، Raj نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2005
  • Pages
    -105
  • From page
    106
  • To page
    0
  • Abstract
    Consider a study of two groups of individuals infected with a population of a genetically related heterogeneous mixture of viruses, and multiple viral sequences are sampled from each person. Based on estimates of genetic distances between pairs of aligned viral sequences within individuals, we develop four new tests to compare intra-individual genetic sequence diversity between the two groups. This problem is complicated by two levels of dependency in the data structure: (i) Within an individual, any pairwise distances that share a common sequence are positively correlated; and (ii) for any two pairings of individuals which share a person, the two differences in intra-individual distances between the paired individuals are positively correlated. The first proposed test is based on the difference in mean intraindividual pairwise distances pooled over all individuals in each group, standardized by a variance estimate that corrects for the correlation structure using Ustatistic theory. The second procedure is a nonparametric rank-based analog of the first test, and the third test contrasts the set of subject-specific average intraindividual pairwise distances between the groups. These tests are very easy to use and solve correlation problem (i). The fourth procedure is based on a linear combination of all possible U-statistics calculated on independent, identically distributed sequence subdatasets, over the two levels (i) and (ii) of dependencies in the data, and is more complicated than the other tests but can be more powerful. Although the proposed methods are empirical and do not fully utilize knowledge from population genetics, the tests reflect biology through the evolutionary models used to derive the pairwise sequence distances. The new tests are evaluated theoretically and in a simulation study, and are applied to a dataset of 200 HIV sequences sampled from 21 children.
  • Keywords
    Wilcoxon test , U-statistic , HIV genetic diversity , Median test , CTL epitope , Nonparametric statistics , hypothesis testing , Correlated data , two-sample test
  • Journal title
    BIOMETRICS (BIOMETRIC SOCIETY)
  • Serial Year
    2005
  • Journal title
    BIOMETRICS (BIOMETRIC SOCIETY)
  • Record number

    83976