MRC PhD students
Level of study: MSC
Title: The Statistical Theory Underlying Human Genetic Linkage and Association Analysis based on Quantitative Data from Extended Families.
Traditionally, in human genetic linkage analysis, only dichotomous traits like Disease/No Disease and genotype data from family trios (for example, mother, father, and child) or sib-pairs were analysed. In such cases, the (affected or unaffected) child was the focus of interest, and the parents were considered as they are the sources of the child’s genes. The statistical methods used for the analyses of this data are based on contingency tables and methods of categorical data analysis, and were thus relatively uncomplicated.
Recently however, there have been two very important developments in genetics: It became clear that if the disease status of several generations of a family and their genotypes are known, researchers can pinpoint which genotypes are linked to the disease or trait. It also became evident that if a trait is numerical or quantitative, like blood pressure or viral loads, rather than dichotomous, one has much more power for the same sample size.
This lead to the development of statistical mixed models which could incorporate all the features of the data, including the degree of the relationship between each pair of family members. This is necessary because a parent-child pair definitely shares one allele of a gene, but two siblings might share anything from 0 to both alleles with one another.
The statistical methods involved here have however been developed by geneticists, for specific special cases, so there does not seem to be a unified and general theory of the methods.
Specific aims of the project are:
To explain, in a unified and statistically comprehensive manner, the theory involved in the analysis of family-based genetic data. A general method will be presented which will include most existing methods as special cases. The focus will be on linkage analysis: what it is, and what it aims to do. There will be a step-by-step build up to it, starting with an introduction to genetic epidemiology. This will include an explanation of the relevant genetic terminology, as well as a comparison of common genetic and statistical terminology. The dissertation will also include a review of the methods used in published articles and an application section where an appropriate human genetic family dataset is analysed, illustrating the methods explained in the theory sections.
Supervisor: Dr Lize van der Merwe, MRC Unit: Biostatistics Unit
Study Institution: University of the Western Cape |