[HTML][HTML] Cluster analysis in severe emphysema subjects using phenotype and genotype data: an exploratory investigation

MH Cho, GR Washko, TJ Hoffmann, GJ Criner… - Respiratory …, 2010 - Springer
MH Cho, GR Washko, TJ Hoffmann, GJ Criner, EA Hoffman, FJ Martinez, N Laird, JJ Reilly
Respiratory research, 2010Springer
Background Numerous studies have demonstrated associations between genetic markers
and COPD, but results have been inconsistent. One reason may be heterogeneity in disease
definition. Unsupervised learning approaches may assist in understanding disease
heterogeneity. Methods We selected 31 phenotypic variables and 12 SNPs from five
candidate genes in 308 subjects in the National Emphysema Treatment Trial (NETT)
Genetics Ancillary Study cohort. We used factor analysis to select a subset of phenotypic …
Background
Numerous studies have demonstrated associations between genetic markers and COPD, but results have been inconsistent. One reason may be heterogeneity in disease definition. Unsupervised learning approaches may assist in understanding disease heterogeneity.
Methods
We selected 31 phenotypic variables and 12 SNPs from five candidate genes in 308 subjects in the National Emphysema Treatment Trial (NETT) Genetics Ancillary Study cohort. We used factor analysis to select a subset of phenotypic variables, and then used cluster analysis to identify subtypes of severe emphysema. We examined the phenotypic and genotypic characteristics of each cluster.
Results
We identified six factors accounting for 75% of the shared variability among our initial phenotypic variables. We selected four phenotypic variables from these factors for cluster analysis: 1) post-bronchodilator FEV1 percent predicted, 2) percent bronchodilator responsiveness, and quantitative CT measurements of 3) apical emphysema and 4) airway wall thickness. K-means cluster analysis revealed four clusters, though separation between clusters was modest: 1) emphysema predominant, 2) bronchodilator responsive, with higher FEV1; 3) discordant, with a lower FEV1 despite less severe emphysema and lower airway wall thickness, and 4) airway predominant. Of the genotypes examined, membership in cluster 1 (emphysema-predominant) was associated with TGFB1 SNP rs1800470.
Conclusions
Cluster analysis may identify meaningful disease subtypes and/or groups of related phenotypic variables even in a highly selected group of severe emphysema subjects, and may be useful for genetic association studies.
Springer