Abstract
The chitinase-like protein YKL-40 mediates airway inflammation and serum levels are associated with asthma severity. However, asthma phenotypes associated with YKL-40 levels have not been precisely defined.
We conducted an unsupervised cluster analysis of asthma patients treated at the Yale Center for Asthma and Airways Disease (n=156) to identify subgroups according to YKL-40 level. The resulting YKL-40 clusters were cross-validated in cohorts from the Severe Asthma Research Programme (n=167) and the New York University/Bellevue Asthma Repository (n=341). A sputum transcriptome analysis revealed molecular pathways associated with YKL-40 subgroups.
Four YKL-40 clusters (C1–C4) were identified. C3 and C4 had high serum YKL-40 levels compared with C1 and C2. C3 was associated with earlier onset and longer duration of disease, severe airflow obstruction, and near-fatal asthma exacerbations. C4 had the highest serum YKL-40 levels, adult onset and less airflow obstruction, but frequent exacerbations. An airway transcriptome analysis in C3 and C4 showed activation of non-type 2 inflammatory pathways.
Elevated serum YKL-40 levels were associated with two distinct clinical asthma phenotypes: one with irreversible airway obstruction and another with severe exacerbations. The YKL-40 clusters are potentially useful for identification of individuals with severe or exacerbation-prone asthma.
Abstract
Asthma with high serum YKL-40 levels is associated with severe lung function impairment and severe exacerbations http://ow.ly/wyly30elajo
Introduction
In subjects with asthma, high serum levels of the chitinase-like protein YKL-40, which is encoded by the CHI3L1 gene, are associated with severe asthma, lung function impairment and exacerbations of disease [1]. However, unsupervised learning models to more precisely define the phenotypes associated with YKL-40 have not been tested [2].
We have previously shown that the associations between asthma risk, disease severity and high serum levels of YKL-40 are partially mediated by the genetic effects of a single nucleotide polymorphism (SNP) in the CHI3L1 gene promoter and an intronic SNP [3, 4]. While human and mechanistic studies have demonstrated that YKL-40 is involved in airway remodelling [5, 6], they have not demonstrated a clear association between markers of type 2 (T2) inflammation and YKL-40 [1, 2, 7]. This suggests that YKL-40 may influence or be influenced by non-T2 inflammatory responses in asthma.
In recent years, cluster analyses have been successfully used to identify subgroups of patients characterised by similar features of asthma based on both clinical and molecular features of the disease [8–11]. In the Severe Asthma Research Programme (SARP), Moore et al. [9] used 11 clinical and physiological features of asthma to identify five clusters of patients with distinct clinical characteristics. These features have been replicated in independent cohorts, demonstrating that these clinical and physiological features are important discriminators of asthma heterogeneity [12, 13].
As a result of the known association of YKL-40 with severe asthma, and the evidence suggesting a connection between YKL-40 and non-T2 inflammatory pathways, we sought to use an unsupervised clustering method to understand the clinical and physiological features of the disease and the distinct molecular mechanisms present in the airway of asthmatic subjects with high serum levels of YKL-40. The goal of these experiments was twofold. First, we tested whether the addition of serum YKL-40 levels to previously validated clinical features of asthma severity in an unsupervised clustering analysis [9] would lead to the identification of specific clinical subgroups characterised by increased expression of YKL-40. Second, we sought to identify specific gene expression profiles in sputum associated with the YKL-40 clusters.
Methods
Subjects
Study participants in the Yale Center for Asthma and Airways Disease (YCAAD) cohort, based in New Haven, CT, USA, completed a comprehensive phenotyping study visit. The inclusion and exclusion criteria and the YCAAD phenotyping protocol have been described previously [1]. Additional description of the methods is included in the supplementary material. Study participants in the SARP and New York University/Bellevue Asthma Repository (NYUBAR) cohorts completed study visits using established standard operating procedures, as described previously [9, 14]. The characteristics of these subjects have been reported in previous publications [9, 12, 14, 15]. These studies were conducted following institutional review board approval for all of the institutions involved and all participants provided informed consent.
Measurement of serum and sputum YKL-40 levels
YKL-40 levels were measured in duplicate in serum (all cohorts) and sputum (YCAAD only) supernatant specimens by ELISA (Quidel, San Diego, CA, USA), as described previously [1, 11].
Sputum induction and gene expression measurements
Sputum induction was performed in the YCAAD cohort only; additional details are described in the supplementary material.
Statistical analysis
All statistical, clustering and classifier analyses were performed using R software (R Foundation for Statistical Computing, Vienna, Austria). Results are reported as median and interquartile range (25–75%), unless otherwise specified. Continuous variables were tested using nonparametric tests, including the Wilcoxon test to compare two groups and the Kruskal–Wallis test to compare more than two groups. Categorical variables were analysed with the Chi-squared test. p-values <0.05 were considered significant. Specific details on the clustering and gene expression analyses are described in the supplementary material.
Results
Discovery and validation of YKL-40 clusters
To identify subgroups of asthma patients associated with elevated serum YKL-40 levels, we conducted an unsupervised clustering analysis on 156 individuals from the YCAAD cohort using serum YKL-40 levels and 11 clinical and physiological features of asthma, as described in Methods [9]. Figure 1 illustrates the study workflow. Four subgroups of disease patients with differing serum YKL-40 levels and distinct clinical and physiological characteristics of disease (clusters C1–C4) were identified (figure 2a). C1 had the lowest median YKL-40 value (48 (32–71) ng·mL−1), which was similar to the median value of C2 (50 (33–88) ng·mL−1). These median values were also similar to YKL-40 values (43 (20–184) ng·mL−1) in healthy subjects [16]. The high serum YKL-40 cluster C3 had a higher median YKL-40 level than the median level of C1 and C2 combined (74 (6–108) versus 49 (32–74) ng·mL−1; p=0.05). C4 also had a higher median level than C1 and C2 combined (206 (110–311) versus 49 (32–74) ng·mL−1; p<0.01). The median YKL-40 level of C1 and C2 combined was lower than that of C3 and C4 combined (49 (32–74) versus 110 (68–222) ng·mL−1; p<0.01) (table 1).
To validate the YKL-40 clusters and their associated clinical, physiological and biological features, a classifier algorithm was developed using recursive partitioning on the YCAAD cohort (supplementary figure E1). This classifier was applied to the SARP (n=167) and NYUBAR (n=341) cohorts (figure 2b and c, and table 1). Subgrouping of both the SARP and NYUBAR cohorts using this algorithm demonstrated clusters with YKL-40 levels similar to those of the YCAAD clusters (figure 2b and c, and table 1). C1 and C2 had lower serum YKL-40 levels compared with C3 and C4 in both the SARP cohort (46 (32–65) versus 68 (46–126) ng·mL−1; p=0.002) and the NYUBAR cohort (53 (40–81) versus 99 (59–162) ng·mL−1; p<0.001) (table 2). Combining the data from the three cohorts showed that the serum YKL-40 level in C3 was 58% higher and in C4 was 133% higher compared with C1 and C2 combined (supplementary table E2). Overall, this analysis of the three cohorts of asthma revealed consistent subgroups of individuals with either normal (C1 and C2) or elevated (C3 and C4) serum levels of YKL-40.
Clinical and physiological characteristics of the YKL-40 clusters
To determine the phenotype of each of the YKL-40 clusters (C1–C4), we compared their clinical characteristics (table 1). In all three cohorts, individuals in C1 (the cluster with the lowest serum YKL-40 levels) were the most prevalent (53%), the youngest (median age 42 years) and predominantly female (97%). Compared with the high serum YKL-40 clusters C3 and C4, C1 was also distinguished by a lower disease severity, low rates of exacerbations requiring hospitalisation or mechanical ventilation, normal lung function (figure 3a–c) and lower doses of medications.
YKL-40 cluster C2, which was also characterised by low serum YKL-40 levels, was the second most common cluster (27%) and was predominantly male (70%). C2 was the only cluster demonstrating bronchodilation following the administration of a short-acting β-agonist in all three cohorts (table 1). Despite local differences in IgE levels (table 2), C2 had higher IgE levels than C1 in all three cohorts (172 (66–392) versus 96 (29–252) IU·mL−1; p<0.01) (supplementary table E2) and similar levels to C3 and C4. Therefore, C1 and C2 had similar low YKL-40 levels, but differed by bronchodilation response and IgE levels.
The high serum YKL-40 cluster C3 included individuals who were older (median age 49 years) and had more severe disease than C1 and C2. Compared with the other three clusters, C3 had the earliest asthma onset (median 6 years) and the longest duration of disease (median 36 years) (supplementary table E2), as well as the highest rates of near-fatal asthma (NFA) exacerbations requiring intubation (average 44%) (figure 4) and irreversible airflow obstruction (median post-bronchodilator forced expiratory volume in 1 s (FEV1) 48% predicted) (supplementary table E2). Moreover, 96% of patients in C3 were using inhaled and/or systemic corticosteroids (figure 3). Despite the high rates of corticosteroid use, C3 also required additional controller medications (table 1).
C4 was the cluster with the highest serum YKL-40 levels (figure 2). C4 differed from C3 in several other key features, including older age of asthma onset (median 43 years) and shorter duration of disease (median 11 years) (supplementary table E2), as well as a lower rate of NFA (9%) and less-severe persistent airflow obstruction (median post-bronchodilator FEV1 74% predicted) (figures 3 and 4, and supplementary table E2). Interestingly, although statistically significant in YCAAD only, C4 had the highest rate of obesity among the clusters in YCAAD (67% versus 41%; p=0.007). A combined analysis of all three cohorts demonstrated that C3 and C4 had higher rates of obesity than C1 and C2 (57% versus 35%; p<0.01). Similar to C1, C4 was predominantly female (73%), but differed by age of onset (43 versus 15 years; p<0.01). C4 had lower bronchodilator response compared with C2 (6% versus 9%; p=0.04) (supplementary table E2).
This analysis demonstrated consistent phenotypic characteristics within YKL-40 clusters in the three asthma cohorts, including several features that were not observed in the original SARP clusters and thus were not included in the clustering algorithm (higher IgE in C2 and higher NFA rates in C3). A Chi-squared test comparing the YKL-40 clusters with the original SARP clusters demonstrated a statistically significant difference between the results of the two clustering approaches (p<0.001) (supplementary table E3). Overall, the most striking clinical findings were that C2 was characterised by elevated IgE, while C3 and C4 were characterised by severe disease, and were discriminated by high serum YKL-40 levels, severe exacerbations and worse airflow obstruction.
Sputum cell characteristics and YKL-40 protein expression associated with YKL-40 clusters
We next evaluated correlations between the YKL-40 clusters and cell and protein data from induced sputum in a YCAAD subset (n=113: C1, n=54; C2, n=17; C3, n=18; C4, n=24) (table 2). Although pairwise comparisons across clusters did not identify particular differences in sputum cell counts or YKL-40 levels (table 2), we examined whether the combination of clusters with high serum YKL-40 levels (C3 and C4) had distinctive sputum characteristics when compared with the combination of clusters with low serum YKL-40 levels (C1 and C2). The clusters with high serum YKL-40 levels (C3 and C4) demonstrated higher sputum YKL-40 protein levels (28.8 versus 9.6 ng·mL−1; p=0.03) (figure 5a) and higher neutrophil levels (45% versus 34%; p=0.03) (figure 5b) compared with the low serum YKL-40 clusters (C1 and C2). In all the subjects in this YCAAD subset, sputum YKL-40 levels were also correlated with serum YKL-40 levels (ρ=0.23; p=0.02) (data not shown). Sputum macrophage percentages were higher in C1 and C2 compared with C3 and C4 (58 versus 44%; p<0.01). An analysis of sputum cell characteristics revealed no differences in sputum cellularity or cell viability across the four YKL-40 clusters (data not shown).
Sputum T2 inflammatory gene expression associated with YKL-40 clusters
Based on the observed clinical, physiological and sputum differences among the YKL-40 clusters, we hypothesised that gene expression profiles in the airway would also differ among the clusters. Therefore, we analysed mRNA expression in the sputum of a subset of the YCAAD subjects involved in the discovery of the YKL-40 clusters (n=63: C1, n=30; C2, n=8; C3, n=13; C4, n=12) and healthy controls (n=10) (supplementary table E4). Sputum CHI3L1 mRNA expression did not differ significantly among the 63 YCAAD subjects or in pairwise comparisons between clusters and healthy controls (data not shown).
Based on the above-mentioned low YKL-40 expression and elevated IgE levels in C2, this cluster was predicted to be characterised by T2 inflammation. Consistent with this hypothesis, the T2 expression signature (the average mRNA expression of interleukin (IL)-4, IL-5 and IL-13) was highest in C2 compared with the other YKL-40 clusters (p=0.04) (figure 6). A broader transcriptome analysis revealed additional transcriptional enrichment for the IL-5 signalling pathway in C2, providing additional evidence of T2 enrichment in this cluster (supplementary tables E7 and E8). The presence of the highest T2 mRNA expression in this cluster with low YKL-40 protein levels suggests that YKL-40 is primarily associated with non-T2 inflammatory pathways [11, 17]. In contrast, the high serum YKL-40 clusters (C3 and C4) were characterised by high sputum YKL-40 protein levels and airway neutrophilia, and were not associated with sputum markers of T2 inflammation. Taken together, these results suggest that YKL-40 may be a non-T2 asthma biomarker.
Sputum non-T2 immune gene expression associated with YKL-40 clusters
To further delineate the association between non-T2 inflammatory mechanisms and YKL-40, the 63 YCAAD subjects’ sputum expression profiles were compared with those of the healthy controls. C1 was enriched for genes involved in airway immunity and innate immune response to viral infection networks (supplementary tables E5 and E6).
The high serum YKL-40 clusters (C3 and C4) showed similar gene ontology enrichment for immune responses. Both C3 and C4 showed 11 genes with identical dysregulation and similar fold change magnitude (supplementary table E9). Supplementary figure E2 illustrates the expression of five genes common to both clusters that have been previously linked to airway biology and asthma: TNFAIP3 [18–20], MIR21 [21, 22], HLA-DQA1 [23, 24], IL1RAP [25] and CCL18 [26–28].
Transcriptional changes specific to C3 are presented in figure 7a and supplementary table E10. The identified transcripts included several genes involved in the innate and adaptive immune responses, such as the defensins DEFA1, DEFA1B and DEFA3, TLR2, IFITM1, and LY86 (supplementary figure E3). A pathway enrichment analysis of C3 transcripts identified several dysregulated pathways, including neutrophil extracellular trap (NET) formation (NETosis), immune response to bacterial infections in the airway, and inhibition of neutrophil migration by pro-resolving lipid mediators. This enrichment analysis suggests that among individuals in C3, there was activation of innate immunity and the NETosis pathways that are involved in sterile inflammation and autoimmunity (supplementary tables E10 and E11).
In contrast to C3, the transcriptional profile associated with C4 showed activation of several genes involved in the IL-1 and IL-18 pathways (figure 7b, supplementary figure E4, and supplementary tables E12 and E13), and no activation of NETosis.
Discussion
In this study, we identified four YKL-40 asthma clusters using unsupervised clustering based on a combination of serum YKL-40 levels and clinical features of asthma. Two of the four YKL-40 clusters (C3 and C4) were characterised by high serum YKL-40 levels compared with the other two clusters (C1 and C2). The clusters had distinct clinical, physiological and sputum inflammatory cell features. Following their discovery in the YCAAD cohort, the YKL-40 clusters were validated in two additional cohorts, i.e. SARP and NYUBAR. Importantly, the YKL-40 clusters had distinct characteristics that separated them from the original SARP clusters and resulted from the inclusion of serum YKL-40 levels in the clustering algorithm, which linked these clinical characteristics to circulating levels of the YKL-40 protein. This clustering was supported by noncluster features, including history of intubations and molecular differences in the sputum transcriptome. The known association between YKL-40 and non-T2 inflammation was supported by the discovery of T2 inflammatory enrichment in the low serum YKL-40 cluster C2 combined with the enrichment of non-T2 inflammatory genes in high serum YKL-40 clusters C3 and C4. Furthermore, the non-T2 networks observed in C3 were distinct from those in C4.
In all three cohorts, the subjects in C3 were linked by higher rates of NFA and severe airflow obstruction. These shared features may underlie the mechanisms leading to increased YKL-40 production in this group. Several molecular mechanisms may explain the association between YKL-40 and these clinical and physiological features. The first mechanism proposes a link between mechanical stress and YKL-40 in patients with NFA. Given that mechanical stress on the airway epithelium is known to lead to YKL-40 production [29], the exposure to airway stretch seen in NFA-related respiratory failure or in response to severe airflow obstruction may be responsible, at least in part, for the increased YKL-40 seen in C3.
The second mechanism posits that YKL-40 activates smooth muscle proliferation, leading to airway remodelling and severe airflow obstruction. This mechanism is based on the following observations: 1) in airway epithelial cells, exposure to YKL-40 causes increased IL-8 synthesis and smooth muscle proliferation in vitro, and 2) YKL-40 increases protease-activated receptor-2 activity, which enhances bronchial smooth muscle proliferation [5, 6].
A novel finding of our study is the presence in C3 of increases in both sputum neutrophils and the expression of genes involved in the formation of NETs. NETs act as extracellular DNA traps formed by neutrophils and eosinophils, and NETosis is a highly immunogenic cell death process associated with NET formation. NETosis has been associated with sterile inflammation, autoimmunity and chronic conditions associated with inflammation [30]. Previous reports have found NETs in the airway of individuals with allergic asthma and our study provides support for the role of NETosis in this severe asthma subgroup characterised by high YKL-40 [31].
The identification of asthma cluster C3 is relevant given the association of C3 with severe fixed airflow obstruction and NFA, as well as the potential to target NET production in the airway with inhaled recombinant DNase (in a similar fashion to cystic fibrosis) [32].
In contrast to C3, YKL-40 cluster C4 had the highest serum YKL-40 level, older (adult) disease onset and less airflow obstruction. C4 was also characterised by poor asthma control, frequent severe exacerbations requiring hospitalisation and a high rate of obesity (although statistically significant only in the YCAAD cohort). The association with obesity may explain, at least in part, the elevated YKL-40 levels in C4. It is known that visceral adipose tissue is a source of YKL-40 [33]. More importantly, the association between obesity and asthma may be mediated through a combination of elevated YKL-40 due to visceral fat accumulation and T2 inflammation in asthma [34]. Despite the similarities between C3 and C4 in terms of sputum neutrophil counts and the activation of the airway-associated genes TNFAIP3, HLA-DQA1, IL1RAP, CCL18 and MIR21, the C4 sputum gene expression profile showed distinct activation of the IL-1 and IL-18 pathways. The discrepancies between C3 and C4, including contrasting gene expression profiles and different fold differences in serum YKL-40 levels, suggest that their molecular associations with YKL-40 may be mediated by at least two different non-T2 inflammatory mechanisms. Our finding that sputum CHI3L1 mRNA expression did not differ among clusters in the YCAAD cohort suggests that the higher serum YKL-40 levels in C3 and C4 may be related to mechanisms independent of elevated CHI3L1 mRNA expression in sputum cells (i.e. production by bronchial epithelial cells, smooth muscle and visceral fat).
The cluster analysis also identified two low serum YKL-40 clusters. C1 was an overwhelmingly female subgroup characterised by nonsevere asthma and preserved lung function, while C2 was a high-T2 subgroup with IL-5 signalling pathway enrichment, similar to previously identified asthma clusters with high T2 gene expression [17, 35]. These clinical differences and the presence of significant differences in serum YKL-40 levels confirm our ability to discriminate asthma subgroups using this approach. This novel framework for characterising distinct subgroups of patients with asthma according to clinical features and serum YKL-40 will enable further studies to unveil the distinct mechanisms underlying these associations.
Our study does have some limitations, including its cross-sectional nature. We were unable to determine whether the YKL-40 clusters remained stable over time or were affected by changes in the environment or medications, including compliance with the latter. Therefore, prospective studies of these clusters are necessary to understand the interaction of these clusters with time and asthma therapy. Not all of the pairwise comparisons across the clusters in all cohorts demonstrated the same differences. This may be partially explained by local differences in the study cohorts despite the similar inclusion criteria, therapeutic effects, sample sizes and variations in classifier performance of the two validation cohorts. However, our approach revealed clusters with nearly identical physiological and clinical characteristics and comparable YKL-40 levels among the three cohorts. Selection of other asthma features and models may have yielded different clustering results; however, we sought to balance cluster discovery and the ability to validate these novel clusters in independent cohorts. Our model may evolve over time as our understanding of YKL-40 biology improves and additional asthma cohorts incorporate the sputum transcriptome in their patient characterisation.
In conclusion, using a clustering analysis of subjects with asthma based on YKL-40 levels and clinical and physiological features of the disease, we identified four distinct clusters of subjects. Two clusters were connected by elevated serum YKL-40 levels, but were differentiated by the magnitude of the YKL-40 elevation and specific clinical/physiological abnormalities. This study suggests that the use of YKL-40 levels in combination with clinical and physiological features of asthma can identify specific subgroups of patients with distinct clinical features and risk for severe airflow obstruction, NFA and frequent severe exacerbations. The high serum YKL-40 clusters (C3 and C4) may be clinically and biologically relevant for identifying patients with severe asthma with non-T2 disease.
Supplementary material
Supplementary Material
Please note: supplementary material is not edited by the Editorial Office, and is uploaded as it has been supplied by the author.
Supplementary material ERJ-00800-2017_Supplement
Supplementary figures ERJ-00800-2017_Supplementary_figures
Disclosures
Supplementary Material
G.L. Chupp ERJ-00800-2017_Chupp
L. Cohn ERJ-00800-2017_Cohn
G.M. Crisafi ERJ-00800-2017_Crisafi
J.L. Gomez ERJ-00800-2017_Gomez
N.N. Jarjour ERJ-00800-2017_Jarjour
J. Reibman ERJ-00800-2017_Reibman
Acknowledgements
New York University/Bellevue Asthma Registry: The authors acknowledge the Colton family for their continued support and Maria Elena Fernandez Beros (New York University School of Medicine, New York, NY, USA). Yale Center for Asthma and Airways Disease: The authors acknowledge Sarah Marone and Eleni Kapagiannidou (Yale School of Medicine, New Haven, CT, USA).
Footnotes
This article has supplementary material available from erj.ersjournals.com
Support statement: NIH T32HL007778-18 and T15LM007056-26. R01HL095390-03; 1K01HL125474-01, R01HL069116 and GCRC RR03186, CTSA UL1 TR000038; FAMRI Young Clinical Scientist Award 113393; Aerocrine Fellow Award 2010. Funding information for this article has been deposited with the Crossref Funder Registry.
Conflict of interest: Disclosures can be found alongside this article at erj.ersjournals.com
- Received April 17, 2017.
- Accepted July 16, 2017.
- Copyright ©ERS 2017