Professional Documents
Culture Documents
*Correspondence:
Background: COVID-19 has been quickly spreading, making it a serious public health
Yuyang Cai threat. It is important to identify phenotypes to predict the severity of disease and design
caiyuyang@sjtu.edu.cn
an individualized treatment.
Hai Huang
1220775601@qq.com Methods: We collected data from 213 COVID-19 patients in Wuhan Pulmonary Hospital
Ling Yang
yangling01@xinhuamed.com.cn
from January 1 to March 30, 2020. Principal component analysis (PCA) and cluster
analysis were used to classify patients.
† These authors have contributed
equally to this work Results: We identified three distinct subgroups of COVID-19. Cluster 1 was the largest
group (52.6%) and characterized by oldest age, lowest cellular immune function, and
Specialty section:
albumin levels. 38.5% of subjects were grouped into Cluster 2. Most of the lab results in
This article was submitted to
Infectious Diseases - Surveillance, Cluster 2 fell between those of Clusters 1 and 3. Cluster 3 was the smallest cluster (8.9%),
Prevention and Treatment, characterized by youngest age and highest cellular immune function. The incidence of
a section of the journal
Frontiers in Medicine respiratory failure, acute respiratory distress syndrome (ARDS), heart failure, and usage
Received: 08 June 2020
of non-invasive mechanical ventilation in Cluster 1 was significantly higher than others (P
Accepted: 13 October 2020 < 0.05). Cluster 1 had the highest death rate of 30.4% (P = 0.005). Although there were
Published: 12 November 2020
significant differences in age between Clusters 2 and 3 (P < 0.001), we found that there
Citation:
was no difference in demand for medical resources.
Ye W, Lu W, Tang Y, Chen G, Li X,
Ji C, Hou M, Zeng G, Lan X, Wang Y, Conclusions: We identified three distinct clusters of the COVID-19 patients. The results
Deng X, Cai Y, Huang H and Yang L
(2020) Identification of COVID-19
show that age alone could not be used to assess a patient’s condition. Specifically,
Clinical Phenotypes by Principal management of albumin, and immune function are important in reducing the severity
Component Analysis-Based Cluster
of disease.
Analysis. Front. Med. 7:570614.
doi: 10.3389/fmed.2020.570614 Keywords: COVID-19, phenotype, treatment, principal component analysis, cluster analysis
TABLE 2 | Correlations of the 18 original variables with the six main factors derived from the principal component analysis.
CRP, C-reactive protein; PCT, procalcitonin; NT-pro BNP, N-terminal pro brain natriuretic peptide; TNI, troponinI; FIb, fibrinogen; APTT, anginal partial thromboplastin time; PT, prothrombin
time; WBC, white blood cell; Cr, creatinine; Alb, albumin; ALT, Alanine aminotransferase; AST, Aspartate aminotransferase.
CRP, C-reactive protein; PCT, procalcitonin; NT-pro BNP, N-terminal pro brain natriuretic peptide; TNI, troponinI; FIB, fibrinogen; APTT, anginal partial thromboplastin time; PT, prothrombin
time; WBC, white blood cell.
COVID-19 Clusters and Disease Severity other two clusters in our study. Therefore, it is also important to
The disease severity of COVID-19 in the current patient pay attention to the albumin level in elderly patients.
population was compared across the clusters (Table 4). Our cluster analysis suggests that immunological parameters
Differences between Clusters 2 and 3 are shown in (helper T lymphocyte count and cytotoxic T lymphocyte count)
Supplementary Table 2. The incidence of respiratory failure, and serum albumin level are important in determining prognosis
acute respiratory distress syndrome (ARDS), and heart failure in and the vulnerability to developing comorbidities, including
Cluster 1 was significantly higher than the other two clusters (P respiratory failure, ARDS, and heart failure. Improving the
< 0.05). The proportion of non-invasive mechanical ventilation immune status and albumin level of patients may be a potential
usage in Cluster 1 was 27.7%, which was significantly higher than measures to prevent disease progression.
other clusters (P = 0.017). Cluster 1 also had the highest death The mortality rate was higher in elderly patients (7, 8).
rate of 30.4% (P = 0.005). We found that the mortality rate of Cluster 3, which was
characterized by the youngest mean age, was not significantly
DISCUSSION different from middle-aged patients who grouped in Cluster
2. This result aroused our attention. In previous studies, it
COVID-19 is a novel, rapidly spreading, viral illness that was mentioned that some COVID-19 patients showed immune
represents an emergent global health threat. Mortality rate is imbalance and a cytokine storm, which could be responsible
higher in elderly and intensive care unit (ICU) COVID-19 for further lung injury (15–17). Young patients in Cluster 3
patients, reaching 17–38% in recent reports (7, 8). Progressive had the highest T lymphocyte count, and most likely had a
lymphocytopenia was often found in severe cases (9–11). In cytokine storm. Thus, is the implication to clinicians that if a
this study, we identified three distinct subgroups of COVID- younger patient presents with COVID-19, they should check T
19 through a cluster analysis of 213 patients. Cluster 1 was lymphocyte counts because those with very high levels may be
characterized by oldest age, highest mortality rate (30.36%), and at risk of developing severe disease despite a younger age. This
significantly lower lymphocyte count. This result was consistent needs further pathological research to validate.
with previous reports (7, 8). D-Dimer is a degradation product that is produced in
The immune system of a host controls invading pathogens and hydrolysis of fibrin (18). Studies have reported increase in D-
thereby determines the prognosis of patients with any infectious Dimer levels in patients with pneumonia, has an indication of
disease, including pneumonia (12). As immune deficiency is the presence of thrombosis and the blood hypercoagulable state
closely tied to mortality, evaluating the immune condition could (19, 20). High D-Dimer is likely to be associated with persistent
be an important companion to monitoring a patient’s general clotting disorders, microthrombotic formation, pulmonary
condition in order to estimate prognosis (13). We found that embolism and acute myocardial infarction in long-stay patients
helper T lymphocyte count and cytotoxic T lymphocyte count or death patients, which may cause refractory hypoxemia,
in Cluster 1 were significantly lower than those of the other two respiratory failure, disseminated intravascular coagulation or
clusters. This suggested more impaired immune function in the even death. Our previous study found that COVID-19 patients
Cluster 1 patients. Treating the immune deficiency at the early with higher initial and peak D-Dimer value tended to have a
stage of disease may reduce the risk of disease deterioration and higher risk of death (21). In this study, we found that D-Dimer
improve patient prognosis. Therefore, more attention to immune of Cluster 1 was significantly higher than other two clusters.
function is required in the elderly, severely ill patients instead of Cluster 1 also had the highest death rate of 30.4%, which was
focusing on invasive treatment only. consistent with previous studies. These patients were likely to
Low albumin can lead to hypoproteinemia, and it can cause have myocardial infarction and/or pulmonary embolism, and it
a range of diseases, such as serous effusion, pulmonary edema, might also explain the difference of myocardial enzymes (TNI
heart failure, and more. Timely correction of hypoproteinemia and AST) among the three clusters. This might suggest the
could effectively prevent the incidence of complications (14). importance of early anticoagulant intervention.
Therefore we compared the albumin differences between three Neutrophil count and lymphocyte count were found to have
clusters. Albumin of Cluster 1 was significantly lower than the great prognostic power in community-acquired pneumonia. The
increase of neutrophils often indicates that the patients have the immune function and pay attention to the underlying
bacterial infection and the infection is aggravated. The decrease health conditions in the elderly patients. D-Dimer, lymphocyte
of lymphocyte means that the immune function is poor (22, 23). count, neutrophil count, NT-proBNP, T lymphocyte count,
At the early stage of COVID-19, the total number of leukocytes and serum albumin should be paid attention to. This might
is normal or decreases, while the lymphocyte count decreases remind us that correction of these abnormal lab results in time
(3). We found that Cluster 1 had the lowest lymphocyte count can be useful in preventing the corresponding complications
and the highest neutrophil count. There was no difference in and reducing the mortality rate. Age alone could not be
Neutrophil count and lymphocyte count between Cluster 2 and used to assess a patient’s condition; cluster assessment may be
3. Our previous study found that COVID-19 patients with more reliable.
high neutrophil-lymphocyte Count Ratio might have a poor
prognosis, even a risk of death (21). Those might suggest that the DATA AVAILABILITY STATEMENT
aggravated condition and the infection is difficult to control in
Cluster 1. The original contributions presented in the study are included
According to our clustering results in disease severity, patients in the article/Supplementary Materials, further inquiries can be
in Cluster 1 had a high incidence of respiratory failure, ARDS, directed to the corresponding author/s.
heart failure, and high utilization rate of non-invasive mechanical
ventilation. The demand for medical resources of these patients ETHICS STATEMENT
is significantly higher than other clusters. Thus, we suggest
that Cluster 1 needs a comprehensive treatment plan, or may The studies involving human participants were reviewed
even need to stay in the intensive care unit. Although there and approved by The National Health Commission of
were significant differences in age between Clusters 2 and the People’s Republic of China. Written informed consent
3, we also found that there was no significant difference in for participation was not required for this study in
demand for medical resources between these two clusters. It accordance with the national legislation and the institutional
could be interpreted that doctors should pay the same clinical requirements. Written informed consent was not obtained
attention to middle-aged and young patients. Age alone could from the individual(s) for the publication of any potentially
not be used to assess a patient’s condition, we must correct the identifiable images or data included in this article. Informed
misunderstanding that young patients should always be assumed consent was exempted with the approval of Medical Ethics
to have relatively mild disease in COVID-19. Committee of Xinhua Hospital Affiliated to Shanghai
There are some potential limitations in our study. First, this Jiaotong University School of Medicine, Shanghai, China
was a single center retrospective study. All of the data were (No. XHEC-D-2020-052).
collected from patients in Wuhan Pulmonary Hospital. Most of
the patients in this hospital were symptomatic, severe or even AUTHOR CONTRIBUTIONS
critical. As a result, the proportion of young and mild disease
patients in the study was relatively low. Second, only 213 out of YC, HH, and LY designed the current study and revised the
413 patients were enrolled in our study. The exclusion of patients manuscript. YT, GC, XLi, CJ, MH, GZ, XLa, YW, and XD
with missing clinical data might cause some bias in our analysis. collected data. WY and WL wrote the manuscript and revised the
Our results could be more representative if we are able to collect manuscript. All authors contributed to the article and approved
these data in the future. Finally, our data may be subjected to the submitted version.
recall bias and selection bias due to the nature of our study.
For example, the record of patients’ comorbidities might not be FUNDING
accurate and complete, considering the unprecedented pressure
during admission and treatment. This work was supported by Zhejiang University special scientific
Further studies with more detailed and representative data are research fund for COVID-19 prevention and control [grant
needed. In particular, a long-term follow up of the patients will number 2020XGZX065].
allow us to further explore the differences between phenotypes.
SUPPLEMENTARY MATERIAL
CONCLUSIONS
The Supplementary Material for this article can be found
We identified three distinct subclasses of COVID-19 patients in online at: https://www.frontiersin.org/articles/10.3389/fmed.
Wuhan Pulmonary Hospital. It might be necessary to improve 2020.570614/full#supplementary-material
4. Tzeng CR, Chang YC, Chang YC, Wang CW, Chen CH, Hsu MI. Cluster 17. Zhang Y, Fan L, Xi R, Mao Z, Shi D, Ding D, et al. Lethal concentration
analysis of cardiovascular and metabolic risk factors in women of reproductive of perfluoroisobutylene induces acute lung injury in mice mediated via
age. Fertil Steril. (2014) 101:1404–10. doi: 10.1016/j.fertnstert.2014. cytokine storm, oxidative stress and apoptosis. Inhal Toxicol. (2017) 29:255–
01.023 65. doi: 10.1080/08958378.2017.1357772
5. Ahmad T, Pencina MJ, Schulte PJ, O’Brien E, Whellan DJ, Piña IL, et al. 18. Gorjipour F, Totonchi Z, Gholampour Dehaki M, Hosseini S, Tirgarfakheri
Clinical implications of chronic heart failure phenotypes defined by cluster K, Mehrabanian M, et al. Serum levels of interleukin-6, interleukin-8,
analysis. J Am Coll Cardiol. (2014) 64:1765–74. doi: 10.1016/j.jacc.2014. interleukin-10, and tumor necrosis factor-α, renal function biochemical
07.979 parameters and clinical outcomes in pediatric cardiopulmonary bypass
6. Sd C, Commandeur JJ, Frank LE, Heiser WJ. Effects of group size and lack of surgery. Perfusion. (2019) 34:651–9. doi: 10.1177/0267659119842470
sphericity on the recovery of clusters in K-means cluster analysis. Multivariate 19. Guo SC, Xu CW, Liu YQ, Wang JF, Zheng ZW. Changes in plasma
Behav Res. (2006) 41:127–45. doi: 10.1207/s15327906mbr4102_2 levels of thrombomodulin and D-dimer in children with different types of
7. Wang D, Hu B, Hu C, Zhu FF, Liu X, Zhang J, et al. Clinical characteristics of Mycoplasma pneumoniae pneumonia. Zhongguo Dang Dai Er Ke Za Zhi.
138 hospitalized patients with 2019. Novel Coronavirus-Infected Pneumonia (2013) 15:619–22.
in Wuhan, China. JAMA. (2020) 323:1061–9. doi: 10.1001/jama.202 20. Inoue Arita Y, Akutsu K, Yamamoto T, Kawanaka H, Kitamura M, Murata H,
0.1585 et al. A fever in acute aortic dissection is caused by endogenous mediators that
8. Chen N, Zhou M, Dong X, Qu J, Gong F, Han Y, et al. Epidemiological influence the extrinsic coagulation pathway and do not elevate procalcitonin.
and clinical characteristics of 99 cases of 2019 novel coronavirus Intern Med. (2016) 55:1845–52. doi: 10.2169/internalmedicine.5
pneumonia in Wuhan, China: a descriptive study. Lancet. (2020) 5.5924
395:507–13. doi: 10.1016/S0140-6736(20)30211-7 21. Ye W, Chen G, Li X, Lan X, Ji C, Hou M, et al. Dynamic changes of D-
9. Li G, Fan Y, Lai Y, Han TT, Li ZH, Zhou PW, et al. Coronavirus infections and dimer and neutrophil-lymphocyte count ratio as prognostic biomarkers
immune responses. J Med Virol. (2020) 92:424–32. doi: 10.1002/jmv.25685 in COVID-19. Respir Res. (2020) 21:169. doi: 10.1186/s12931-020-
10. Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y, et al. Clinical features of 01428-7
patients infected with 2019 novel coronavirus in Wuhan, China. Lancet. 22. Celikbilek M, Dogan S, Ozbakir O, Zararsiz G, Kücük H, Gürsoy S,
(2020) 395:497–506. doi: 10.1016/S0140-6736(20)30183-5 et al. Neutrophil-lymphocyte ratio as a predictor of disease severity in
11. Han Q, Lin Q, Jin S, You L. Coronavirus 2019-nCoV: a brief perspective from ulcerative colitis. J Clin Lab Anal. (2013) 27:72–6. doi: 10.1002/jcla.
the front line. J Infect. (2020) 80:373–7. doi: 10.1016/j.jinf.2020.02.010 21564
12. Lee KY. Pneumonia, acute respiratory distress syndrome, and early immune- 23. Huang H, Wan X, Bai Y, Bian J, Xiong J, Xu Y, et al. Preoperative neutrophil-
modulator therapy. Int J Mol Sci. (2017) 18:388. doi: 10.3390/ijms180 lymphocyte and platelet-lymphocyte ratios as independent predictors of T
20388 stages in hilar cholangiocarcinoma. Cancer Manag Res. (2019) 11:5157–
13. Guo L, Wei D, Zhang X, Wu Y, Li Q, Zhou M, et al. Clinical features predicting 5162. doi: 10.2147/CMAR.S192532
mortality risk in patients with viral pneumonia: the MuLBSTA score. Front
Microbiol. (2019) 10:2752. doi: 10.3389/fmicb.2019.02752 Conflict of Interest: The authors declare that the research was conducted in the
14. Senoo T, Ishida S, Ohta K, Inaba Y, Takagi M, Yoshioka H, et al. absence of any commercial or financial relationships that could be construed as a
Hypoproteinemia as an precipitating factor of congestive heart failure in potential conflict of interest.
hypertensive heart disease (author’s transl). Nihon Ronen Igakkai Zasshi.
(1980) 17:527–32. doi: 10.3143/geriatrics.17.527 Copyright © 2020 Ye, Lu, Tang, Chen, Li, Ji, Hou, Zeng, Lan, Wang, Deng,
15. Mehta P, McAuley DF, Brown M, Sanchez E, Tattersall RS, Manson JJ, et al. Cai, Huang and Yang. This is an open-access article distributed under the terms
COVID-19: consider cytokine storm syndromes and immunosuppression. of the Creative Commons Attribution License (CC BY). The use, distribution or
Lancet. (2020) 395:1033–4. doi: 10.1016/S0140-6736(20)30628-0 reproduction in other forums is permitted, provided the original author(s) and the
16. Wu D, Yang XO. TH17 responses in cytokine storm of COVID-19: an copyright owner(s) are credited and that the original publication in this journal
emerging target of JAK2 inhibitor Fedratinib. J Microbiol Immunol Infect. is cited, in accordance with accepted academic practice. No use, distribution or
(2020) 53:368–70. doi: 10.1016/j.jmii.2020.03.005 reproduction is permitted which does not comply with these terms.