Differential Diagnosis of COVID-19 and Community-acquired Pneumonia Using Different Machine Learning Methods
-
摘要: 目的:利用深度学习技术,全自动标注病变的计算机断层扫描(CT)数据,开发准确快速区分新型冠状病毒感染(COVID-19)和其他社区获得性肺炎的人工智能模型。方法:回顾性分析248例COVID-19患者及347例其他肺炎患者的资料,进行COVID-19与其他肺炎分类;在人工智能肺分割提取后将异常的CT图像特征降维,输入几种经典强化机器学习模型、三维卷积神经网络(3D CNN)和注意力多示例学习(Attention-MIL)深层神经网络架构中,模型诊断性能利用受试者工作特性(ROC)曲线、精确召回率(PR)曲线、曲线下面积(AUC)、敏感性、特异性、准确性指标进行评价。结果:在经典机器学习模型中K邻近算法(KNN)具有较好的效果,在外部测试集上的AUC值和平均精度(AP)值分别为0.79和0.89,平衡F分数(F1)值为0.76,准确率为0.75,敏感性为0.76,精确率为0.77;经典的3D CNN在外部测试集上效果良好,AUC值和AP值分别为0.64和0.82,F1值为0.71,准确率为0.78,敏感性为0.66,精确率为0.62;Attention-MIL模型在外部测试集上表现出更好的鲁棒性,AUC值和AP值分别为0.85和0.94,F1值达到0.82,准确率为0.92,敏感性为0.74,精确率为0.76。结论:与强化影像组学和3D CNN模型相比,深度学习Attention-MIL模型在鉴别诊断COVID-19和其他社区获得性肺炎上表现出更高的效能。
-
关键词:
- CT /
- Attention-MIL模型 /
- COVID-19 /
- 社区获得性肺炎
Abstract: Purpose: Utilizing deep learning techniques, this study aimed to develop an artificial intelligence model that automatically annotates lesion computed tomography (CT) data, accurately and rapidly distinguishing novel coronavirus pneumonia (COVID-19) from other community-acquired pneumonia cases. Methods: A retrospective analysis was conducted on data from 248 patients with COVID-19 and 347 patients with other types of pneumonia. The COVID-19 cases were differentiated from other pneumonia cases during classification. After performing artificial intelligence-based lung segmentation, the extracted abnormal CT image features were dimensionally reduced and inputted into various classical machine learning models, Three-dimensional convolutional neural network (3D CNN), and attention-Multiple-instance learning (MIL) deep neural network architectures. The diagnostic performance of the models was evaluated using metrics such as receiver operating characteristic (ROC) curves, Precision Recall (PR) curves, Area Under Curve (AUC), sensitivity, specificity, and accuracy. Results: Among the classical machine learning models, K-Nearest Neighbor (KNN)demonstrated good performance, with an AUC of 0.793, Average Precision (AP) of 0.886, Balanced F Score (F1-score) of 0.7608, accuracy of 0.7512, sensitivity of 0.7754, and precision of 0.7691 on the external test set. The classical 3D CNN model exhibited satisfactory performance on the external test set with an AUC of 0.635, AP of 0.816, F1-score of 0.7144, accuracy of 0.7783, sensitivity of 0.6603, and precision of 0.6200. The attention-MIL model showed better robustness on the external test set, achieving an AUC of 0.851, AP of 0.935, F1-score of 0.8193, accuracy of 0.9155, sensitivity of 0.7414, and precision of 0.7646. Conclusion: Compared to the radiomics-enhanced and 3D CNN models, the deep learning attention-MIL model exhibited better performance in the differential diagnosis of COVID-19 and other community-acquired pneumonia.-
Key words:
- CT /
- attention-MIL model /
- COVID-19 /
- community-acquired pneumonia
-
图 6 新型冠状病毒感染、社区获得性肺炎者使用梯度加权激活映射或Grad-CAM方法生成的注意力热图
热图是标准的Jet颜色图,并与原始图像重叠。红色突出显示与预测类别关联的激活区域。
Figure 6. Coronavirus disease 2019 (COVID-19), a representative example of attention heatmaps generated with data from individuals with community-acquired pneumonia using gradient-weighted category activation mapping or the Grad-CAM method-pneumonia
表 1 不同医院患者的统计数据汇总
Table 1. Summary of the statistical data of patients from different hospitals
不同医院肺炎患者 病例数(CT数)/例 年龄/岁 男/例 女/例 内蒙古人民医院 COVID-19 80 45±13.11 34~77 36 44 CAP 102 56±14.12 45~67 57 45 金门县人民医院 COVID-19 143(143) 44.95±15.12 2~86 73 70 浙江省人民医院 COVID-19 4(4) 43±13.13 26~59 1 3 CAP 35(35) 42.08±14.95 10~66 21 14 浙江大学医学院附属邵逸夫医院 COVID-19 8(8) 42.75±6.33 33~51 4 4 CAP 210(334) 44.05±16.77 15~85 103 107 台州市中心医院 COVID-19 13(13) 47.76±14.22 31~74 6 7 表 2 各种方法在外部测试集上的表现评价指标
Table 2. Performance evaluation indicators for each method on independent test sets
测试集 F1值 准确率/% 召回率/% 精确率/% Adaboost 0.55 0.56 0.55 0.55 bagging 0.66 0.65 0.68 0.67 KNN 0.76 0.75 0.77 0.77 logistic 0.72 0.75 0.74 0.72 MLP 0.69 0.69 0.71 0.69 nusvc 0.74 0.75 0.76 0.75 SVC 0.68 0.69 0.68 0.69 xgboost 0.60 0.60 0.62 0.59 表 3 不同机器学习框架在COVID-19独立测试集上的性能
Table 3. Performance of different machine learning frameworks on COVID-19 independent test sets
Group/COVID-19 敏感性/% 特异性/% AUC P KNN 77 67 73 P<0.001 3D CNN 78 69 76 P<0.001 Attention-MIL 90 96 85 P<0.001 -
[1] CHEN N, ZHOU M, DONG X, et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: A descriptive study[J]. Lancet, 2020, 395(10223): 507−513. DOI: 10.1016/S0140-6736(20)30211-7. [2] GAO Y, YAN L, HUANG Y, et al. Structure of the RNA-dependent RNA polymerase from COVID-19 virus[J]. Science, 2020, 368(6492): 779−782. DOI: 10.1126/science.abb7498. [3] LI Q, GUAN X, WU P, et al. Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia[J]. New England Journal of Medicine, 2020, 382(13): 1199−1207. DOI: 10.1056/NEJMoa2001316. [4] HOLSHUE M L, DEBOLT C, LINDQUIST S, et al. First case of 2019 novel coronavirus in the United States[J]. New England Journal of Medicine, 2020, 382(10): 929−936. DOI: 10.1056/NEJMoa2001191. [5] AI T, YANG Z, HOU H, et al. Correlation of chest CT and RT-PCR testing in coronavirus disease 2019 (COVID-19) in China: A report of 1014 cases[J]. Radiology, 2020, 296(2): E32−E40. DOI: 10.1148/radiol.2020200642. [6] FANG Y, ZHANG H, XIE J, et al. Sensitivity of chest CT for COVID-19: Comparison to RT-PCR[J]. Radiology, 2020, 296(2): E115−E117. DOI: 10.1148/radiol.2020200432. [7] 刘玉建, 仲建全, 冯浩, 等. 新型冠状病毒肺炎患者的高分辨率 CT 影像学特征[J]. 医疗装备, 2022,35(11): 1−4.LIU Y J, ZHONG J Q, FENG H, et al. Imaging characteristics of high resolution CT for patients with corona virus disease 2019[J]. Medical Equipment, 2022, 35(11): 1−4. (in Chinese). [8] HUANG C, WANG Y, LI X, et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China[J]. Lancet, 2020, 395(10223): 497−506. DOI: 10.1016/S0140-6736(20)30183-5. [9] MEI X, LEE H C, DIAO K Y, et al. Artificial intelligence-enabled rapid diagnosis of patients with COVID-19[J]. Nature Medicine, 2020, 26(8): 1224−1228. DOI: 10.1038/s41591-020-0931-3. [10] CHEN Y, FAN S, CHEN Y, et al. Vessel segmentation from volumetric images: A multi-scale double-pathway network with class-balanced loss at the voxel level[J]. Medical Physics, 2021, 48(7): 3804−3814. DOI: 10.1002/mp.14934. [11] YE H, GAO F, YIN Y, et al. Precise diagnosis of intracranial hemorrhage and subtypes using a three-dimensional joint convolutional and recurrent neural network[J]. European Radiology, 2019, 29(11): 6191−6201. DOI: 10.1007/s00330-019-06163-2. [12] KERMANG D S, GOLDBAUM M, CAI W, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning[J]. Cell, 2018, 172(5): 1122−1131.e9. DOI: 10.1016/j.cell.2018.02.010. [13] RAJARAMAN S, CANDEMIR S, KIM I, et al. Visualization and interpretation of convolutional neural network predictions in detecting pneumonia in pediatric chest radiographs[J]. Applied Sciences-Basel, 2018, 8(10): 1715. DOI: 10.3390/app8101715. [14] WYNANTS L, Van CALSTER B, COLLINS G S, et al. Prediction models for diagnosis and prognosis of COVID-19 infection: Systematic review and critical appraisal[J]. British Medical Journal, 2020, 369: m1328. DOI: 10.1136/bmj.m1328. [15] ZHANG X, WANG D, SHAO J, et al. A deep learning integrated radiomics model for identification of coronavirus disease 2019 using computed tomography[J]. Scientific Reports, 2021, 11(1): 3938. DOI: 10.1038/s41598-021-83237-6. [16] HUANG Y Q, LIANG C H, HE L. Preoperative prediction of lymph node metastasis in colorectal cancer[J]. Journal of Clinical Oncology, 2016, 34(18): 2157−64. DOI: 10.1200/JCO.2015.65.9128. [17] PARMAR C, GROSSMANN P, BUSSINK J, et al. Machine learning methods for quantitative radiomic biomarkers[J]. Scientific Reports, 2015, 15: 13087. DOI: 10.1038/srep13087. [18] NIETHAMMER M, KWITT R, VIALARD F X. Metric learning for image registration[J]. Proc EEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019: 8455-8464. DOI: 10.1109/cvpr.2019.00866. [19] ZHANG X, LU D, GAO P, et al. Survival-relevant high-risk subregion identification for glioblastoma patients: The MRI-based multiple instance learning approach[J]. European Radiology, 2020, 30(10): 5602−5610. DOI: 10.1007/s00330-020-06912-8. [20] LIU Y, FU Q, PENG X, et al. Attention-based deep multiple-instance learning for classifying circular RNA and other long non-coding RNA[J]. Genes (Basel), 2021, 12(12): 2018. DOI: 10.3390/genes12122018. [21] DELONG E R, DELOONG D M, CLARKE-PEARSON D L. Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach[J]. Biometrics, 1988, 44(3): 837−845. doi: 10.2307/2531595 [22] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-cam: Visual explanations from deep networks via gradient-based localization[J]. Proceedings of the IEEE International Conference on Computer Vision, 2017. DOI: 10.1109/ICCV.2017.74. [23] MARTIN J, TENA N, ASUERO A G. Current state of diagnostic, screening and surveillance testing methods for COVID-19 from an analytical chemistry point of view[J]. Microchemical Journal, 2021, 167: 106305. DOI: 10.1016/j.microc.2021.106305. [24] XU X, JIANG X, MA C, et al. A deep learning system to screen novel coronavirus disease 2019 pneumonia[J]. Engineering (Beijing), 2020, 6(10): 1122−1129. DOI: 10.1016/j.eng.2020.04.010. [25] ABBAS A, ABDELSAMEA M, GABER M. Classification of covid-19 in chest X-ray images using DeTraC deep convolutional neural network[J]. Applied Intelligence, 2021, 51(2): 854−864. DOI: 10.1007/s10489-020-01829-7. [26] GOZES O, FRID-ADAR M, SAGIE N, et al. Detection and analysis of COVID-19 in medical images using deep learning techniques[J]. Scientific Reports, 2021, 11(1): 19638. DOI: 10.1038/s41598-021-99015-3. [27] CHEN J, WU L, ZHANG J, et al. Deep learning-based model for detecting 2019 novel coronavirus pneumonia on high-resolution computed tomography: A prospective study[J]. Scientific Reports, 2020, 10(1): 19196. DOI: 10.1038/s41598-020-76282-0. [28] WANG S, KANG B, MA J, et al. A deep learning algorithm using CT images to screen for corona virus disease (COVID-19)[J]. European Radiology, 2021, 31(8): 6096-6104. [29] LI Z, ZHONG Z, LI Y, et al. From community-acquired pneumonia to COVID-19: A deep learning-based method for quantitative analysis of COVID-19 on thick-section CT scans[J]. European Radiology, 2020, 30(12): 6828−6837. DOI: 10.1007/s00330-020-07042-x. [30] CHOUAT I, ECHTIOUI A, KHEMAKHEM R, et al. COVID-19 detection in CT and CXR images using deep learning models[J]. Biogerontology, 2022, 23(1): 65−84. DOI: 10.1007/s10522-021-09946-7. -