XGBoost-SHAP-based interpretable framework for the early identification of pulmonary nodules

doi:10.3969/j.issn.1005-3697.2026.04.005

Home > Archive>Volume 41, Issue 4, 2026 >422-427. DOI:10.3969/j.issn.1005-3697.2026.04.005

XGBoost-SHAP-based interpretable framework for the early identification of pulmonary nodules
DOI:
                        10.3969/j.issn.1005-3697.2026.04.005
                    
CSTR:
                        
Author:
                        
Affiliation:
Clc Number:R734.2
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Abstract:

Objective: To achieve early identification of pulmonary nodules and visual interpretation of key variables through interpretable machine learning, and to facilitate precise prevention, control, early diagnosis and treatment of lung cancer. Methods: This study enrolled individuals at high risk of lung cancer and completed clinical screening. Their high-risk assessment data and imaging results were extracted. Participants were divided into high-risk and low-risk groups for pulmonary nodules based on China’s Lung Cancer Screening Standard (T/CPMA 013-2020). Variables with differences identified by univariate analysis were used as predictors, with pulmonary nodule grouping as the dependent variable, to construct an interpretable XGBoost-SHAP identification framework for early nodule detection and visual result interpretation. Results: A total of 644 high-risk individuals were included, with 199 (30.9%) in the high-risk pulmonary nodule group. The XGBoost model achieved an accuracy of 0.9146, sensitivity of 0.7587, specificity of 0.9843, F1-score of 0.8458, and AUC of 0.9741 for nodule grouping. SHAP analysis revealed that higher SHAP values—and thus increased risk of nodule enlargement—were associated with greater smoking intensity, exposure to secondhand smoke from colleagues/family, infrequent kitchen ventilation during cooking, excessive intake of processed foods, occupational exposure to asbestos/radon, insufficient intake of protein, fruits and vegetables, and manual labor occupation. Conclusion: The constructed interpretable framework performs well in early pulmonary nodule identification. Changes in nodule size are associated not only with traditional risk factors (e.g., smoking habits, secondhand smoke exposure, cooking fume exposure, occupational asbestos/radon exposure) but also with the participants’ dietary habits.

Reference

Cited by

Get Citation

易付良；李刚；刘昕；向茹梅；骆长玲；邓丽春；余秀莲；周厚容；高扬；邹雪娜. XGBoost-SHAP 肺结节早期识别可解释性框架构建[J]. Journal of North Sichuan Medical College,2026,41(4):422-427.

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:
Revised:
Adopted:
Online: May 06,2026
Published:

Home

Journal Introduction

Include

Notice

Contact Us

中文

Get Citation

Related Videos

Share

Article Metrics

History

Article QR Code