Abstract:Objective:To develop a robust and widely applicable predictive model to improve the accuracy of diagnosing the ma-lignancy risk of pulmonary nodules.Methods:This study retrospectively collected clinical data from 1,414 patients with pulmonary nod-ules diagnosed and treated at the Affiliated Hospital of North Sichuan Medical College and the Guang'an People's Hospital.Meta-analy-sis and Least Absolute Shrinkage and Selection Operator(LASSO)regression were used to identify predictors related to the malignancy risk of pulmonary nodules.These factors were further optimized by multivariable Logistic regression(LR)to determine key features.Based on these features,8 machine learning models were constructed and evaluated for performance using Receiver Operating Character-istic(ROC)curves,calibration curves,and Decision Curve Analysis(DCA)in the training set and internal validation set.The best-performing model was used to develop a nomogram for risk stratification of patients.Results:Through the combined screening process of Meta-analysis,LASSO regression,and multivariable LR,10 key predictive factors were identified and integrated into eight different ma-chine learning models.Model evaluation demonstrated that the LR model performed best,achieving an Area Under the Curve(AUC)of 0.843 in the internal validation cohort.Additionally,the nomogram derived from this model exhibited strong predictive ability in the ex-ternal validation cohort,with an AUC of 0.770.Risk scores calculated from the nomogram stratified patients into four risk groups,with malignancy rates ranging from 0%in the low-risk group to 100%in the very high-risk group.Conclusion:The prediction model devel-oped in this study effectively assesses the malignancy risk of pulmonary nodules,providing a valuable risk stratification tool for clinical use.