The impact of deep learning on diagnostic performance in the differentiation of benign and malignant thyroid nodules
Abstract
Aims: This study aims to use deep learning (DL) to classify thyroid nodules as benign and malignant with ultrasonography (US). In addition, this study investigates the impact of DL on the diagnostic success of radiologists with different experiences.
Material and methods: This study included 576 US images of thyroid nodules. The dataset was divided into 80% training and 20% test sets. Four radiologists with different levels of experience classified the images in the test set as benign-malignant. A DL model was then trained with the train set and predicted benign-malignant for the test set. Then, the output of the DL model for each nodule in the test set was presented to 4 radiologists, who were asked to make a benign-malignant classification again considering these DL results.
Results: The accuracy of the DL model was 0.9391. The accuracy for junior resident (JR) 1, JR 2, senior resident (SR), and senior radiologist (Srad) before DL-assisting were 0.7043, 0.7826, 0.8435, and 0.8522 respectively. The accuracy in DL-assisted classifications was 0.9130, 0.8696, 0.9304, and 0.9043 for JR 1, JR2, SR, and Srad, respectively. DL assistance changed the decisions of less experienced radiologists more than more experienced radiologists.
Conclusion: The DL model has superior accuracy in classifying thyroid nodules as benign-malignant with US images than radiologists with different levels of experience. Additionally, all radiologists, and most notably less experienced radiology residents, increased their accuracy in DL-assisted predictions.
Keywords
DOI: http://dx.doi.org/10.11152/mu-4432
Refbacks
- There are currently no refbacks.