An artificial intelligence (AI) system performed well at distinguishing four common types of polyps on digitized histopathologic slides. The findings suggest AI has the potential to improve colon cancer screening, Dartmouth-Hitchcock researchers reported in JAMA Network Open online April 23.
The researchers evaluated a deep neural network internally at their own institution and also externally, using a dataset of slides evaluated by pathologists at 24 different institutions in 13 U.S. states. The external data were drawn from a study sponsored by Dartmouth-Hitchcock Medical Center (DHMC) that evaluated the value of supplementation with vitamin D with or without calcium for preventing colon polyps.
The mean accuracy of the deep neural network was 93.5%, compared with 91.4% for local pathologists in the internal part of the study. For the external data, the deep neural network had an accuracy of 87%, compared with 86.6% for pathologists.
"The findings suggest that this model may assist pathologists by improving the diagnostic efficiency, reproducibility, and accuracy of colorectal cancer screenings," wrote Saeed Hassanpour, PhD, an associate professor of biomedical data science at Geisel School of Medicine at Dartmouth, and colleagues.
Automatic classification
The artificial intelligence model was developed as a means of improving screening for colorectal cancer, which involves the removal of preinvasive adenomas or serrated lesions.
"Numbers and types of polyps found can also indicate future risk for malignancies and are therefore used as the basis for screening recommendations," the researchers noted in a statement about the publication of the data.
The retrospective study explored the potential for AI in automatic classification of polyps, as a means of improving reproducibility of screening and reducing barriers to access to pathology services, the researchers explained.
The study entailed analysis of hematoxylin and eosin-stained formalin-fixed, paraffin-embedded (FFPE) whole slide images -- 157 in the internal dataset and 238 in the external study. The deep neural network was trained to identify the following four commonly reported types of polyps:
- Tubular adenoma (TA)
- Tubulovillous adenoma (TVA)
- Hyperplastic polyp (HP)
- Sessile serrated adenoma (SSA)
Evaluation of the external dataset showed similar accuracy, sensitivity, and specificity for local pathologists taking part in the vitamin D/calcium study as for the deep neural network, Hassanpour and colleagues reported.
"Deep neural networks may provide a generalizable approach for the classification of colorectal polyps on digitized histopathologic slides," the authors wrote.
Pathologists vs. AI in identifying colorectal cancer | ||
Polyp type | Local pathologists accuracy | Deep neural network accuracy |
TA | 79.8% | 84.5% |
TVA | 81.5% | 89.5% |
HP | 91.6% | 85.3% |
SSA | 93.3% | 88.7% |
Mean | 86.6% | 87% |
The AI model was more challenged by further subclassification of adenomatous and serrated polyps, perhaps because the thresholds for the detection of tubulovillous or villous growths and sessile serrated crypts vary among pathologists, they suggested. Many of the errors made by the deep neural network were similar to mistakes made by pathologists in practice, the researchers reported.
"For example, a common mistake made by both the model and the local pathologists was distinguishing hyperplastic polyps and sessile serrated adenomas, potentially reflecting the data imbalance of the sessile serrated adenoma class in the training set," Hassanpour and colleagues wrote.
The model could be used in laboratory information systems to guide pathologists to areas of interest on slides, helping to alert them to areas that are more likely to be preinvasive cancer, the authors suggested. Dartmouth-Hitchcock researchers plan to test the model in a prospective trial as an aid for pathologists' classification of colorectal cancer polyps.
"A deep learning model for colorectal polyp classification, if validated through clinical trials, has potential for widespread application in clinical settings," they concluded.