Mahmood Lab investigators at Brigham and Women’s Hospital (BWH) have developed a self-supervised deep-learning algorithm for efficient, scalable retrieval of whole-slide images (WSIs) from their repositories regardless of size, according to a study recently published in Nature Biomedical Engineering.
Pathologists use whole-slide images in place of traditional glass slides to examine morphological features for diagnosis; WSI repositories can store immense amounts of data. On the one hand, high-capacity storage is beneficial but it also results in time-consuming searches for specific WSIs.
The speed of retrieval scales with the size of the repository, the investigators noted.
To address the issue, they developed an open-source, histology-image search method called Self-Supervised Image Search for Histology (SISH).
SISH recognizes morphology across WSIs to retrieve similar cases, at speeds much faster than previous algorithms, the researchers said.
“SISH allows experts to search for similar cases using the pathology image -- i.e., instead of finding similar words and phrases in a pathology report or electronic medical record,” Dr. Faisal Mahmood, study coauthor, said in an email interview. “SISH finds similar images at a theoretically constant search speed ... The speed is constant regardless of the size of the pathology image database it combs through.”
The team evaluated SISH by having it retrieve 22,385 WSIs across 13 anatomic sites and 56 disease subtypes. These WSIs came from several cohorts, including The Cancer Genome Atlas (TCGA), the Clinical Proteomic Tumor Analysis Consortium (CPTAC), and BWH.
The group evaluated the method’s ability to retrieve archival slides for rare diseases, a task generally considered challenging due to the low number of available slides.
Based on these retrievals, “SISH has strong performance on large and diverse datasets, can generalize to independent cohorts as well as rare diseases and, finally, … can be used as a search engine not just for WSIs but also for patch retrieval,” the researchers wrote.
The method is not without limitations. The team indicated that SISH has a large memory requirement and limited context awareness within large tissue slides.
SISH also “has been developed only to search for images using a query image. In clinical practice, pathologists rely on other data such as the patient’s medical record, other imaging modalities, and molecular test results to guide diagnoses and clinical decision making,” the researchers wrote.
The team has no current plans to create a commercial product, Mahmood said, adding that it is focusing on “further refining the algorithm and building larger search databases.”
Future studies may extend SISH to accept multimodal queries, such as text or genomic data, to improve diagnoses, and other search engines may be developed for multiplex immunofluorescence and spatial transcriptomics data, the team said.
“As pathology departments around the country and the world transition from using glass slides and microscopes to digitally scanned pathology images and large amounts of pathology data is accumulated SISH would become increasingly powerful,” Mahmood also said.
“With a large repository of cases to comb through we would be able to find similar cases easily, [and] this will help with rare disease diagnosis and finding cases with similar morphology which may have a similar response to treatment,” Mahmood added.