A collaboration between two medical societies and volunteers has created the largest public brain haemorrhage image database.
The creation of the brain haemorrhage image database stems from the most recent edition of the Radiology Society of North America (RSNA) Artificial Intelligence (AI) Challenge. The two medical societies, RSNA and the American Society of Neuroradiology (ASNR), along with 60 volunteers, have created the collection that includes expertly annotated images.
It is expected that the dataset will help to speed up the development of machine learning algorithms that will aid with the detection and characterisation of the life-threatening condition. Accuracy in diagnosing the presence and type of intracranial haemorrhage is a vital part of effective treatment, as even a small haemorrhage can lead to death if it is in a critical location.
The report has been published in in Radiology: Artificial Intelligence.
Creating datasets from scratch
Rather than using an existing dataset, the competition’s organisers set out to create one from scratch, compiling the images from Stanford University in Palo Alto, California, Universidade Federal de São Paulo in São Paulo, Brazil, and Thomas Jefferson University Hospital in Philadelphia, Pennsylvania.
Lead author, Dr Adam Flanders, neuroradiologist and professor at Thomas Jefferson University Hospital, commented: “The value of this challenge is to create a dataset that might lead to a generalisable solution, and the best way to do that is to train a model from data originating from multiple institutions that use a variety of CT scanners from various manufacturers, scanning protocols and a heterogeneous patient population.
“In this case, we had data from three institutions and international participation. The dataset is unique, not only in terms of the volume of abnormal images but also the heterogeneity of where they all came from.”
A total of 60 volunteers were selected to annotate 874,035 brain haemorrhage CT images in 25,312 unique exams. The volunteers marked each image as normal or abnormal. For the abnormal images, they indicated the haemorrhage subtype.
“It was a nail-biter all the way along,” Dr Flanders said of the process. “We were building the airplane while it was in flight. When you consider the number of images that we had to de-identify locally, consume, curate, label, cross-check and then organise into just the right datasets to release to the contestants, there was a lot of work involved by the volunteer workforce, the RSNA Machine Learning Subcommittee, data scientists, contractors and RSNA staff.
“The 10 top solutions came from all over the world. Some of the winners had absolutely no background in medical imaging.”
A pathway for future collaborations
The dataset was released under a non-commercial license, meaning it is freely available to the AI research community for non-commercial use and further enhancement.
Dr Flanders said the objective of engaging with a subspecialty society to leverage their unique expertise in developing a high-quality dataset is an effective and useful pathway to follow for future collaborations.
“I was really impressed by the huge volunteer effort and the tremendous worldwide interest in this project,” Dr Flanders said. “The dataset we created for this challenge will endure as a valuable ML research resource for years to come.”
The organisers will be using the model again for this year’s competition, a collaboration with the Society of Thoracic Radiology seeking improved detection and characterisation of pulmonary embolism on chest CT.