SVG Image
< Back to news

May 17 2024

UvA Researchers Created an Evaluation Checklist for Datasets

AI can analyze medical data quickly and accurately, potentially improving diagnoses and treatment plans. However, bias remains a persistent problem, making medical AI solutions not yet effective for every population. Researchers from the UvA Informatics Institute are unfolding the bias issue and suggesting improvements.

As AI application in healthcare grows, due to the aging population and rising healthcare costs, the importance of reliable AI solutions becomes increasingly apparent. Maria Galanty, a PhD student at the UvA, emphasizes the importance of documented datasets to address bias in medical AI. "Bias in machine learning models can lead to systematic errors, especially in classifying patient groups," says Galanty. This problem arises when datasets do not adequately represent the target population.

 

Galanty, who earned her bachelor's degree in mathematics and cognitive sciences at the University of Warsaw, moved to the Netherlands to work and live in an environment with high educational quality and good work-life balance. After completing her master's in artificial intelligence at Utrecht University, she began her PhD research on bias in medical data.

 

Together with Dieuwertje Luitse, also a PhD student, Galanty conducted a study on the documentation of publicly available medical datasets. These datasets are often reused by machine learning engineers who were not involved in the dataset creation process, meaning they have no additional knowledge about it. Their research focused on different types of medical data, such as MRIs, color fundus photography of the eye, and electrocardiograms.

 

They created an evaluation tool, a checklist, to assess the completeness of dataset documentation. "There is a lot of variety in dataset documentation. In some cases, it only states that annotations were performed by a medical professional, while others describe the process in detail," explains Galanty. She advocates for guidelines on how to prepare good documentation, so that all relevant information is included. Good documentation allows data users to be aware of potential biases and to reduce them.

 

Galanty's ambition is to contribute to more robust medical AI solutions. "The step from developing machine learning tools at university to application in hospitals is still quite large. If we want a tool to actually be used for patients, it needs to be very well tested and meet all ethical and legal conditions."

 

Source