Speaker
Prof.
Felix Bießmann
Description
In the last decade, many initiatives are focusing on the release of data sets for research. One of many reasons to publish data is to enable researchers to develop novel applications that include Machine Learning (ML) components. In this context, data quality is a key factor. The translation of Machine Learning (ML) research innovations to real-world applications is often hindered by reoccurring challenges related to data quality and to a lack of automation in data pipelines upstream of ML components. In this talk I will highlight some research around data quality for ML, specifically monitoring, cleaning and prediction of data quality impact on downstream ML components.