In this research proposal, we aim to assess the data quality of a KG created from the MIMIC dataset, which contains intensive care patient information in text form.
The proposed methodology for this research will involve two primary approaches: rule mining and SHACL shape constraint language.
Firstly, rule mining will be used to identify patterns and correlations within the data, which will allow us to identify any inconsistencies, redundancies, or errors.
Secondly, the SHACL shapes constraint language will be applied to validate the KG against predefined constraints to ensure its adherence to standards and data quality requirements.
tabular representations Construct the knowledge graph from the tabular dataset in CSV form.