Definition of a practical taxonomy for ...
Type de document :
Article dans une revue scientifique: Article original
DOI :
PMID :
URL permanente :
Titre :
Definition of a practical taxonomy for referencing data quality problems in healthcare databases.
Auteur(s) :
Quindroit, Paul [Auteur]
METRICS : Evaluation des technologies de santé et des pratiques médicales - ULR 2694
Fruchart, Mathilde [Auteur]
Evaluation des technologies de santé et des pratiques médicales - ULR 2694 [METRICS]
Degoul, Samuel [Auteur]
Groupe hospitalier de la région de Mulhouse et Sud-Alsace
Périchon, Renaud [Auteur]
METRICS : Evaluation des technologies de santé et des pratiques médicales - ULR 2694
Martignène, N. [Auteur]
Institut de formation interhopitalier Théodore-Simon [IFITS]
Soula, Julien [Auteur]
METRICS : Evaluation des technologies de santé et des pratiques médicales - ULR 2694
Marcilly, Romaric [Auteur]
METRICS : Evaluation des technologies de santé et des pratiques médicales - ULR 2694
Lamer, Antoine [Auteur]
METRICS : Evaluation des technologies de santé et des pratiques médicales - ULR 2694
Institut de formation interhopitalier Théodore-Simon [IFITS]
METRICS : Evaluation des technologies de santé et des pratiques médicales - ULR 2694
Fruchart, Mathilde [Auteur]
Evaluation des technologies de santé et des pratiques médicales - ULR 2694 [METRICS]
Degoul, Samuel [Auteur]
Groupe hospitalier de la région de Mulhouse et Sud-Alsace
Périchon, Renaud [Auteur]
METRICS : Evaluation des technologies de santé et des pratiques médicales - ULR 2694
Martignène, N. [Auteur]
Institut de formation interhopitalier Théodore-Simon [IFITS]
Soula, Julien [Auteur]
METRICS : Evaluation des technologies de santé et des pratiques médicales - ULR 2694
Marcilly, Romaric [Auteur]
METRICS : Evaluation des technologies de santé et des pratiques médicales - ULR 2694
Lamer, Antoine [Auteur]
METRICS : Evaluation des technologies de santé et des pratiques médicales - ULR 2694
Institut de formation interhopitalier Théodore-Simon [IFITS]
Titre de la revue :
Methods of Information in Medicine
Nom court de la revue :
Methods Inf Med
Date de publication :
2022-11-15
ISSN :
2511-705X
Mot(s)-clé(s) en anglais :
data quality
database
dirty data
taxonomy
data reuse
database
dirty data
taxonomy
data reuse
Discipline(s) HAL :
Sciences du Vivant [q-bio]
Résumé en anglais : [en]
Abstract Introduction Health care information systems can generate and/or record huge volumes of data, some of which may be reused for research, clinical trials, or teaching. However, these databases can be affected by ...
Lire la suite >Abstract Introduction Health care information systems can generate and/or record huge volumes of data, some of which may be reused for research, clinical trials, or teaching. However, these databases can be affected by data quality problems; hence, an important step in the data reuse process consists in detecting and rectifying these issues. With a view to facilitating the assessment of data quality, we developed a taxonomy of data quality problems in operational databases. Material We searched the literature for publications that mentioned “data quality problems,” “data quality taxonomy,” “data quality assessment,” or “dirty data.” The publications were then reviewed, compared, summarized, and structured using a bottom-up approach, to provide an operational taxonomy of data quality problems. The latter were illustrated with fictional examples (though based on reality) from clinical databases. Results Twelve publications were selected, and 286 instances of data quality problems were identified and were classified according to six distinct levels of granularity. We used the classification defined by Oliveira et al to structure our taxonomy. The extracted items were grouped into 53 data quality problems. Discussion This taxonomy facilitated the systematic assessment of data quality in databases by presenting the data's quality according to their granularity. The definition of this taxonomy is the first step in the data cleaning process. The subsequent steps include the definition of associated quality assessment methods and data cleaning methods. Conclusion Our new taxonomy enabled the classification and illustration of 53 data quality problems found in hospital databases.Lire moins >
Lire la suite >Abstract Introduction Health care information systems can generate and/or record huge volumes of data, some of which may be reused for research, clinical trials, or teaching. However, these databases can be affected by data quality problems; hence, an important step in the data reuse process consists in detecting and rectifying these issues. With a view to facilitating the assessment of data quality, we developed a taxonomy of data quality problems in operational databases. Material We searched the literature for publications that mentioned “data quality problems,” “data quality taxonomy,” “data quality assessment,” or “dirty data.” The publications were then reviewed, compared, summarized, and structured using a bottom-up approach, to provide an operational taxonomy of data quality problems. The latter were illustrated with fictional examples (though based on reality) from clinical databases. Results Twelve publications were selected, and 286 instances of data quality problems were identified and were classified according to six distinct levels of granularity. We used the classification defined by Oliveira et al to structure our taxonomy. The extracted items were grouped into 53 data quality problems. Discussion This taxonomy facilitated the systematic assessment of data quality in databases by presenting the data's quality according to their granularity. The definition of this taxonomy is the first step in the data cleaning process. The subsequent steps include the definition of associated quality assessment methods and data cleaning methods. Conclusion Our new taxonomy enabled the classification and illustration of 53 data quality problems found in hospital databases.Lire moins >
Langue :
Anglais
Audience :
Internationale
Vulgarisation :
Non
Établissement(s) :
Université de Lille
CHU Lille
CHU Lille
Date de dépôt :
2023-11-15T03:02:04Z
2024-01-16T13:22:11Z
2024-01-16T13:22:11Z