
Creation of high-quality hospital data by harmonizing different data sources in one common data model.
Objective
The hospital’s data network currently comprises multiple data sources. Harmonizing these sources into a unified data model is essential, as it reduces data complexity and facilitates efficient data analysis. This project aims to achieve this by implementing the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM).
Methodology
The process of transforming hospital data into the OMOP CDM involves several steps, supported by in-house tools. The main tool for the process is RiaB®, which manages the Extract-Transform-Load (ETL) process to convert source data into the OMOP CDM format.
The OMOP CDM relies on standard vocabularies for data representation, requiring the hospital data to be mapped to these standardized vocabularies. This transformation is facilitated by RabGin and Keun®. RabGin is used to maintain the vocabulary process and to create custom concepts when no suitable concept exists in the standard vocabularies. After this, Keun® is used to map the source data to either standard vocabulary concepts or to the custom concepts created through RabGin.
RiaB® populates the OMOP CDM tables using ETL queries and the mappings generated by the Keun® tool. To ensure the delivery of a stable, secure, and high-quality OMOP CDM, a deployment pipeline is employed across three environments: development, staging, and production. ETL queries and mappings are developed and tested in the development environment before being moved to staging for further testing and validation. Once validated, the final data is deployed to the production environment.
After the data is organized within the OMOP CDM on the production environment, it becomes available for various data extractions, analysis applications and collaboration opportunities. One such application is linking the pathology trajectories to the OMOP dataset. Within a pathology trajectory, all vocabulary concepts are defined in the form of a reference set (refset). Based on this refset and the OMOP episode table, the clinical datapoints from a specific pathology can be easily found and linked to the specific step inside the trajectory.
Impact and future directions
Data harmonization simplifies the process of addressing research and data-driven questions, as it streamlines data extraction and analysis. This, in turn, enables more effective and efficient use of the hospital’s data resources.
General info and contact
Keywords (#): RabGin, Keun®, Riab®, OMOP CDM, Healthcare Data, Data harmonization
Contact: radar@azdelta.be
RADar authors: Ir. Hanne Vanluchene
