UC Davis Health Standardized Data Pipeline

Source

Description

Clarity, Providers manage care within Epic Hyperspace. Data is extracted nightly into a relational database called Clarity.

Caboodle, Clarity data is further refined into a star-schema data warehouse that Epic provides called Caboodle. Caboodle data is optimized for retrieval, and primarily used for operational analytics and self-service queries using SlicerDicer.

UC Davis Health OMOP

The UCDH Standardized Data Warehouse is built on the OMOP Common Data Model and updated monthly with data from Clarity. The Davis Standard Data Warehouse organizes data into medically meaningful domains. Data in the DSDW are mapped to standard vocabularies in collaboration with UC Health, UCLA, UCSD, UCI and UCSF.

Data Partnership for Advancing Technologies and Healthcare (DataPATH),

A legally anonymized extract from the Davis Standard Data Warehouse (DSDW). Sensitive fields are removed or obfuscated. DataPATH is protected under P3 sensitivity designations (UCOP Security Policy) but does not require an Internal Review Board (IRB) for use.

UC Health Data Warehouse

A central repository of the combined standardized OMOP instances from the UC academic medical centers, including:

UCLA Health, UCSD Health, UCSF Health, UCI Health, UCR Health

UC Davis data in UCHDW is identical to data in the UC Davis Standardized Warehouse

UC Health Research, A legally anonymized extract from the UCHDW . Sensitive fields are removed or obfuscated. 

UC Health COVID Research Data Set (CORDS)

A limited data set containing EHR records from the UCHDW consisting of covid tested patients. The data has been stripped of many identifiers; however, it still constitutes protected health information subject to HIPAA and must be protected as such.

The data set may be accessed for research and public health purposes only.

There is a UC CORDS dataset available locally. No IRB required.

Data Source Details


EMR (Clarity)

A nightly extract of operational information from Epic Chronicles. The data model consists of a highly complicated relational data structure for more flexible manipulation. It is a highly granular, detailed, and comprehensive portrait of work as recorded by the EHR. Data is tightly coupled to operations and workflows. Best for analyzing complex health system operations.


Data Warehouse (Caboodle)

A nightly extract of operational information from Clarity, and other systems, organized into a Star schema having concepts and dimensions based on a conceptual business model. This data warehouse simplifies operational reporting by gathering related business concepts into consolidated structures. Data is still tightly coupled to operations and workflows. Caboodle is best used for general analysis of health system operations.


 UC Davis Health OMOP

Common Data Models (CDMs) use standardized data structures and medical terminologies to harmonize disparate, heterogeneous, and locally defined healthcare data. The OMOP common data model was chosen to standardize EHR data among the five UC Health campuses. UC Davis works in collaboration with UC Health and the five UC Health campuses in developing processes used in the population and harmonization of the data. This datamart is refreshed monthly from Clarity.


 UC Davis DataPATH (De-identified Data)

A legally de-identified extract from the UC Davis Standardized Warehouse. Sensitive fields are removed or obfuscated. Analysts are restricted to perform computations within a secured environment (e.g. cannot copy/extract). Data PATH is best for researchers to evaluate potential cohorts and explore preliminary hypotheses on UC Davis data. No IRB required

UC Health Data Warehouse (UCHDW)

A central repository of the combined standardized OMOP instances from the UC academic medical centers, including: UCLA Health, UCSD Health, UCSF Health, UCI Health, UCR Health. UC Davis data in UCHDW is identical to data in the UC Davis Standardized Warehouse. UC Health collaboration led by Center for Data-driven Insights and Innovations (CDI2).

The UCHDW does not allow direct access at this point. To access UCHDW data, researchers are advised to develop a cohort of interest within one of the local databases above, and contact CDI2 for proxy searches on the central system.


UC Health Research

A legally anonymized extract from the UCHDW. Sensitive fields are removed or obfuscated. 


UC Health COVID Research Data Set (CORDS)

A limited data set containing EHR records from the UCHDW consisting of covid tested patients. This datamart follows the N3C COVID Cohort Phenotype. For every covid positive result there are two covid negative controls, limited to  patients of similar age and gender. The data has been stripped of many identifiers; however, it still constitutes protected health information subject to HIPAA and must be protected as such.

The data set may be accessed for research and public health purposes only. There is a UC CORDS Dataset available locally. No IRB required.