2 Distributed Program Package Workflow
This chapter discusses the relationship between the multiple distributed program packages used in the Sentinel data quality assurance process. Section 2.1 provides an overview of the interaction between the various distributed Sentinel Data Quality Review and Characterization Programs. Section 2.2 discusses programs embedded within the Quality Assurance Program Package (QA Package) and Mother-Infant Linkage Quality Assurance Program Package (QA MIL Package).
2.1 Interaction between Distributed Program Packages
The two primary Sentinel Data Quality Review and Characterization Distributed Programs are the QA Package and the QA MIL Package. The workflow for and interaction between these programs is discussed below. Please note that this workflow is focused on the interaction between the distributed packages and is not a full representation of the entire Data Partner Refresh process.
2.1.1 Phase A
- The QA Package is distributed and run first by the Data Partner (DP) against their local SCDM tables. The QA Package generates a set of flags and datasets that are used to identify potential data quality issues in the SCDM tables.
- If the Data Partner populates the Mother-Infant Linkage (MIL) table, the MIR module executes. This module creates datasets with identified infants and mothers with deliveries within the Data Partner’s data. Following Phase A approval, these datasets are used by the Data Partner to identify linked mother’s and infants to create the MIL table.
- Immediately following the completion of the QA Package checks and MIR module, the master program calls the SCDM Snapshot Program. This program generates additional tables characterizing the Data Partner’s SCDM tables.
- Finally, the master program calls Sentinel Common Components. This program creates tables specifying file paths, DP-specified table names, and a variety of metadata. It also leaves behind code and datasets used to simplify set-up of the QA MIL Package and Query Request Package (QRP) by setting macro variables and file paths.
Figure 2.1: QA Package Workflow
2.1.2 Phase B
- Following a successful Phase A Data Refresh and approval, SOC will distribute the QA MIL Package to those Data Partner’s creating the MIL table. This package is run by the Data Partner against their local SCDM tables to generate a set of flags and datasets that are used to identify potential data quality issues in the MIL table. It references the output from the Sentinel Common Components program executed in Phase A to ensure that macro variables and file paths are set correctly.
- Immediately following the completion of the QA MIL Package checks, the master program calls the SCDM Snapshot Program. This program generates additional an additional table characterizing the Data Partner’s MIL table.
- Finally, the master program calls Sentinel Common Components. This program creates tables specifying file paths, DP-specified table names—now including the MIL table—and leaves behind code and datasets used to simplify set-up of the QRP by setting macro variables and file paths.
Figure 2.2: QA MIL Package Workflow
2.1.3 Role of Sentinel Common Components within the Sentinel System
In each phase, after confirmation that the Sentinel Common Components package is pointing to the correct ETL and phase, production queries will be directed to the Common Components request associated with relevant ETL. For example, MIL table queries should be run on the most recent approved Phase B ETL, but at the same time, if a newer approved ETL phase exists at that site, standard non-MIL requests should be run on that newer ETL, as illustrated below.
Figure 2.3: Example scenario of multiple approved production ETLs at a DP
Common Components provides both parameter testing and utility functionality and works for both phases of the ETL approval process. These functions include:
- Execution of ETL metadata checks
- Execution of logic to ensure the the MIL table in all processes is available for requests
- Execution of housekeeping macros used to standardize and validate paths
- Execution the SCDM Snapshot Program
- Creation a set of macro variables using metadata about each SCDM table to later be used by distributed queries
These functions are described in more detail in Sections 17.1 and 17.2.
2.2 Embedded Programs
2.2.1 Mother-Infant Identification Program
This Mother-Infant Identification Program (Mother-Infant ID) identifies deliveries of live born infants to mothers and identifies infants. It creates files that Sentinel Data Partners use to link deliveries and infants in order to create a Sentinel Mother-Infant Linkage (MIL) table. This package works both as a standalone program and as an embedded module within the Sentinel Quality Assurance (QA) Package, where it runs as the Mother-Infant Request (MIR) module when the package’s Execute_MI parameter is set to Y. The program supports both partitioned and unpartitioned data environments.
If Execute_MI is set to Y in the QA Package master program file, the Mother-Infant ID package will execute after the QA Program has completed. Running this package is a pre-requisite for the creation of the Mother-Infant Linkage table.[^Specifications for the Mother-Infant Identification Program can be found on the Sentinel website and a more extensive discussion of QA checks for the Mother-Infant Linkage (MIL) table can be found in Chapter 12.] Information on the datasets created by this package can be found in 12.] Information on the datasets created by this package can be found in the Chapter 8: QA Package Output Datasets.
Definition 2.1 (Live Birth Delivery) Live births are identified using the pregnancy algorithm in QRP, which selects pregnancy outcomes where CODECAT = “PO” and CODE is LB, LBO, or UNC. LB and UNC deliveries are identified in inpatient (IP) settings, while LBO deliveries are identified in outpatient settings (ED, AV, or OA). Live birth deliveries are eligible for the MIL table when they meet the following additional criteria:
- the individual has an assigned value of Female in the Sentinel Common Data Model demographic table variable Sex;
- the individual was between 10 and 54.999 years of age (inclusive) as of the admission date of the delivery encounter;
- the first encounter with an LB, LBO, or UNC outcome satisfies the minimum incidence window relative to a preceding pregnancy outcome of the same or higher priority, as implemented by the pregnancy algorithm; and
- no minimum maternal enrollment is required.
When delivery codes for LBO pregnancy outcomes on the same admission date appear in more than one encounter type, the program applies an encounter type hierarchy (IP > ED > AV > OA) to assign a single encounter type.
Infant records with at least one day of enrollment with medical coverage during the first three years of life are eligible for inclusion in the MIL Table.
2.2.2 Sentinel SCDM Snapshot Program
The Sentinel SCDM Snapshot Program (SCDM Snapshot) queries approved SCDM tables to produce aggregate datasets with key metrics characterizing the Sentinel Distributed Database. These data are created for the purposes of characterizing the Sentinel Distributed Database. This program is embedded in both the QA Package and QA MIL Package. The results of this program are output to the subdirectory scdm_snapshot found within the QA Package and QA MIL Package msoc directory.
Information on the datasets created by this package can be found in the chapters on the QA Package and QA MIL Package output (Chapters 8 and 14, respectively).
2.2.2.1 Method for spanning Enrollment records by PatID
The SCDM Snapshot uses different logic for spanning enrollment records by PatID than is used in the QA Package. Whereas the QA Package expects that enrollment periods separated by more than one day should not be bridged, the SCDM Snapshot collapses enrollment periods with the same PatID, Enr_Start, Enr_End, MedCov, DrugCov, and Chart and a gap of 45 days or less.
2.2.3 Sentinel Common Components
For more information on Common Components, please refer to Part 15.