QRP
 
Loading...
Searching...
No Matches
ms_processinputfiles.sas File Reference

This macro processes the program input files variables. More...

Detailed Description

This macro processes the program input files variables.

  • Change infolder access to read.
  • Initialize helper variables to determine study left censor date.
  • Initialize abort check variables.
  • RiskscoreCodes and RiskScoreFile checks:
    • When requesting generalized riskscores, both RiskscoreCodes and RiskScoreFile input files must be provided.
    • Make sure the following do not occur:
      • Some risk score in RISKSCOREFILE are not represented in the RISKSCORECODES file.
      • RISKFROMANCHOR or RISKTOANCHOR are blank and analysis type=4.
      • RISKFROMANCHOR or RISKTOANCHOR are not blank and analysis type is not 4.
      • RISKFROMANCHOR = EPISODEENDDT and RISKTOANCHOR=INDEXDT.
    • Check the following in RiskScoreCodes:
      • Make sure weight exists for every entry.
      • Make sure code is populated if condid is not IN (intercept).
      • Make sure codecat is a valid value.
      • If specified, make sure sex values are valid.
      • If specified, make sure age group values are valid.
  • Determine length of code/CaresettingPrincipal in inputfiles.
  • Determine length of DX, PX, and NDC code.
  • Diagnosis, Procedure, and Dispensing codes:
    • Assign type.
    • Cohortcodes:
      • Determine if pregnancy outcome codes will be evaluated from MIL table.
      • Strength and unit can only be set for RX codes and must be in list of valid values.
      • Strength and unit should both be populated if one is populated.
      • When strength and unit are populated check the sameday value for dose calcuations.
      • If T4HOIMETHOD=binary and if indexcriteria=FUT for any code, then abort.
      • Allow pregnancy outcome codes to be separated by a space. Create obs with single code.
      • Determine max length of mois to be used throughout QRP.
        • If MIL and ("LIVE" or "MIX") outcomes are requested in the same QRP run, then abort.
      • Ensure UNIT has the same value within group/stockgroup for type 2 and within group for type 5.
      • If cohortcodes contains cause of death defining codes (CODECAT=CD), CODETYPE must be set to 10.
    • InclusionCodes:
      • Do not set study start date if any condfrom exclusion window is '.'.
      • Strength and unit can only be set for RX codes.
      • Strength and unit should both be populated if one is populated.
      • When strength and unit are populated check the sameday value for dose calcuations.
      • If mincumdose is populated, then both strength and unit must be populated AND condfrom/condto cannot be missing.
      • If there is an aFDD or cFDD requirement, cannot use CODESUPPLY, strength/unit must be populated.
      • Ensure MinRXDays has same value within subcondlevel and only specified for codecat=RX.
      • Ensure UNIT has same value within subcondlevel.
      • Ensure mincfdd/maxcfdd has same value within subcondlevel.
      • Set MinRXDays to 1 where the above tests failed and continue processing.
      • Ensure only 1 codedays value per group condlevel subcondlevel.
      • Create numeric Condlevel and SubCondlevel variables for looping.
    • Append inclusion and exclusion codes to work.&COHORTCODES.
  • If for Type 4 all 3 pregnancy lookup files aren't specified, then abort.
  • Pregnancy Outcomes lookup file:
    • If the same code is used for multiple outcomes in PREGNANCYCODES, then abort.
    • If there are duplicate codes found, then abort.
  • Type 4: Pregnancy Duration codes lookup file:
    • Read in pregnancy duration file, combine mpriority and cpriority variabless.
    • If duplicate codes found, then abort.
  • Type 4: Pregnancy Metadata File:
    • Read in pregnancy metadata lookup file which contains pregnancy outcomes and incidence day values for applying washout.
      • As live birth outcomes can only be selected using (LIVE and MIX) or MIL, remove record and corresponding incidence day value of unused outcome if both are present in the file.
    • Sort by preg_outcomenum defensively as preg_outcomenum will be re-sequenced depending upon selected source of live birth records.
    • If MIL table is specified, check for ("LIVE" or "MIX") outcome and mark position of value within incidencedays.
    • If MIL table is not specified, check for a MIL outcome and mark position of value within incidencedays.
    • Loop through _incidencedays values and remove unused values were applicable.
    • Confirm incidencedays contains same number of incidence values as outcomes.
    • If outcomes not defined in both PREGNANCYCODES and PREGNANCYMETA, then abort.
    • If an outcome selected in COHORTCODES is not in the PREGNANCYMETA, then abort.
  • Type 4: Medication of interest/drug use codes and check for ENRDAYS.
  • Mother-Infant Linkage Analysis:
    • Create migroupnames dataset for use in t4cida summary table creation and PS/COV analysis.
      • Make sure expmp variable is not missing.
    • If T4HOIMETHOD = timetoevent, outcomepop can only be M, otherwise abort.
  • Monitoring File:
    • Limiting to periodid set by user.
    • Determine if query period is data driven (in the case of fixed risk window sequential analyses).
    • If query period is data driven, then monitoringfile processing will occur in ms_setdatadrivenqueryperiod.
    • If query period is not data driven and fixed across all scenarios:
      • Determine end of query period and assign last date of follow-up if censoring on query or DP max date.
        • Add QA DP Max Date to file.
        • Turn off CDPEND if using SURVEILLANCEMODE.
        • Assign index assessment period end date.
          • FUPDRIVEN - query period ends at the earliest of FUPENDATE/CDPEND.
          • Check to see if dp_maxdate is prior to startdate.
        • Assign censordate.
      • Take minimum StartDate and maximum EndDate.
      • Create non-overlapping looks dates.
      • Assign calendar based macro variables.
    • Data driven: select start of surveillance.
  • Surveillance mode processing:
    • Prior data existence for surveillance.
  • Preprare Raw extraction files:
    • Adapt for different lengths to avoid warnings later.
    • Compute CondIdNum to handle gaps in CondId.
    • Select covariates codes (not combination of covariates).
    • To prevent warning when covariatecodes contains >11 length for CC codes, reset length of code.
    • Checking for the presence of optional lab variables.
    • Strength and unit can only be set for RX codes.
    • Strength and unit should both be populated if one is populated.
    • When strength and unit are populated check the sameday value for dose calcuations.
    • If there is aFDD requirement, cannot use CODESUPPLY, strength/unit must be populated.
    • Do not set study start date if any covfrom window is '.'.
    • Codedays variable should be specified at the covariate level.
      • Run a check to ensure value is unique within covariate.
    • Covfrom and covto should be the same value for RX codes within a stockgroup.
    • Ensure MinRXDays has same value within covarnum.
    • Ensure MinRXDays only specified for codecat = RX.
    • Ensure UNIT has same value within covarnum.
    • Ensure lab code types are only numeric/character per covariate.
    • Check that codepop is consistently populated within each covariate for type 4.
      • Ensure codepop has same value within covariate.
    • Set MinRXDays to 1 where the above tests failed and continue processing.
    • List of covarnum-studyname for labels.
      • Add dummy covariates if covarnum are not sequential.
    • Count Covars and ComboCovars.
    • Confirm there are not any missing covarnums.
      • Covarnums must be consecutive without gaps.
    • Confirm valid covarnums for Combo Covars.
    • If there are combo covars that do not meet specification, then abort.
    • Check consistency with PROFILE parameter.
    • Check if some covariate windows are anchored on INDEXDT_EXP and if there are unexposed reference cohorts.
    • If COVARIATE windows are anchored to INDEXDT_EXP, then abort.
    • Combine COHORTCODES, RiskScoreCodes and CovarCodes codes to pull data from CDM only once.
      • DH codes always pulled in ms_cidanum, so a _dth dataset is not created.
    • Check for the presence of optional lab variables.
    • Rename some variables if the combo file is used.
    • List of encounters to extract in ms_cidanum.
  • Creating the number of groups to create cohorts:
    • To prevent unecessary looping, separate count of groups with different age stratification.
    • Determine distinct number of enrollment/demographic/death cohorts.
    • Process basecohort for type 2:
      • Make sure same basecohort are grouped and assess order to process them correctly.
      • Check whether variables are the same across same basecohort.
  • Stratifications:
    • ZipFile.
    • USERSTRATA.
      • Add year when month is requested.
      • Add year when quarter is requested.
      • Add agegroupnum to strata that include age group.
      • Sort levelvars applicable to denominators.
      • If MOINAME specified, check if cohortcodes has at least one MP.
      • Check if MICOHORTFILE exists when requesting t4cida table.
      • Determine if age group and/or geog vars are output.
      • If geographic strata requested, zipfile must also be specified.
      • Determine if censor table should be produced.
      • Type 4: determine if control cohort should be produced.
  • ITSFile:
    • Compute age group denominators.
    • Create lookup file for its levels to add potential missing strata and prevent duplicate computation by ms_cidadenom.
    • Separate t2cida levels from t2its. If the same levelvars is requested for both with different levelid, only keep t2cida for efficiency and assign mapping in lookup file.
    • Get all required levels.
    • Keep unique levelvars per table type (prevalent or not).
  • Never Exposed Cohort:
    • Utilization File and RiskScoreFile.
    • To prevent unecessary looping for prevalent data, separate count of groups with different age stratification.
  • Secondary Episode Analysis:
    • Determine the highest adherenceid value across all analysisgrps.
    • List of primary episode groups and analysisgrps when prev cohort is requested.
    • For type 5:
      • Confirm dose categories are defined when t5dose or t5disp with cfdd_output_cat requested on userstrata.
      • Confirm the same number of categories are requested per group.
      • Check if cfdd is requested for at least one group.
    • Must meet both strength/unit and levelvars conditions.
  • Concomitant Episode Analysis.
  • Switchinclusion file:
    • Check to see if the age stratas are the same within analysis group values.
  • Check if DRUGCLASSFILE is specified when MFUFILE is also specified and contains RX codes.
  • Determine if PS or Covariate Stratification analysis to be performed.
  • HDPS Integrity Check:
    • Get groups that are not compared with a predefined model.
    • Create dataset to id which group for HDPS.
    • Check if the same group is there more than once (then it is associated with more than one HDPS window).
    • If HDPS window is anchored to INDEXDT_EXP, reference cohort must be exposed.
  • Tree Lookup File:
    • If T4HOIMETHOD timetoevent and tree analysis both specified, then abort.
    • Check if groups specified do not exist in treefile.
    • Check if denominator is missing for PSMATCH/T3.
    • Check if weight is specified for PSSTRAT/IPTW.
    • Check if denominator is specified for PSSTRAT/IPTW.
    • Check if groups meet outcome anchoring criteria.
    • Check if poisson dataset is requested in USERSTRATA.
    • Outcome evaluation for infants (CODEPOP=I/MI) must begin on or after delivery date for type 4.
    • Identify groups for ps_match in treefile and check denominator values.
    • Poisson analysis:
      • Identify groups for ps_stratification in treefile and check denominator values.
    • PSMATCHFILE, STRATIFICATIONFILE or IPTWFILE required for SI.
    • Check if required conditions are filled if the query is not DATADRIVEN.
    • For type 4:
      • Check if denominator=person is specified in treefile for PSSTRAT/IPTW SI analyses and verify if ReqDaysAftPreg is sufficient.
  • Determine study start date for claim extraction - left censor date.
    • Take the maximum value of all possible lookback parameters to substract from &STARTDATE.
    • If any parameter has an "any" lookback, do not set left censor date.
    • Check renaming parameters for "any" lookback if &NOSTUDYSTARTDATE.=N, then take minimum value.
    • Exclusion and Covariate lookbacks have already been checked.
    • When cohort includes MIL deliveries then linkagestatus cannot be blank.
    • Look for max lookback exclusion and covariate codes.
    • Compute studystartdate:
      • For types 1, 2, 3, 5, 6: determine maximum lookback date.
        • For type 6: select minimum product dates to product date is after studystartdate.
      • For type 4: query period binds the delivery date, however need to consider pregnancy duration.
        • If all condfromanchor values are set to episodeenddt, no need to add _maxduration.
  • Stockpilingfile:
    • Determine which dataset to use and set group/analysis group variable.
    • Expand stockpiling groupings to reference and exposure groups.
    • In the case of stockgroups only being in covariatecodes file, need to create group/analysisgrp stockpiling row.
    • Since there is no key variable to link this to, a cross join where combinations of all requested baseline tables and stockgroup will be created.
    • Remove duplicate stockgroup values prior to cross join.
    • De-duplicate rows to correctly assign stocknumber based on group.
    • Join stocknumber back onto original dataset.
    • Set defaults and merge in information from original stockpiling file.
    • De-duplicate rows to correctly assign stocknumber based on group.
    • Join stocknumber back onto original dataset.
    • List of stockgroups used for dose calcuations.
  • CIDACOV pre-processing:
    • Rename group as primary to make treatmentpathways compliant with ms_cidacov analysisgrp logic.
    • Set createbaseline to Y when a baseline table is requested for any switch within an analysis group.
    • Confirm CREATEBASELINE = Y for all groups in level 2 and 3 analyses.
      • List all EOI/REF groups.
    • Mark any groups to update CREATEBASELINE and write warning to the log.
  • If strength or unit parameters do not meet requirements, then abort.
  • INCLUSIONCODES CONDFROMANCHOR and CONDTOANCHOR:
    • Joining on &COHORTFILE. to ignore potential MIL groups.
  • Perform additional checks on inputfile variables.
Program inputs
  • dplprior.&RUNID._adjusted_eval(&PERIODIDSTART.-1) (Dataset for covariate stratification and propensity score estimations, ps stratum weighting analysis, inverse probability of treatment weights (IPTW) analysis, and matching strategies and time period.)
  • dplprior.&RUNID._mstr (Dataset with one record per individual per index date for every GROUP.)
  • dplprior.&RUNID._signature (Dataset containing metadata associated with the request, including request identifiers, program identifiers, database version, run time metrics, and SAS environment information.)
  • indata.&DISTABLE. (Dataset with dispensing data.)
  • indata.&LABTABLE. (Dataset with lab result data.)
  • indata.&PROCTABLE. (Dataset with procedure data.)
  • infolder.&COHORTCODES. (Dataset with codes used to define exposures and outcomes of interest.)
  • infolder.&COHORTFILE. (Dataset used to define enrollment and demographic requirements, type of cohort identification strategy.)
  • infolder.&COMBOFILE. (Datasets that specifies how combination items should be created.)
  • infolder.&COMBOFILE.codes (Datasets with codes used to specifie combination items.)
  • infolder.&CONCFILE. (Dataset to specify GROUP values from a Type 2 analysis and perform additional analyses.)
  • infolder.&COVARIATECODES. (Dataset for request of covariates presence and analytic adjustment.)
  • infolder.&COVSTRATFILE. (Dataset describing parameters for a covariate stratification analyses.)
  • infolder.&INCLUSIONCODES. (Dataset with codes used to define additional cohort inclusion/exclusion criteria.)
  • infolder.&IPTWFILE. (Dataset specifying the parameters for an IPTW analysis.)
  • infolder.&ITSFILE. (Dataset with ITS analysis specifications.)
  • infolder.&MFUFILE. (Dataset for most frequent utilization (MFU) assessment.)
  • infolder.&MICOHORTFILE. (Dataset for Type 4 analysis with pregnancy cohorts among pregnancies matched to an infant specification.)
  • infolder.&MONITORINGFILE. (Dataset with monitoring period(s) for descriptive, inferential and sequential analyses.)
  • infolder.&MULTEVENTFILE. (Dataset to specify COHORTGRP values from a Type 2 analysis and perform additional analyses.)
  • infolder.&MULTEVENTFILE_ADHERE. (Dataset to specify multiple criteria to determine overall adherence for a Type 2 multiple events analysis.)
  • infolder.&OVERLAPFILE. (Dataset characterizing an overlap of primary and secondary treatment episodes during the observation window.)
  • infolder.&OVERLAPFILE_ADHERE. (Dataset to specify multiple criteria to determine overall adherence for a concomitant use analysis.)
  • infolder.&PREGNANCYCODES.(Dataset for Type 4 cohort identification strategy.)
  • infolder.&PREGNANCYDURATION. (Dataset for Type 4 cohort identification strategy.)
  • infolder.&PREGNANCYMETA. (Dataset for Type 4 cohort identification strategy.)
  • infolder.&PSCSSUBGROUPFILE. (Dataset with all subgroups and subgroup levels for each analysis group.)
  • infolder.&PSESTIMATIONFILE. (Dataset with the parameters for estimating a PS model and is required for PS-based analyses.)
  • infolder.&PSMATCHFILE. (Dataset with parameters for a PS matching analysis.)
  • infolder.&RISKSCORECODES. (Dataset for risk score calculation if requested.)
  • infolder.&RISKSCOREFILE. (Dataset required for calculating one or more risk scores.)
  • infolder.&STOCKPILINGFILE. (Dataset with valid dispensings selection used by the stockpiling algorithm to create exposure episodes.)
  • infolder.&STRATIFICATIONFILE. (Dataset with parameters for a PS stratification analysis and required for PS-based analyses.)
  • infolder.&TREATMENTPATHWAYS. (Dataset with identification and computation of switch pattern episodes.)
  • infolder.&TREEFILE. (Dataset with parameters required to execute multiple SI analyses from a basic QRP execution.)
  • infolder.&TREELOOKUP. (Dataset with hierarchical tree of codes that are eligible to be HOI.)
  • infolder.&TYPE1FILE. (Dataset required for a background rate calculation cohort identification strategy.)
  • infolder.&TYPE2FILE. (Dataset required for an exposures and follow-up time cohort identification strategy.)
  • infolder.&TYPE3FILE. (Dataset required for a self-controlled risk interval design cohort identification strategy.)
  • infolder.&TYPE4FILE. (Dataset required for a pregnancy episodes identification strategy.)
  • infolder.&TYPE5FILE. (Dataset required for medical product utilization cohort identification strategy.)
  • infolder.&TYPE6FILE. (Dataset required for evaluating manufacture level product utilization and switching patterns.)
  • infolder.&USERSTRATA. (Dataset with ouput tables and stratifications specification.)
  • infolder.&UTILFILE. (Dataset with medical or drug utilization metrics specification.)
  • infolder.&ZIPFILE. (Lookup table required if a request requires stratification of results by geographic location.)
Program outputs
  • &COHORTCODES. (Dataset that includes the codes for outcomes.)
  • &COHORTCODES._po (Dataset that includes the codes for pregnancy outcomes.)
  • &INCLUSIONCODES. (Dataset containing inclusion/exclusion codes for the cohort.)
  • &MONITORINGFILE. (Dataset that defines the relevant query periods for execution.)
  • &PREGNANCYCODES. (Dataset that includes the codes for pregnancy outcomes.)
  • &PREGNANCYDURATION. (Dataset for type4 with pregnancy duration codes.)
  • &PREGNANCYMETA. (Dataset for Type 4 analysis containing pregnancy outcome metadata.)
  • msoc.&RUNID._runtimes (Dataset containing metrics on total run time, cohort creation run time, and run time for several program processes.)
  • work._&COVARIATECODES. (Dataset containing covariates codes excluding combination of covariates.)
  • work._aniv (Dataset containing a list of age anniversary to extract in ms_cidanum.)
  • work._checklink (Dataset where cohort(s) do have not linkagestatus specified.)
  • work._cod (Dataset containing a list of cause of death to extract in ms_cidanum.)
  • work._codedays_check (Dataset containing multiple codedays value per group condlevel subcondlevel.)
  • work._cohortfile (Dataset with separate count of groups with different age strat.)
  • work._cohorttype (Dataset for never exposed cohort.)
  • work._diag (Dataset containing diagnosis codes to extract.)
  • work._dte (Dataset containing a list of death to extract in ms_cidanum.)
  • work._enc (Dataset containing a list of encounters to extract in ms_cidanum.)
  • work._incorrect_combo_covars (Dataset containing combo covars that do not meet specification.)
  • work._lab (Dataset containing lab result codes to extract.)
  • work._mil (Dataset containing a list of mother-infant linkage to extract in ms_cidanum.)
  • work._ndc (Dataset containing dispensing codes to extract.)
  • work._proc (Dataset containing procedure codes to extract.)
  • work._tmpdupcode (Dataset containing the same code for multiple outcomes in PREGNANCYCODES.)
  • work.adherencefile (Dataset to determine overall adherence for a Type 2 multiple events analysis.)
  • work.cohortfile (Dataset containing cohort group.)
  • work.cohortfile_for_enr (Dataset where same basecohort are grouped and assess order to process them correctly.)
  • work.covarcodes (Dataset with covariate code list.)
  • work.dummycovars (Dataset containing dummy covariates if covarnum are not sequential.)
  • work.hdps (Dataset containing combination items that behave like DX &DPLOCALLIB.&outname._diag and worktemp._diagextract.)
  • work.hdpssettings (Dataset to id which group for HDPS.)
  • work.labcodetype_check (Dataset containing lab code types that are used to define a single covariate.)
  • work.looks (Dataset containing non overlapping looks dates.)
  • work.migroupnames (Dataset for PS/COV analysis.)
  • work.monitoringfile_agg (Dataset containing minimum startdate and maximum enddate.)
  • work.riskscorefile (Dataset containing unique risk scores.)
  • work.riskscorescodes (Dataset containing risk score codes for raw extraction.)
  • work.secondaryinputfile (Dataset for secondary episode analysis.)
  • work.stockpile_covar (Dataset containing variables to stockpile (where stockgroups are only in covariatecodes file).)
  • work.stockpile_noncovar (Dataset containing variables to stockpile.)
  • work.studynames (List of covarnum-studyname for labels.)
  • work.type2file (Dataset for never exposed cohort.)
  • work.type6groups_forbaseline (Dataset with identification and computation of switch pattern episodes sorted by analysisgrp and descending createbaseline.)
  • work.unit_check (Dataset containing subcondlevel with different unit value.)
  • work.userstrata (Master list of all strata.)
  • work.userstrata_denom (Dataset containing unique levelvars per table type (prevalent or not).)
  • work.userstrata_its_lookup (Dataset containing levelvars per table type prevalent or not.)
  • work.utilfile (Dataset with medical or drug utilization metrics specification.)

Usage

%ms_processinputfiles();
Parameters
None.

SAS Macros Dependencies

Author
Sentinel Coordinating Center (info@.nosp@m.sent.nosp@m.inels.nosp@m.yste.nosp@m.m.org)