Lecture 4: Data Sources and study planning Flashcards Preview

Pharmacoepidemiology > Lecture 4: Data Sources and study planning > Flashcards

Flashcards in Lecture 4: Data Sources and study planning Deck (29):

Causal Contrast

Compares the disease risk in a group of exposed individuals with the disease risk that would have occurred if these individuals had not been exposed


Problems with doing RCT in pharmacoepi

1) Unethical to study a drug highly suspicious for a serious adverse event
2) Lose real-world setting
3) Expensive and difficult to study rare outcomes or outcomes that require lengthy FU


Types of Data sources used in pharmacoepi

Primary data sources and secondary data sources


Primary Data source

data directly collected from study participants for the purposes of the study (informed consent required)


Secondary Data source

data collected from existing healthcare databases or medical records where all the events have already occurred before the data is queried [not collected with intention to do a scientific study] (informed consent varies)


Types of primary data sources

1) Protocol-required assessments (clinical and/or lab measurements--blood pressure, depression scale AND interview/questionnaire)
These assessments should be part of routine for clinical care


Advantages of primary data sources

Data collection tailored to study objectives, so you focus on measuring confounders, there is available lab data, indication for medication use more explicit, can obtain information from assessments needed for valid measurement, but not universally performed as standard of care, can randomize subjects to treatment (i.e. LST)


Disadvantages of primary data sources

Expensive and time-intensive
Infeasible for studies requiring large sample sizes or long FU
Many operational considerations (getting informed consent, identification, initiation and management of study sites, data monitoring)


Types of Secondary data sources (unstructured data)

Data do not already exist in a structured database
Information from individual patient medical records must be abstracted and converted into structured data for study purposes


Types Secondary data sources (structured data)

Data already exists in a structured (coded) database with de-identified patients
Some sources are:
1) Administrative claims and non-claims databases for insurance companies, programs and health plans
2) EMR, health care registries and record linkage systems [population-based registries] (many in EU)
3) National health surveys, existing cohorts like Framingham


Types Secondary data sources (hybrid data)

Data exist in structured (coded) database, but are supplemented by unstructured data.
Ex. text fields (physician notes are reviewed, coded and added to the structured db)


Some examples of administrative databases

Kaiser Permanente, Veterans affairs, Pharmetrics, United healthcare


Some examples of EMR databases and registries

General practitioner db (THIN, GPRD), Healthcare registries (Sweden & Denmark)


Examples of coding systems

International Classification of diseases (ICD) [9--coding diagnoses and procedures and 10--cause of death], CPT, ATC...


Advantages of secondary data sources

1) Study can be done rapidly and inexpensively
2) Can be used for studies with large sample size or long FU
3) Reduced operational issues
4) pharmacy information more accurate than self-report and medical record
5) data linkage with other dos to obtain additional information (like death, cancer, etc)


Disadvantages of secondary data sources

1) Diagnoses may not be valid (ex. recording of rule-out diagnoses)
2) data on important confounders are unavailable
3) data on over the counter and inpatient drug use is not available
4) Information is truncated if the db has high turnover


Potential sources of bias with administrative db

1) Low SES people may not seek coverage and won't be represented in db.
2) Incomplete documentation of clinical status
3) miscoding of drug, strength, dose
4) incomplete record keeping
5) Miscoding of primary and secondary diagnoses
6)Incomplete linkage


Feasibility assessments

1) ensure scientific and operational integrity of study
2) ideal study is not wholly feasible
3) purpose is to characterize circumstances in which it is feasible to address research question and identify trade-offs between scientific and operational considerations
4) study objectives drive all aspects of feasibility assessment


Identifying key data elements

1) Identify exposure and outcome variables of interest as well as potential confounding and interaction
2) Identify data elements that can measure these variables
3) List atypical and typical medications of interest
4) List cardiovascular outcomes of interest
5) List variables that contribute to confounding by indication
6) List variables that may modify the effect


Choosing between primary and secondary data collection

Rank data sources for capturing required data elements (sufficient number of patients, recording of lab data for valid measurement of outcome, routine conduct of clinical assessments for valid measurement of confounding diagnoses)


Study Design (Subjects selected according to exposure)

Cohort study


Study Design (Subjects selected according to outcome)



Study Design (Subjects selected according to neither exposure or outcome)



Study Design (Subjects randomized to exposure with observational FU)

Large simple trial (LST)


Descriptive studies

Drug utilization study
Safety surveillance study


Analytic studies

Groups are statistically compared to address pre-specified hypothesis


Sample size estimation

Analytic studies: depend on design, prevalence of E/O, effect size, etc.
Descriptive Studies: sample size estimation not relevant, estimate of sample size required for certain level of precision may be appropriate


Other operational considerations (secondary data sources)

Requirements of IRB (informed consent not required)
Time/funding needed for validation and pilot studies
Are medical records accessible, can sites identify eligible patients, will data abstraction be done by site or study staff?


Operational conditions (primary data collection)

1) Protocol level assessment: identify target countries, sites (academic vs community), and data sources, IRBs?. Informs scientific aspects of study protocol
2) Site-level: evaluation of potential sites for interest (generate list of potential sites and survey them) and capability of study participation