Partners Healthcare - REPEAT

Partners Healthcare - REPEAT

www.repeatinitiative.org Disclosures This work was funded by the Laura and John Arnold Foundation At the time that this work was conducted, Dr. Wang was principal investigator on other grants from: Agency for Healthcare and Research Quality National Institute of Aging Laura and John Arnold Foundation FDA Sentinel Initiative Investigator initiated grants to Brigham and Womens Hospital from Novartis, J & J, Boehringer Ingelheim for unrelated work She is a consultant to Aetion Inc, for unrelated work 2 Objective To increase the confidence of decision makers in using evidence from healthcare databases by producing empirically based recommendations on how to transparently report on study implementation, achieve reproducible and robust findings 3 Aim 1. To quantify the current state of healthcare database study reproducibility via direct replication 1. Systematic search using Google Scholar

2. Apply exclusion criteria 3. Evaluate transparency considering all publicly available information 4. Replicate 150 studies 80% comparative (blind to original results) 4 5. Contact original authors to discuss assumptions, understand differences Top h-5 clinical, epidemiology journals Published after Jan 1, 2011 cohort + claims + database name CONSORT style diagram Include descriptive, comparative safety/effectiveness cohort studies Exclude if data source mismatch, PDF unavailable, methods study, etc. Standardized extraction form Based on ISPE/ISPOR catalogue Measure/describe how often specific parameter decisions were unclear

Metrics to quantify replicability Abs. Diff, Std. Diff, calibration, etc. Aim 1. To quantify the current state of healthcare database study reproducibility via direct replication 1. Systematic search using Google Scholar 2. Apply exclusion criteria 3. Evaluate transparency considering all publicly available information 4. Replicate 150 studies 80% comparative (blind to original results) 5 5. Contact original authors to discuss assumptions, understand differences Top h-5 clinical, epidemiology journals Published after Jan 1, 2011 cohort + claims + database name

CONSORT style diagram Include descriptive, comparative safety/effectiveness cohort studies Exclude if data source mismatch, PDF unavailable, methods study, etc. Standardized extraction form Based on ISPE/ISPOR catalogue Measure/describe how often specific parameter decisions were unclear Metrics to quantify replicability Abs. Diff, Std. Diff, calibration, etc. Aim 1. To quantify the current state of healthcare database study reproducibility via direct replication 1. Systematic search using Google Scholar 2. Apply exclusion criteria Random sample 250 studies 3. Evaluate transparency considering all publicly available information 4. Replicate 150 studies 80% comparative (blind to original results)

6 5. Contact original authors to discuss assumptions, understand differences Top h-5 clinical, epidemiology journals Published after Jan 1, 2011 cohort + claims + database name CONSORT style diagram Include descriptive, comparative safety/effectiveness cohort studies Exclude if data source mismatch, PDF unavailable, methods study, etc. Standardized extraction form Based on ISPE/ISPOR catalogue Measure/describe how often specific parameter decisions were unclear Metrics to quantify replicability Abs. Diff, Std. Diff, calibration, etc. Aim 1. To quantify the current state of healthcare database study reproducibility via direct replication 1. Systematic search using Google Scholar 2. Apply exclusion criteria

Random sample 250 studies 3. Evaluate transparency considering all publicly available information 4. Replicate 150 studies 80% comparative (blind to original results) 7 5. Contact original authors to discuss assumptions, understand differences Top h-5 clinical, epidemiology journals Published after Jan 1, 2011 cohort + claims + database name CONSORT style diagram Include descriptive, comparative safety/effectiveness cohort studies Exclude if data source mismatch, PDF unavailable, methods study, etc. Standardized extraction form Based on ISPE/ISPOR catalogue

Measure/describe how often specific parameter decisions were unclear Metrics to quantify replicability Abs. Diff, Std. Diff, calibration, etc. Aim 1. To quantify the current state of healthcare database study reproducibility via direct replication 1. Systematic search using Google Scholar 2. Apply exclusion criteria Random sample 250 studies 3. Evaluate transparency considering all publicly available information 4. Replicate 150 studies 80% comparative (blind to original results) 8 5. Contact original authors to discuss assumptions, understand differences Top h-5 clinical, epidemiology journals

Published after Jan 1, 2011 cohort + claims + database name CONSORT style diagram Include descriptive, comparative safety/effectiveness cohort studies Exclude if data source mismatch, PDF unavailable, methods study, etc. Standardized extraction form Based on ISPE/ISPOR catalogue Measure/describe how often specific parameter decisions were unclear Metrics to quantify replicability Abs. Diff, Std. Diff, calibration, etc. Aim 2. To evaluate the robustness of evidence currently found in healthcare database studies Closely replicated Noted design/analysis issue Implementation parameters intended question? 9 Plausible alternative parameters Address design/analysis issues Assay sensitivity e.g. negative control outcomes

2. Conduct numerous sensitivity analyses Quantitative bias adjustment (misclassification) Residual confounding 3. Conduct external adjustment under varying assumptions Vibration ratio 4. Evaluate robustness of evidence Null Involve original investigators 1. Identify random sample of 50 comparative studies Original Robustness checks Aim 2. To evaluate the robustness of evidence currently found in healthcare database studies

Closely replicated Noted design/analysis issue Implementation parameters intended question? 10 Plausible alternative parameters Address design/analysis issues Assay sensitivity e.g. negative control outcomes 2. Conduct numerous sensitivity analyses Quantitative bias adjustment (misclassification) Residual confounding 3. Conduct external adjustment under varying assumptions Vibration ratio 4. Evaluate robustness of evidence Null Involve original

investigators 1. Identify random sample of 50 comparative studies Original Robustness checks Aim 2. To evaluate the robustness of evidence currently found in healthcare database studies Closely replicated Noted design/analysis issue Implementation parameters intended question? 11 Plausible alternative parameters Address design/analysis issues Assay sensitivity e.g. negative control outcomes 2. Conduct numerous sensitivity analyses Quantitative bias adjustment (misclassification) Residual confounding 3. Conduct external

adjustment under varying assumptions Vibration ratio 4. Evaluate robustness of evidence Null Involve original investigators 1. Identify random sample of 50 comparative studies Original Robustness checks Aim 2. To evaluate the robustness of evidence currently found in healthcare database studies Closely replicated Noted design/analysis issue Implementation parameters intended question? 12 Plausible alternative parameters Address design/analysis issues

Assay sensitivity e.g. negative control outcomes 2. Conduct numerous sensitivity analyses Quantitative bias adjustment (misclassification) Residual confounding 3. Conduct external adjustment under varying assumptions Vibration ratio 4. Evaluate robustness of evidence Null Involve original investigators 1. Identify random sample of 50 comparative studies Original Robustness checks

Random Sample of Peer-Reviewed, Published Database Studies Current progress Transparency EvaluationReplication Robustness of 250 11 22 33 44 55 66 77 88 99 1 28 of 150 of 50 11 22 33 44 55 66 77 88 99 56 Author Contacts

of 150 (only 10 attempted contacts) 1 1 1 1 INTERIM RESULTS Relative sample size of replication versus original (Nreplication/Noriginal) No codes, unclear temporality for exposure, exclusion criteria Replication team made many assumptions Average relative sample size close, but many up to 2x as large or half the size NReplication larger Same sample size Comparative Descriptive NOriginal larger INTERIM RESULTS

Difference in baseline characteristics* of cohort (% original % replication) No differenc Study ID Comparative Descriptive * binary/categorical Study ID Covariate codes not reported Covariate codes reported INTERIM RESULTS Difference in baseline characteristics* of cohort (% original % replication) 86% of baseline characteristics were within 10% points Study ID Comparative Descriptive * binary/categorical Study ID Covariate codes not reported Covariate codes reported INTERIM RESULTS Difference in baseline characteristics* of cohort

(% original % replication) 95% of baseline characteristics were within 25% points Study ID Comparative Descriptive * binary/categorical Study ID Covariate codes not reported Covariate codes reported INTERIM RESULTS Difference in baseline characteristics* of cohort (% original % replication) 5% of baseline characteristics differed by more than 25% Study ID Comparative Descriptive * binary/categorical Study ID Covariate codes not reported Covariate codes reported INTERIM RESULTS Why did the replication differ so much from the original for some baseline characteristics?

Authors provided citation to comorbidity score All patients in replication had score 2 because tumor/malignancy was part of inclusion > 75% in original had score = 0 Study ID Comparative Descriptive * binary/categorical Study ID Covariate codes not reported Covariate codes reported INTERIM RESULTS Calibration of effect estimates* for original versus replication Original Replication = Original * Hazard, odds, risk ratio Replication INTERIM RESULTS Calibration of effect estimates* for original versus replication Original Estimates follow diagonal

* Hazard, odds, risk ratio Replication INTERIM RESULTS Effect estimate agreement between original and replication Same side of null? Both above null Original O above null R below null Both below null O below null R above null 84% of effect estimates were on the same side of null 16% were not 52% of effect estimates and confidence intervals were on same side of null Difference in effect estimate log(original) log(replication)

* Hazard, odds, risk ratio Replication Mean: 0.0 29% within 0.1 Range: -0.6, 0.4 INTERIM RESULTS Why is the replication estimate substantially larger? Original Hazard ratio for bleeding Original: 1.9 Replication: 3.4 Notes from replication team: Discrepancies between exclusions in manuscript text versus attrition table Made assumptions regarding algorithms for exclusion, covariates Codes? Care setting? Dx position? Day 0 in assessment window?

Outcome algorithm was in appendix Sample size and characteristics: * Hazard, odds, risk ratio Replication Replication cohort was 10% larger Most replicated baseline characteristics within 10% points Reported outcome rate in original and replication very different (P vs S?) INTERIM RESULTS Why are the effect estimates on opposite sides of null? Notes from replication team: Original

Hazard ratio for bleeding Original: 1.2 (95% excludes null) Replication: 0.8 (95% include null) Assumptions regarding algorithms for exclusion, covariates Codes? Care setting? Dx position? Day 0 in assessment window? Outcome algorithm provided Assumptions about follow up Censoring criteria, exposure stockpiling, bridging, extension Sample size and characteristics: * Hazard, odds, risk ratio Replication

Replication cohort was 30% larger Over half of baseline characteristics differed by more than 10% points Work in progress Empirical evaluation Describe frequency of reporting, impact of transparency of specific study parameters Prioritize reporting on parameters with demonstrable influence on replicability or robustness Hard to replicate analysis results if unable to replicate base cohort Majority of internal debate over vague prose on temporality (slower timeline for replication) Exclusion criteria not detailed, selection of study entry date before or after applying exclusions

How much do assumptions matter? Context dependent, robustness next Shared terminology and structured reporting templates Simplify reporting - terminology used for the same concepts varies Visualization of study design implementation Reporting on research using unstructured data 26 REPEAT Core Team (alphabetical) 7 groups working in parallel on different studies (1+ faculty, 2+ research staff)

Adrian Ortiz Santiago BS Ajinkya Pawar PhD MS Elisabetta Patorno MD DrPH Elizabeth M. Garry PhD MPH Emma Payne BS Jessica Franklin PhD Joshua Gagne PharmD ScD Krista Huybrechts PhD MS Kristina Stefanini BA Lily Bessette BS Mimi Zakarian BS Monica L. Gierrada MPH Mufaddal Mahresi MD MPH Nileesa Gautam BS Sebastian Schneeweiss MD ScD Shirley V Wang PhD ScM Sushama Kattinakere MBBS MSPH Yinzhu Jin MS MPH www.repeatinitiative.org Scientific Advisory Board (alphabetical) Regulators, HTA, delivery systems, patients, payers, industry, journals, research societies

Jeffrey Brown PhD Alison Bourke MSc FRPharm.S Amr Makady PharmD PhD Andrew Bate PhD Brian Bradbury DSc Brian Nosek PhD Christine Laine MD MPH FACP David Martin MD MPH Deborah Zarin MD Dick Willke PhD Dorothee Bartels MSc PhD Elizabeth Loder MD MPH

Frank de Vries PharmD PhD Hans-Georg Eichler MD, MSc Henrik Toft Srensen MD PhD Javier Jimenez MD MPH Jesper Hallas MD PhD Joanne Waldstreicher MD John Ioannidis MD DSc John Seeger PharmD DrPh K. Arnold Chan MD ScD Karen Burnett MBA MS

Kris Kahler PhD Laura Happe PharmD MPH Liam Smeeth PhD Lisa Freeman Michael Nguyen MD Nam-Kyong Choi B. Pharm PhD Pll Jnsson PhD Mres Peter Arlett BSc MBBS MRCP FFPM Peter Tugwell MSc MD FRCPC Richard Platt MD MSc Sarah Priddy Alwardt PhD Sean Hennessy PharmD, PhD Troyen Brennan MD Will Shrank MD Wolfgang Winkelmayer MD MPH ScD FASN Yoshiaki Uyama PhD

Recently Viewed Presentations

  • Writing Lab - University of West Florida

    Writing Lab - University of West Florida

    If a question mark, exclamation point, or dash is part of the quotation, place it inside the quotation marks. Dean Martin once asked, "Ain't love a kick in the head?" "Sometimes I so remind myself of Socrates!" Jason said. If...
  • Exploiting Route Redundancy via Structured Peer to Peer Overlays

    Exploiting Route Redundancy via Structured Peer to Peer Overlays

    Exploiting Route Redundancy via Structured Peer to Peer Overlays Ben Y. Zhao, Ling Huang, Jeremy Stribling, Anthony D. Joseph, and John D. Kubiatowicz
  • Homecrest Presbyterian Church Worship of the Lords Day

    Homecrest Presbyterian Church Worship of the Lords Day

    God delights in us when we cry out to him, through our confession and our prayers. As we approach God's throne in worship, let us renew our commitment to live honorably and faithfully in God's grace. ... Sharing of Joys...
  • 3.4 Properties of Logarithmic Functions

    3.4 Properties of Logarithmic Functions

    3.3 Properties of Logarithmic Functions Properties of Logarithms Let b, R, and S be positive real numbers with b ≠ 1, and c any real number. Product Rule: Quotient Rule: Power Rule: Expanding the Logarithm of a Product Assuming x...
  •  Longitude/Latitude  Prime Meridian/Equator  Oceans  Tropics  Continents  All the

    Longitude/Latitude Prime Meridian/Equator Oceans Tropics Continents All the

    Italy is a boot shaped peninsula that extends into the Mediterranean Sea. France is located northwest of Italy. Spain is west of Italy, southwest of France and on the coast of the Med. Portugal is west of Spain (Spain's baby)...
  • Concentration of Solutions

    Concentration of Solutions

    25cm3 of sodium hydroxide solution of unknown concentration was titrated with dilute sulphuric acid of concentration 0.050 mol dm-3. 20.0 cm3 of the acid was required to neutralise the alkali. Find the concentration of the sodium hydroxide solution in mol...
  • IRAN (PERSIA) Cradle of Civilization ( PART 7

    IRAN (PERSIA) Cradle of Civilization ( PART 7

    The Arg-é Bam THE ARG-É BAM The bi-millennium old citadel of Arg-é Bam: The world's largest adobe structure Bam, Kerman, Iran Source: Wikipedia.org Ali Qapu Palace Isfehan, Iran Armenian Vank Cathedral, Isfehan, Iran Naqshe Jahan square in Isfahan is the...
  • MPEG-4: Multimedia Coding Standard Supporting Mobile ...

    MPEG-4: Multimedia Coding Standard Supporting Mobile ...

    MPEG-4: Multimedia Coding Standard Supporting Mobile Multimedia System Lian Mo, Alan Jiang, Junhua Ding April, 2001 - Content Overview of Mobile Multimedia System MPEG-4 Features, System and DMIF MPEG-4 Video MPEG-4 Audio Overview of Mobile Multimedia System What is Mobile...