EPI-820 Evidence-Based Medicine LECTURE 4: DIAGNOSIS II Mat Reeves BVSc, PhD 1 Objectives: 1. Understand the derivation and use of Bayes Theorem. 2. Understand the Fundamental Medical Fact #1". 3. Define the likelihood ratio (LR) and calculate LR+ and LR- from 2 x 2 table. 4. Understand the Odds-likelihood form of Bayes Theorem. 5. Understand what clinical conditions maximize the value of test results. 6. Understand the biases and limitations of published test performance measures. 2

I. Bayes Theorem - its Derivation and Use Bayes theorem = a unifying methodology, based on conditional probabilities, for interpreting clinical test results. PVP = [Se . Prev] [Se . Prev] + [(1 - Sp) . (1 - Prev)] PVN = [Sp . (1 - Prev)] [Sp . (1 - Prev)] + [(1 - Se) . Prev] 3 A. Derivation of Bayes Equations 1) 2 x 2 table Approach: DISEASE

Test + Test - PRESENT (D+) ABSENT (D-) Se. Prev 1-Sp. 1-Prev 1-Se. Prev Sp. 1-Prev Prev

1 - Prev [Se. Prev] [Se. Prev] + [1-Sp. 1-Prev] [Sp. 1 Prev] [Sp. 1 Prev] + [1-Se. Prev] 4 2) First Principles Approach Using Joint and Conditional Probabilities: Example - PVP: Step 1: Specify prior prob. of disease = P(D+), non-disease = P(D-). Step 2: Calculate joint probability of disease and pos. test result: P(T+, D+) = P(T+|D+) . P(D+) N.B. equation derived from definition of joint probability. = cell a, or Se multiplied by prevalence.

Step 3: Calculate joint probability of non-disease and a pos. test result: P(T+, D-) = P(T+|D-) . P(D-) = cell b, OR FP rate multiplied by prob. of non-disease. 5 Example PVP Step 4: Calculate probability of a positive test result i.e., P (T+): P (T+) = P(T+, D+) + P(T+, D-) = the sum of cells a and b. Substituting formulas from steps 2 and 3 we get: P (T+) = P(T+|D+) . P(D+) + P(T+|D-) . P(D-) Step 5: Calculate probability of disease given a positive test results i.e., PVP: P(D+|T+) = P(T+, D+)

P(T+) = cell a divided by cells a and b. - derived from formal definition of joint probability (Step 2) 6 Example PVP So: P(D|T+) = P(T+|D+) . P(D+) P(T+|D+) . P(D+) + P(T+|D-) . P(D-) PVP = [Se . Prev] [Se . Prev] + [(1 - Sp) . (1 - Prev)] 7 B. Example of the Use of Bayes Theorem IP & I-125 FS for DVT (Se = 90%, Sp = 95%), Prevalence of

disease = 15%: PVP = [0.95 . 0.15] [0.95 . 0.15] + [(1 - 0.95) . (1 - 0.15)] PVP = 0.1425 = 0.77 or 77% 0.1425 + 0.0425 PVN = [0.95 . (1 - 0.15)] [0.95 . (1 - 0.15)] + [(1 - 0.90) . 0.15] PVN = 0.8075 = 0.982 or 98.2% 0.8075 + 0.015 identical to Fig. 9 in lecture 3 (given rounding error of PVP) equations are cumbersome? difficult to remember? 8

Prevalence The importance of prevalence on the interpretation of predictive values and its influence on the whole framework of clinical practice cannot be overstated. Prevalence represents the clinician's best guess or opinion (expressed as a probability) prior to ordering an actual test. 9 Table 4.1 Relationship Between History of Chest Pain and Prevalence of Coronary Artery Disease (CAD) (Weiner et al., NEJM, 1979). Prevalence of CAD Type of History All men

Men with abN ECG Typical angina 0.89 0.96 (gain= 0.07) Atypical angina 0.60 0.87 (gain= 0.27) Non-anginal chest pain 0.22

0.39 (gain= 0.17) 10 Conclusions: a) probability of CAD depends on patients history b) probability of CAD after an abnormal test depends on patients history c) value of the test information also depends on patients history 11 Fundamental medical fact # 1: The interpretation of test results depends on the probability of disease before the test was run (= prior probability or prior belief). 12

The Prior Probability of Disease: represents what the clinician believes (prior belief or clinical suspicion) set by considering the practice environment, patients history, physical examination findings, experience and judgment etc constantly revised in light of new information (= Bayes theorem). 13 Fig 4.1 Use of Bayes Theorem to Revise Disease Estimates in Light of New Test Information After - test 0 Use of Bayes

Formula Before-test 0.5 After + test 1.0 Probability of Disease 14 Figure 4.3 Advantages of a Pre-test Probability of 40 60% (Test Has Se= 75%, Sp= 85%, LR+= 5 and LR-= 0.3 %) 1 PV+ = P(D+|T+)

Post-test probability 0.8 0.6 Max gain 0.4 0.2 (1 PV-) = P(D+|T-) 0 0 0.2 0.4

0.6 0.8 1 Prevalence of prior probrobability 15 II. Generalizability of Published Test Performance Measures (Se, Sp, LRs) Test performance measures are frequently assumed to be intrinsic characteristics of diagnostic tests independent of underlying prevalence It is assumed that Se and Sp (or LRs) are fixed and that valid post-test probabilities can be computed by simply varying the prior probability. Problems?

16 Potential Problems # 1. Disease severity affects diagnostic test performance - the more severe the disease the higher the sensitivity (easier to detect). test performance depends on the spectrum of disease severity in the source (test) population. #2. Test characteristics are in fact dynamic - they can therefore: change due to alterations in underlying prevalence. be different among subgroups defined by age, gender etc. 17 Potential Problems #3. Published Se and Sp estimates can be highly biased (Ransohoff and Feinstein,

1986). Spectrum bias = the difference in both the spectrum and severity of disease between study population and clinically relevant population. 18 A. Selection Bias During Phase I Evaluations initial evaluation of a new diagnostic test typically undertaken at referral centers. determine if test will be positive among patients with severe disease, i.e., the sickest of the sick. --- Se is overestimated (advanced disease is easy to detect). determine if the test is usually negative in normal (healthy, young) volunteers, i.e., the wellest of the well --- Sp is overestimated (population unlikely to have diseases that cause FP results). sometimes test is applied only to the sickest of the sick, with the result that Sp is underestimated (FP results are over-inflated, because very sick patients have conditions that tend to make the

index test positive). 19 B. Selection Bias During Phase II Evaluations (TestReferral Bias) Test-referral bias occurs when the index test itself is used as a criterion to select which patients receive the definitive (gold standard) diagnostic procedure. Test negative subjects dont go on to get the gold standard which results in: over-estimation of Se (number of FN underestimated) under-estimation of Sp (number of TN underestimated). 20 Test-referral bias Example (from Cox)

Index Pop. No disease T+ Study Pop. T- T+ T- FPR = 0.55 FPR = 0.3 T+ T+ T-

Index Pop. Disease T- TPR = 0.8 21 TPR = 0.6 Net Result of Spectrum bias: Se is over-estimated Sp?? - depends on relative balance of the possible biases. 22 Summary Published Se/Sp/LR Values and Bayes Theorem:

Published estimates of Se/Sp or LRs should be considered average values for a particular (sub) population Theres a great deal to be learnt from mastering Bayes Theorem - but its not without potential pitfalls and errors - be careful!!! 23 III. The Likelihood Ratio (LR) - Definition Alternative way of describing diagnostic test performance. For dichotomous test results it summarizes exactly the same information as Se/Sp. The LR for a particular value of a diagnostic test is defined as the probability of observing the test result (X) in the presence of disease divided by the probability of observing the test result in the absence of disease. LR (X) = P(X|D+)

P(X|D-) it is the odds that a given test result (X) occurs in a diseased individual compared to a non-diseased individual. 24 Figure 4.2 The Likelihood Ratio for a Dichotomous Test (Positive or Negative) Example: IP and I-125 FS testing. DVT PRESENT (D+) Either or both POS (T+) 103 8

ab c d TEST Both NEG (T-) ABSENT (D-) 11 152 LR+ = a/(a+c) LR- = c/(a+c) = b/(b+d) = d/(b+d) LR+ = 18 LR- = 0.105

25 Interpretation of LRs LR+ indicates that a positive result (IP and/or I-125 FS) is 18 times more likely to occur in the presence of DVT than in the absence of it. LR- indicates that a negative result is 0.105 times less likely to occur in the presence of DVT than in the absence of it (or for every negative result in a patient with DVT, expect 9 negative results in patients without DVT). 26 Note, for dichotomous tests: LR+ = Sensitivity 1 - Specificity or

True-positive Rate False-positive Rate LR- = 1 - Sensitivity Specificity or False-negative Rate True-negative Rate Also: ROC curve = Se versus 1 Sp LR+ = the slope of the ROC curve. 27 Table 4.2 Likelihood Ratios for Common Tests Disease/Condition

Test and Result LR Alcohol dependency CAGE questions: Yes to 3 or more Yes to any 2 Yes to any 1 No to all 4 250 7 1.3 0.2 >75% Coronary Art. Stenosis

Pancreatic cancer Breast cancer TB (Culture) Symptoms of typical angina Yes CT Scan Definitely abnormal Probably abnormal Possibly abnormal Definitely normal Fine needle aspirate Definitely malignant Suspicious Benign Unsatisfactory Sputum smear

Positive Negative 115 (men) 120 (women) 26 4.8 0.35 0.11 4.8 0.11 0.41 31 0.79 28

IV. Advantages of Using LRs A. Can Apply Bayes Theorem Easily Using: Odds-likelihood Ratio Form of Bayes Theorem: Pre-test odds X LR = Post-test odds where: Pre-test odds = Prevalence 1 Prevalence Post-test probability = Post-test odds 1 + Post-test odds 29 What this equation is telling us: The environment (indicated by the pre-test

odds) is as important as the information provided by the test (indicated by the LR). Must know the prevalence or prior probability. The LR is easily obtained from reference books, the clinicians job is to provide an estimate of the pre-test odds!! 30 Example: IP and I-125 FS Tests and DVT Prevalence = 15% LR+ = Se/(1-Sp) = 0.90/(1 - 0.95) = 18.0 0.15 X 18 = 3.176 (post-test odds) 1 - 0.15 Post-test probability = 3.176 = 76% 1 + 3.176 31 Example: IP and I-125 FS Tests and DVT

Prevalence = 15% LR - = (1 - Se)/Sp = 0.10/0.95 = 0.105 0.15 X 0.105 = 0.0186 (post-test odds) 1 - 0.15 Post-test probability = 0.0186 = 0.0182% 1 + 0.0186 Note this is P(D+|T-) or the complement of PVN or (1 P(D-| T-). PVN is therefore 1 0.0186 = 98.2% 32 B. Can Calculate LRs for Several Levels of Test Results Disease more likely in the presence of an very abnormal test result than for a marginal one. LRs can be calculated for any range of test results, thereby preserving clinical information.

33 Table 4.3 Likelihood Ratios of MI for Six Levels of CK MI - Yes CK Result MI- No LR 50 > 400 1 (50/230)/(1/130) 28.3 34

320-400 1 (34/230)/(1/130) 19.2 71 160-319 4 (71/230)/(4/130) 10.0

60 80-159 10 (60/230)/(10/130) 2.6 13 40-79 26 (13/230)/(26/130) 0.28

2 0-39 88 (2/230)/(88/130) 0.01 230 130 34 LRs and Multiple Levels When CK results were dichotomized, the LR

range was 0.07 to 7.6 (108 fold difference) When data presented for seven levels, the LR range is now 0.01 to 28.3 (a 2830 fold difference) LRs preserve the natural degree of severity in the original data - DONT LUMP!!!!. 35 C. Other Advantages of LRs i) Robustness Test performance measures (Se/Sp or LRs) are assumed to be independent of the underlying prevalence of disease. But, Se and Sp may change in response to changes in prevalence. LRs are theoretically less susceptible to these changes - because calculated from smaller slices of data 36

ii) Multivariable Modeling LR is an exponential function of a linear combination of test data: LR = exp (ao + B1X1 + .....BkXk) Logistic regression models calculate the probability of a certain event X, given explanatory variables B1X1 + .....+ BkXk. P(D+|~X) = e B0 + B1X1 + .....+ BkXk 1 + e B0 + B1X1 + .....+ BkXk With some minor adjustments, the parameters of the logistic regression model can be used to estimate the LR: 37 Advantages? Can combine the discriminatory power of several

variables (or tests) into a single LR (each variable is now independent). e.g., IP and I-125 FS!! Can adjust estimates for important covariates (e.g. age or gender). Examples: Equine colic (Reeves), BreastAid (Osuch) 38