One-way Between Groups Analysis of Variance 320 Ainsworth Major Points Problem with t-tests and multiple groups The logic behind ANOVA Calculations Multiple comparisons Assumptions of analysis of variance Effect Size for ANOVA Psy 320 - Cal State Northridge 2 T-test So far, we have made comparisons between a single group and population, 2-related samples and 2

independent samples What if we want to compare more than 2 groups? One solution: multiple t-tests Psy 320 - Cal State Northridge 3 T-test With 3 groups, you would perform 3 ttests Not so bad, but what if you had 10 groups? You would need 45 comparisons to analyze all pairs Thats right 45!!! Psy 320 - Cal State Northridge 4 The Danger of Multiple t-Tests

Each time you conduct a t-test on a single set of data, what is the probability of rejecting a true null hypothesis? Assume that H0 is true. You are conducting 45 tests on the same set of data. How many rejections will you have? Roughly 2 or 3 false rejections! So, multiple t-tests on the same set of data artificially inflate Psy 320 - Cal State Northridge 5 Summary: The Problems With Multiple t-Tests Inefficient - too many comparisons when we have even modest numbers of groups. Imprecise - cannot discern patterns or trends of differences in subsets of groups. Inaccurate - multiple tests on the same set

of data artificially inflate What is needed: a single test for the overall difference among all means e.g. ANOVA Psy 320 - Cal State Northridge 6 LOGIC OF THE ANALYSIS OF VARIANCE Psy 320 - Cal State Northridge 7 Logic of the Analysis of Variance Null hypothesis h0: Population means equal m 1 = m 2 = m 3 = m4 Alternative hypothesis: h1 Not all population means equal.

Psy 320 - Cal State Northridge 8 Logic Create a measure of variability among group means MSBetweenGroups AKA s2BetweenGroups Create a measure of variability within groups MSWithinGroups AKA s2WithinGroups Psy 320 - Cal State Northridge 9 Logic MSBetweenGroups /MSWithinGroups

Ratio approximately 1 if null true Ratio significantly larger than 1 if null false approximately 1 can actually be as high as 2 or 3, but not much higher Psy 320 - Cal State Northridge 10 So, why is it called analysis of variance anyway? Arent we interested in mean differences? Variance revisited Basic variance formula s 2

X i X 2 n 1 Psy 320 - Cal State Northridge SS df 11

Why is it called analysis of variance anyway? What if data comes from groups? We can have different sums of squares SS 1 Yi YGM SS 2 Yi Y j 2 2 SS 3 n j Y j YG M

2 W here i represents the individual, j represents the groups and G M represent the ungrouped (grand) m ean Psy 320 - Cal State Northridge 12 Logic of ANOVA Grand Mean (Ungrouped Mean) X Johns Score Group1 Y YGroup 2 YGroup 3

X-Axis Psy 320 - Cal State Northridge 13 CALCULATIONS Psy 320 - Cal State Northridge 14 Sums of Squares The total variability can be partitioned into between groups variability and within groups variability. Y Y i GM

2 2 n j Y j YGM Yi Y j 2 SSTotal SS BetweenGroups SSWithinGroups SST SS BG SSWG SST SS Effect SS Error Psy 320 - Cal State Northridge 15 Degrees of Freedom (df ) Number of observations free to vary dfT = N - 1 Variability of N observations

dfBG = g - 1 Variability of g means dfWG = g (n - 1) or N - g n observations in each group = n - 1 df times g groups dfT = dfBG + dfWG Psy 320 - Cal State Northridge 16 Mean Square (i.e. Variance) YY i

2 T MST s 2 GM N1 2 MS BG sBG 2 WG MSWG s n Y

j j YGM 2 # groups 1 Y Y i 2 j

# groups *(n 1) Psy 320 - Cal State Northridge 17 F-test MSWG contains random sampling variation among the participants MSBG also contains random sampling variation but it can also contain systematic (real) variation between the groups (either naturally occurring or manipulated) Psy 320 - Cal State Northridge 18 F-test

FRatio Systematic BG Variance Random BG Variance Random WS Variance And if no real difference exists between groups FRatio Random BG Variance 1 Random WS Variance Psy 320 - Cal State Northridge 19 F-test Grand Mean

(Ungrouped Mean) Y Y Y X-Axis The F-test is a ratio of the MSBG/MSWG and if the group differences are just random the ratio will equal 1 (e.g. random/random) Psy 320 - Cal State Northridge 20 F-test Grand Mean (Ungrouped Mean) YGroup1 YGroup 2

YGroup 3 X-Axis If there are real differences between the groups the difference will be larger than 1 and we can calculate the probability and hypothesis test Psy 320 - Cal State Northridge 21 F distribution 1 F-ratio There is a separate F distribution for every df

like t but we need both dfbg and dfwg to calculate the FCV from the F table D.3 for alpha = .05 and D.4 for alpha = .01 Psy 320 - Cal State Northridge 22 1-WAY BETWEEN GROUPS ANOVA EXAMPLE Psy 320 - Cal State Northridge 23 Example A researcher is interested in knowing which brand of baby food babies prefer: Beechnut, Del Monte or Gerber. He randomly selects 15 babies and assigns each to try strained peas from one of the three brands

Liking is measured by the number of spoonfuls the baby takes before getting upset (e.g. crying, screaming, throwing the food, etc.) Psy 320 - Cal State Northridge 24 Hypothesis Testing 1. Ho: mBeechnut = mDel Monte = mGerber 2. 3. 4. 5. At least 2 ms are different = .05 More than 2 groups ANOVA F For Fcv you need both dfBG = 3 1 = 2 and dfWG = g (n - 1) = 3(5 1) = 12

Table D.3 Fcv(2,12) = 3.89, if Fo > 3.89 reject the null hypothesis Psy 320 - Cal State Northridge 25 Step 6 Calculate F-test Start with Sum of Squares (SS) We need: SST SSBG SSWG Then, use the SS and df to compute mean squares and F Psy 320 - Cal State Northridge 26 Step 6 Calculate F-test

Gerber Del Monte Beechnut `Brand Baby Spoonfuls (Y) Group Means Mean Sum 1 2 3 4 5 6 7 8

9 10 11 12 13 14 15 3 4 4 4 8 7 4 8 6 5 9 6 10

8 9 6.333 Y Y ij .. 2 Y Y ij 2 .j n j Y. j Y..

2 4.6 6 8.4 0.445 5.443 2.779 0.111 1.777 7.113 0.111 13.447 2.779 7.113 1

4 4 0 1 0.36 5.76 2.56 0.16 0.36 71.335 34.4 Psy 320 - Cal State Northridge [5 * (6 - 6.333)2] = 0.555 [5 * (8.4 - 6.333)2]

= 21.36 36.93 27 ANOVA summary table and Step 7 Source SS BG 36.93 WG 34.4 Total 71.335 df MS F

Remember MS = SS/df F = MSBG/MSWG Step 7 Since ______ > 3.89, reject the null hypothesis Psy 320 - Cal State Northridge 28 Conclusions The F for groups is significant. We would obtain an F of this size, when H0 true, less than 5% of the time. The difference in group means cannot be explained by random error. The baby food brands were rated differently by the sample of babies. Psy 320 - Cal State Northridge

29 ALTERNATIVE COMPUTATIONAL APPROACH Psy 320 - Cal State Northridge 30 Alternative Analysis computational approach to SS Equations SST Y 2 Y

N a 2 T Y N 2 2 T SS BG

n N2 a j 2 SSWG Y n j 2 2 Under each part of the equations, you divide by the number of scores it took to get the number in the numerator

Psy 320 - Cal State Northridge 31 Computational Approach Example T ___ SS Y _____ 71.33 N 15 Baby Spoonfuls (Y) 1 3 2 4 3 4

4 4 5 8 Sum 23 6 7 7 4 8 8 9 6 10 5 Sum 30 11 9

12 6 13 10 14 8 15 9 Sum 42 Total 95 Sum Y Squared 673 Gerber Del Monte Beechnut

Brand 2 2 2 T a 2 T 2 ___ 2 ___ 2 ___ 2 ___ 2 SS BG

n N 5 15 _____ _____ 36.93 j 2 SSWG Y a j 2

n ___ 2 ___ 2 ___ 2 ____ ____ ____ 34.4 5 Note: You get the same SS using this method 32 Unequal Sample Sizes With one-way, no particular problem Multiply mean deviations by appropriate ni as you go The problem is more complex with more complex designs, as shown in next chapter. Equal samples only simplify the equation because when n1= n2 = = ng

n Y j 2 j YGM n Y j YGM 2 33 MULTIPLE COMPARISONS Psy 320 - Cal State Northridge 34 Multiple Comparisons

Significant F only shows that not all groups are equal We want to know what groups are different. Such procedures are designed to control familywise error rate. Familywise error rate defined Contrast with per comparison error rate Psy 320 - Cal State Northridge 35 More on Error Rates Most tests reduce significance level () for each t test. The more tests we run the more likely we are to make Type I error. Good reason to hold down number of tests

Psy 320 - Cal State Northridge 36 Tukey Honestly Significant Difference The honestly significant difference (HSD) controls for all possible pairwise comparisons The Critical Difference (CD) computed using the HSD approach Psy 320 - Cal State Northridge 37 Tukey Honestly Significant Difference MSerror

CD q nA where q is the studentized range statistic (table), MSerror is from the ANOVA and nA is equal n for both groups 1 1 CD q MSerror / 2 n n j i Psy 320 - Cal State Northridge 38

Tukey Comparing Beechnut and Gerber To compute the CD value we need to first find the value for q q depends on alpha, the total number of groups and the DF for error. We have 3 total groups, alpha = .05 and the DF for error is 12 q = 3.77 Psy 320 - Cal State Northridge 39 Tukey With a q of 3.77 just plug it in to the formula MSerror 2.867 CD q 3.77

2.86 nA 5 This give us the minimum mean difference The difference between gerber and beechnut is 3.8, the difference is significant Psy 320 - Cal State Northridge 40 Fishers LSD Procedure Requires significant overall F, or no tests Run standard t tests between pairs of groups. Often we replace s2pooled with MSerror from overall analysis

It is really just a pooled error term, but with more degrees of freedom (pooled across all treatment groups) Psy 320 - Cal State Northridge 41 Fishers LSD Procedure Comparing Beechnut and Gerber s * X Beechnut X Gerber MSWG MSWG 2.867 2.867

nBeechnut nGerber 5 5 X Beechnut X Gerber 8.4 4.6 t * 3.55 s X Beechnut X Gerber 1.071 tcv(5+5-2=8) = .05 = 1.860 Since 3.55 > 1.860, the 2 groups are significantly different. Psy 320 - Cal State Northridge 42 Bonferroni t Test Run t tests between pairs of groups, as usual

Hold down number of t tests Reject if t exceeds critical value in Bonferroni table Works by using a more strict value of for each comparison Psy 320 - Cal State Northridge 43 Bonferroni t Critical value of for each test set at .05/c, where c = number of tests run Assuming familywise = .05 e. g. with 3 tests, each t must be significant at .05/3 = .0167 level. With computer printout, just make sure calculated probability < .05/c
Psy 320 - Cal State Northridge 44 Assumptions for Analysis of Variance Assume: Observations normally distributed within each population Population variances are equal Homogeneity of variance or homoscedasticity Observations are independent Psy 320 - Cal State Northridge 45 ASSUMPTIONS Psy 320 - Cal State Northridge

46 Assumptions Analysis of variance is generally robust to first two A robust test is one that is not greatly affected by violations of assumptions. Psy 320 - Cal State Northridge 47 EFFECT SIZE Psy 320 - Cal State Northridge 48 Magnitude of Effect Eta squared (h2)

Easy to calculate Somewhat biased on the high side Formula SS BG h SSTotal 2 Percent of variation in the data that can be attributedPsyto treatment differences 49 320 - Cal State Northridge Magnitude of Effect Omega squared (w2) Much less biased than h2 Not as intuitive We adjust both numerator and denominator with MSerror

Formula SS BG (k 1) MSWG w SST MSWG 2 Psy 320 - Cal State Northridge 50 h and w for Baby Food 2 2 SS BG 36.93 h

.518 SST 71.335 2 SS BG (k 1) MSWG 36.93 2(2.867) w .420 SST MSWG 71.335 2.867 2 h2 = .52: 52% of variability in preference can be accounted for by brand of baby food w2 = .42: This is a less biased estimate, and note that it is 20% smaller. Psy 320 - Cal State Northridge

51 Other Measures of Effect Size We can use the same kinds of measures we talked about with t tests (e.g. d and d-hat) Usually makes most sense to talk about 2 groups at a time or effect size between the largest and smallest groups, etc. And there are methods for converting h2 to d and vice versa Psy 320 - Cal State Northridge 52