# Module Chapter 9: Deciding on the Sampling Strategy 9:

/ ? ? Introduction Introduction to Sampling

Types of Samples: Random and Non-Random How Confident and Precise Do You Need to Be? How Large a Sample Do You Need? Where to Find a Sampling Statistician? ? 2 Sampling Is it possible to collect data from the entire population? (census) If so, we can talk about what is true for the entire population Often we cannot (time/cost) ( / )

If not, we can use a smaller subset: a SAMPLE 3 Concepts Population the total set of units Sample a subset of the population Sampling Frame

list from which to select your sample 4 More Sampling Concepts Sample Design methods of sampling (probability or non-probability) Parameter characteristic of the population Statistic

characteristic of a sample 5 Random Sample A random sample allows us to make estimates about the larger population based on what we learn from the subset Lottery, everyone has an equal chance Advantages:

eliminates selection bias able to generalize to the population cost-effective 6 Types of Random Samples

Simple random sample Random interval sample Stratified random sample Random cluster sample Multi-stage random sample Combination random sample 7

Simple Random Sample Simplest Establish a sample size and proceed to randomly select units until we reach the sample size Uses a random number table to select units 8 Random Interval Sample Used when there is a sequential population that is not already enumerated and would be difficult or time consuming to enumerate

Uses a random number table to select intervals 999 Stratified Random Sample Use when specific groups must be included that might otherwise be missed by using a simple random sample usually a small proportion of the population 10 Stratified Random Sample

Total Population sub-population Subpopulation simple random sample simple random sample

sub-population simple random sample 11 Random Cluster Sample Another form of random sampling Any naturally occurring aggregate of the units that are to be

sampled that are used when: you do not have a complete list of everyone in the population of interest but have a list of the clusters in which they occur or you have a complete list of everyone, but they are so widely distributed that it would be too time consuming and expensive to send data collectors out to a simple random sample 1112 22 Multi-stage Random Sample Combines two or more forms of random sampling 2 Most commonly, it begins with random cluster sampling and then applies sample random sampling or stratified

random sampling 1113 33 Combination Random Samples More than one random sampling technique is used 14 Drawback of Random Cluster and Multi-stage Random Sampling May not yield an accurate representation of the population

1115 55 Summary of Random Sampling Process Step Process 1 Obtain a complete listing of the entire population 2 Assign each case a number 3 Randomly select the sample using a random numbers table 4 When no numbered listing exists or is not practical to create: take a random start select every nth case n 16

Non-Random Samples Can be more focused Can make sure a small sample is representative Cannot make inferences to a larger population 17 Types of Non-random Samples convenience whoever is easiest to contact or whatever

is easiest to observe Snowball ask people who else you should interview purposeful (judgment) set criteria to achieve a specific mix of participants 1118 88 Forms of Purposeful Samples

Typical cases (median) Maximum variation (heterogeneity) Quota Extreme case Confirming and disconfirming cases

1119 99 Bias and Non-random Sampling People selected in a biased way? Are they substantially different from the rest of the population? collect some data to show that the people selected are fairly similar to the larger population (e.g. demographics) 20 Combinations: Random and

Non-Random Example: Non-randomly select two schools from poorest communities and two from the wealthiest communities 2 2 Select a random sample of students from these four schools 4 21 Possibility of Error Sample different from the population? Statistics: data derived from random samples

22 How confident do you wish to be? confidence level E.g., 90% (90% certain your sample results are an estimate of the population as a whole) 90% 90% the higher confidence level, the larger sample needed 23 Confidence Standard

Standard is 95% 95% 19 of 20 samples would have found similar results 20 19 we are 95% certain that the population parameter is somewhere between the lower and upper confidence interval calculated from the sample 95% 24 Confidence Interval Sometimes called sampling error, margin of error, or precision Example: in polls 48% for, 52% against, with (+/- 3%) 48% 52%

3% actually means 45% to 51% for and 49% to 55% against 45%-51% 49-55% 2225 55 Sample Size By increasing sample size, you increase accuracy and decrease

margin of error The larger the margin of error, the less precise your results will be The smaller the population, the smaller the needed sample size for a given confidence level and margin of error, but the larger the needed ratio of the sample size to the population size. Aim for is a 95% confidence level and a margin of error of +/- 5% 95% +/- 5% 26 Sample Sizes for Large Populations Precision Confidence Level

99% 95% 90% 1% 16,576 9,604 6,765 2% 4,144 2,401

1,691 3% 1,848 1,067 752 5% 666 384 271 27 Summary of Sampling Size

Accuracy and precision can be improved by increasing the sample size The standard to aim for is a 95% confidence level and a margin of error of +/- 5% 95% 5% The larger the margin of error, the less precise the results will be The smaller the population, the larger the needed ratio of the sample size to the population size

2228 88 Where to Find a Sampling Statistician American Statistical Association (ASA) directory of statistical consultants ASA http://www.amstat.org/consultantdirectory/index.cfm Alliance of Statistics Consultants http://www.statisticstutors.com/#statistical-analysis HyperStat Online http://davidmlane.com/hyperstat/consultants.html 2229 99