Tool 5: Introduction to Surveys & Sampling

Suppose that you want to work out what the population of a town thinks about the cleanliness of the streets there. You could ask everyone who lives in the town if they are satisfied or dissatisfied with the state of the streets. If they all replied, you would have some very reliable data indeed. If a large number of them didn't, and you didn't take this into account, you might end up with skewed results.

For example, suppose the population is 10,000 and of these, 9,000 think the streets are clean enough. Now suppose that the only people who reply to your survey are the 1,000 who think the streets are dirty. Your survey would show that 100% of people are dissatisfied, when in fact only 10% are.

Another problem with surveying everyone is that it can be very expensive. Just think of the cost of surveying every person in your area - it could cost hundreds of thousands of pounds. Sampling is a scientific way of getting results from a smaller number of people that you can then use to work out what the overall population think.


Designing a survey using sampling

If it involves thousands of people, you'd be well advised to get a good market research firm to do this sort of work for you. However, you still need to know how to brief them properly.

Three things to think about when designing a sample survey:

1. Decide exactly who you want to survey

Is it all adults in a geographical area? Is it children in the schools of the area? Is it people who only work in the area? Are there significant numbers of people from equality and diversity groups in the area?

2. Give every member of that population an equal chance of being chosen and being able to respond

This means putting together a sampling frame, such as the electoral roll from which names are randomly chosen. Pure random sampling from a complete sampling frame is almost impossible to achieve in real life. A good market research firm will advise you on how best to get close to it. If significant numbers from equality and diversity groups live in the area you should think about how to encourage them to take part in the survey so that responses are as representative as possible.

3. Plan a survey method that reduces response bias

It's usually thought that face-to-face surveys are best for being representative because there is more control over choosing who to involve. You need to guard more against bias in responses to postal surveys. In any case you should always ask people in your sample about their age, gender, ethnicity, or any disability, and where relevant you might want to ask about religion & belief or sexual orientation, so that you can compare your sample with the census data for the area. If the two sets of figures are very different, e.g. in a case when far more older female house owners reply than are proportionate compared to the census data, you might want to weight the results to compensate for this.

Weighting:

This is done to correct for any biases in the sample. You can calculate weights for different groups of respondents, e.g. gender, age group, ethnicity, disability, postcode, type of residence, using local demographic statistics. The weights are used to scale down the influence of over-represented groups and to increase the influence of under-represented groups on the overall survey results - thus better reflecting the local population.

Several different weights might be combined in a single factor. If you are unclear about how your data should be weighted, discuss this with your market research company. You might wish to ask why the particular weighting scheme they propose is likely to increase the accuracy of the results. If you use raw survey data yourself in SPSS or another statistical package, don't forget to check whether the weighting is on or off - by looking at the Data command for weighting - before starting your analysis.

Sample Size:

Actually, there's a fourth thing to think about - sample size - that's important for getting accurate results. We'll look at this in the next section.


Working out the population's views from your results

So once you've got your survey results, what do they tell you about the views of the population?

Three things to think about:

1. The Confidence Interval

This is a statistical term. Roughly speaking, it is about the chance that your survey will give you rogue results. Most people do the sums on the assumption that there is a 95% chance of getting results which you can apply to the population as a whole.
You could do the sums so that the confidence interval rises to a 99.9% chance. However, this comes at a statistical price. If you increase the confidence interval, you increase the margin of error as well.

2. The Margin of Error

Suppose that you have a sample of 1,000 people of whom 70% say that they're happy with the cleanliness of the streets. If you know how, you can work out the margins of error on this result for the population as a whole. We'll just show you the results here:

* 95% Confidence Interval: the result if you surveyed every member of the population would lie between 67.16% and 72.84% who are happy

* 99% Confidence Interval: the result if you surveyed every member of the population would lie between 66.26% and 73.74% who are happy


* 99.9% Confidence Interval: the result if you surveyed every member of the population would lie between 65.09% and 74.91% who are happy

3. Reducing the Margin of Error by increasing the Sample Size

The table below is a handy reckoner for doing this at a 95% confidence interval. For example, if you increased the sample size from 1,000 to 2,000 you would reduce the margin of error on your 70% satisfied-with-the-streets result from plus or minus 2.84% points to plus or minus 2.01% points. That's a reduction of from nearly six to about four points.

Notice that the margin of error also depends on how people have answered the question. If sample opinion is split right down the middle, 50% satisfied and 50% not, then there's a bigger chance that you'd get this result purely by chance; so the margin of error is bigger. But if only 10% are satisfied, this is less likely to be a chance result and so the margin of error is smaller. This applies at any confidence interval and sample size.

Table of Error Margins at 95% 

 

One last thing

Many people assume that sample size needs to be proportional to an area's population. This isn't strictly necessary. The margins of error in the table don't depend on population size at all. They stem purely from the mathematics of what is known as the 'normal distribution'. This is a bell-shaped curve that basically says most people are going to take a middle view with only a few people standing at the extremes.

 

However, you do need to be careful about defining your population. Two examples can illustrate some of the issues:

County Results calculated at District Level

In a survey of Essex, if 70% of people say they are satisfied with the cleanliness of the street, we can't say this of respondents from just one of the districts. The result from that district, even within the same survey, might be different. Suppose 50% of Colchester respondents are satisfied - however the fraction of the sample from that district will be much smaller than for Essex as a whole, say, 100 compared with 1,100. This will give a much bigger margin of error - so the margin of error for a particular district in a county survey can be more or less a whacking great +/-10% points, i.e. somewhere between 40% and 60% instead of 70% for the county as a whole. Therefore if you want to be able to disaggregate your results from county to district level you will need to increase the sample size and make sure response sizes are adequate from each district.

Accounting for different social groups

You need to think about social groups as well. If in Colchester district you do a survey of all adult residents and 70% say they are satisfied with street cleanliness, can you then say that 70% of ethnic minority residents are satisfied with this too? The answer is that you can't. You might only have 50 people in your sample from ethnic minorities. That could give you a margin of error for that set of people nearly plus or minus 14% points. If you want to find out the views of ethnic minority residents with any reliability, you need either to design a survey purely for them or boost the sample for such residents. The same applies to any other set of people, e.g. older residents, people with a disability, the under 30s.

 

Back to Step 4