**Sampling
Error**
There are about 250 million
adults in America, of every imaginable background and circumstance. So how can
a survey of only 800 or 1,000 adults reflect what the entire country is
thinking? How can a thousand voices
speak for them all?
Marketing researchers liken
it to making a big pot of soup — to taste-test the soup, you don't have to eat
the whole pot, or even a whole bowl's worth. You only have to try a spoonful or
two. The same is true of marketing research. You don't have to ask every single
person in America
to find out what Americans think; you only need to ask a few to get the flavor
of the population’s opinion.
This fact is reflected by a
survey's standard error of the mean. Specifically, the standard error of the
mean is an index of the amount of error that results when a single sample mean
is used to estimate the population mean, it is an index of **sampling error**. The standard
error of the mean equals the standard deviation of the population of raw scores
divided by the square root of the size of the sample on which the means are
based.
So, a subject highly related
to sampling error and standard error is the “margin of error.” The lower the margin of error, the larger the
sample, and the more accurately the views of those surveyed match those of the
entire population.
When marketing researchers
report the margin of error for their surveys (usually expressed as something
like "plus or minus 3 percent") they are stating their confidence in
the data they have collected.
**Confidence
Interval**
You must also remember that
every margin of error has a "**confidence
interval**," usually 95 percent. That means that if you asked a question
from this survey 100 different times, 95 of those times the results would be
within 3 percentage points of the original answer. Of course, this means that
the other five times you ask the question, you may get answers that are
completely different.
For example, if 50% of a
sample of 1,000 randomly selected Americans said they are satisfied with their
bank, in 95 cases out of 100, 50% of the entire population in the U.S. would
also have given the same response had they been asked, give or take 3
percentage points (i.e., the true proportion is somewhere between 47% or 53%).
The bigger the sample, the
smaller the margin of error, but once you get past a certain point -- say, a
sample size of 800 or 1,000 -- the improvement is very small. The results of a
survey of 300 people will likely be correct within 6 percentage points, while a
survey of 1,000 will be correct within 3 percentage points, a lower margin of
error. But that is where the dramatic differences end -- when a sample is
increased to 2,000 respondents, the margin of error drops only slightly, to 2
percentage points.
Despite this, some surveys
have sample sizes much larger than 1,000 people. But why ask two or three
thousand respondents when 800 will do? Well, it sounds more impressive, but
that's hardly worth the cost of interviewing all those additional people.
Usually when a study has a large sample, it is so certain subgroups can be isolated
and compared to other subgroups or to the total sample. If you want to compare
retired people to the general public, for instance, a sample of 1,000 might
yield only one or two hundred people who are no longer working, which may not
be enough to get a solid grasp on the views of that group. A sample of 2,000,
however, will probably yield a larger group of retired Americans and provide a
more accurate picture of their views, which can be compared to non-retirees’
opinions.
Sometimes increasing the overall
sample size is not enough, if the subgroup you are examining is rare or particularly
hard to find. Affluent households, for example, make up only a small percentage
of the U.S.
population. In a standard random sample, you would have to interview an
enormous number of people before you had a large enough subgroup of affluent
households. In this instance, you would take an "oversample,"
purposely seeking out members of the “high-net worth” group you are interested
in, and comparing the results to the main sample.
Of course, in
both general samples and oversamples, who is asked is as important as how many
are asked. Reputable survey organizations go to great lengths to make sure
their interview sample is random and representative of whomever they are
surveying, be it retired, affluent, or all Americans.

**Statistical Significance**

Sometimes,
even the best researchers misuse and abuse the concept of significance. Many in research pour over reams of
cross-tabulations and perform a multitude of analyses to find significant
differences and formulate their decisions based on **statistical significance**.
They tend to associate statistical significance with the magnitude of
the result. Their reasoning is something
like this: “The more statistically
significant a result, the bigger the difference between two numbers.” In other words, the fact that one proportion
is significantly different than another suggests that there is a big difference
between the two proportions and statistical significance is often associated
with “bigness” of a result.

People often
think that if the difference between two numbers is significant it must be
large and therefore must be considered in the analysis. It is suggested that when comparing numbers,
two types of significance should be considered:
*statistical significance *and* practical significance*. By understanding the difference between
statistical and practical significance, we can avoid the pitfall that many in
the research industry make.

What does
statistical significance mean? A
significance level of, say, 95% merely implies that there is a 5% chance of accepting
something as being true based on the sample when in the population it might be
false. The statistical significance of
an observed difference depends on two main factors: the sample size and the magnitude of the
difference observed in the samples.

For example,
let’s say we do a significance test between two groups of people who are
exposed to a product concept and find a 20-point difference between Group A
(65% acceptance) and Group B (45%). Is
the difference statistically significant?
Despite the large magnitude of the difference (20 points), its
statistical significance will depend on the sample size. According to statistical theory, we need a
sample size of about 50 or more people in each of the groups for the difference
to be statistically significant at the 95% level of confidence. If we meet the sample size requirement, then
the difference of 20 points will be statistically significant at the 95% level
of confidence.

What does
this really mean? Many marketers will
look at this result and conclude that since there is a 20-point difference and
the difference is statistically significant, there must be a big difference
between Groups A and B. In reality, if
we had done a census (i.e., surveyed the entire population) instead of
surveying a sample, the difference between Group A and Group B may have turned
out to be smaller. In other words, what
this result tells us is merely this:
given our particular sample size, there is a 5% chance that in the
population represented by this sample, the proportions for Group A and Group B
are not different. That’s all!

Statistical
significance does not tell us anything about how big the difference is. It only tells us the probability with which a
difference found in the sample would not be found in the population. Thus, for this case, statistical significance
would allow us to conclude that there is only a 5% chance that in the
population the proportion of Group A favoring the product is not higher than
Group B; we are taking a 5% risk of concluding a difference exists when there
may not be any such difference. If this
difference were significant at the 99% level of confidence, it would not have
become larger. It would only mean that
there is a 1% chance that the difference observed in the sample would not be
observed in the population. Thus, we are
only taking a 1% risk.

#####

**Practical Significance**
From a
marketing perspective, the statistically significant difference of 20 points
may be meaningful or meaningless. It all depends on our research objectives and
resources. If it costs millions of
dollars to reach each additional percentage of the market, we may decide to
funnel resources toward Group A since it has a higher acceptance rate. In this case, the difference may be termed a
“big” difference because (a) we are reasonably sure (95% or 99% sure) that the
difference observed in our sample also exists in the population and (b) each
percentage of difference is worth millions of dollars to the client. Thus, statistical significance should not be
used to decide how big a difference is but merely to ascertain our confidence
in generalizing the results from our sample to the population.

In another
situation, the same difference may be ignored despite the fact that it may be
statistically significant. For instance,
if the marketing costs are so low that it makes sense to market to both groups,
we can ignore the difference (even though it is significant) and treat both
groups as if they were the same. We may
choose to market to both groups as if they had similar acceptance rates (even
though our statistical test was significant).

The logic is
the following: although we can be 95%
sure that the difference observed exists in the population, given the marketing
scenario, the difference is not meaningful. Thus, the relevance of a statistically
significant difference should be determined based on practical criteria
including the absolute value of the difference, marketing objectives, strategy,
and so forth. The mere presence of a
statistical significance does not necessarily imply that the difference is
large or that it is of noteworthy importance.