Imagine you are designing a study for a client who wants to
have “readable” base sizes of certain key demographic groups represented in a
survey, e.g., race and ethnicity. So, in
order to accommodate, you set up the sample configuration such that Caucasians, African-Americans, Hispanics, and Asians each have a base size of 100
completes, such that your total sample size is n=400.
So far, so good. But then, your client wants you to test the
significance of differences of each group against the total sample. Well, everything would be okay if each of
these groups were equal in size in the population. Of course, they are not, so that means you
can’t simply roll up the 400 respondents into one group and make
straightforward comparisons to the separate groups. To solve for this issue, you decide to weight
the data using population proportions of each group according to the latest
census data available.
In essence, weighting data is like
pulling taffy. For some groups, you only
need to pull the taffy a little bit because their proportion in the sample is
close to the population. In other
groups, you will need to stretch the taffy more as they may be
under-represented in the sample, relative to the population.
However, all kinds of trouble can
occur at this stage of your otherwise well-designed study. You can apply weights to a data set that
range way too large and way too small.
You can apply the weight by assigning a proportion of one of the
subgroups incorrectly. And you can apply
the weight correctly and forget to read your crosstabs that show “Weighted
Data.” When using weights, be warned
that trouble is lurking around the corner if you are not careful and check your
work before publishing the results to your client.
To begin, examine each individual
weight being applied to each respondent’s data.
If the weight being applied is greater than 2.0, you may be trying to
pull that taffy too far, and it may snap.
If the weight is close to 0.0, you are essentially eliminating that
respondent’s data since anything multiplied by zero is zero. If you can stay within the range of 0.5 and
1.5, you are in good shape, and the taffy will be just right.
Whomever is handling your data
processing, whether it is some crack technician that’s been running Quantum to
produce crosstabs for years and years, or whether you are doing it yourself,
double check your work. Believe us, these
errors are made because they can be easily overlooked.
The worst error to make is by posting
unweighted data to your report. Again,
easy to do, but extremely costly to overcome.
Your client will be hard pressed to process your invoice, and will
probably never call you again for another study in the future. Check and double check your work. Better yet, have someone else check your work
as most researchers I know can tell a story about having looked at something
for so long, can not see errors they’ve made that are right under their nose.
Weighting data is surely the
Achilles’ heel of market research. So,
when you find yourself in a study in which applying weights is necessary,
please be careful, stretch first, and don’t pull a muscle.