Back to SySurvey Whitepapers |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A Question of Wording |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Author: Professor Philip Glendall and Janet Hoek Link: Massey University |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
IntroductionOne of the paradoxes of questionnaire design is that the more you know, the more you realise how little you know. Questionnaire design is one vocation where ignorance is bliss. Many questionnaire designers take for granted that the questions they ask are the same as those their respondents answer and that all respondents answer the same question, yet Belson (1986) has clearly shown the fallacy of these assumptions.Most developments in survey research over the last twenty years have been concerned with increasingly powerful analytical techniques and their application to larger and larger data bases. Consequently factor analysis, discriminant analysis, cluster analysis and various types of multidimensional scaling are now routinely applied to survey data. But these developments have often lost sight of the fact that the quality of data analysis depends on the quality of the data analysed. No amount of multivariate analysis can improve the quality of the original survey responses. Over 40 years ago Gallup (1947) concluded that survey results were influenced more by question design than by sample design. Since then, despite the efforts of Belson, Converse, Kalton, Schuman, Presser and others, the development of knowledge about question wording, and particularly of an underlying theory of question wording, has been very slow. Generalisable rules about question design have been difficult to establish, and, although some authors have offered general rules for practitioners, such rules are often contradictory and are rarely based on exhaustive empirical research. However, the observation that some question wording variations have little impact on the stability of survey results, suggests that it may eventually be possible to develop a theory of question wording. Labaw (1980), for example, cites the results of a series of six polls concerned with American foreign policy in Panama (see Figure 1). These polls were conducted over a period of two years and each poll involved a differently worded question, but the results were strikingly similar. The obvious implication is that these were essentially the same question. Figure 1. Stability of poll results for differently-worded questions
The effect of changing the question is illustrated by the results of another poll conducted during the period in question. Although the question is similarly worded to those in Figure 1, there is an additional emotional element which apparently turns it into a different question (see Figure 2). Figure 2. The effect of changing the question
From these results, Labaw concludes that question wording variations as such have little impact on the stability of survey results. Question wording variations only become significant, according to Labaw, when the variations introduce or tap a different concept, reality or emotional level surrounding an issue. This is reassuring for questionnaire designers, because their task would be impossible if every change in question wording, no matter how small, produced a different question. But it still does not establish how researchers can ensure that the questions their respondents answer are the ones they meant to ask. The only way to develop a comprehensive and reliable set of rules for question design is to systematically test alternative question forms, sequences and wording variations in a variety of situations, until those results which are generalisable can be differentiated from those which are situation-specific. Inevitably this will be a long, slow process, but the benefits of the knowledge gained would be far more than peace of mind for questionnaire designers. It would guarantee that the information collected in surveys was as reliable as the techniques used to analyse and interpret it imply. This paper presents the results of four small pieces of research in the area of question wording: a comparison of an open and a closed question on the same topic, a comparison of responses to positive and negative versions of the same attitude statements, a test of different question wording variations, and a test of the effect of one question on another. MethodThe vehicle for this research was the 1989 Palmerston North Household Omnibus, which is conducted annually by students from the Marketing Department of Massey University. The survey area covers households within the Palmerston North city boundary, and the sample is based on clusters of four interviews (two with males, two with females, 15 years of age or older) around randomly-selected starting points. Substitutions are made for households where an interview is refused or where no contact can be made after three attempts.Two versions of the Omnibus questionnaire were used to allow the four pieces of research, each of which employed a split sample technique, to be conducted. The responses to each version of the questionnaire were weighted so that the age-sex distribution of the two subsamples - those who answered Version 1 and those who answered Version 2 - were the same. The subsample sizes were 347 for Version 1 and 311 for Version 2. Results and DiscussionOpen Versus Closed QuestionsVersion 1 of the questionnaire included the following open question:
In Version 2, this question was replaced with a closed alternative:
INFLATION HIGH EXCHANGE RATE LAW AND ORDER INTEREST RATES UNEMPLOYMENT THE ECONOMY IN GENERAL RACIAL PROBLEMS The open version of the question and the response categories for the closed version were taken from the regular Heylen omnibus. An additional response category, "AIDS", was added at the top of the list to test the effect of item order and the presence of an item not generated by open questioning. Responses to the open question were coded into the "closed" categories of the alternative question wherever possible, and the resulting distribution compared with that for the closed question. Table 1 shows the results of these comparisons. As expected, the pattern of responses to the open and closed question differed, although the general picture which emerged from both was similar (for example, unemployment was clearly the most important issue among all respondents at the time). Some response categories generated by the open question were not included in the prespecified list of responses to the closed question, and AIDS was never given as a response to the open question, despite the fact that 6% of those asked the closed question gave this as their response. Analysis of the open responses was, of course, subjective and a better list of closed alternatives may have been generated by some qualitative research preceding the design of this question. However, this exercise supports earlier findings (Belson, 1986; Converse & Presser, 1986; Kalton & Schuman, 1982; Schuman & Presser 1981) that: 1. the patterns of responses to open and closed versions of the same question are likely to differ, and 2. the pattern of responses to closed questions is heavily influenced by the choices presented to respondents. Table 1. Comparison of open and closed responses
Positive Versus Negative Attitude Statements Both versions of the questionnaire included the following question:
The rationale for this design was that a "positive" or "negative" set of statements might influence respondents' answers to the statements. In other words, a set of positive statements might produce higher agreement than the level of disagreement for a set of negative statements. The results of this exercise are shown in Table 2, where the proportions
of respondents who had an opinion are reported. The percentage of "don't
knows" ranged from 8% to 78%, and was generally of the order of 30% to
40%.
Table 2. Respondents' reactions to "positive" and "negative" statements
Although some of the differences between the equivalent responses to the different versions were quite large (up to 15%), there was no discernible pattern to these differences and no evidence to support the hypothesis that a positive or negative set of statements has a predictable effect on respondents' answers. Despite the fact that only the largest of the observed differences is statistically significant at the 5% level, the results suggest that some words or statements which appear to be "opposites" are not perceived by respondents as such. This issue is discussed in more detail in the following section. Question Wording Variations Respondents were presented with a set of attitude statements covering a wide range of social issues and asked to agree or disagree with each statement. About a quarter of the statements were identical in both questionnaires; the rest contained various question wording variations. Table 3 shows the responses to the nine statements which were identical in Version 1 and Version 2. For all but two of these statements the pattern of responses for each questionnaire is virtually identical. This suggests that the weighting procedure described previously was successful in producing two balanced subsamples, and, furthermore, that any differences in the responses to other statements were the result of differences in question wording rather than the result of differences in the composition of the subsamples presented with each version of the questionnaire. Table 3. Responses to identical attitude statements
Introducing a New Concept The pair of statements shown in Table 4 was designed to test the effect
of introducing a new concept into a question.
Table 4. The effect of introducing a new concept into a question
In this case, the introduction of the qualifying phrase, "even if this results in higher unemployment", influenced the pattern of responses in a predictable way (less agreement, more disagreement, more uncertainty). Thus, while it is clear that most respondents supported the concept of "equal pay" for women, it is equally clear that the level of support depends on the trade-off presented. "Reversible" Statements The two statements shown in Table 5 are identical in all respects except that the order of two words, "men" and "women" is reversed in the second statement. Although these statements are linguistically reversible, it is clear that this change affected the way respondents interpreted them. When "women" became the object of the question there was a dramatic increase in the proportion of "don't knows", and, among those with an opinion, a higher level of agreement that women are better suited emotionally for politics than men. Table 5. Comparison of responses to "reversible" statements
In this case, shifting the focus of the question from men to women changed the emotional content of the second statement. Even though both statements contained the same words, the different results generated by changing their order suggests that respondents neither perceived nor treated them as logical opposites. Question designers face the problem that both are equally plausible statements, yet the pattern of responses produced by each is quite different. "Mirror Image" Statements Eight of the attitude statements used in the 1989 omnibus were constructed in such a way that a "mirror image" of each statement appeared on each version of the questionnaire. If these statements were true "mirror images", the proportion of respondents agreeing with one version would be the same as the proportion disagreeing with the other version, and vice versa. The actual results are shown in Table 6. Table 6. Responses to "mirror image" attitude statements
Differences between the equivalent responses to the eight "mirror image" statements varied from 4% to 18%, suggesting that most, if not all, of the statements were not in fact interpreted as logical opposites. Furthermore, there was no discernible pattern to the differences between the two sets of responses, nor was there an obvious explanation for them. Previous research has shown that the substitution of the words "forbid" and "allow", which are logical opposites, has a predictable effect on the pattern of responses (Converse & Presser, 1986; Hippler & Schwarz, 1986; Kalton, Collins & Brook, 1978; Kalton & Schuman, 1982; Schuman & Presser, 1981). More people are willing to "not allow" something than are willing to "forbid" it. This phenomenon was confirmed by our research (83% of respondents disagreed that the law should allow public speeches which promote racism, but only 66% agreed that the law should forbid such speeches). However, it is difficult to generalise beyond this particular pair of antonyms. The terms "in favour of" and "opposed to" produced a similar pattern of responses to those for the allow/forbid statements, but "allowed to" and "prevented from" did not. Similarly, the responses to the statements concerning abortion and trade unions would have produced positive differences in Table 6, if the proposition that respondents are more willing to "not allow" something than they are to forbid or prevent it was generalisable. The most that can be said on the basis of the results in Table 6 is that changes in question wording can produce differences in respondents' responses, but not necessarily, and that the effect is often unpredictable. Influence of One Question on Another The final piece of research involved a small experiment designed to test whether the response to one item influences respondents' answers to a subsequent item. Each version of the questionnaire contained a "mirror image" attitude statement, one suggesting that people with AIDS have themselves to blame, the other suggesting that they are not to blame. For both versions the next statement asked respondents to judge whether people with AIDS received too little sympathy from society. The hypothesis was that respondents presented with the "favourable" initial statement might, as a result, respond more charitably to the next statement. Such behaviour would be supported by the phenomenon of acquiescence, the tendency of respondents to agree with any statement regardless of its content, (Kalton & Schuman, 1982; Wright, 1976). If a respondent "acquiesced" with the favourable version of the first statement, logically he or she should disagree with the second statement, and vice versa. The outcome of this test is shown in Table 7. Despite the different responses to the questions which differed in each version of the questionnaire, the distributions of responses to the question common to versions 1 and 2 were virtually identical. In other words, there was no support for the hypothesis that response to this question was influenced by the response to the preceding question. This result conflicts with evidence from a number of studies (Converse & Presser, 1986; Crespi & Morris, 1984; Kalton & Schuman, 1982; Schuman & Presser, 1981) which clearly showed that the order of questions in a survey does affect the answers to them. However, these studies also showed that this does not always happen. Table 7. The influence of one question on another
ConclusionsFrom this research there was no evidence to support the proposition that a "positive" or "negative" set of attitude statements could influence respondents' answers in a predictable way. Nor did our results support the notion that the content of a particular question could influence the answer to another (though this effect is well-established by previous research). However, the results did confirm that different question forms (in this case an open and a closed question) produce different responses, and that question wording effects are potentially important, but often unpredictable.On the first issue, open versus closed questions, the best advice is to use closed questions where possible. Then at least the context of the question is the same for all respondents. However, it is important to remember that the distribution of responses for a closed question is critically dependent on the answer set presented to respondents. If researchers omit an important answer, the inclusion of "other" as a final category will not compensate for this deficiency. Similarly, if they include an unimportant answer, its importance is likely to be overestimated. The knowledge that any one question may be beset with a host of problems suggests that researchers should use a number of questions to study a particular issue. In this way they might avoid the worst implications of individual question wording effects. In other words, the use of multiple questions on a single issue gives some protection against the unpredictable effects of question wording. "Agree-disagree" statements appear to be particularly prone to question wording effects, and in most cases forced-choice questions are better if only for the fact that they are more likely to encourage a more considered response. By itself "The Government should see to it that everyone receives adequate medical care" seems plausible, but so does "Everyone should be responsible for their own medical care", yet the two are contradictory. Thus for most purposes the better question is: "Should the Government see to it that everyone receives adequate medical care, or should everyone be responsible for their own medical care?" What seems clear from this study and other research on question wording is that it is possible to write the same question in a number of different ways with no effect on respondents' understanding of it. On the other hand, simply changing one word may change the whole meaning of a question. The problem for questionnaire designers is knowing when wording variations have changed a question and when they have not. Only by systematically researching the effect of question wording variations will this distinction become apparent. ReferencesBelson, W.A. Validity in Survey Research. London: Gower Publishing Co, 1986.Bishop, G.F., Oldendick, R.W., & Tuchfarber, A. What must my interest in politics be if I just told you "I don't know"?. Public Opinion Quarterly, 1984, 48, 510-519. Converse, J.M., & Presser, S. Survey Questions: Handcrafting the Standardised Questionnaire. New Delhi, India: Sage Publications, 1986. Crespi, C., & Morris, D. Question order effect and the measurement of candidate preference in the 1982 Connecticut elections. Public Opinion Quarterly, 1984, 48, 578-591. Gallup, G. The quintamensional plan of question design. Public Opinion Quarterly, 1947, 3, 385-393. Hippler, H.J., & Schwarz, N. Not forbidding isn't allowing. The cognitive basis of the forbid-allow assymmetry. Public Opinion Quarterly, 1986, 50, 87-96. Kalton, G., Collins, M., & Brook, L. Experiments in wording opinion questions. Applied Statistics, 1978, 27 (2), 149-161. Kalton, G., & Schuman, H. The effect of the question on survey responses: a review. Journal of the Royal Statistical Society. 1982, 145, (1), 42-57. Labaw, P.J. Advanced Questionnaire Design. Cambridge, Massachusetts: Art Books, 1980. Schuman, H., & Presser, S. Questions and Answers in Attitude Surveys. New York: Academic Press, 1981. Wright, J.D. Does acquiescence bias the index of political efficacy?. Public Opinion Quarterly, 1976, 39, 219-226. Philip Gendall is Professor and Head of Department, and Janet Hoek is a Lecturer, in the Department of Marketing, Massey University. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Back to SySurvey Whitepapers |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||