This is a bilingual snapshot page saved by the user at 2024-5-31 18:53 for https://app.immersivetranslate.com/word/, provided with bilingual support by Immersive Translate. Learn how to save?

问卷设计的理论、
Theories of questionnaire design,

方法与新趋势
Methodology and emerging trends

The Theories, Methods and New

Trends of Questionnaire Design

张婵
by Zhang Chan

调查问卷是用于获取调查数据的重要工具,被广泛应用在社会生活中的诸多领域。对于大部分人来说,撰写调查问卷看起来并不是一件难事。设计一份问卷不需要太多的专业知识,特别是在网络调查平台的加持下,能够很轻松地制作出一份问卷并在人群中发放出去。即便在从事某一专业领域的研究者中,也有不少人会觉得设计问卷有什么难的呢?然而在耗费时间精力发放问卷并回收数据后,他们往往发现事情并不如同想象中那么简单。由于在问卷设计中欠缺考虑,可能使得最终得到的数据质量难以满足研究要求。为了解决不同领域研究者的困扰,就需要对问卷设计的理论与方法有所把握。
Questionnaires are important tools used to obtain survey data and are widely used in many areas of social life. For most people, writing a questionnaire does not seem to be a difficult task. Designing a questionnaire does not require much expertise, especially with the addition of online survey platforms that can easily create a questionnaire and distribute it among the population. Even among researchers working in a specialized field, there are many who would think what is so difficult about designing a questionnaire? However, after spending time and effort to distribute the questionnaires and collect the data, they often find that things are not as simple as they thought. Due to the lack of consideration in questionnaire design, the quality of the final data obtained may not meet the requirements of the study. In order to solve the problems of researchers in different fields, it is necessary to have a grasp of the theory and methods of questionnaire design.

设计问卷不难,但设计一份好的问卷并不容易。什么是好的问卷呢?就单个题项的测量而言,首先需要有好的效度(validity。效度的概念意味着,一个问题作为使用的测量工具,旨在测量的内容在多大程度上与研究者所关心的相一致。问题效度问卷措辞的关系密切,看似不重要的措辞变化,有些时候可能对调查结果产生意想不到的影响。但是,即使问题本身具有较好的效度,也不能保证受访者的回答就是真实的情况。很多时候由于受访者问卷设计的原因,以及这两种因素的相互作用影响某些问卷设计对某类受访者的影响更大),使得出现受访者回答与其真实情况存在差距的测量误差(measurement error)。问卷整体的质量而言,问卷设计还需要考虑到一个问题在问卷中所处上下文情境(context),以及问卷题目内容问卷长度等特征因素对受访者回复质量带来的潜在影响。此外,以往大量的研究表明,受访者的回复还可能受到调查模式(survey mode)或数据搜集模式(data collection mode)的影响,如面对面访问、电话调查、邮寄调查、网络调查等模式的特征可能影响最终得到的数据质量。不同社会文化背景下受访者,也可能表现出不同的回复风格(response style)。
Designing a questionnaire is not difficult, but designing a good questionnaire is not easy. What is a good questionnaire? In terms of the measurement of individual questions, the first thing that is needed is good validity (validity). The concept of validity implies the extent to which what a question is intended to measure as the measurement tool used is consistent with what the researcher is interested in. Question validity is closely related to questionnaire wording, and seemingly unimportant changes in wording may in some cases have an unintended effect on the findings. However, even if the question itself has good validity, there is no guarantee that the respondent's answer is the true situation. In many cases, measurement error occurs due to either the respondent or the design of the questionnaire, as well as the interaction of these two factors (e.g., certain questionnaire designs have a greater impact on certain types of respondents), resulting in a discrepancy between the respondent's answers and the true situation. In terms of the overall quality of the questionnaire, the questionnaire design also needs to take into account the context in which a question is asked, as well as the potential impact of the content of the questionnaire title, the length of the questionnaire, and other characteristics of the questionnaire on the quality of the respondents' answers. In addition, a large number of previous studies have shown that respondents' responses may also be affected by the survey mode or data collection mode, such as face-to-face interviews, telephone surveys, postal surveys, Internet surveys, and other modes whose characteristics may affect the quality of the final data obtained. Respondents from different socio-cultural backgrounds may also exhibit different response styles.

如何在科学研究中设计一份好的问卷,离不开理论的支撑和实证的检验。从理论角度来说,只有了解了受访者是如何回答问卷问题的,才有可能具有针对性地设计问卷,更好帮助研究者获得尽可能可靠和准确的数据。因此,本书在第一章中回顾了调查方法领域关于受访者回复过程(response process)的三个理论,分别是回答调查问题的四步骤模型满意理论(satisficing theory和解释性启发式(interpretive heuristics)。从实证角度来说,不同的问卷设计方式是否对数据结果产生预期中的影响,需要经过现实世界中经验证据的检验。本书在第二、三章中,系统地介绍了目前研究中评估问卷设计的指标和方法。这些方法来自心理学、调查方法学等不同的研究领域,适用于调查研究开始前后的不同阶段,服务于单一独立研究、元分析等不同的研究目的,提供了定性与定量数据、眼动数据与回复行为等不同类型的证据。本书在介绍了问卷设计的基本理论和研究方法之后,在第四章中梳理了关于调查问卷设计的一些重要实证研究结论。面对问卷设计领域数量庞大的实证研究文献,本书并不试图对以往数十年的研究做一个面面俱到的总结。本书主要从问题措辞、回复选项、问题格式这三个方面出发,聚焦问卷设计实践中最经常遇到的设计决策问题(比如,回复选项采用5点还是7点量表筛选题可能会怎样影响受访者的回复行为如何设计开放式问题,等等),对以往的研究进行系统性的梳理和总结。
How to design a good questionnaire in scientific research cannot be separated from theoretical support and empirical testing. From a theoretical point of view, only by understanding how respondents answer the questionnaire questions, it is possible to design the questionnaire in a targeted way, which can better help the researcher to obtain as reliable and accurate data as possible. Therefore, in Chapter 1, this book reviews three theories in the field of survey methodology about the respondent's response process, namely the four-step model of answering survey questions, satisficing theory, and interpretive heuristics. Empirically, whether different questionnaire design approaches have the expected impact on data outcomes needs to be tested against real-world empirical evidence. In Chapters 2 and 3, this book systematically presents the indicators and methods used to assess questionnaire design in current research. These methods come from different research fields such as psychology and survey methodology, apply to different stages before and after the start of a survey study, serve different research purposes such as single independent studies and meta-analyses, and provide different types of evidence such as qualitative and quantitative data, eye movement data and response behavior. After introducing the basic theories and research methods of questionnaire design, this book sorts out some important empirical findings on survey questionnaire design in Chapter Four. In the face of the vast amount of empirical research literature in the field of questionnaire design, this book does not attempt to provide an all-inclusive summary of previous decades of research. Instead, this book focuses on the three aspects of question wording, response options, and question formatting, focusing on the most frequently encountered design decision issues in questionnaire design practice (e.g., whether to use a 5-point or 7-point scale for response options, how screening questions may affect respondents' response behavior, how to design open-ended questions, etc.), and systematically compiling and summarizing the previous studies.

在问卷设计的领域,聚焦性的英文专著并不罕见。相对而言,这一领域的中文著作更显稀缺。本书扎根于问卷设计的专门活动,希望向中文读者介绍领域内目前最重要的理论、方法和前沿的研究发现。在撰写本书的过程中,笔者参考了调查方法领域多部经典英文著作,特别是图兰吉(Roger Tourangeau)、里普斯(Lance J. Rips)和拉辛斯基(Kenneth Rasinski合著的《调查回复的心理学》(“The Psychology of Survey Response”),以及图兰吉、康拉德(Frederick G. Conrad)和库珀(Mick P. Couper合著的《网络调查的科学》(“The Science of Web Surveys”)。在本书的行文过程,融入了上述经典著作中关于问卷设计研究最有影响力的观点和发现。同时本书关注了近年来调查技术的最新进展,以及与之相关的问卷设计领域前沿研究问题。在第五至第八章中,分别讨论了移动网络调查的问卷设计、主动数据与被动数据结合的新型调查方法、密集追踪数据的生态瞬时评估法,以及大数据兴起对问卷调查的影响等内容。此外,本书也加入了在中国社会文化背景下的问卷设计研究作为重要的参考案例。
In the field of questionnaire design, focused monographs in English are not rare. Relatively speaking, Chinese-language works in this field are even scarcer. Rooted in the specialized activities of questionnaire design, this book hopes to introduce Chinese readers to the most important current theories, methods, and cutting-edge research findings in the field. In the process of writing this book, the author has referred to a number of classic English works in the field of survey methodology, especially "The Psychology of Survey Response" ("The Psychology of Survey Response") co-authored by Roger Tourangeau, Lance J. Rips, and Kenneth Rasinski. Psychology of Survey Response"), and "The Science of Web Surveys" by Turangi, Frederick G. Conrad, and Mick P. Couper ("The Science of Web Surveys"). In the course of this book, the most influential ideas and findings from these classic works on questionnaire design research are incorporated. At the same time, the book focuses on the latest advances in survey technology in recent years, as well as the related cutting-edge research issues in the field of questionnaire design. In Chapters 5 to 8, the questionnaire design for mobile network surveys, the new survey method combining active and passive data, the ecological instantaneous assessment method with intensive tracking data, and the impact of the rise of big data on questionnaire surveys are discussed respectively. In addition, questionnaire design studies in the Chinese socio-cultural context are included as important reference cases.

以下是关于本书内容的一些说明:
Here are some notes on the contents of the book:

第一,书中对问卷设计方法的阐述侧重于问卷设计中的一般性问题,在不同调查模式、文化背景下都可能或多或少地存在。在近年来关于问卷设计的实证研究中,网络调查占了相当大的比重。其原因,一方面是由于网络调查的成本较低,给使用者带来的财务负担较小;另一方面是随着移动平台设备和互联网络的发展,衍生出的移动网络调查(本书第五章)、基于智能手机的主被动数据搜集(本书第六章)等新型模式,给问卷设计提出了新挑战。此外,目前调查方法学领域的大多数研究都是由欧美学者在西方语境和文化背景下完成的。这些研究结果在多大程度上可以推广到其他的语言文化场景中,需要更加谨慎地判断,其中的很多结论仍然有待在未来的研究中进行更进一步的检验。
First, the book's description of questionnaire design methodology focuses on general issues in questionnaire design, which may exist to a greater or lesser extent in different survey modes and cultural contexts. Among the empirical studies on questionnaire design in recent years, web surveys have accounted for a considerable proportion. On the reasons for this, on the one hand, it is due to the fact that web surveys are less costly and impose less financial burden on users; on the other hand, with the development of mobile platform devices and interconnected networks, new modes such as mobile web surveys (Chapter 5 of this book) and smartphone-based active-passive data collection (Chapter 6 of this book) have been derived, which have posed a new challenge to questionnaire design. In addition, most of the current research in the field of survey methodology has been done by European and American scholars in Western contexts and cultures. The extent to which these findings can be generalized to other linguistic and cultural scenarios needs to be judged more cautiously, and many of the conclusions remain to be tested further in future studies.

第二,调查问卷设计以及相关的测量误差,只是调查诸多环节的一个部分和一种误差来源。根据总体调查误差(total survey error)理论,测量误差之外,调查误差还包含涵盖误差(coverage error抽样误差(sampling error无回复误差(nonresponse error)和处理误差(processing error等来源。这些不同类型的误差分别构成了各自相对应的研究领域。对于调查数据质量整体评价,需要综合考虑调查整体的设计,其中不仅限于问卷;以及不同误差类型,其中不仅限于测量误差。在不同种类的误差来源中间,本书聚焦于调查问卷设计这一环节,重点关注与之相关的测量误差。需要说明的是,调查设计也可能影响其它种类的误差来源,比如问卷设计可能会影响人们是否愿意参加调查,并由此造成与之相对应的无回复误差。
Second, survey questionnaire design and related measurement errors are only one part and one source of error in many aspects of surveys. According to the theory of total survey error (total survey error), in addition to measurement error, survey error also contains coverage error (coverage error), sampling error (sampling error), nonresponse error (nonresponse error) and processing error (processing error) and so on. Sources. These different types of errors constitute their respective areas of research. The overall evaluation of survey data quality requires a comprehensive consideration of the overall survey design, which is not limited to questionnaires, and the different types of errors, which are not limited to measurement errors. Among the different types of error sources, this book focuses on survey questionnaire design, with an emphasis on the measurement errors associated with it. It should be noted that survey design may also affect other types of error sources, for example, questionnaire design may affect people's willingness to participate in surveys, and this may result in corresponding non-response error.

第三,本书的容量有限,主要涉及与问卷设计相关的理论、方法和实证结果。笔者基于自己在调查方法领域的研究,以及在各种调查项目中的实际经验,试图在有限的容量内回答读者在不同调查场景下可能遇到的具有共性的问卷设计问题。如果还有哪些问卷设计问题是您所关心的,以及在阅读本书的过程中,您有哪些疑问或者困惑,欢迎读者朋友提出您的宝贵建议和意见谢谢!
Thirdly, this book has a limited capacity and mainly deals with theories, methods and empirical results related to questionnaire design. Based on the author's own research in the field of survey methodology and his practical experience in various survey projects, he tries to answer, within the limited capacity, the questionnaire design problems with commonality that readers may encounter in different survey scenarios. If there are still any questionnaire design problems that you are concerned about, and if you have any questions or confusions in the process of reading this book, readers are welcome to put forward your valuable suggestions and comments. Thank you!

2

I
I. INTRODUCTION

第一章 关于受访者回答问卷问题的理论1
Chapter I. Theories on respondents' answers to the questionnaire 1

第一节 回答调查问题的四步骤模型1
Section I. Four-step model for answering survey questions 1

第二节满意理论4
Section II. "Satisfaction" theory 4

第三节 解释性启发式8
Section III. Interpretive heuristics 8

第四节 三个理论之间的关系12
Section IV. Relationship between the three theories 12

第二章 评估问卷设计的指标14
Chapter II. Indicators for assessing the design of the questionnaire 14

第一节 测量误差14
Section I. Measurement errors 14

第二节 回复质量指标16
Section II. Response quality indicators 16

第三节 回复质量指标与回复质量之间的关系22
Section III. Relationship between response quality indicators and response quality 22

第四节 量表的信效度检验24
Section IV. Reliability testing of the scale 24

第五节 量表信度与回复质量指标的关系26
Section V. Relationship between scale reliability and response quality indicators 26

第三章 评估问卷设计的方法28
Chapter III Methodology for the design of the evaluation questionnaire 28

第一节 认知访谈28
Section I. Cognitive interviews 28

第二节 调查实验31
Section II. Investigative experiments 31

第三节 系统综述35
Section III. System overview 35

第四节 眼动技术36
Section IV. Eye movement technology

第四章 问卷设计的实证研究与发现40
Chapter 4: Empirical research and findings on questionnaire design 40

第一节 问题措辞40
Section I. Question wording

第二节 选项效应46
Section II. Option effects

第三节 问题格式57
Section III. Format of the question 57

第五章 移动网络调查的问卷设计64
Chapter 5 Questionnaire Design for Mobile Web Surveys 64

第一节 移动网络调查成为发展趋势64
Section I. Mobile Web Surveys as a Trend 64

第二节 基本设计原则与具体设计问题65
Section II. Basic Design Principles and Specific Design Issues 65

第三节 移动网络调查的效果77
Section III. Effectiveness of mobile web surveys 77

第六章 主动数据与被动数据相结合的新型调查方法86
Chapter VI. New survey methodology combining active and passive data 86

第一节 主被动数据结合方法的特征86
Section I. Characteristics of the combined active-passive data approach 86

第二节 主被动数据结合方法面临的挑战92
Section II. Challenges of the combined active-passive data approach 92

第七章 密集追踪调查-生态瞬时评估法100
Chapter VII. Intensive Tracking Survey - Ecological Transient Assessment Method 100

第一节 生态瞬时评估法的基本内容100
Section I. Basic elements of the ecological transient assessment method 100

第二节 EMA中的数据质量特征102
Section II. Data Quality Characteristics in EMA 102

第三节 EMA研究案例107
Section III. EMA Study Cases 107

第八章 问卷调查发展与大数据的兴起116
Chapter 8 Questionnaire survey development and the rise of big data 116

第一节 调查模式的发展演变116
Section I. Evolution of the investigation model 116

第二节 大数据的基本特征121
Section II. Basic characteristics of big data 121

第三节 大数据与传统调查的比较与融合127
Section III. Comparison and integration of big data and traditional surveys 127

参考文献132
Reference 132

2

第一章 关于受访者回答问卷问题的理论
Chapter 1: Theories on respondents' answers to questionnaire questions

在本章中,将介绍关于受访者如何回答问卷问题的三个重要理论。它们分别回答调查问题的四步骤模型(简称四步骤模型),“满意”理论(satisficing theory),和解释性启发式(interpretive heuristics)。
In this chapter, three important theories about how respondents answer questionnaire questions are presented. They are the four-step model for answering survey questions (the four-step model), the "satisficing" theory, and the interpretive heuristics.

第一节 回答调查问题的四步骤模型
Section I. Four-step model for answering survey questions

Tourangeau1984)提出,受访者回答问题的过程可以被分为四个步骤理解题意;②搜索和提取相关信息;③综合信息形成判断;④根据问题的要求,选择合适的选项(单选或多选题)或者给出回答(填空题或者开放题)。上述的四个步骤,可以简单地概括为理解提取判断和回答。1-1中给出了回答调查问题的四步骤模型,其中实线箭头可以视为理想情况下受访者回答问卷问题时经历的认知过程的顺序虚线箭头用于示意在一些情况下,受访者可能会跳过其中一个或者多个认知步骤,或者从下一个认知步骤返回到上一个认知步骤
Tourangeau (1984) suggested that the process of respondents answering a question can be divided into four steps: (i) comprehending the meaning of the question; (ii) searching for and extracting relevant information; (iii) synthesizing the information to form a judgment; and (iv) selecting the appropriate option (e.g., single- or multiple-choice) or giving an answer (e.g., fill-in-the-blank or open-ended question) according to the requirements of the question. The four steps mentioned above can be simply summarized as understanding, extracting, judging and answering. A four-step model for answering survey questions is given in Figure 1-1, where the solid arrows can be seen as the sequence of cognitive processes that a respondent would go through when answering a questionnaire question in an ideal situation; the dashed arrows are used to illustrate the fact that, in some cases, a respondent may skip one or more of these cognitive steps or return from the next cognitive step to the previous cognitive step.

资料来源:Tourangeau1984)。
Source: Tourangeau (1984).

1-1 回答调查问题的四步骤模型
Figure 1-1 Four-Step Model for Answering Survey Questions

通常而言,理解题意是受访者回答问题的第一步。生僻的专业术语、复杂的句式结构,以及在一个问题中实际包含了多个问题的复合问题(double-barreled question都可能给受访者造成理解上的困难。即便对于看似简单的概念,受访者的理解也可能出现偏差。英国统计局在对其2011年人口普查数据质量的研究发现,对于家里有几个房间这道题目,尽管问卷提供了一些说明(如,应该包括哪些房间,不包括哪些房间),但仍有不少受访者无法判断哪些房间需要纳入到回答中(Teague,2017)。看似差别不大的问题措辞,可能会对调查结果产生意想不到的影响。如对于“全球变暖”和“气候变化”这两种看似意思非常相近的表述,不同的措辞会影响到最终得到的调查结果。一份以美国为场景的研究表明,相对于“全球变暖正在发生”,来自共和党的受访者更加同意“气候变化正在发生”(Schuldt et al.,2015)。笔者在对国内大型调查的研究发现,受访者对信佛信佛教有不同的理解相对于信佛教,有更高比例的受访者选择信佛,一字之差可以使调查结果发生很大的改变(Zhang et al.,2022)。
In general, understanding the meaning of the question is the first step for respondents to answer the question. Unexplained terminology, complex sentence structures, and double-barreled questions that actually contain more than one question in a single question can cause difficulties in understanding. Even for seemingly simple concepts, respondents' understanding may be skewed. For example, the UK Statistics Authority's study of the quality of data from its 2011 census found that many respondents were unable to determine which rooms needed to be included in their answer to the question "How many rooms are there in the house?" despite the instructions provided in the questionnaire (e.g., which rooms should be included and which rooms should be excluded) ( Teague, 2017). Seemingly small differences in the wording of questions may have an unexpected impact on the survey results. For example, for the expressions "global warming" and "climate change", which seem to have very similar meanings, different wording can affect the final results of the survey. In a study based on the United States, respondents from the Republican Party agreed more with the statement "climate change is occurring" than with the statement "global warming is occurring" (Schuldt et al., 2015). In a study of a large national survey, the author found that respondents had different understandings of "believing in Buddhism" and "believing in Buddhism". A higher percentage of respondents chose "Buddhism" over "Buddhism", and the difference in the words can make a big difference in the survey results (Zhang et al., 2022).

在提取相关信息和作出判断两个环节,受访者经历的认知过程和采取的策略与问题类型有关。对于个人的经历、行为以及相关事件等事实问题,人们在多大程度上能够准确地从记忆中提取信息,取决于自传体记忆(autobiographical memory)的结构。根据关于自传体记忆的不同理论,Tourangeau2000)把影响回忆的准确性因素归为两类:事件的特征,如发生时间的远近,每个事件的独特性、重要性等问题的特征,如题目提供的可用于回忆的线索回忆的顺序、调查的节奏等。在事实问题中,常见的一类问题是:在一个参照期内(如过去一个月、过去一年),某个行为(如出去看电影、在家听音乐)发生的频率。Tourangeau 等(2000)总结了受访者回答频率问题时的使用策略,以及相关的影响因素进行了研究。就使用的回答策略而言,包括从最细致回忆每个具体事件然后加总,到粗略估计和猜测等不同类型。就影响受访者回复策略的因素而言,包括事件总体数量和规律性每个事件的独特性参照期的长度等内容。对于态度问题,早期理论认为人们的态度是稳定的当调查问及对一个对象的态度时,人们只需要找到这个态度并且报告出来即可,这个过程就好像在抽屉里找到需要的文件一样。然而,对态度的这种观点无法解释为什么人们的态度容易受到问题的措辞、顺序、上下文语境等调查设计因素的影响。Tourangeau2000)把受访者回答态度问题可能用到的信息来源分为三种:基于印象,基于一般的价值和立场,基于对目标对象的具体信念和感受。总的来说,在提取和判断这两个回答调查问题的步骤中,无论是对事实问题还是态度,都存在多种回复策略,对回复质量有着各自的涵义
The cognitive processes experienced and the strategies adopted by respondents in both extracting relevant information and making judgments are related to the type of question. The extent to which people are able to accurately extract information from memory for factual problems such as personal experiences, behaviors, and related events depends on the structure of autobiographical memory (AUM). According to different theories about autobiographical memory, Tourangeau et al. (2000) categorized the factors affecting the accuracy of recall into two groups: the characteristics of the event, such as the proximity of the time of occurrence, the uniqueness and importance of each event, etc.; and the characteristics of the question, such as the clues provided by the question that can be used for recall, the sequence of recall, and the pace of the investigation. A common type of question in factual questions is the frequency with which a certain behavior (e.g., going out to the movies, listening to music at home) occurred during a reference period (e.g., past month, past year).Tourangeau et al. (2000) summarized the strategies used by respondents in answering the frequency question and the related influences examined. In terms of the response strategies used, they included different types ranging from the most detailed recall of each specific event and then adding up, to rough estimates and guesses. In terms of factors influencing respondents' response strategies, these included elements such as the overall number and regularity of events, the uniqueness of each event, and the length of the reference period. For attitude questions, early theories suggested that people's attitudes were stable. When surveys asked about attitudes toward an object, people simply needed to find that attitude and report it, a process akin to finding the needed file in a drawer. However, this view of attitudes does not explain why people's attitudes are susceptible to survey design factors such as the wording of the question, the order, the contextual context, etc. Tourangeau et al. (2000) categorize the sources of information that respondents may use to answer attitude questions into three categories: impression-based, general values and stances-based, and specific beliefs and feelings about the target object. Overall, in both the extraction and judgment steps of answering survey questions, both for factual questions and attitudes, there are multiple response strategies that have their own meanings for response quality.

在四步骤模型中,最后一部分是给出回答。在经历了理解题意、记忆提取、综合判断这几个部分之后,选择一个选项或者给出一个回答看似是容易的。然而,调查方法领域的大量研究表明,这最后一步同样构成了产生测量误差的重要原因。为了更好地分析测量误差在受访者回答这一步骤中是如何产生的,Tourangeau 2000将回答部分进一步分解成两个过程:第一,映射(mapping),根据理解、提取、判断的结果,选择一个合适的选项或给出一个回答;第二,编辑editing),对选择/回答进行编辑调整。受访者在映射这一步可能并不十分明确自己应该如何选择,如在“经常”和“频繁”“非常同意”和“比较同意”等相对接近的选项中间通常没有明确的界限。调查方法学领域的大量研究发现,选项和数量、方向、文字标签等量表设置的不同,可以影响到受访者的回答结果(详见本书第四章第二节)。在编辑这一步,大量研究发现对于一些问题,由于社会期许效应(social desirability effect的存在,担心信息泄漏带来的风险或者其他的顾虑,受访者可能不会报告自身真实的情况(Tourangeau et al.,2007a)。
In the four-step model, the last part is giving an answer. After going through the parts of understanding the question, memorizing and extracting it, and synthesizing and judging it, choosing an option or giving an answer may seem easy. However, a large body of research in the field of survey methodology has shown that this last step constitutes an equally important cause of measurement error. In order to better analyze how measurement error arises in this step of respondent answering, Tourangeau et al. (2000) further decomposed the answering part into two processes: first, mapping, selecting an appropriate option or giving an answer based on the results of comprehension, extraction, and judgment; and second, editing, making editorial adjustments to the choice/answer. editing. Respondents may not be very clear about their choices in the mapping step, such as in the relatively close proximity of "often" and "frequently", "strongly agree" and "relatively agree", and so on. There are often no clear boundaries between relatively close choices such as "often" and "frequently", "strongly agree" and "somewhat agree". Numerous studies in the field of survey methodology have found that differences in options and scale settings such as number of options, directions, and text labels can affect respondents' answers (see Chapter 4, Section 2 of this booklet for more details). At the editing step, a large body of research has found that for some questions, respondents may not report their true situation due to the social desirability effect, fear of risks associated with information leakage, or other concerns (Tourangeau et al., 2007a).

在描述四步骤模型时,使用到了“步骤”或“阶段”的说法。Tourangeau2000)强调,受访者在回答问题时的思考过程不一定总是按照理解、提取、判断、回答顺序进行。受访者回答问题时实际经历的认知过程是多样的,可能是由这四个步骤中的一个或者多个部分形成的多种形式的组合。这些步骤可能同时发生,相互之间可能有重叠。受访者在读题的同时,已经开始在记忆中提取相关信息。受访者也可能在不同步骤之间反复,如在回答事实问题时,在初步搜索记忆之后判断提取信息的准确度,如果认为不够准确则会继续搜索记忆。与此同时,受访者可能跳过某个或者多个步骤。对于一些认为敏感的题目,受访者可能在了解题意以后,直接选择“不知道”或者“拒绝回答”的选项。
In describing the four-step model, the term "steps" or "stages" is used, and Tourangeau et al. (2000) emphasize that respondents' thought processes in answering questions do not always follow the order of comprehension, extraction, judgment, and response. in the order of comprehension, extraction, judgment, and answer. The actual cognitive processes that respondents go through when answering a question are varied and may be a combination of many forms formed by one or more parts of these four steps. These steps may occur simultaneously and may overlap with each other. For example, the respondent may have already begun to extract relevant information from memory while reading the question. Respondents may also repeat between steps, e.g., when answering a factual question, judging the accuracy of the extracted information after an initial search of the memory, and continuing to search the memory if it is not considered accurate enough. At the same time, respondents may skip one or more steps. Respondents may choose "don't know" or "refuse to answer" for questions that are considered sensitive after understanding the meaning of the question.

四步骤模型的提出,为调查方法学领域关于测量误差的研究提供了重要的理论框架。在四步骤模型的框架下,关于测量误差的研究可以聚焦在受访者回复调查问题的一个或者多个步骤,帮助研究者更加细致和深入分析产生测量误差的来源,为提升回复质量提供更加明确的策略。在Tourangeau等(2000)的《调查回复的心理学》(The Psychology of Survey Response一书中,结合当时认知心理学和调查方法学理论以及相关的实证研究,详细梳理和阐述了受访者在理解、提取、判断和回答每个步骤中,可能经历的具体认知过程和采用的回复策略,具体如,人们如何回答态度问题,可能使用哪些回答策略,影响因素有哪些,等等。过去二十年间对测量误差的研究主要是运用四步骤模型,通过实证方法评估不同调查设计对受访者回复的影响。相对而言,对受访者回答问题时经历的具体认知过程缺乏更进一步的研究。因此,《调查回复的心理学》书中关于受访者如何回答问题的分析和讨论,对当前研究仍然具有重要的指导意义。
The proposal of the four-step model provides an important theoretical framework for research on measurement error in the field of survey methodology. Under the framework of the four-step model, research on measurement error can focus on one or more steps in respondents' responses to survey questions, helping researchers to analyze the sources of measurement error in greater detail and depth, and providing clearer strategies for improving response quality. In Tourangeau et al.'s (2000) book "The Psychology of Survey Response" ("The Psychology of Survey Response"), combining the theories of cognitive psychology and survey methodology at that time as well as relevant empirical studies, we have detailedly sorted out and elaborated on the respondents' ability to understanding, extracting, judging and answering each step, the specific cognitive processes that respondents may go through and the response strategies that they adopt, specifically, for example, how people answer attitudinal questions, which response strategies they may use, what are the influencing factors, and so on. Research on measurement error over the past two decades has focused on empirically assessing the effects of different survey designs on respondents' responses using the four-step model. There is a relative lack of further research on the specific cognitive processes that respondents go through when answering questions. Therefore, the analysis and discussion of how respondents answer questions in the book The Psychology of Survey Response remains an important guide for current research.

第二节 满意理论
Section II. "Satisfaction" theory

在关于受访者回答的理论中,“满意”理论(satisficing theory得到了大量的关注。根据《韦氏大词典》中的解释,satisfice由“satisfy(满意)和suffice(足够)两个词语结合而来的,被用来表达做得刚刚好以满意的意思。为了表述的方便,本文把satisfice也翻译成满意,并加上引号以提示读者satisficesatisfy”所表达出来的涵义具有区别。
Among the theories about respondents' answers, the "satisfaction" theory has received a great deal of attention. According to the Wechsler Dictionary, "satisfice" is a combination of the words "satisfy" and "suffice". The word "satisfice" is a combination of the words "satisfy" and "suffice", which is used to express the meaning of "doing just enough to satisfy". For the sake of convenience, this article translates "satisfice" as "satisfaction" and puts it in quotation marks to remind the reader that "satisfice" and "satisfy" are the same. Satisfice" and "satisfy" have different meanings.

赫伯特·西蒙(Herbert A. Simon20世纪5060年代,围绕着决策理论的领域,发表了多篇关于有限理性论(limited rationality)的著作。有限理性论考虑到个体的信息处理能力存在限度,从而区别于传统的古典经济人假设。基于有限理性论,在面对外部环境和自身能力约束时,一种理性的决策并非是充分评估了所有可能性之后在其中选取结果最优的方案而是付出足够的努力在有限的可能性中进行选择,旨在得到一个满意方案(Simon19551956)。
Herbert A. Simon published several works on limited rationality (LR) in the 1950s and 1960s, centered around the field of decision theory. Limited rationality takes into account the limits of an individual's information-processing capacity, thus distinguishing it from the traditional assumption of classical economic man. Based on the theory of limited rationality, a rational decision is not to choose the optimal solution among all possibilities after fully evaluating them; rather, it is to make enough effort to choose among the limited possibilities with the aim of obtaining a satisfactory solution (Simon, 1955, 1956).

西蒙提出的有限理性论对经济学、管理学、认知心理学等诸多领域有着重要影响,其中“满意”的核心概念构成了不少讨论的基础内容。调查方法学界对“满意”概念的关注和运用,主要始于Krosnick1991的讨论Krosnick1991)运用满意来描述受访者回答调查问题时没有尽可能努力给出一个最优的回答,而是付出足够的努力以获得一个令自己满意的回答。这种回答问题的策略,在调查方法学中被称为满意策略。基于满意受访者回复策略的阐述,进而被称为满意理论。
Simon's theory of finite rationality has had a significant impact on many fields, including economics, management, and cognitive psychology, where the core concept of "satisfaction" has formed the basis of many discussions. The focus on and use of the concept of satisfaction in survey methodology began with Krosnick's (1991) discussion of the use of "satisfaction" to characterize respondents who do not try as hard as possible to give an optimal response to a survey question, but instead effort to give an optimal response, but rather put in enough effort to get a response that satisfies them. This strategy of answering questions is known as the "satisfaction" strategy in survey methodology. The elaboration of respondent response strategies based on the concept of "satisfaction" has come to be known as "satisfaction" theory.

Tourangeau1984)提出的回答调查问题四步骤模型基础上Krosnick1991)进一步定义了最优和满意的回答策略。根据四步骤模型,受访者在回答调查问题时的认知处理过程可以分为四个部分:解读问题的意思,在记忆里搜寻相关的信息,把相关的信息整合起来形成综合判断,基于这个综合判断回答调查的问题(详见本章第一节)。最优化的回答策略是指受访者在回答调查问题时仔细且全面执行上述这四个步骤。如果受访者执行上述步骤不认真,或者干脆跳过了一个或者多个步骤,即构成了使用“满意”策略的回答方式。如对于调查问题 “在过去的一个月,你总共点了多少次外卖”,假如受访者经常叫外卖,则需要在记忆(或者手机记录中)检索大量的相关信息才能准确地回答这个问题。受访者可能不愿意费这番功夫,而是基于自己的大概印象给出一个估计。这种回答方式不是最优的回答方式,而是表现为“满意”的回答方式。
Based on the four-step model of answering survey questions proposed by Tourangeau (1984), Krosnick (1991) further defined the optimal and "satisfactory" answering strategy. According to the four-step model, respondents' cognitive processing in answering survey questions can be divided into four parts: interpreting the meaning of the question, searching for relevant information in memory, integrating the relevant information to form a comprehensive judgment, and answering the survey question based on this comprehensive judgment (see the first section of this chapter for details). An optimal response strategy is one in which the respondent carefully and comprehensively performs all four of these steps when answering the survey questions. If a respondent does not perform these steps carefully, or simply skips one or more of them, this constitutes a response using a "satisfactory" strategy. For example, if the question "In the past month, how many times have you ordered takeout?" is asked, the respondent would need to retrieve a large amount of information from memory (or cell phone records) in order to accurately answer the question if he or she orders takeout frequently. The respondent may not want to go through this effort and give an estimate based on his/her general impression. This is not an optimal response, but rather a "satisfactory" response.

满意策略的使用可能使得受访者的回答不准确,这是问卷调查的执行者和研究者所不希望看到的。调查研究在谈及满意策略时,从而通常表现出一种负面的态度。但是对受访者而言,最优的回答问题方式在很多情况下实际上很难做到。在资源、能力、周围环境等因素的限制下,满意策略可能是一种不准确但理智或可行的策略。比如,如果一个人经常但是不规律的点外卖,要准确地回答出过去一个月总共点了多少次外卖,虽然在理论上可行,但是实际上需要费很多精力才能计算清楚。又如,常见的对总体生活满意度的测量,需要受访者根据其对生活总体的满意程度给出一个分数通常而言,使用“1”表示非常不满意,使用“5”表示非常满意,要求受访者在15之间打一个分数。这种情况下,最优的回答策略至少需要受访者非常认真思考生活的各个方面。这种最优的回答策略需要耗费受访者相当的时间和精力,实际操作往往难以界定多认真才算认真通过对上述的案例进行总结可以发现,最优策略虽然理论上存在,但是现实中受访者采取“满意”策略可能是调查实务中的常态。
The use of the satisfaction strategy may make the respondents' answers inaccurate, which is not desired by the questionnaire administrators and researchers. Survey studies thus usually show a negative attitude when talking about satisfaction strategies. But the optimal way of answering questions for respondents is actually difficult in many cases. The satisfaction strategy may be an inaccurate but sensible or feasible strategy within the constraints of resources, abilities, and surroundings. For example, if a person orders takeout frequently but irregularly, to answer accurately how many times he or she has ordered takeout in total in the past month, while theoretically feasible, would actually require a lot of effort to figure out. Another example is the common measure of overall life satisfaction, which requires respondents to give a score based on how satisfied they are with their life in general. Typically, a "1" is used to indicate very dissatisfied and a "5" is used to indicate very satisfied, and the respondent is asked to give a score between 1 and 5. In this case, the optimal response strategy would require the respondent to think very carefully about at least one aspect of his or her life. This optimal response strategy requires considerable time and effort on the part of the respondent, and in practice it is often difficult to define how "serious" is "serious". By summarizing the above cases, it can be seen that although the optimal strategy exists in theory, in reality, respondents may adopt the "satisfied" strategy as the norm in survey practice.

理论上而言,所有非最优的回答策略,都是满意策略。受访者采用满意策略并不一定意味其回答问题四个部分的每一步都产生非最优的结果。如在上述点外卖次数的例子中,由于调查问题简单明确,受访者在理解题目的这一步可能是充分的受访者能够清楚地明白这个问题的意思,满意测量可能主要出现在搜索信息这一步。满意策略对回复质量的影响取决于受访者使用满意策略的程度。Krosnick1991认为,“满意策略有强弱之分。如果受访者在回答问题时,虽然完成了理解提取判断和回答这四个步骤,但是没有足够认真地执行每一个步骤,可以被认为是弱“满意”的策略。如果受访者在读题后,略过提取和判断这两步,直接回答问题;或者在更加极端的情况下,受访者在没有读题的情况下直接给出回答,如在单选题里随机选择一个选项,可以被认为是强“满意”的策略。
Theoretically, all non-optimal response strategies are "satisfactory" strategies. The fact that a respondent uses a "satisfactory" strategy does not necessarily mean that each step in the four parts of the question produces a non-optimal result. For example, in the "Number of take-out orders" example above, the respondent's understanding of the question may have been adequate at this step because of the simplicity and clarity of the survey question. Respondents were able to understand the question clearly, and the "satisfaction" measure may be found mainly in the step of searching for information. "The effect of the "satisfaction" strategy on response quality depends on the extent to which the respondent uses the "satisfaction" strategy, which according to Krosnick (1991) is strong or weak. Krosnick (1991) argues that there are strong and weak "satisfaction" strategies. If a respondent completes the four steps of comprehension, extraction, judgment, and answering, but does not perform each step carefully enough, it can be considered a weak "satisfactory" strategy. If the respondent skips the extraction and judgment steps and answers the question directly after reading the question, or in more extreme cases, if the respondent gives an answer directly without reading the question, such as choosing a random option in a multiple choice question, this can be considered a strong "satisfactory" strategy.

Krosnick1991)提出三个影响受访者使用满意策略的因素,分别是任务难度受访者能力和受访者动机。任务难度的影响,与问题特征存在关联。根据回答调查问题的四步骤模型,回答问题的难度由理解、提取、判断、回答这四步骤的难度组成。调查研究对问题难度的评估主要通过回复时长加以反映。一般而言,回复时长越长,意味着问题的难度越大。Yan2007)发现网络调查中影响回复时长的问题特征包括从句的数量从句内平均单词数选项的数量和类型,以及问题在问卷中的位置等因素。具体到问卷类型,包括是否是量表量表中选项的文字标签如何设置等内容。
Krosnick (1991) suggested three factors that influence respondents' use of the "satisfaction" strategy: task difficulty, respondent ability, and respondent motivation. The effect of task difficulty is related to the characteristics of the question. According to the four-step model of answering survey questions, the difficulty of answering a question consists of the difficulty of the four steps: understanding, extracting, judging, and answering. The assessment of question difficulty in survey research is reflected mainly by response length. In general, a longer response length implies a more difficult question.Yan et al. (2007) found that question characteristics that affect response length in web surveys include factors such as the number of clauses, the average number of words within a clause, the number and type of options, and the location of the question in the questionnaire. Specific to the type of questionnaire, this included such things as whether it was a scale or not, and how the text labels of the options in the scale were set.

Krosnick1991)提出影响满意策略的受访者能力包括三个方面提取整合概括信息的能力对特定问题的思考程度,以及受访者是否在调查前就对相关问题具有明确的态度。目前对受访者满意策略的研究主要使用教育程度作为受访者能力的近似替代。国内外的一些研究发现教育程度越低,满意策略的使用程度越高(Krosnick et al.,2002Anduiza et al.,2017)。笔者在国内网络调查的研究也发现,教育程度较低的受访者,采满意策略的程度更高(Zhang et al.,2023)。Krosnick1991)认为影响受访者动机的因素很多,可能包括调查的重要性对调查的兴趣,受访者在多大程度上认为需要对自己的行为进行解释或者负责,以及受访者内在的认知需求等。
Krosnick (1991) suggests that respondents' competencies that influence "satisfaction" strategies include three aspects: the ability to extract, integrate, and summarize information, the extent to which they think about a particular issue, and whether or not the respondent has a clear pre-survey attitude about the issue. Current research on respondents' "satisfaction" strategy mainly uses education level as an approximate proxy for respondents' competence. Some studies at home and abroad have found that the lower the level of education, the higher the level of satisfaction strategy use (Krosnick et al., 2002; Anduiza et al., 2017). The author's research on domestic online surveys also found that respondents with lower levels of education had a higher level of use of the satisfaction strategy (Zhang et al., 2023).Krosnick (1991) argues that there are a number of factors affecting respondents' motivation, which may include the importance of the survey, interest in the survey, the extent to which the respondent is motivated, and the extent to which the respondent is interested in the survey, and the extent to which the respondent is motivated by the survey. the extent to which respondents feel the need to explain or take responsibility for their behavior, and the respondent's intrinsic cognitive needs.

总体上来说,受访者在一道问卷问题上思考越充分,这个回答越接近最优的回答。基于Krosnick1991关于受访者动机和受访者能力与满意策略关系的阐述,Zhang2013作了更进一步的实证探索。如图1-2中的研究结果表明,受访者在回答一道问卷问题时存在一个可达到的最充分思考程度,和最优的回复之间的差距与受访者能力有关。与此同时,受访者在某道题目存在一个实际的思考程度,与最充分的思考程度之间的差距取决于受访者努力认真作答的动机
Overall, the more fully a respondent thinks about a question, the closer the response is to the optimal one. Based on Krosnick's (1991) description of the relationship between respondent motivation and respondent competence and the "satisfaction" strategy, Zhang (2013) made a further empirical exploration. The findings in Figure 1-2 show that there is a maximum level of thinking that can be achieved by a respondent when answering a question, and the gap between it and the optimal response is related to respondent competence. At the same time, there exists an actual level of thinking on a question, and the gap between it and the optimal level of thinking depends on the respondent's motivation to try to answer carefully.

资料来源:Zhang2013)。
Source: Zhang (2013).

1-2 受访者思考问题充分程度与受访者动机和能力之间的关系
Figure 1-2 Relationship Between Respondents' Adequacy of Thinking About Issues and Respondents' Motivation and Ability to

“满意”是受访者的一种主观决策,在一般情况下,“满意”策略本身不能直接被研究者观察到。Krosnick1991)提出了一系列关于“满意策略的表现形式,其中包括:第一,在一系列选项里选择第一个看起来合适的;第二,回答不知道”;第三,当一系列问题选项相同时倾向于给出一样的回答。如当都是“1=非常不满意,5=非常满意”时选择“非常满意。这些满意策略的表现形式,在调查方法的研究中也被称为满意行为或“满意指标。本书中,将采用满意指标说法。需要注意的是,满意指标的出现,并不意味受访者一定采用了满意策略,如受访者回答“不知道”时,可能真的是不知道。这就像流感的症状包括发烧,但是发烧不一定意味着一定得了流感。关于“满意”指标更加细致的讨论,详见本书第二章第三节的内容
"Satisfaction" is a subjective decision by the respondent, and in general, the "satisfaction" strategy itself cannot be directly observed by the researcher; Krosnick (1991) proposed a series of manifestations of the "satisfaction" strategy, including: first, choosing the first option in a series that seems appropriate; second, answering "don't know"; and third, tending to give the same answer when a series of questions have the same options. Krosnick (1991) suggested a number of ways in which the "satisfaction" strategy might be manifested, including: first, choosing the first of a series of choices that seems appropriate; second, answering "I don't know"; and, third, preferring to give the same answer to a series of questions when the choices are the same. For example, choosing "very satisfied" when all the options are "1 = very dissatisfied, 5 = very satisfied". These expressions of "satisfaction" strategies are also known as "satisfaction" behaviors or "satisfaction" indicators in the study of survey methods. In this book, the term "satisfaction" indicators will be used. It is important to note that the presence of a "satisfaction" indicator does not necessarily mean that the respondent has adopted a "satisfaction" strategy, e.g., when the respondent answers "I don't know", he or she may really not know. This is just like the symptoms of influenza, which include the onset of the flu. This is similar to the fact that symptoms of influenza include fever, but fever does not necessarily mean that one has the flu. For a more detailed discussion of "satisfaction" indicators, see Chapter 2, Section 3 of this book.

Krosnick1991)提出的一系列满意指标,对关于调查回复质量的研究产生了非常重要的影响。理想情况下,对于回复质量的评估需要真实值。继续前面点外卖次数的例子,如果要准确判断这道题的回复质量,需要知道受访者点外卖的真实情况。但是对于真实世界中的研究来说,类似的信息往往不可得。实际上,如果研究者已经知道相关信息,就没有必要在问卷中去提问了。在相关研究中,准确判断调查问题的回复质量通常是非常困难的。满意指标和满意策略并不是完美对应的关系,“满意指标有时并不意味着回复质量低。但上述的研究背景下,,Krosnick1991)提出了这些满意指标之后,调查方法领域出现了大量研究,利用满意指标对调查的回复质量进行评估。虽然满意指标的数值并不一定能直接反映出满意策略的程度,但研究者可以通过比较不同调查方法对满意指标的影响,来推断对数据质量的影响。如在对调查模式的研究,经常使用满意指标来判断面对面、电话访问和网络调查中哪种调查模式的回复质量更好Fricker et al.,2005Holbrook et al.,2003)。
The series of "satisfaction" indicators proposed by Krosnick (1991) has had a very important impact on research on survey response quality. Ideally, the quality of responses should be assessed in terms of their true value. Continuing with the previous example of "number of take-out orders," to accurately determine the quality of responses to this question, one would need to know the actual number of take-out orders that respondents have made. However, for real-world research, similar information is often not available. In fact, if the researcher already knows the relevant information, there is no need to ask the question in the questionnaire. In related studies, it is often very difficult to accurately judge the quality of responses to survey questions. The "satisfaction" indicator and the "satisfaction" strategy are not perfect correspondences, and the "satisfaction" indicator sometimes does not mean that the response quality is low. However, in the context of the research described above, and following Krosnick's (1991) introduction of these "satisfaction" indicators, there has been a significant amount of research in the field of survey methodology that has utilized "satisfaction" indicators to assess the quality of responses to surveys. Although the numerical value of a satisfaction indicator does not necessarily reflect the degree of a "satisfactory" strategy, researchers can compare the effects of different survey methods on satisfaction indicators to infer the effects on data quality. However, by comparing the effects of different survey methods on the "satisfaction" indicators, researchers can infer the impact on data quality. For example, in studies of survey modes, satisfaction indicators are often used to determine which mode of survey, face-to-face, telephone interviews, or web-based surveys, yields better response quality (Fricker et al., 2005; Holbrook et al., 2003).

第三节 解释性启发式
Section III. Interpretive heuristics

启发式是一个心理学概念,用于描述在不确定的情况下进行的判断或选择。在具有不确定性的背景下,人们有时会采用一些简化的启发式法则,而不是在对各种可能的结果及其发生概率进行全面、细致、深入的考量之后得到最优的判断或者决定(Tversky et al.,1974)。启发式不仅是心理学领域一个重要概念,对经济学、政治学、法学以及人工智能等多个领域都有非常重要的影响(Gilovich et al.,2002)。
A heuristic is a psychological concept used to describe a judgment or choice made in a situation of uncertainty. In the context of uncertainty, people sometimes adopt simplified heuristics instead of making optimal judgments or decisions after thorough, detailed, and in-depth consideration of possible outcomes and their probabilities of occurrence (Tversky et al., 1974). Heuristics are not only an important concept in the field of psychology, but also have a very important impact on many fields such as economics, political science, law, and artificial intelligence (Gilovich et al., 2002).

图兰吉和他的同事使用“启发式”的概念解释受访者有时候会借助一些看起来不重要的问卷设计元素来帮助自己解读和回答问题(Tourangeau et al.,2004,2007a,2013)。他们提出了五种解释性启发式的思路,并且通过一系列调查方法学实验证实了这些解释性启发式在受访者回答问题过程中确实发挥了作用。接下来,是对五解释性启发具体介绍
Tourangeau and his colleagues use the concept of "heuristics" to explain that respondents sometimes rely on seemingly unimportant elements of questionnaire design to help them interpret and answer questions (Tourangeau et al., 2004, 2007a, 2013). They proposed five interpretive heuristic ideas and confirmed through a series of survey methodology experiments that these interpretive heuristics do play a role in respondents' question answering process. What follows is a specific description of the five interpretive heuristics.

第一,中间意味着典型(middle means typical在解读一个量表的选项时,受访者可能会认为量表的中间位置代表概念上的中点或者将其视为最典型最常见的回答。Tourangeau2004)通过使用实验的方法,比较了两种量表选项的设计。如图1-3所示其中一种是传统的等间距设计,即量表中选项之间的距离是相同的另一种量表的间距不等,越靠近左边的选项间距越窄。在间距不等的这种设计中,原来的中间选项(even chance)在视觉上不再是量表的中点相对于等间距设计,possibleunlikely两个选项在间距不等的设计中更加接近视觉的中点。实验研究的结果发现,相较于等间距设计,在间距不等的设计中受访者选择probablepossibleunlikely三个选项的比例之和显著更高。这意味着量表的中间位置对受访者具有特殊的意义当选项越接近视觉的中间位置时,受访者可能会认为这样的选项回答更加常见或者普遍,也就更加容易选择这个选项。
First, middle means typical. When interpreting the options on a scale, respondents may perceive the middle of the scale as representing the conceptual midpoint or as the most typical and common response.Tourangeau et al. (2004) compared two scale option designs using an experimental approach. As shown in Figures 1-3 one of these is the traditional equally spaced design, where the options in the scale are spaced the same distance from each other; the other scale is unequally spaced, with the options closer to the left being narrower in spacing. In this design with unequal spacing, the original middle option ("even chance") is no longer visually the midpoint of the scale. The "possible" and "unlikely" options were closer to the visual midpoint in the unequal spacing design than in the equal spacing design. The results of the experimental study found that the proportions of respondents choosing the options "possible", "probable" and "unlikely" in the unequal spacing design were closer to the visual midpoint in the unequal spacing design than in the equal spacing design. The sum of the proportions of respondents choosing "probable", "possible" and "unlikely" was significantly higher in the equally spaced design. This implies that the middle position of the scale has special significance for the respondents. The closer the option is to the middle of the visual, the more common or widespread the option or response is perceived to be, and the more likely the respondent is to choose that option.

资料来源:Tourangeau2004
Source: Tourangeau et al. (2004).

1-3 关于中间意味着典型实验的示意图
Figure 1-3 Schematic diagram of a typical experiment on intermediate implication

左和上意味着(选项的)开始(left and top mean first受访者在填写问卷时会期待选项从左边(对于横着排列的选项)或者上边(对于竖着排列的选项)开始;认为不同选项之间从左到右或者从上到下具有递进关系,如选项依次排列为非常同意”“比较同意”“一般”“比较不同意”“非常不同意。根据上述启发式的原则,如果选项顺序的排列不符合人们的这种预期,意味着受访者可能需要更多的时间用于理解和回答这个问题。如图1-4所示,可以在选项中改变“一般”的位置,将选项顺序调整为“一般”“非常同意”“非常不同意”“比较同意”“比较不同意”的顺序。Tourangeau2004结合实验结果,验证了上述的这个假设。他们发现当选项顺序不符合预期时,不仅回复时长有所,回答的分布也发生了改变。
Left and top mean first. When completing the questionnaire, respondents would expect the options to start from the left (for options arranged horizontally) or from the top (for options arranged vertically); they would perceive a progressive relationship between the different options from left to right or from top to bottom, e.g., the options are arranged in the following order: "Strongly Agree," "Quite Agree," "Fairly Agree," "Quite Disagree," and "Strongly Disagree. ", "agree", "agree", "disagree", "disagree", and "disagree". According to the above heuristic, if the order of the options does not match this expectation, it means that the respondent may need more time to understand and answer the question. As shown in Figure 1-4, it is possible to change the position of "Generally" in the options and reorder the options as "Generally", "Strongly Agree", "Strongly Disagree", "More Tourangeau et al. (2004) verified this hypothesis with experimental results. They found that when the order of the options was not as expected, not only did the length of responses increase, but the distribution of responses also changed.

资料来源:Tourangeau2004
Source: Tourangeau et al. (2004).

1-4左和上意味着选项开始实验的示意图
Figure 1-4 left and top implies a schematic of the options to start the experiment

位置相近意味着相关(near means related当两个题目在问卷中的位置靠近时,受访者可能会推断两个题目在概念上也具有联系。关于这一解释性启发式特征的证据主要来自网络调查模式下的矩阵题不同的问题出现在同一页面中的同一个矩阵下面。受访者在回答这些问题时,可能在具体选项上表现出相关特征。研究发现相比于不同题项在不同的页面,当不同题项出现在一个矩阵题里时,题项之间表现出来的相关性更高(Tourangeau et al.,2004Couper et al.,2001)。
Proximity implies relatedness (near means related). When two topics are positioned close to each other in a questionnaire, respondents may infer that the two topics are also conceptually related. Evidence for this interpretive heuristic feature comes mainly from matrix questions in the web survey mode. Different questions appear under the same matrix on the same page. Respondents may exhibit correlated features on specific options when answering these questions. It has been found that correlations exhibited between questions are higher when different questions appear in a single matrix question compared to when different questions are on different pages (Tourangeau et al., 2004; Couper et al., 2001).

1-1多项目与单项屏幕对项目总体相关性和平均完成时间的影响
Table 1-1: Impact of Multiple vs. Single Screens on Overall Project Relevance and Average Completion Time

上面意味着好(up means good在回答问题时,受访者可能会给位于上面的题项更加正面的评价。为了验证这个假设,Tourangeau等(2013)通过六个实验,在不同抽样方法的网络样本中以不同方式控制了题项在屏幕中所处的位置。在不同的实验内容中,包含了不同食物成分、社会群体等类型的评价对象。他们通过对这六个实验的元分析(meta-analysis,详见本书第三章第三节),发现与在屏幕下方相比,当同样的评价对象位于屏幕的上方时,人们倾向于给出更加正面的评价,尽管这个差异并不是很大。
Up means good. When answering a question, respondents are likely to give more positive ratings to question items located above. To test this hypothesis, Tourangeau et al. (2013) conducted six experiments in which the position of the question items on the screen was controlled in different ways in a sample of networks with different sampling methods. The different experiments included different food components, social groups, and other types of evaluation objects. Through a meta-analysis (meta-analysis, see Chapter 3, Section 3 of this book for more details) of these six experiments, they found that people tended to give more positive ratings when the same evaluation objects were located at the top of the screen as compared to being at the bottom of the screen, even though this difference was not very large.

资料来源:Tourangeau2013
Source: Tourangeau et al. (2013).

1-5 上面意味着好实验的示意图
Figure 1-5 Schematic diagram of the good experiment implied above

第五,视觉相似意味着意思相近(like means close。在一些情况下,受访者可能通过视觉上的相似程度推断概念上的联系。Tourangeau2007a进行的实验中控制了从非常反对非常赞成7点量表的颜色。如图1-6所示,在一种情况下,“非常反对这一端为深红,非常赞成这一端为深蓝;在另一种情况下,“非常反对这一端为深蓝,“非常赞成这一端为浅蓝。他们的研究发现深蓝-浅蓝组相比,深红-深蓝组中的受访者更倾向于选择赞成这一侧选项。上述研究同时发现,当7点量表中每个选项都有“有点不赞成”“比较不赞成”等对应的文字标签时,量表颜色对回答的影响减弱。这可能是因为,相比于“深蓝-浅蓝组,当反对这一侧的选项标记为红色时,受访者认为这些选项更极端减少对的选择。人们在回答量表问题时可能会借助量表的各种设计特征解读选项含义,当量表的信息不够明确时尤为如此如当量表中不是每个选项都有文字标签时,信息的缺失给受访者带来了更多的想象空间。在目前针对智能手机填写的网络调查,相关的设计内容非常常见。对诸如1-5”“1-7”等的量表选项,很多时候题目只给出两端的选项文字标签,如1代表非常不满意、“5代表非常满意。在这种情形下,需要警惕问卷设计中其他看似不重要的设计特征,可能会对受访者的回答产生复杂影响。
Fifth, visual similarity implies similarity of meaning (like means close). In some cases, respondents may infer conceptual connections through visual similarities. In the experiment conducted by Tourangeau et al. (2007a), the color of a 7-point scale ranging from "strongly disagree" to "strongly agree" was controlled. As shown in Figures 1-6, in one case, the "strongly oppose" end of the scale was dark red and the "strongly favor" end was dark blue; in the other case, the "strongly oppose" end of the scale was dark blue and the "strongly favor" end was dark blue. In the other case, the "very much against" end of the scale was dark blue and the "very much in favor" end was light blue. Their study found that respondents in the "dark red-dark blue" group were more likely to choose the "yes" side than the "dark blue-light blue" group. It was also found that when each option in the 7-point scale had a corresponding text label such as "somewhat disapprove" and "rather disapprove", the effect of the scale color on the responses was weakened. This may be due to the fact that respondents perceived the options on the "oppose" side as more extreme and chose them less when they were labeled in red compared to the "dark blue-light blue" group. People may use various design features of the scale to interpret the meaning of the options when answering scale questions, especially when the information on the scale is not clear. For example, when not every option on the scale is labeled with text, the lack of information creates more room for the respondent's imagination. In current web surveys filled out for smartphones, relevant design elements are very common. For scale options such as "1-5", "1-7", etc., very often the questions only give text labels for options at both ends, such as "1" for very dissatisfied, "1" for very dissatisfied and "5" for very satisfied. In such cases, it is important to be alert to other seemingly unimportant design features of the questionnaire design that may have a complex impact on respondents' answers.

资料来源:Tourangeau2007a
Source: Tourangeau et al. (2007a).

1-6 视觉相似意味着意思相近实验的示意图
Figure 1-6 Schematic of a visual similarity implies similarity of meaning experiment

第四节 三个理论之间的关系
Section IV. Relationship between the three theories

四步骤模型提出回复问卷问题包括问题理解、记忆提取、综合判断、给出回复四个部分,它的贡献在于给分析受访者如何回答问题,以及这个过程与问卷设计的关系提供了一个框架。调查方法研究引入心理学“满意”的概念,用于描述受访者没有以最优的方式回答问题,而是给出“看似合适”“令人满意”或“足够好”回答的一种回复策略。调查方法研究常常将“满意”理论与四步骤模型相结合,探讨“满意”策略在回答调查问题四步骤中的具体表现。受访者可以将“满意”策略运用在回答过程的一个部分,如在记忆提取这部分时,受访者并没有尽全力努力搜索相关的记忆;或者多个部分,如在极端情况下受访者没有读题,也没有记忆搜索和判断,而是直接在问卷中选择了一个选项。
The four-step model proposes that responding to a questionnaire consists of four parts: understanding the question, extracting from memory, synthesizing the judgment, and giving the response, and its contribution is to provide a framework for analyzing how respondents answer the question and how this process relates to the design of the questionnaire. Survey methodology research has introduced the psychological concept of "satisfaction" to describe how respondents do not answer questions optimally, but rather give responses that "seem appropriate", "satisfactory", or "good enough". "good enough" responses. Survey methodology research often combines the theory of satisfaction with the four-step model to explore how the strategy of satisfaction manifests itself in the four steps of responding to survey questions. Respondents may use the "satisfaction" strategy in one part of the response process, such as in the memory retrieval part, where the respondent does not make a concerted effort to search for relevant memories, or in more than one part, such as in the extreme case where the respondent does not read the question, or search for memories and make judgments, but simply selects an option in the questionnaire.

启发式的概念同样来源于心理学,用来描述人们在做判断时基于的一些简化原则或方法,以及所带来的非最优决策过程。启发式也是一种“满意”策略,都是有限理性而非完全理性的反映。相对于“满意”理论,启发式更加聚焦在一些明确的法则、原则或者隐喻上面。如“上面意味着好”的原则特征,是解释性启发式的重要表现。目前调查方法学对于启发式的研究,主要用于解释为什么一些看似不重要的问卷设计元素会对调查结果产生复杂影响。
The concept of heuristics is also derived from psychology, and is used to describe simplifying principles or methods on which people base their judgments, and the resulting non-optimal decision-making process. A heuristic is also a "satisfaction" strategy, and is a reflection of limited rather than complete rationality. In contrast to "satisfaction" theory, heuristics are more focused on explicit laws, principles, or metaphors. For example, the principle of "above means good" is an important manifestation of interpretive heuristics. Current research on heuristics in survey methodology is mainly used to explain why some seemingly unimportant elements of questionnaire design can have a complex impact on survey results.

尽管关于人们如何回答问卷问题还存在其他的理论,但在过去的几十年间,四步骤模型、“满意”和解释性启发式三个理论对调查问卷设计的研究有着非常重要的作用。在前期建立研究假设,后期对研究结果进行解释时,很多研究都或多或少地在这些理论基础上进行论证和推演。随着技术不断发展,调查的模式发生不断变化。在2000年前后开始的网络调查,之后出现的移动网络调查(详见本书第五章),以及近几年受到关注的基于智能手机的主被动数据结合调查方法(详见本书第六章),上述理论在问卷设计中依然是研究和解释受访者回答行为的重要基石。
Although other theories exist about how people answer questionnaire questions, three theories - the four-step model, "satisfaction" and interpretive heuristics - have had a very important role to play in research on questionnaire design over the past decades. When establishing research hypotheses in the early stages and interpreting findings in the later stages, many studies have more or less argued and extrapolated on the basis of these theories. With the continuous development of technology, the mode of investigation is constantly changing. In the web-based surveys that started around 2000, the mobile web-based surveys that appeared after that (see Chapter 5 of this book for details), and the smartphone-based survey method of combining active and passive data that has gained attention in recent years (see Chapter 6 of this book for details), the theories mentioned above are still important cornerstones in questionnaire design for studying and explaining respondents' answering behaviors.