any of this book's chapters, particularly those in Part II, discuss research design. But these discussions occur in the context of a particular type of research or method. In contrast, the contents in this chapter cover broader aspects of research planning and present basic research design principles that cut across all types of research. This chapter is therefore designed to give readers foundational knowledge for the pages that follow. 本书的任何章节,尤其是第二部分的章节,都讨论了研究设计。但这些讨论发生在特定类型的研究或方法的背景下。相比之下,本章的内容涵盖了研究规划的更广泛方面,并提出了贯穿所有类型研究的基本研究设计原则。因此,本章旨在为读者提供后续页面的基础知识。
Scientific research is the systematic and organized gathering of information about a specific topic. Systematization of process across the design, data collection, and analysis phases of a study is what distinguishes scientific research from other forms of inquiry, Organization of and attention to procedures-both internal and external to a studyalso characterize research. Though the degree of systematicity varies by the study design, all types of research share certain basic principles and elements of design. 科学研究是系统、有组织地收集有关特定主题的信息。研究的设计、数据收集和分析阶段的过程系统化是科学研究与其他形式的调查的区别,研究和对程序的关注——包括研究的内部和外部也是研究的特点。尽管系统性的程度因研究设计而异,但所有类型的研究都共享某些基本原则和设计要素。
Viewed broadly, designing and planning research is a lot like planning a wedding. You need to think about 从广义上看,设计和规划研究很像规划婚礼。你需要考虑
your event well in advance of its occurrence, 您的事件在事件发生之前就已发生,
which and how many people to invite (sampling and sample sizes), 邀请哪些人和多少人(抽样和样本量),
how to invite them (recruitment), 如何邀请他们(招聘),
what activities will be included (method selection), 将包含哪些活动(方法选择),
costs and budget parameters, 成本和预算参数,
34 PART I: PLANNING AND PREPARING FOR RESEARCH 34 第一部分:研究和准备
the location of the event (site selection), 活动地点(选址)、
potential obstacles/problems (will Uncle Fred get drunk and table dance?), and 潜在的障碍/问题(弗雷德叔叔会喝醉和跳桌舞吗?),以及
how much time your event will take. 您的活动将花费多少时间。
Other parallels include 其他相似之处包括
the larger the event is, the more difficult to plan and the more expensive it is; 活动规模越大,计划就越困难,成本越高;
the more planners involved, the more expertise is included, but the longer it will take to reach consensus and plan (individual personalities are a key variable in this!); 参与的规划者越多,包含的专业知识就越多,但达成共识和计划所需的时间就越长(个人个性是其中的关键变量!
no matter how much planning you do, you can't anticipate or control exactly how the event will play out (refer back to Uncle Fred); and 无论你做了多少计划,你都无法准确预测或控制事件将如何发展(参见弗雷德叔叔);和
the fewer events and activities, the more control you have over the details; conversely, less control with more events and activities. 事件和活动越少,您对细节的控制就越大;相反,较少的控制,更多的事件和活动。
Research design is also-or at least should be-a highly iterative process. Going back to our wedding analogy, imagine the following (not uncommon) scenario. You and your fiancé have spent hours and hours planning your special day. Your dream wedding is set in motion. But after careful scrutiny of the projected costs, you realize that you can't afford it all, so you decide to cut costs by, say, inviting fewer guests and/or cutting back on food. Changes in either of these will, in turn, have consequences for you and your wedding guests. Or perhaps an even more stressful scenario occurs. Your future mother-in-law (in research the analogues might be funders, Institutional Review Boards, local stakeholders, research partners, or all four!) demands changing or removing one or more of the carefully thought out elements and/or adding an activity. These demands will often happen late in the planning process. Wanting to please your spouse's mother, you scramble to make the "suggested" changes and incorporate them into a revised wedding plan. These changes will have a snowball effect on other event parameters (often the budget), which will subsequently need to be adjusted. Welcome to the world of research design. 研究设计也是——或者至少应该是——一个高度迭代的过程。回到我们的婚礼类比,想象一下以下(并不少见的)场景。您和您的未婚夫花了好几个小时来计划你们的特殊日子。您的梦想婚礼已开始。但是,在仔细审查了预计成本后,您意识到您无法负担所有费用,因此您决定通过减少邀请客人和/或减少食物来削减成本。反过来,这些变化中的任何一个都会对您和您的婚礼客人产生影响。或者,可能会发生更紧张的情况。你未来的岳母(在研究中,类似物可能是资助者、机构审查委员会、当地利益相关者、研究合作伙伴,或所有四个人!)要求更改或删除一个或多个经过深思熟虑的元素和/或增加一项活动。这些需求通常发生在规划过程的后期。为了取悦你配偶的母亲,你争先恐后地做出 “建议 ”的改变,并将它们纳入修改后的婚礼计划中。这些更改将对其他事件参数(通常是预算)产生滚雪球效应,随后需要进行调整。欢迎来到研究设计的世界。
The basic principle that the wedding analogy is intended to convey is that research design is a complicated, iterative, and often frustrating process. To help readers with this complex process, we outline a series of high-level steps in the design process. The sections that follow describe in more detail each of these steps. Table 2.6 at the end of this chapter outlines this process in more detail and presents options and considerations for each decision point discussed. 婚礼类比旨在传达的基本原则是,研究设计是一个复杂的、迭代的、经常令人沮丧的过程。为了帮助读者完成这个复杂的过程,我们概述了设计过程中的一系列高级步骤。以下各节将更详细地介绍这些步骤。本章末尾的表 2.6 更详细地概述了此过程,并提供了所讨论的每个决策点的选项和注意事项。
FIND YOUR FOCUS 找到您的关注点
Identify the Research Problem 确定研究问题
The research problem, at least as we definite it here, refers to the broad theoretical or real-world problem that motivates a researcher to initiate the research design process in the first place. It is the driving force behind a study. Depending on which textbook you 研究问题,至少正如我们在这里所定义的,是指促使研究人员首先启动研究设计过程的广泛理论或现实世界问题。它是研究背后的驱动力。取决于你用哪本教科书
Basic Steps in the Design Process (often revised at every step) 设计过程中的基本步骤(通常在每个步骤中都进行修订)
Find your focus 找到您的关注点
Identify the research problem 确定研究问题
Specify research objectives 指定研究目标
Transform objectives into research questions and hypotheses 将目标转化为研究问题和假设
Narrow your focus 缩小您的关注范围
Decide on the type of research study and design(s) 确定研究研究和设计的类型
Clarify units of observation and units of analysis 阐明观测单位和分析单位
Determine sample and recruitment procedures 确定样本和招聘程序
Operationalize study measures 实施研究措施
Define data collection and management procedures 定义数据收集和管理程序
Decide on the mechanics of data collection 确定数据收集的机制
Develop data entry, storage, and transmission standards 制定数据输入、存储和传输标准
Consider external logistics 考虑外部物流
Consider research scope, time, and resources 考虑研究范围、时间和资源
Consider research site context 考虑研究站点上下文
Document the research plan 记录研究计划
Develop a protocol 制定方案
Draft an analysis plan 起草分析计划
"Plan for communicating results “计划传达结果
read, the research problem might also be referred to as the goal, aim, or purpose of a study. Theoretically, research problems can emerge from anywhere, but in reality they usually come from one or more of the following, nonmutually exclusive sources: 阅读,研究问题也可以称为研究的目标、目的或目的。从理论上讲,研究问题可以来自任何地方,但实际上它们通常来自以下一个或多个非互斥的来源:
A funder. This can come in the form of a personally communicated request, but much more commonly it is expressed through a formal funding announcement such as a request for proposals or applications (RFP, RFA). 一个资助者。这可以以个人传达的请求的形式出现,但更常见的是通过正式的资助公告来表达,例如提案或申请请求(RFP、RFA)。
The literature. A literature review may reveal a gap in knowledge about a particular topic and/or may inspire a researcher to develop and evaluate a new theory. 文献。文献综述可能会揭示有关特定主题的知识差距和/或可能激发研究人员开发和评估新理论。
A real-world problem. The fundamental purpose of public health is to improve the well-being of communities and populations. Surveillance data, or personal experience with a particularly vulnerable population, can bring to light a public health problem and prompt a researcher (or funder) to seek a "solution." 一个现实世界的问题。公共卫生的根本目的是改善社区和人口的福祉。监测数据或与特别脆弱人群的个人经历可以揭示公共卫生问题,并促使研究人员(或资助者)寻求“解决方案”。
When put into writing, a research problem is typically communicated in one or two sentences near the beginning of a research study description. It often begins something like "The aim/goal/purpose of this study is to ..." and then continues with a brief description of the study's desired outcome. Note that research problems can vary immensely in scope and detail. A general rule to remember: The smaller the scope of the research problem, and the more detailed its description, the easier it is to operationalize and execute. 当以书面形式提出时,研究问题通常在研究描述的开头附近用一两句话来传达。它通常以“本研究的目的/目标/目的是......”开头。然后继续简要描述研究的预期结果。请注意,研究问题的范围和细节可能差异很大。要记住的一般规则是:研究问题的范围越小,描述越详细,就越容易操作和执行。
Regardless of how focused and detailed a research problem is, a few sentences simply cannot provide enough information to begin the design process. A research problem first must be transformed into one or more research objectives. 无论研究问题多么集中和详细,几句话根本无法提供足够的信息来开始设计过程。首先,必须将研究问题转化为一个或多个研究目标。
Specify Research Objectives 指定研究目标
A research problem is a brief statement of the purpose and overall goal of a research study. In a very few cases, it is detailed and simple enough to guide development of a research plan. In the vast majority of cases, though, a research problem must be deconstructed and made more explicit. Research objectives should conceptually fall under the scope of the research problem and simultaneously describe in more detail what the study intends to accomplish. They are actionable and provide a bridge between the larger research problem and the more detailed and refined research question(s) and/or hypotheses, which we describe later. A good way to think about and describe a research objective is to begin with the preposition to and follow it with a verb. These are some of the more commonly used verbs in this regard: 研究问题是对研究目的和总体目标的简要陈述。在极少数情况下,它足够详细和简单,可以指导研究计划的制定。然而,在绝大多数情况下,研究问题必须被解构并变得更加明确。研究目标在概念上应属于研究问题的范围,并同时更详细地描述研究打算实现的目标。它们是可操作的,并在更大的研究问题和更详细、更精细的研究问题和/或假设之间架起了一座桥梁,我们将在后面介绍。思考和描述研究目标的一个好方法是以介词 to 开头,然后以动词开头。这些是在这方面比较常用的一些动词:
Identify
Explore
Describe
Explain
Compare
Assess
Evaluate
Measure
Predict
Test
Note that the choice of verb is important, since it conveys what type of general approach the research design will take. Looking at the verbs above from left to right, an underlying pattern of increasing structure in design is observable. Exploring a topic requires less structure than, say, evaluating or testing something. The verbs on the left side above imply a more descriptive-type study. As they move to the right they become increasingly relational and then explanatory. 请注意,动词的选择很重要,因为它传达了研究设计将采用的一般方法类型。从左到右查看上面的动词,可以观察到设计中增加结构的潜在模式。探索一个主题比评估或测试某物需要更少的结构。上面左侧的动词暗示了更具描述性的研究。当它们向右移动时,它们变得越来越关系化,然后是解释性的。
Be mindful of infusing meaning-laden words into an objective. Words like cause, relationship, and difference, among other words, can substantially change the meaning of an objective even if the initial verb is constant. Consider the following four objectives: 注意将充满意义的词注入目标中。原因、关系和差异等词,即使初始动词是恒定的,也可以从根本上改变目标的含义。请考虑以下四个目标:
To identify the sanitation issues in community . 确定社区 的卫生问题。
To identify the cause of sanitation issues in community X. 确定社区 X 卫生问题的原因。
To identify the different sanitation issues between communities X and Y . 确定社区 X 和 Y 之间的不同卫生问题。
To identify the relationship between sanitation issues and dysentery in community X. 确定 X 社区卫生问题与痢疾之间的关系。
The first objective lends itself to an exploratory qualitative study. The second objective requires looking deeper and designing a causal study. The third objective requires a comparative design. And objective four is correlative in nature. All of these warrant very different types of studies. 第一个目标适用于探索性定性研究。第二个目标需要更深入地研究并设计因果研究。第三个目标需要比较设计。目标 4 本质上是相关的。所有这些都需要进行非常不同类型的研究。
The take-home point is that objectives must be carefully thought out and written in a precise and explicit manner. Figuring out how to phrase your research objectives is the first step in honing in on what you want to actually accomplish with your research. Moreover, most proposals and protocols require explicitly delineated objectives. The conceptual relationship between a research problem, objectives, and research questions/ hypotheses (discussed next) is depicted in Figure 2.1. 要点是,必须仔细考虑目标,并以精确和明确的方式编写。弄清楚如何表达你的研究目标是磨练你想要通过研究实际完成什么的第一步。此外,大多数提案和协议都需要明确划定目标。图 2.1 描述了研究问题、目标和研究问题/假设(接下来讨论)之间的概念关系。
Progressing from the research problem to explicit study objectives is probably one of the most critical steps in the design process. The total number and specificity of the objectives shape the scope and nature of the study. But developing objectives is not just a scientific enterprise, it is also a political and economic one. Different audiences for your research will have different expectations regarding its design and associated outcomes. Some audiences, for example, require more precision and/or rigor than others. Logistical factors also place a relatively firm boundary around study parameters. Budget and time are two of the stronger constraints, but other factors, such as the local political context (see Chapter 4) and access to the study population and data sources, can significantly affect the shape of a study. 从研究问题发展到明确的研究目标可能是设计过程中最关键的步骤之一。目标的总数和特异性决定了研究的范围和性质。但是,制定目标不仅仅是一项科学事业,也是一项政治和经济事业。您研究的不同受众对其设计和相关结果有不同的期望。例如,某些受众比其他受众需要更高的精度和/或严谨性。Logistical 因素也在研究参数周围设置了相对严格的边界。预算和时间是两个较强的约束因素,但其他因素,如当地的政治背景(见第 4 章)和获得研究人群和数据来源的机会,可以显着影响研究的形式。
Transform Objectives Into Research Questions and Hypotheses 将目标转化为研究问题和假设
Once the objectives have been specified, the next step is to operationalize them and transform the to statements into research questions and/or hypotheses. Research questions and hypotheses more precisely describe what the research findings will inform. 一旦确定了目标,下一步就是将它们付诸实施,并将 to 陈述转化为研究问题和/或假设。研究问题和假设更准确地描述了研究结果将提供的信息。
PART I: PLANNING AND PREPARING FOR RESEARCH 第一部分:研究和准备
Figure 2.1 Conceptual Relationship Between Research Problem, Objectives, and Research Questions/Hypotheses 图 2.1 研究问题、目标和研究问题/假设之间的概念关系
Types of Research Questions 研究问题的类型
Part of the research question development process is deciding how you will approach the topic and what you want to be able to say with the data you collect and analyze. There are three general types of research questions (Gilson, 2012; Robson, 2011; Shi, 2008). The first type, descriptive or exploratory, seeks to understand a particular phenomenon or problem when little is known about it or there is a need to understand it in a new context, for example, to explore the characteristics, motivations, and medical history of women who delay childbearing until an advanced age. The second type, relational, seeks to understand the relationship between two concepts or measures. Relational questions investigate associations rather than causal relationships, for example, what is the relationship between women's access to family planning services and delayed childbearing? Causal questions seek to understand whether one variable causes changes in the outcome variable of interest, for example, does use of a particular contraceptive device cause women to have trouble conceiving when they decide they are 研究问题开发过程的一部分是决定您将如何处理该主题以及您希望能够使用收集和分析的数据说什么。研究问题一般分为三种类型(Gilson,2012 年;Robson, 2011;Shi, 2008)。第一种类型,描述性或探索性,试图在对特定现象或问题知之甚少或需要在新的背景下理解它时理解它,例如,探索推迟生育到高龄的妇女的特征、动机和病史。第二种类型,关系型,旨在理解两个概念或度量之间的关系。关系问题研究关联而不是因果关系,例如,女性获得计划生育服务与延迟生育之间有什么关系?因果问题旨在了解一个变量是否会导致感兴趣的结果变量发生变化,例如,使用特定的避孕装置是否会导致女性在决定受孕时难以受孕
ready? Relational and causal questions are more often used to understand the effect of a health service or intervention on key outcomes, but they might also be used to understand the factors associated with certain behaviors - like the influence of social support on the use of health care services by people who use illegal drugs-in order to inform interventions. 准备?关系和因果问题更常用于了解卫生服务或干预对关键结果的影响,但它们也可用于了解与某些行为相关的因素——例如社会支持对使用非法药物的人使用医疗保健服务的影响——以便为干预措施提供信息。
Descriptive (aka exploratory) research questions seek to answer questions about a phenomenon. How many or how much are affected or implicated? Who is involved? Where does the phenomenon occur? For example, infant male circumcision is not routinely practiced in East Africa, so Young and colleagues (2012) interviewed parents in hospitals in a province in Kenya. They talked with parents who did and did not accept the procedure in order to understand more about their decision-making processes related to the procedure and whether they experienced any barriers or facilitators to uptake (Young et al., 2012). 描述性(又名探索性)研究问题旨在回答有关现象的问题。有多少或多少受到影响或牵连?谁参与其中?这种现象发生在哪里?例如,婴儿男性包皮环切术在东非并不常见,因此 Young 及其同事(2012 年)在肯尼亚一个省的医院采访了父母。他们与接受和不接受该程序的父母交谈,以更多地了解他们与该程序相关的决策过程,以及他们是否遇到任何障碍或促进因素(Young et al., 2012)。
Relational and causal (aka explanatory) research questions are evaluative in nature and seek to understand whether changes in one factor, such as the implementation of or improvement to a program, results in changes in behaviors, knowledge, and health outcomes (Gilson, 2012). Relational and causal research questions can also be formulated to understand why and how changes occurred. Understanding the mechanisms that resulted in change (not just whether the change occurred, yes or no) can help make recommendations for future research and program and policy implementation. For example, a 1989 study by Bongaarts and colleagues was one of the studies that helped uncover a strong population-level association between the presence of male circumcision and lower HIV prevalence in African contexts (Bongaarts, Reining, Way, & Conant, 1989). They noted that in areas and among cultures where circumcision was rare, HIV prevalence was higher. Among areas where male circumcision was common, HIV prevalence was lower. This study is relational because the research cannot demonstrate whether the lack of male circumcision caused lower HIV prevalence. These relational studies were, however, critical in the development of hypotheses which led to clinical trials of male circumcision. The subsequent randomized control trials eventually demonstrated at the individual level that male circumcision can reduce the risk of acquiring HIV prevalence by up to (Bailey et al., 2007; R. H. Gray et al., 2007). 关系和因果(又名解释性)研究问题本质上是评估性的,旨在了解一个因素的变化,例如项目的实施或改进,是否会导致行为、知识和健康结果的变化(Gilson,2012 年)。还可以制定关系和因果研究问题来了解变化发生的原因和方式。了解导致变化的机制(不仅仅是变化是否发生、是或否)有助于为未来的研究以及计划和政策实施提出建议。例如,Bongaarts及其同事在1989年进行的一项研究是帮助揭示男性包皮环切术的存在与非洲环境中较低的HIV流行率之间的强烈人群水平关联的研究之一(Bongaarts, Reining, Way, & Conant, 1989)。他们指出,在包皮环切术罕见的地区和文化中,HIV 患病率更高。在男性包皮环切术普遍的地区,HIV 患病率较低。这项研究是相关的,因为该研究无法证明缺乏男性包皮环切术是否导致 HIV 患病率降低。然而,这些关系研究对于导致男性包皮环切术临床试验的假设的发展至关重要。随后的随机对照试验最终在个体层面证明,男性包皮环切术可以将感染 HIV 患病的风险降低多达 (Bailey 等人,2007 年;RH Gray 等人,2007 年)。
Whichever the type, a good research question is clearly articulated, focused, answerable, and measureable (Gilson, 2012). In other words, it should be clear what concepts are to be measured, the relationships of those concepts, and it should be possible to answer the questions with data that can be feasibly collected. We need to be pragmatic and balance the scope and size of the study against available resources and the utility of the answers provided by the research (Robson, 2011). Above all, a good research question should yield results that are useful to inform policy, programs, or theory or serves to move along the field of inquiry. Therefore, when formulating research questions, it is important to be able to articulate what the research results will add to the existing knowledge base, to whom it will be useful, and how the results will be useful (Gilson, 2012). 无论哪种类型,一个好的研究问题都是清晰、重点突出、可回答和可衡量的(Gilson,2012 年)。换句话说,应该清楚要测量哪些概念,这些概念之间的关系,并且应该能够用可以收集的数据来回答问题。我们需要务实,并在研究的范围和规模与可用资源和研究提供的答案的效用之间取得平衡(Robson,2011)。最重要的是,一个好的研究问题应该产生有助于为政策、计划或理论提供信息的结果,或者有助于在研究领域中前进。因此,在制定研究问题时,重要的是要能够阐明研究结果将为现有知识库增加什么,对谁有用,以及结果将如何有用(Gilson,2012)。
For exploratory research, research questions are typically stated in the form of an actual question. Most research questions, though, can be translated into 对于探索性研究,研究问题通常以实际问题的形式陈述。不过,大多数研究问题都可以翻译成
40 PART I: PLANNING AND PREPARING FOR RESEARCH 40 第一部分:研究和准备
hypotheses that contain a statement about a direction of influence of cause and effect. In other words, a hypothesis restates the research question in a way that describes the expected relationship between two or more variables (Fisher & Foreit, 2002). For example: 包含有关因果关系影响方向的陈述的假设。换句话说,假设以描述两个或多个变量之间预期关系的方式重述了研究问题(Fisher & Foreit, 2002)。例如:
Variable A is associated with variable B 变量 A 与变量 B 相关联
Areas with higher levels of male circumcision are associated with lower levels of HIV. 男性包皮环切术水平较高的地区与 HIV 水平较低有关。
A change in variable will result in an increase or decrease in variable 变量 的变化将导致变量 的增加或减少
_ Adult male circumcision will result in decreased risk of HIV acquisition for men. _ 成年男性包皮环切术将降低男性感染 HIV 的风险。
The hypothesis describes the relationship between the independent variable (e.g., male circumcision) and dependent variables (HIV). It also implies directionality (male circumcision influences HIV acquisition, not the reverse). The independent variable is often an intervention or program in public health research, such as the introduction of electronic medical records. But the independent variable can also be some other factor, such as education level, that is hypothesized to be associated with or cause a change in the dependent variable. The dependent variable is the outcome that is influenced, often health behaviors or health status, for example. 该假设描述了自变量(例如,男性包皮环切术)和因变量 (HIV) 之间的关系。它还意味着方向性(男性包皮环切术会影响 HIV 感染,而不是相反)。自变量通常是公共卫生研究中的干预或计划,例如引入电子病历。但自变量也可以是假设与因变量相关或导致因变量发生变化的其他因素,例如教育水平。因变量是受影响的结果,例如,通常是健康行为或健康状况。
A note about different terminology: Different disciplines may use different words for independent or dependent variables. For example, independent variables may also be called exposure variables or predictor variables. Dependent variables may be called outcomes or response variables. 关于不同术语的说明:不同的学科可能会对自变量或因变量使用不同的词。例如,自变量也可以称为曝光变量或预测变量。因变量可以称为结果或响应变量。
Literature Reviews 文献综述
As mentioned above, developing research questions and hypotheses requires specificity and precision. This, in turn, requires a certain degree of insight about where the knowledge gaps and/or applied needs are. A literature review will help you understand what is already known about your research topic. Literature reviews serve many purposes, detailed in Table 2.1. The most rigorous type of literature is that which is peer reviewed-those books and manuscripts that went through a process of independent review by scholars with relevant expertise in the field of study. While support for study ideas will be viewed more credibly if backed up by the peer-reviewed literature, often good information-and possibly more recent and timely information-exists in the "grey literature." That is information that exists in conference presentations, reports, or working papers from various organizations, usually on their websites. The decision to include grey literature may be strengthened if the intervention being reviewed is complex, has a complex outcome, or lacks outcomes measures; is underrepresented in the peer-reviewed literature; lacks quality evidence in the peer-reviewed literature; or takes place in a context that is important to implementation (Benzies, Premji, Hayden, & Serrett, 2006). 如上所述,开发研究问题和假设需要特异性和精确性。反过来,这需要对知识差距和/或应用需求所在有一定程度的洞察力。文献综述将帮助您了解关于您的研究课题的已知信息。文献综述有多种用途,详见表 2.1。最严格的文献类型是经过同行评审的文献——那些经过研究领域相关专业知识的学者独立审查的书籍和手稿。虽然如果得到同行评审文献的支持,对研究想法的支持会更可信,但通常良好的信息——可能还有更新和及时的信息——存在于“灰色文献”中。这是存在于各种组织的会议演示文稿、报告或工作论文中的信息,通常在他们的网站上。如果所评价的干预措施复杂、结局复杂或缺乏结局测量,则纳入灰色文献的决定可能会得到加强;在同行评审文献中的代表性不足;同行评审文献中缺乏高质量的证据;或者在对实施很重要的环境中进行(Benzies, Premji, Hayden, & Serrett, 2006)。
Table 2.1 Uses for Literature Reviews 表 2.1 文献综述的用途
Systematic reviews are different from literature reviews in that they are, as you might imagine, more systematic. They still cover the literature and summarize what is known, but typically systematic reviews are an end rather than a means, resulting in a comprehensive understanding of what is known about the effectiveness of health interventions. Typically, the process involves several partners, the results reflect precisely what is and is not known about interventions, and results are used to inform policy or clinical decisions (Institute of Medicine, 2011). The Institute of Medicine has produced very specific standards for systematic reviews. 系统评价与文献评价的不同之处在于,正如您可能想象的那样,它们更加系统。它们仍然涵盖文献并总结已知的内容,但通常系统评价是目的而不是手段,从而全面了解关于健康干预有效性的已知信息。通常,该过程涉及多个合作伙伴,结果准确反映了对干预措施的了解和未知,结果用于为政策或临床决策提供信息(医学研究所,2011 年)。医学研究所为系统评价制定了非常具体的标准。
Resources for conducting literature reviews include library databases, electronic databases (e.g, Pubmed, CINAHL, PsycINFO, SSCI, AgeInfo, CareData, Social Services Abstracts, Popline, EMBASE), and search engines such as Google Scholar. A more complete list is available at the end of this chapter (see Search Engines and Databases for Conducting Literature Reviews). Take the time to learn how to'effectively use keywords in your search; how to narrow your search using parentheses and or, and, or not statements; and truncation or other "advanced" features. The more thorough your literature search is, the more important it becomes to document your search strategy by documenting the combination of key words you have used and your strategy for eliminating and retaining articles to review. 进行文献综述的资源包括图书馆数据库、电子数据库(例如 Pubmed、CINAHL、PsycINFO、SSCI、AgeInfo、CareData、Social Services Abstracts、Popline、EMBASE)和搜索引擎(例如 Google Scholar)。本章末尾提供了更完整的列表(参见用于进行文献综述的搜索引擎和数据库)。花点时间学习如何在搜索中有效地使用关键词;如何使用括号和 OR、AND 或 NOT 语句缩小搜索范围;以及截断或其他 “高级” 功能。你的文献搜索越彻底,通过记录你使用的关键词的组合以及你删除和保留文章以供审阅的策略来记录你的搜索策略就越重要。
As you may know, bibliographic citation management software is widely available now, and there are a number of both paid and free packages to choose from (e.g., Endnote, Procite). Programs may be hosted on your own personal computer or may be web-based or hosted by a central server. They are tools that can help store and organize and create reference lists of the materials you have gathered. They can also help you create formatted bibliographies or references lists and according to the style you need. They also serve as search engines within the databases you have created or have access to. 您可能知道,书目引文管理软件现在广泛可用,并且有许多付费和免费软件包可供选择(例如,Endnote、Procite)。程序可以托管在您自己的个人计算机上,也可以基于 Web 或由中央服务器托管。它们是可以帮助存储和组织和创建您收集的材料参考列表的工具。他们还可以帮助您根据您需要的样式创建格式化的书目或参考文献列表。它们还用作您创建或有权访问的数据库中的搜索引擎。
42 PART I: PLANNING AND PREPARING FOR RESEARCH 42 第一部分:研究和准备
Using Theory and Conceptual Models to Inform/Refine Research Questions 使用理论和概念模型为研究问题提供信息/完善研究问题
Theory, research, and practice are all linked. Research provides evidence that informs how to organize and target interventions to improve care and ultimately health outcomes. Theory drives research priorities and how research is carried out. Theory is "a set of interrelated concepts, definitions, and propositions that presents a systematic view of events or situations by specifying relationship among variables in order to explain and predict the events or situations" (Glanz, Lewis, & Rimer 1997, p. 21). Theories help researchers know who to target as the subject of research, what the researcher needs to know before developing the research or the program of study, and how to shape interventions of study so that they have a greater likelihood of having a positive effect on outcomes of interest. The Appendix at the end of this book provides a description of the main theories and models used in public health research and practice. 理论、研究和实践都是相互关联的。研究提供了证据,告知如何组织和有针对性地进行干预,以改善护理并最终改善健康结果。理论推动研究重点和研究如何进行。理论是“一组相互关联的概念、定义和命题,通过指定变量之间的关系来呈现对事件或情况的系统视图”(Glanz, Lewis, & Rimer 1997, p. 21)。理论帮助研究人员了解以谁为目标作为研究对象,研究人员在开发研究或研究计划之前需要了解什么,以及如何塑造研究干预措施,以便它们更有可能对感兴趣的结果产生积极影响。本书末尾的附录描述了公共卫生研究和实践中使用的主要理论和模型。
Conceptual models are used to summarize existing knowledge about the relationships between variables of interest and to illustrate the research question using concept boxes and arrows. A conceptual model is "a diagram of proposed causal linkages among a set of concepts believed to be related to a particular public health problem" (Earp & Ennett, 1991, p. 164). A concept is an abstract idea or phenomenon that can be observed or measured. Variables are the operational definitions of the concepts (Glanz et al., 1997). At the center, a conceptual model contains the independent variable (or cause or intervention) that leads to an outcome (or dependent variable; Figure 2.2). Conceptual models are typically read from left to right, but other formulations exist. The model allows for mediating variables (an intervening explanatory variable in between the independent and outcome variable), antecedent variables, and modifying variables (variables that affect the direction or strength of the relationship between the independent and outcome variables). In Figure 2.2 we use a light example of proposal and marriage to illustrate the relationship of some of these variables. 概念模型用于总结有关感兴趣变量之间关系的现有知识,并使用概念框和箭头说明研究问题。概念模型是“一组被认为与特定公共卫生问题相关的概念之间拟议的因果关系图”(Earp & Ennett,1991年,第164页)。概念是可以观察或测量的抽象概念或现象。变量是概念的操作定义(Glanz et al., 1997)。在中心,概念模型包含导致结果(或因变量;图 2.2)。概念模型通常是从左到右阅读的,但也存在其他公式。该模型允许中介变量(自变量和结果变量之间的干预解释变量)、前因变量和修改变量(影响自变量和结果变量之间关系的方向或强度的变量)。在图 2.2 中,我们使用一个求婚和婚姻的简单例子来说明其中一些变量之间的关系。
Figure 2.2 Conceptual Model of Marriage Proposal Leading to Marriage 图 2.2 求婚导致结婚的概念模型
Although not shown in Figure 2.2, confounding variables are commonly included in conceptual models. They are factors that act to influence both the independent variable and the outcome or dependent variable. For example, while use of antenatal care is typically associated with better maternal and infant outcomes, pregnancy complications might be a confounding variable in the relationship between use of antenatal care and pregnancy outcome (Figure 2.3). A woman who experiences complication in a pregnancy might be more likely to use more antenatal care, and those pregnancy complications may also result in poorer pregnancy outcomes. 虽然图 2.2 中没有显示,但混杂变量通常包含在概念模型中。它们是影响自变量和结果或因变量的因素。例如,虽然使用产前保健通常与较好的母婴结局有关,但妊娠并发症可能是使用产前保健与妊娠结局之间关系中的一个混杂变量(图 2.3)。在怀孕期间出现并发症的妇女可能更有可能使用更多的产前保健,而这些妊娠并发症也可能导致更差的妊娠结局。
Figure 2.3 Example of a Variable Confounding the Relationship Between the Independent and Dependent Variable 图 2.3 混淆自变量和因变量之间关系的变量示例
A research conceptual model does not tell the researcher how to intervene, but rather how the intervention is expected to influence the outcomes of interest and the other factors that influence the effectiveness of the intervention. For research purposes, the concepts are translated into variables and are operationally defined so that they can be measured. Researchers typically include in the conceptual model only those variables that they can measure, though sometimes researchers include variables they know to influence the other variables but that they are not measuring. However, conceptual models are meant to be parsimonious. Table 2.2 provides examples in order to demonstrate the increasing specificity from concepts to variables and measures. 研究概念模型并不告诉研究人员如何干预,而是预期干预如何影响感兴趣的结果以及影响干预效果的其他因素。出于研究目的,这些概念被转化为变量并在操作上定义,以便可以对其进行测量。研究人员通常只在概念模型中包括他们可以测量的那些变量,但有时研究人员会包括他们知道会影响其他变量但并未测量的变量。但是,概念模型应该是简洁的。表 2.2 提供了一些示例,以证明从概念到变量和度量的日益特异性。
Table 2.2 Examples of Concept, Variable, and Measure 表 2.2 概念、变量和测度的示例
Concept 概念
Variable 变量
Measure 量
使用孕产妇保健服务
Use of maternal
health care services
寻求产前保健
Seek antenatal
care
怀孕期间由训练有素的提供者参加 4 次或更多次产前保健 (Y/N)
Attended antenatal care with a
trained provider four or more
times during pregnancy (Y/N)
经济实力雄厚
Economic
strengthening
社会现金转移干预
Social cash
transfer
intervention
有孤儿和弱势儿童的超贫困家庭每月收到 20 美元 (Y/N)
Ultra poor household with
orphans and vulnerable children
receives US$20 per month (Y/N)
44 PART I: PLANNING AND PREPARING FOR RESEARCH 44 第一部分:研究和准备
Conceptual models also help during data analysis because they describe the hierarchical relationships between factors. For example, poverty is often associated with poor health, but poverty is not the cause of poor health per se. Poverty acts through variables such as malnutrition to affect health outcomes like diarrhea in children. In multivariable and multivariate analyses (like those using linear and logistic regression models), researchers typically want to assess the effect of the independent variable (in this case malnutrition) on the outcome (diarrhea) taking into account the other factors (poverty). The analyst will want to understand the effect of malnutrition without taking into account other variables as well as adjusting for other variables. Thus, she or he may have three models, one with malnutrition alone, one with poverty alone, and one with both malnutrition and poverty (looking at the effects on diarrhea). 概念模型在数据分析过程中也有帮助,因为它们描述了因子之间的层次关系。例如,贫困通常与健康状况不佳有关,但贫困本身并不是健康状况不佳的原因。贫困通过营养不良等变量影响儿童腹泻等健康结果。在多变量和多变量分析(如使用线性和 Logistic 回归模型的分析)中,研究人员通常希望评估自变量(在本例中为营养不良)对结果(腹泻)的影响,同时考虑其他因素(贫困)。分析师将希望了解营养不良的影响,而不考虑其他变量以及调整其他变量。因此,她或他可能有三个模型,一个是单独的营养不良,一个是单独的贫困,还有一个是营养不良和贫困(研究对腹泻的影响)。
Although most common in monitoring and evaluation, logic models (also known as program impact pathways) can be useful in other public research endeavors as well. Logic models are linear, left-to-right models that describe program inputs and activities expected to influence outputs, outcomes, and impact. 虽然逻辑模型(也称为项目影响途径)在监测和评估中最为常见,但在其他公共研究工作中也很有用。逻辑模型是线性的、从左到右的模型,用于描述预期会影响产出、结果和影响的项目输入和活动。
NARROW YOUR FOCUS 缩小您的关注范围
Decide on the Type of Research Study and Design(s) 决定研究研究和设计的类型
At this stage you are ready to determine the best research design to answer your study question. As noted earlier, the word choice of your research objective and question starts to imply a design. The key is to determine which study design is the most effective way to answer your research question. 在这个阶段,你已经准备好确定最好的研究设计来回答你的研究问题。如前所述,您的研究目标和问题的词选择开始暗示一种设计。关键是要确定哪种研究设计是回答您的研究问题的最有效方法。
Basic Types of Research Studies 研究的基本类型
As stated above, there are different types of research questions: descriptive, relational, and causal. These types of research questions translate into either exploratory or explanatory research. Exploratory research seeks to understand new or complex phenomena and to generate hypotheses, particularly when little is known about a topic. Once more information is known about a situation or context, descriptive research is used to describe phenomena and characteristics of people and their contexts. Explanatory research is carried out to try to understand the factors causing or associated with knowledge, behavior, health, or other variables and the relationship between various factors and the outcomes (Robson, 2011). Explanatory research is interested in the relationships (i.e., causal or associative) between variables (Gilson, 2012). 如上所述,有不同类型的研究问题:描述性、关系性和因果性。这些类型的研究问题转化为探索性或解释性研究。探索性研究旨在了解新的或复杂的现象并产生假设,尤其是在对某个主题知之甚少的情况下。一旦了解了有关情况或背景的更多信息,描述性研究就用于描述人们及其背景的现象和特征。进行解释性研究是为了试图了解导致或与知识、行为、健康或其他变量相关的因素,以及各种因素与结果之间的关系(Robson,2011 年)。解释性研究对变量之间的关系(即因果关系或关联性)感兴趣(Gilson,2012)。
Descriptive Studies 描述性研究
The type of research question will inform the study design, and descriptive questions usually translate into simpler study designs. Descriptive studies are observational, do not attempt to assign causality, and are good for problem identification. They are good at answering who, what, why, when, where, and so what (Grimes & Schulz, 2002). 研究问题的类型将为研究设计提供信息,描述性问题通常会转化为更简单的研究设计。描述性研究是观察性的,不试图分配因果关系,并且有利于问题识别。他们擅长回答谁,什么,为什么,何时,何地,以及什么(Grimes & Schulz,2002)。
Descriptive studies are conducted to investigate a population's health service needs, experiences, or behaviors in order to inform interventions. When descriptive studies gather information about experiences with health services or a particular intervention, they may actually be considered a posttest-only design, which is described in the context of study designs below. Types of descriptive studies include the following: 进行描述性研究是为了调查人群的卫生服务需求、体验或行为,以便为干预措施提供信息。当描述性研究收集有关卫生服务或特定干预措施体验的信息时,它们实际上可能被视为仅后测设计,这在下面的研究设计的上下文中进行了描述。描述性研究的类型包括:
Case studies: A comprehensive focus on a single phenomenon (Yin, 1999). Qualitative and quantitative data can be used, and the researcher can rely on multiple types of data collection. Examples include studies of a financial system in a hospital, the political structure of a community, or policies of a managed care system. The methods used to select cases can vary widely; a case may be selected because it is thought to be a critical case, extreme case, revelatory case, model case, or modal case. 案例研究:全面关注单一现象 (Yin, 1999)。可以使用定性和定量数据,研究人员可以依赖多种类型的数据收集。示例包括对医院财务系统、社区政治结构或管理式医疗系统政策的研究。用于选择案例的方法可能差异很大;选择案例可能是因为它被认为是关键案例、极端案例、启示性案例、典型案例或模态案例。
Cross-sectional studies: In a cross-sectional study, data are collected at one point in time. A variety of quantitative and qualitative methods can be employed with a sample of units (e.g., people, hospitals) to gather data about a particular health service or policy or to gain understanding about a particular phenomenon, measure, or context. Descriptive surveys aim to describe pictures of a situation or problem or program (Veney & Kaluzny, 1998). Other data may be used in cross-sectional descriptive studies, including structured or semistructured interviews, reviews of documents, or observations. 横断面研究:在横断面研究中,数据是在某个时间点收集的。可以对单位样本(例如,人、医院)采用各种定量和定性方法来收集有关特定卫生服务或政策的数据,或了解特定现象、措施或背景。描述性调查旨在描述一种情况或问题或程序的图片(Veney & Kaluzny, 1998)。其他数据可用于横断面描述性研究,包括结构化或半结构化访谈、文件审查或观察。
Trend design: In a trend design, two or more cross-sectional studies are conducted over time. Data, often quantitative, are compiled to demonstrate how variables of interest change over time. An important data source for trend analyses in the health services is routine program information (Veney & Kaluzny, 1998). Information about number of clients served, client characteristics such as sex and age, diagnostic classification, and so on provide information about patterns of use of a particular program. Other trend data can come from state, national, or international statistics agencies tracking disease and injury prevalence and use of services. 趋势设计:在趋势设计中,随着时间的推移进行两项或多项横断面研究。数据(通常是定量的)被编译以展示感兴趣的变量如何随时间变化。卫生服务趋势分析的一个重要数据源是常规项目信息(Veney & Kaluzny,1998)。有关所服务的客户端数量、客户端特征(如性别和年龄)、诊断分类等的信息提供有关特定程序使用模式的信息。其他趋势数据可能来自跟踪疾病和伤害流行率以及服务使用情况的州、国家或国际统计机构。
Panel design: A panel design is similar to a trend design in that data are collected at two or more points over time, but in a panel design data are collected from the same people or cases. 面板设计:面板设计类似于趋势设计,因为数据随时间推移在两个或多个时间点收集,但在面板设计中,数据是从相同的人员或案例中收集的。
Case-control and cohort studies (either retrospective or prospective): These are two other types of descriptive observational studies and are commonly used in epidemiology to understand disease etiology, incidence, prevalence, causes, and risk factors (Mann, 2003). You can find more information about cohort and case-control studies in Chapter 7. 病例对照和队列研究(回顾性或前瞻性):这是另外两种类型的描述性观察性研究,通常用于流行病学以了解疾病病因、发病率、患病率、原因和风险因素(Mann,2003)。您可以在第 7 章中找到有关队列和病例对照研究的更多信息。
Relational and Causal Studies 关系和因果研究
Relational and causal studies are used to understand a relationship between independent and dependent variables. In the case of causal or explanatory studies, there is also an attempt to demonstrate a causal relationship. Interventions range from true experimental to quasi-experimental and nonexperimental designs (Table 2.3). The main feature of true experimental designs is that randomization is used to assign units into intervention (or experimental) and control groups. The unit of randomization can be at the level of the individuals or it can be at higher levels, such as a hospital or facility, or geographic unit such as 关系和因果研究用于了解自变量和因变量之间的关系。在因果或解释性研究的情况下,也试图证明因果关系。干预措施范围从真正的实验性到准实验性和非实验性设计(表 2.3)。真正实验设计的主要特点是随机化用于将单元分配到干预组(或实验组)和对照组。随机化的单位可以是个人的级别,也可以是更高的级别,例如医院或设施,也可以是地理单位,例如
46 PART I: PLANNING AND PREPARING FOR RESEARCH 46 第一部分:研究和准备
a district. Randomization is a technique that can deal with most of the threats to internal validity (see below), particularly selection. In other words, it helps increase our confidence that the participants in each intervention and the control group are similar in characteristics that might otherwise act to independently influence the outcome of interest. 一个区。随机化是一种可以处理对内部有效性(见下文)的大多数威胁的技术,尤其是选择。换句话说,它有助于增加我们的信心,即每次干预的参与者和对照组的特征相似,否则可能会独立影响感兴趣的结果。
True experiments, while having many positive attributes as a study design, also have some drawbacks. For one, true experiments do not do well with explaining complex social issues; when outcomes of true experiments are unexpected, researchers often face difficulties in explaining the reasons for those results (Robson, 2011). Some interventions, particularly in health services or psychosocial areas, are complex or operate through complex pathways of influence that may be difficult to randomize or assign adequate control groups. Another problem can arise when true experiments rely on volunteers for participating in studies. The simple act of volunteering may mean that the participants are more motivated or interested in the study and therefore may react to the intervention differently from the rest of the population. This difference may cause problems when the intervention is made available to the general population but does not have the same effects as under experimental conditions (see Chapter 16 for a discussion of efficacy versus effectiveness randomized controlled trials). 真正的实验虽然作为研究设计具有许多积极属性,但也有一些缺点。首先,真正的实验并不能很好地解释复杂的社会问题;当真实实验的结果出乎意料时,研究人员经常难以解释这些结果的原因(Robson,2011)。一些干预措施,特别是在卫生服务或社会心理领域,是复杂的,或者通过复杂的影响途径运作,可能难以随机化或分配适当的对照组。当真正的实验依赖于志愿者参与研究时,可能会出现另一个问题。简单的志愿服务行为可能意味着参与者对研究更有动力或更感兴趣,因此对干预的反应可能与其他人群不同。当干预措施提供给普通人群时,这种差异可能会引起问题,但效果与实验条件下的效果不同(参见第 16 章关于疗效与有效性的随机对照试验)。
Quasi-experimental designs again involve an intervention group and a comparison group, but control groups are determined by means other than randomization. Quasiexperimental designs are suitable for environments where it is not feasible to randomize. In the nonequivalent control group design, the individual is not usually the primary sampling unit. Usually the groups are made up of units such as hospitals or facilities, or geographic units such as districts. A nonequivalent control group design has pre- and posttest measures for both the intervention and comparison groups; the main difference is that there is no randomization into groups. Researchers have other techniques they can use to try to make the intervention and comparison groups as similar as possible. They can rely on matching, whereby researchers try to make key characteristics as similar as possible between intervention and comparison groups. For example, in a situation where the groups are health facilities, researchers might try to match on characteristics such as number of beds, number of doctors or medical staff, rural/urban location, catchment area size, services offered, and so on. Also, multivariable and multivariate analysis methods can be used to try to control for differences between intervention and comparison groups. The main risk in both of these methods is that the researcher cannot control for or identify all factors that may result in differences between intervention and comparison groups (that is, researchers can never rule out selection bias). 准实验设计再次涉及干预组和对照组,但对照组是通过随机化以外的方式确定的。准实验设计适用于无法随机化的环境。在非等效对照组设计中,个体通常不是主要抽样单位。通常,这些组由医院或设施等单位或地区等地理单位组成。非等效对照组设计对干预组和对照组都有前测和后测措施;主要区别在于没有随机分组。研究人员可以使用其他技术来尝试使干预组和对照组尽可能相似。他们可以依靠匹配,即研究人员尝试使干预组和对照组之间的关键特征尽可能相似。例如,在各组是卫生设施的情况下,研究人员可能会尝试根据床位数量、医生或医务人员数量、农村/城市位置、服务区大小、提供的服务等特征进行匹配。此外,多变量和多变量分析方法可用于尝试控制干预组和对照组之间的差异。这两种方法的主要风险是研究人员无法控制或识别可能导致干预组和对照组之间差异的所有因素(也就是说,研究人员永远不能排除选择偏倚)。
Time series designs can often offer convincing evidence of changes due to interventions. In a time series design, the comparison group is actually repeated measures that occur prior to an intervention. Time series designs are useful when the boundaries of interventions, such as mass media campaigns, are difficult to define, determine, or contain. Thus, control or comparison groups are not feasible. The more measures that are taken before and after the intervention the better. These measures can be plotted on a graph for visual interpretation of effect, but researchers can also use analytic methods to test for statistical differences. A main limitation of time series designs is that some measures might naturally change over time. For example, more and more adolescents will naturally initiate sexual activity over time. A related model is a regression-discontinuity design. In this model, intervention and comparison groups are determined based on defining a criterion cutoff value, with one group falling above the value and the other falling below. 时间序列设计通常可以提供令人信服的证据来证明干预引起的变化。在时间序列设计中,比较组实际上是在干预之前发生的重复测量。当干预(例如大众媒体活动)的边界难以定义、确定或包含时,时间序列设计非常有用。因此,对照组或对照组是不可行的。干预前后采取的措施越多越好。这些度量可以绘制在图表上,以便直观地解释效果,但研究人员也可以使用分析方法来检验统计差异。时间序列设计的一个主要限制是某些度量可能会随时间自然变化。例如,随着时间的推移,越来越多的青少年会自然而然地开始性活动。相关模型是回归-不连续性设计。在此模型中,干预组和比较组是根据定义标准截断值来确定的,一组高于该值,另一组低于该值。
Nonexperimental designs are more appropriate for collecting descriptive information about a program or for problem diagnosis (Fisher & Foreit, 2002). In a posttest-only design, measures are taken with in an intervention group only after the intervention has occurred. Thus, this design is good for investigating demand for the intervention; characteristics of people participating in the intervention; understanding provider perspectives, knowledge, or skills implementing the intervention; or clients' satisfaction with the intervention. 非实验性设计更适合收集关于程序的描述性信息或用于问题诊断(Fisher & Foreit, 2002)。在仅后测设计中,干预组仅在干预发生后才采取措施。因此,这种设计有利于调查干预需求;参与干预的人的特征;了解提供者实施干预的观点、知识或技能;或客户对干预的满意度。
Pre- and posttest study designs have no comparison group, but measures are made before and after the intervention. While researchers cannot make statements about the levels of change that are attributed to an intervention, pre- and posttest designs can be useful for piloting new interventions in order to get a sense of the potential for effectiveness before engaging in a more rigorous design. 前测和后测研究设计没有对照组,但在干预前后进行测量。虽然研究人员无法就干预措施所归因的变化水平做出陈述,但前测和后测设计可能有助于试行新的干预措施,以便在进行更严格的设计之前了解有效性的潜力。
A static group comparison has postintervention-only measures, but in both intervention and comparison groups. Because assignment into groups is not randomized and there is no baseline measure by which to compare changes over time, a researcher must seriously consider the added value of collecting data in the comparison group. It may be sufficient to stick with a posttest-only design and save costs associated with the additional data collection in the comparison group, if the research cannot ensure adequate matching or statistical analysis to justify the inclusion of the comparison group. 静态组比较仅具有干预后测量,但在干预组和对照组中均有。由于分组不是随机的,并且没有基线测量来比较随时间的变化,因此研究人员必须认真考虑在比较组中收集数据的附加值。如果研究无法确保足够的匹配或统计分析来证明纳入对照组的合理性,那么坚持仅后测设计并节省与在对照组中额外数据收集相关的成本可能就足够了。
Table 2.3 Summary of Types of Study Designs for Research and Evaluation 表 2.3 用于研究和评估的研究设计类型总结
Type of Design 设计类型
Study Design 研究设计
随机分配
Random
Assignment
干预组
Intervention
Group
对照组或比较组
Control or
Comparison
Group
措施在干预之前进行
Measures
Occur
Before
Intervention
干预后采取措施
Measures
Occur After
Intervention
Experimental 实验的
前测后测对照组设计
Pretest-posttest
control group
design
Yes 是的
Yes 是的
Yes 是的
Yes 是的
Yes 是的
仅后测对照组设计
Posttest-only
control group
design
Yes 是的
Yes 是的
Yes 是的
No 不
Yes 是的
Quasi-experimental 准实验
非等效对照组
Nonequivalent
control group
No 不
Yes 是的
Yes 是的
Yes 是的
Yes 是的
时间序列设计
Time series
design
No 不
Yes 是的
是(历史控制)
Yes
(historical
control)
Yes 是的
Yes 是的
Nonexperimental 非实验性
仅后测设计
Posttest-only
design
No 不
Yes 是的
No 不
No 不
Yes 是的
前测后测设计
Pretest-posttest
design
No 不
Yes 是的
No 不
Yes 是的
Yes 是的
静态组比较
Static-group
comparison
No 不
Yes 是的
Yes 是的
No 不
Yes 是的
48 PART I: PLANNING AND PREPARING FOR RESEARCH 48 第一部分:研究和准备
Variations on Common Study Designs 常见研究设计的变体
Thus far we have described basic study designs, but there are a number of other designs that build on these. For example, it is possible to have participants assigned to two different intervention groups ( A and B ) and compare that to a control group, or to have participants assigned to an intervention group ( A ) and an enhanced intervention group. Whether you want to compare A and B to the control group or (A vs. C) and (B vs. C) and (A vs. B) depends on the research question. Similarly, it is possible that you might want to compare one intervention group to two control groups. The more complex your design, the larger the sample size required. 到目前为止,我们已经描述了基本的研究设计,但还有许多其他设计建立在这些设计之上。例如,可以将参与者分配到两个不同的干预组( A 和 B )并将其与对照组进行比较,或者将参与者分配到干预组 ( A ) 和增强干预组。您是否要将 A 和 B 与对照组进行比较,还是 (A vs. C) 和 (B vs. C) 和 (A vs. B) 取决于研究问题。同样,您可能希望将一个干预组与两个对照组进行比较。您的设计越复杂,所需的样本量就越大。
In some cases, you might investigate whether two factors interact to increase their effect on the outcome. This factorial design results in a number of combinations including (A vs. C), (B vs. C), and ( vs. C). Economic components could be added here to understand more about the value of the additional component. For example, if providers are trained in a particular technique in the classroom setting, a researcher may be interested in whether the addition of web-based training enhances the providers' skills. The study design would compare the effect of training only with training plus web component. An excellent example of randomized control trial with a factorial design compared the effects of different supplementation during pregnancy to prevent neural tube defects (MRC Vitamin Study Research Group, 1991). Women were assigned to one of four groups: folic acid only (A), folic acid and other vitamins (B), no vitamins or folic acid (C), or other vitamins but no folic acid (D). The design allowed the researchers to understand that folic acid had a protective effect against neural tube defects while there was no significant protective effect provided by the other vitamins. 在某些情况下,您可以调查两个因素是否相互作用以增加它们对结果的影响。此因子设计会产生许多组合,包括 (A vs. C)、(B vs. C) 和 ( vs. C)。可以在此处添加经济组成部分,以了解有关附加组成部分价值的更多信息。例如,如果提供者在课堂环境中接受了特定技术的培训,研究人员可能会对增加基于 Web 的培训是否能提高提供者的技能感兴趣。研究设计将比较仅训练与训练加 Web 组件的效果。析因设计的随机对照试验的一个很好的例子比较了怀孕期间不同补充剂对预防神经管缺陷的效果(MRC 维生素研究组,1991 年)。女性被分配到四组中的一组:仅叶酸 (A)、叶酸和其他维生素 (B)、无维生素或叶酸 (C) 或其他维生素但无叶酸 (D)。该设计使研究人员了解到叶酸对神经管缺陷有 保护作用,而其他维生素没有提供显着的保护作用。
In a parametric or dose-response analysis, researchers are interested in the effect of different levels of exposure to an intervention. The concept comes from clinical studies of whether increasing (or decreasing) drug doses results in corresponding improvements in health. 在参数或剂量反应分析中,研究人员对不同程度的干预暴露的影响感兴趣。这个概念来自关于增加(或减少)药物剂量是否会导致健康的相应改善的临床研究。
A stepped-wedge design is increasing in popularity recently as an alternative to randomized control trials. Stepped-wedge designs can be useful when it is not feasible to exclude groups in order to create control or comparison groups, or if is not feasible to delay program implementation in order to get baseline measures. In a stepped-wedge design, groups are randomly selected to receive the intervention in phases. The advantage of this design is that it allows for all groups to eventually receive the intervention, but in a phased process that is usually compatible with program implementation. In some cases it may also allow for dose responses analyses to see whether increased exposure to the intervention is associated with changes in study outcomes. 阶梯楔形设计最近作为随机对照试验的替代方案越来越受欢迎。当无法排除组以创建控制组或比较组,或者如果不可行时延迟程序实施以获得基线度量时,阶梯楔形设计可能很有用。在阶梯楔形设计中,随机选择各组分阶段接受干预。这种设计的优点是它允许所有群体最终接受干预,但处于通常与程序实施兼容的分阶段过程。在某些情况下,它还可能允许进行剂量反应分析,以查看干预暴露的增加是否与研究结果的变化有关。
Structure and Method/Design Selection 结构和方法/设计选择
The degree of structure in a research project overall, and in the data collection processes more specifically, will help define both your method selection and the type of findings you will generate. In an ideal world, the overall process of scientific inquiry follows a funnel pattern, starting off with broader and more general types of questioning and moving to more specific and structured types of inquiry as more about a topic is learned. In this context, "research process" can refer to a single study with multiple 研究项目的整体结构程度,更具体地说,在数据收集过程中的结构程度将有助于定义您的方法选择和您将生成的结果类型。在理想情况下,科学探究的整个过程遵循漏斗模式,从更广泛和更一般的提问类型开始,然后随着对某个主题的了解越来越多,转向更具体和结构化的探究类型。在这种情况下,“研究过程”可以指具有多个
components or the larger, collective body of multiple scientific studies pertaining to a particular topic (which can take place over years, decades, or even centuries). 组成部分或与特定主题相关的多个科学研究的更大集体(可能持续数年、数十年甚至几个世纪)。
In the earliest stages of the research process, the focus is likely to be exploratory and the selection of data collection methods and development of instruments will reflect this. Not surprisingly, the early stages in the overall research process are when qualitative, and less structured, forms of inquiry are often employed. Using less structured forms of inquiry in the incipient stages of research improves inquiry validity (that is, ensures that researchers are asking the right questions and in the right way). On the other end of the spectrum, rigorous hypothesis testing, such as in randomized controlled trials, requires a substantial amount of structure (Figure 2.4). 在研究过程的最初阶段,重点可能是探索性的,数据收集方法的选择和工具的开发将反映这一点。毫不奇怪,在整个研究过程的早期阶段,经常采用定性的、结构性较低的调查形式。在研究的初期阶段使用结构性较低的调查形式可以提高调查的有效性(即,确保研究人员以正确的方式提出正确的问题)。另一方面,严格的假设检验,例如在随机对照试验中,需要大量的结构(图 2.4)。
Figure 2.4 Structure and Implications for Research Design 图 2.4 结构及其对研究设计的意义
Looking at Figure 2.4, you'll notice that structure and the flexibility and inductive process that characterize lack of structure each come with advantages and disadvantages. More flexible, less structured forms of inquiry can help enhance inquiry validity, identify locally relevant issues, and contribute to a deeper understanding of a given research topic. They are not, however, well suited to comparative analyses. They are even less useful for directly testing hypotheses. As a researcher, you will need to decide where your specific study context fits on the structure continuum. 查看图 2.4,您会注意到结构以及缺乏结构的特征的柔性和归纳过程都有优点和缺点。更灵活、结构较少的探究形式有助于提高探究的有效性,确定与当地相关的问题,并有助于更深入地理解给定的研究主题。然而,它们并不适合进行比较分析。它们对于直接检验假设的用处就更少了。作为研究人员,您需要确定您的特定研究环境在结构连续体中的位置。
Clarify Units of Observation and Units of Analysis 阐明观测单位和分析单位
After a literature review, you will know where additional research is needed, and research questions(s) will begin to crystalize. Conceptualizing and specifying what you want to collect data from and/or about-the unit of observation-is another step in the research question development process. It is important to clarify this in the early stages of designing your research. Below is a list of the more common domains of observation in public health research. Note that many studies include two or more of these, often in a relational framework: 在文献综述之后,您将知道哪些地方需要额外的研究,并且研究问题将开始具体化。概念化和指定要从中收集数据和/或关于观察单位的数据是研究问题开发过程的另一个步骤。在设计研究的早期阶段澄清这一点很重要。以下是公共卫生研究中更常见的观察领域列表。请注意,许多研究包括其中两个或多个,通常在关系框架中:
Human behavior 人类行为
Human psychological aspects 人类心理方面
Attitudes/opinions 态度/观点
Perceptions 看法
Knowledge 知识
Experiences 经验
Emotions and values 情感和价值观
Culturally shared meanings 文化共享的含义
Social structures 社会结构
Social relationships 社会关系
Biological outcomes 生物学结局
Socioeconomic metrics 社会经济指标
Processes and systems 流程和系统
Environmental context 环境背景
Disease incidence/prevalence 疾病发病率/患病率
Disease vectors 病媒
Geographic units 地理单位
Events 事件
The unit of observation relates to the data collection process by defining the types of data you plan to collect and at what level. In contrast, the unit of analysis in a study pertains to the analytic process; that is, how you plan to parse and compare data during analysis. The unit of analysis is the level of abstraction at which you will look for variability in your data. Individual people are common units of analysis in public health research, as are groups defined by particular demographics, such as race, gender, or socioeconomic status. The unit of analysis is the level at which data are synthesized and compared. Theoretically, a study can collect and compare data at a level ranging from a specific isolated behavior (episode) to a country and its attributes. Some commonly employed units of observation in health research include the following: 观察单位通过定义您计划收集的数据类型和级别来与数据收集过程相关。相反,研究中的分析单位与分析过程有关;即,您计划在分析期间如何解析和比较数据。分析单位是您将在其中查找数据可变性的抽象级别。个人是公共卫生研究中的常见分析单位,由特定人口统计数据(如种族、性别或社会经济地位)定义的群体也是如此。分析单位是综合和比较数据的级别。从理论上讲,研究可以收集和比较从特定孤立行为(事件)到国家/地区及其属性等层面的数据。健康研究中常用的一些观察单位包括:
Country or other large geographic region 国家/地区或其他较大的地理区域
City/town/village 城市/城镇/村庄
Neighborhoods, districts, or areas within a city/town/village 城市/城镇/村庄内的社区、地区或区域
Dyad (e.g., a married couple, patient/doctor) 二元组(例如,已婚夫妇、患者/医生)
Individual 个人
Event 事件
Note that the unit of observation and unit of analysis are often, but not always, the same thing. For example, a study may collect data from and about individuals (unit of observation) in a community, but aggregate those data during analysis and compare data among two or more communities (unit of analysis). Or biological data may be the unit of observation, but more than likely the unit of analysis in this case will be individuals. A cardinal rule in research design is to make sure that your unit of observation is at the same, or preferably more granular, level of abstraction as your unit of analysis. Once collected, data can always be aggregated. They, can never be disaggregated. 请注意,观察单位和分析单位通常(但并非总是)是一回事。例如,一项研究可能从社区中的个人(观察单位)收集数据,但在分析过程中汇总这些数据并比较两个或多个社区(分析单位)之间的数据。或者生物数据可能是观察单位,但在这种情况下,分析单位很可能是个体。研究设计中的一个基本规则是确保你的观察单位与你的分析单位处于相同的抽象层次,或者最好是更精细的抽象层次。收集后,数据始终可以聚合。它们,永远不能被分解。
Determine Sample and Recruitment Procedures 确定样本和招聘程序
Public health researchers must decide how to select their study samples and how many items to include in their sample. A sample is a subset of a population of people, health facilities, households, geographic units, events, and so on. Chapter 17 covers the conceptual and practical aspects of sampling and recruitment. It is important to note here, however, that sampling is a critical component of research design. Sampling procedures determine how much a researcher can extrapolate and generalize (or not) beyond a study sample. A study that has high external validity (or generalizability) provides greater insight into how the behavior, knowledge, an intervention, disease pathways, or incidence observed in one study would apply (or not) in other geographic areas, with different participants, or when made available to a larger segment of the population from which the original study participants were drawn (that is, scale up). 公共卫生研究人员必须决定如何选择他们的研究样本以及样本中要包含多少个项目。样本是人口、卫生设施、家庭、地理单位、事件等的子集。第 17 章涵盖了抽样和招募的概念和实践方面。然而,这里需要注意的是,抽样是研究设计的关键组成部分。抽样程序决定了研究人员在研究样本之外可以推断和概括(或不)进行多少推断和概括。具有高度外部效度(或普遍性)的研究可以更深入地了解一项研究中观察到的行为、知识、干预措施、疾病途径或发病率如何适用于(或不适用)其他地理区域、具有不同的参与者,或者当提供给原始研究参与者来自的较大人群时(即 Scale up) 的 S
Operationalize Study Measures 实施研究措施
Earlier in this chapter we discussed how studies consist of concepts to be measured by variables. At this point in the study design you will need to operationalize those variables for measurement. This means making more precise how you will measure the variable. In social research, verbal or self-report are the most common kinds of measurement (Singleton, Straits, & Straits, 1993). For example, questions are asked of respondents about their background characteristics such as levels of education, their knowledge and attitudes, and certain health behaviors. These measures can also be combined to make composite measures, scales, or indexes (more information about scale development and validation is in Chapter 13). Measures may also be obtained through observation or through review of existing records or documentation. Chapter 11 has more information about observing client-provider interactions, for example, to study quality of health care. Other observed measures may include medical assessment of weight, height, blood pressure, blood or urine samples, and such. Existing records or documents may include health records, public or private documents, and written or video communications (Singleton et al., 1993). 在本章的前面,我们讨论了研究如何由变量测量的概念组成。在研究设计的这一点上,您需要操作这些变量以进行测量。这意味着要更精确地测量变量。在社会研究中,口头或自我报告是最常见的衡量方式(Singleton, Straits, & Straits, 1993)。例如,向受访者询问有关其背景特征的问题,例如教育水平、知识和态度以及某些健康行为。这些度量也可以组合起来形成复合度量、量表或索引(有关量表开发和验证的更多信息,请参见第 13 章)。也可以通过观察或通过审查现有记录或文件来获得措施。第 11 章提供了有关观察客户与提供者互动的更多信息,例如,研究医疗保健的质量。其他观察到的措施可能包括体重、身高、血压、血液或尿液样本等的医学评估。现有记录或文件可能包括健康记录、公共或私人文件以及书面或视频通信(Singleton et al., 1993)。
The operationalization of measures is highly dependent on what you have defined as the research question and hypothesis. There are pragmatic considerations as well about what is feasible. Questionnaires have to be of reasonable length so as to not overburden the respondent, and all measures and procedures have to be ethically and technically feasible. See Table 2.2 above for examples of concepts, variables, and measures. Operationalizing your study measures will help you determine the data you need to obtain those measures. Broad data categories include primary data, or data collected by you for your study, and secondary data, which are existing sources of data (see Chapter 9). Data can be obtained from self-reports, observation, or biological sources. 措施的实施在很大程度上取决于您所定义的研究问题和假设。关于什么是可行的,也有务实的考虑。问卷的长度必须合理,以免使受访者负担过重,并且所有措施和程序都必须在道德和技术上可行。有关概念、变量和度量的示例,请参阅上面的表 2.2。实施您的研究措施将帮助您确定获得这些措施所需的数据。广泛的数据类别包括主要数据或您为研究收集的数据,以及次要数据,它们是现有的数据源(请参阅第 9 章)。数据可以从自我报告、观察或生物来源获得。
The tool that will measure your variable is called an instrument. This can be a survey, device, checklist, procedure, or any other measurement tool. Prior to conducting the research, ensure that your instrument is reliable (measures are reproducible at different times or by different observers) and valid (data measure what they are actually supposed to measure). This is typically done through a process of pretesting (or pilot testing) the instrument. Then, in order to ensure proper use of the instrument, researchers train any data collection or study staff who will use the instrument. Additional information about survey design and implementation is in Chapter 12. 用于测量变量的工具称为 instrument。这可以是调查、设备、清单、程序或任何其他测量工具。在进行研究之前,请确保您的仪器是可靠的(测量可以在不同时间或由不同的观察者重现)和有效(数据测量的是他们实际应该测量的东西)。这通常是通过对仪器进行预测试(或中试测试)的过程来完成的。然后,为了确保正确使用该仪器,研究人员会培训将使用该仪器的任何数据收集或研究人员。有关调查设计和实施的其他信息,请参见第 12 章。
Validity 有效性
Within the research methods literature, there is big V validity and little v validity. The former refers to the concept as a whole, while the latter refers to specific subtypes of validity. As the term validity is commonly used in research vernacular across multiple disciplines, the definitions associated with it vary by field. Below are a few definitions of the general concept of validity (big V ): 在研究方法文献中,有大 V 效度和小 V 效度。前者是指整个概念,而后者是指有效性的特定子类型。由于术语有效性通常用于多个学科的研究白话,因此与之相关的定义因领域而异。以下是有效性(大 V)的一般概念的一些定义:
"An account is valid or true if it represents accurately those features of the phenomena, that it is intended to describe, explain or theorize" (Hammersley, 1992, p. 69) “如果一个描述准确地代表了现象的那些特征,它旨在描述、解释或理论化,那么它就是有效的或真实的”(Hammersley,1992 年,第 69 页)
"The accuracy and trustworthiness of instruments, data and findings in research" (Bernard, 2013, p. 45) “研究中工具、数据和发现的准确性和可信度”(Bernard,2013 年,第 45 页)
"The degree to which a test measures what it is intended to measure" (Social Science Dictionary, 2008) “测试衡量其预期衡量对象的程度”(社会科学词典,2008 年)
"Validity can be considered as the extent to which a measurement, test, or study measures what it purports to measure" (Kirch, 2008, p. 1440) “有效性可以被认为是测量、测试或研究衡量它旨在衡量的程度”(Kirch,2008 年,第 1440 页)
"The degree to which data in a research study are accurate and credible" (D. E. Gray, 2009, p. 582) “研究中的数据准确和可信的程度”(DE Gray,2009 年,第 582 页)
"The validity of a measure is the extent to which it actually assesses what it purports to measure" (Shi, 2008, p. 464) “衡量标准的有效性是它实际评估它所要衡量的程度”(Shi,2008 年,第 464 页)
The definitions represented (and typically those not represented here as well) encompass a cross-cutting theme: the notion that we are actually assessing what we are intending to assess. 所代表的定义(通常也是此处未代表的定义)包含一个横切主题:我们实际上是在评估我们打算评估的东西的概念。
The subtypes of validity most commonly discussed are face validity, content validity, construct validity, criterion validity, external validity, and internal validity (although one can find many more types of validity described throughout the literature; see Bernard, 2013; Maxwell, 1992). We provide a brief definition of several subtypes in Table 2.4. 最常讨论的效度子类型是表面效度、内容效度、结构效度、标准效度、外部效度和内部效度(尽管您可以在文献中找到更多类型的有效性;参见 Bernard,2013 年;Maxwell, 1992)。我们在表 2.4 中提供了几种亚型的简要定义。