Week 3: Multiple Linear Regression
第 3 周:多元线性回归
Last week we ended by looking at simple linear regression, a statistical modelling technique used when we want to analyse the relationship between one continuous independent variable and one continuous dependent variable. In this week’s session we’re going to continue to look at linear regression models but focus on how this technique can be expanded to when we have hypotheses involving more than one independent variable impacting on the same dependent variable.
上周,我们以简单线性回归作为结束语,这是一种统计建模技术,当我们想要分析一个连续自变量和一个连续因变量之间的关系时使用。在本周的会议中,我们将继续研究线性回归模型,但重点关注当我们假设涉及多个自变量影响同一因变量时,如何扩展这一技术。
The relevant chapters for this week in the Andy Field Textbook are:
安迪·菲尔德教科书本周的相关章节是:
Chapter 6: Bias Chapter 8: The Linear Model (regression)
第 6 章:偏差第 8 章:线性模型(回归)
If we consider a scenario where we have 2 (or more) hypothesised predictors (AKA independent variables) and we are interested in the effects these variables have on a single specific outcome (AKA the dependent variable) there are a number of different questions we could ask. For example:
如果我们考虑这样一个场景,即我们有 2 个(或更多)假设预测变量(也称为自变量),并且我们对这些变量对单个特定结果(也称为因变量)的影响感兴趣,那么我们可以提出许多不同的问题问。例如:
Does variability in response in a group of predictor variables help to explaining variance in an outcome variable?
一组预测变量响应的变异性是否有助于解释结果变量的方差?
What are the relative contributions of specific predictors, within a group of predictors, to explaining variance in an outcome variable?
一组预测变量中特定预测变量对解释结果变量方差的相对贡献是什么?
To what extent does a specific predictor, within a larger group, help us explain variability in an outcome variable, which the others cannot?
在更大的群体中,特定的预测变量在多大程度上可以帮助我们解释结果变量的变异性,而其他变量则不能?
We can answer all these questions using Multiple Linear Regression (MLR) models but need to look at slightly different parts of an MLR model, or run slightly different types of MLR, to answers each of these questions.
我们可以使用多重线性回归 (MLR) 模型来回答所有这些问题,但需要查看 MLR 模型的略有不同的部分,或者运行稍微不同类型的 MLR,才能回答每个问题。
In Exercise 1 we’ll look at how we can use MLR models to investigate questions 1 and 2 above.
在练习 1中,我们将了解如何使用 MLR 模型来研究上述问题 1 和 2。
In Exercise 2 we’ll build on this to see how we can use a slightly different type of MLR methodology to investigate question 3.
在练习 2中,我们将在此基础上了解如何使用略有不同类型的 MLR 方法来研究问题 3。
Exercise 1: Multiple Linear Regression (Forced Entry method)
练习 1:多元线性回归(强制进入法)
Open up Exercise1_dataset.sav, which you’ll notice is a slightly modified version of the dataset we used in the final exercise last week. To briefly remind you, this dataset included measures of several personality traits (Conscientiousness, Agreeableness and Extraversion) that had been recorded for a sample of patients as they were being discharged from a mental health ward. The dataset also included a measure of these patient’s medication adherence following discharge.
打开Exercise1_dataset.sav ,您会注意到它是我们上周在最终练习中使用的数据集的稍微修改版本。简单提醒您一下,该数据集包括一些人格特征(责任心、宜人性和外向性)的测量值,这些测量值是在患者从精神健康病房出院时记录的。该数据集还包括对这些患者出院后药物依从性的测量。
In this updated dataset we’ve added in measures of two further personality traits (i.e. we now have measures of all the ‘Big 5’ personality traits). Imagine we want to test a hypotheses about these independent variables’ relationships with medication adherence (the dependent variable).
在这个更新的数据集中,我们添加了另外两个人格特征的测量(即我们现在拥有所有“大 5 ”人格特征的测量)。想象一下,我们想要测试有关这些自变量与药物依从性(因变量)关系的假设。
Looking back at the questions in the intro (page 1), how would you express the first question in that list as an experimental hypothesis, wish to test hypothetical relationships between the independent and dependent variables in this dataset?
回顾一下简介(第 1 页)中的问题,您如何将该列表中的第一个问题表达为实验假设,希望测试此数据集中自变量和因变量之间的假设关系?
Similarly, how might you rephrase the second question as a more specific hypothesis relating to this dataset?
同样,您如何将第二个问题改写为与该数据集相关的更具体的假设?In particular, imagine
特别是想象一下 you had prior reason to belief Conscientiousness was a likely to have the greatest influence of all the personality traits on medication adherence:
您有理由相信尽责性可能是所有人格特质中对药物依从性影响最大的:
All MLR models are made up of ‘blocks’ (or as you might otherwise hear them called: ‘steps’ or ‘nested models’).
所有 MLR 模型都是由“块”组成(或者您可能会听到它们被称为:“步骤”或“嵌套模型”)。
The simplest MLR model is made up of a single block and within this block the influence of all the independent variables on the dependent variable are considered (see illustration below). Entering all your independent variables in a single block like this is known as a ‘Forced Entry’ technique, which refers to the fact that all your hypothesised predictors are ‘forced’ by you into being considered simultaneously in a single block.
最简单的 MLR 模型由单个块组成,在该块内考虑所有自变量对因变量的影响(见下图)。像这样在单个块中输入所有自变量被称为“强制输入”技术,它指的是所有假设的预测变量都被您“强制”在单个块中同时考虑。
Baseline Model (Block 0) | Block 1 |
[No Predictors] | Conscientiousness Agreeableness Extraversion Neuroticism Openness |
Forced entry MLRs are useful when we want to:
当我们想要执行以下操作时,强制进入 MLR 非常有用:
Only assess the collective contribution a set of predictors makes to explaining variance in an outcome
仅评估一组预测变量对解释结果方差的集体贡献
(After having established a collective effect) explore what the relative contributions of each predictor in the model are to explaining variance in the outcome.1
(建立集体效应后)探索模型中每个预测变量对解释结果方差的相对贡献。 1
The only difference between this type of model and the Simple Linear Regression you conducted last week is the number of variables you enter in the single block (i.e. last we you only entered 1 personality Trait into block 1, this time you’re going to run a model where you enter all 5 traits into block 1). As such, it allows us to ask the same sorts of questions as we did last week about:
这种类型的模型和您上周进行的简单线性回归之间的唯一区别是您在单个块中输入的变量数量(即上次我们只在块 1 中输入了 1 个性格特征,这次您将运行在该模型中,您将所有 5 个特征输入到块 1)。因此,它允许我们提出与上周相同的问题:
Effect size
效应大小
Overall Model Fit
整体模型拟合
Parameter Estimates (i.e. regression coefficients)
参数估计(即回归系数)
Effect size and Overall Model Fit helps us to evaluate the “collective contribution” referred to earlier, whilst the Parameter Estimates are using in understanding the “relative contributions”
效应大小和总体模型拟合帮助我们评估前面提到的“集体贡献”,而参数估计则用于理解“相对贡献”.
To run this type of analysis follow these steps:
要运行此类分析,请按照下列步骤操作:
Select Analyze > Regression > Linear…
选择分析> 回归 > 线性...
Into the Dependent variable box enter the variable we want to predict (i.e. Medication Adherence)
在因变量框中输入我们想要预测的变量(即药物依从性)
In to the Independent variable box enter all 5 variables we want to use to try and predict the dependent variable (i.e. Conscientiousness, Agreeableness, Neuroticism, Openness and Extraversion)
在自变量框中输入我们想要用来尝试和预测因变量的所有 5 个变量(即责任心、宜人性、神经质、开放性和外向性)
Select OK
选择确定.
The first question we’re interested in answering is whether this group of predictors, as a quintet (i.e. all 5 personality traits together), explains a meaningful and statistically significant amount of variation in the outcome (Medication Adherence). Based on last week’s exercises which aspects of the output tell you about this
我们有兴趣回答的第一个问题是,这组预测因素作为一个五重奏(即所有 5 种人格特质在一起)是否可以解释结果(药物依从性)中有意义且具有统计显着性的变化量。根据上周的练习,输出的哪些方面可以告诉您这一点overall contribution
总体贡献 and what conclusions would you draw from your output? (i.e. what’s the evidence relating to
你会从你的输出中得出什么结论? (即与什么有关的证据your the
你的 first hypothesis you proposed, back on page 2?
你提出的第一个假设,回到第 2 页?
Presuming we establish that this block of predictors explains a significant amount of variation in the outcome, our next question is: what are the relative contributions of each predictor in the model?
假设我们确定这组预测变量可以解释结果中的大量变化,那么我们的下一个问题是:模型中每个预测变量的相对贡献是什么?
To examine this:
要检查这一点:
First,
第一的, we need to distinguish between those Independent Variables that have statistically significant relationships with the outcome and those that don’t. From the
我们需要区分那些与结果有统计显着关系的自变量和那些没有统计显着关系的自变量。从Coefficient
系数 Table in your output identify which (if any) of the personality traits are associated with medication adherence, answering below:
输出中的表格确定了哪些人格特征(如果有)与药物依从性相关,回答如下:
Second, do the results in your outputsupport the second hypothesis you formulated (i.e. about the relative contribution of conscientiousness compared to other factors)?
其次,您的输出结果是否支持您提出的第二个假设(即关于责任心与其他因素相比的相对贡献)?
To conclude this exercise write-up an (APA) formatted summary of this regression model and its’ findings (use the box at the top of the next page). Hint: formatting is similar to what you did last week in the self-test question. Only this time you have more regression coefficients to report.
为了结束本练习,请写下此回归模型及其结果的 (APA) 格式的摘要(使用下一页顶部的框)。提示:格式设置与您上周在自测问题中所做的类似。只是这一次您有更多的回归系数需要报告。
Looking ahead to the next exercise, A limitation of Forced Entry methods is that whilst we can compare standardised betas we cannot do more than this to “formally test” the relative contribution of different independent variables within these types of regression models.
展望下一个练习,强制输入方法的一个局限性是,虽然我们可以比较标准化的beta ,但我们不能做更多的事情来“正式测试”这些类型的回归模型中不同自变量的相对贡献。
If we specifically want to test the unique contribution of one predictor to explaining variance in the outcome, after controlling for the effects of all others predictor variables, then a Hierarchical MLR model is required. Usage of this model will be the focus of Exercise 2.
如果我们特别想在控制所有其他预测变量的影响后测试一个预测变量对解释结果方差的独特贡献,则需要分层MLR 模型。该模型的使用将是练习 2 的重点。
Exercise 2a: Simple to Multiple Linear Regression
练习 2a:简单到多元线性回归
In the box below
在下面的框中is a description of an observational study, run by a university admissions team
是对由大学招生团队进行的观察性研究的描述.
Task 1: Is Intrinsic Motivation Inventory Score an useful predictor of exam performance?
任务 1 :内在动机量表分数是考试成绩的有用预测指标吗?
To investigate whether Intrinsic Motivation is a good predictor of Entrance exam score we will use simple linear regression to explore the dataset: Exercise2a_dataset.sav. To do this we will:
为了研究内在动机是否可以很好地预测入学考试成绩,我们将使用简单的线性回归来探索数据集:Exercise2a_dataset.sav 。为此,我们将:
Run the regression model
运行回归模型
AND, this time, we’ll also run through some additional steps you should follow to check the assumptions for running a valid regression model are met
并且,这一次,我们还将执行一些您应该遵循的附加步骤,以检查是否满足运行有效回归模型的假设
Note: I skipped over checking assumptions in the first exercise but it is important you run these checks whenever using regression models. We’ll introduce a few absolutely key assumptions you should check here, than discuss a few more in Exercise 2b to give you a comprehensive list
注意:我在第一个练习中跳过了检查假设,但在使用回归模型时运行这些检查很重要。我们将介绍一些绝对关键的假设,您应该在此处检查,然后在练习 2b 中讨论更多假设,以便为您提供全面的列表:
First (before starting any linear regression model), we need to check the assumption that there’s a plausible linear relationship between each independent variable (i.e. the two measures of Intrinsic Motivation) and the outcome (i.e. Exam Performance). Using the skills learned in previous weeks is it safe to say we satisfy this assumption of “linearity”?
首先(在开始任何线性回归模型之前),我们需要检查每个自变量(即内在动机的两个度量)和结果(即考试成绩)之间是否存在合理的线性关系的假设。使用前几周学到的技能是否可以肯定地说我们满足了“线性”假设?
Next, we set up the regression model. Normally, as we do this, we request a number of further checks of our assumptions. However, we’ll leave this until we get onto Exercise 2b. For now just request the following (which amounts to doing a simple linear regression, like in Exercise 1)
接下来,我们建立回归模型。通常,当我们这样做时,我们会要求对我们的假设进行一些进一步的检查。不过,我们将把它留到练习 2b 为止。现在只需请求以下内容(相当于进行简单的线性回归,如练习 1 中所示):
Select Analyze > Regression > Linear…
选择分析> 回归 > 线性...
Into the Dependent variable box enter the variable we want to predict (i.e. Entrance Exam Performance [Entr_Exam])
在因变量框中输入我们想要预测的变量(即入学考试成绩[ Entr_Exam ] )
In to the Independent variable box enter the variable we want to use to try and predict the dependent variable (i.e. Intrinsic Motivation Inventory score [IM])
在自变量框中输入我们想要用来尝试和预测因变量的变量(即内在动机库存分数[IM] )
Select OK
选择确定.
Based on the output from this model can you answer the following question?
根据该模型的输出,您能回答以下问题吗?
Looking at the ANOVA table is our model a good fit for the data? What value(s) specifically suggest this?
查看方差分析表,我们的模型是否适合数据?什么值具体表明了这一点?
Task 2 - Is Intrinsic Motivation Inventory Score a useful predictor of exam performance, after controlling for variability in exam score that is explained by Hours of Independent Study?
任务 2 -在控制了由独立学习时间解释的考试分数的变异性后,内在动机清单分数是否是考试成绩的有用预测指标?
Now we’re going to run a hierarchical multiple linear regression that builds on top of the simple linear regression model we just created for Task 1.
现在,我们将运行一个分层多元线性回归,该回归建立在我们刚刚为任务 1 创建的简单线性回归模型之上。
Hierarchical MLRs are different from Forced Entry models in that they comprise of multiple nested blocks. This means the researcher must choose the order in which variables are entered into the model. This requires the researcher to be hypothesis driven about the order of entry (usually deciding the order based on prior research or theories). This has the advantage of allowing for comparisons to be made between blocks, enabling the unique contribution of specific predictors added later in a series of steps/blocks to be quantified and tested.
分层MLR 与强制进入模型不同,因为它们由多个嵌套块组成。这意味着研究人员必须选择变量输入模型的顺序。这要求研究人员对输入顺序进行假设驱动(通常根据先前的研究或理论来决定顺序)。这样做的优点是允许在块之间进行比较,从而能够量化和测试随后在一系列步骤/块中添加的特定预测变量的独特贡献。
At the same time, it is still possible to ask questions about the overall fit of the model, as we did with Forced Entry (see illustration on next page). This illustration depicts a hierarchical model developed to predict weight from height, gender and age. The blue text boxes highlight the different types of questions that can be used to ask using this approach, by comparing between different blocks.
同时,仍然可以询问有关模型整体拟合的问题,就像我们对强制进入所做的那样(请参见下一页的插图)。该图描绘了一个分层模型,该模型是为了根据身高、性别和年龄预测体重而开发的。通过比较不同的块,蓝色文本框突出显示了可以使用此方法提出的不同类型的问题。
Baseline Model (Block 0) | Block 1 | Block 2 | Block 3 |
[No Predictors] | Height | Height + Gender | Height Gender + Age |
The term ‘nested’ is often used to describe the blocks/steps in a hierarchical regression model because each block builds on the previous one (i.e. it is ‘nested within it’). For example, the Independent Variables from block 1 are carried over to block 2, where some more are added in. This allows investigation of the impact of incrementally adding ‘one more thing’ at each stage of the hierarchical regression model.
“嵌套”一词通常用于描述分层回归模型中的块/步骤,因为每个块都建立在前一个块的基础上(即“嵌套在其中”)。例如,块 1 中的自变量被转移到块 2,其中添加了更多变量。这允许调查在分层回归模型的每个阶段增量添加“另一件事”的影响。
Getting back to the example in this exercise, in the first Block of our hierarchical model we want to enter Independent Study Hours, to control for its influence on Exam Performance. Then we want to add in Intrinsic Motivation at the second step, to see if it is still a significant predictor of performance after controlling for variation already explained by Independent Study Hours. To run this analysis
回到本练习中的示例,在分层模型的第一个块中,我们要输入独立学习时间,以控制其对考试成绩的影响。然后我们想在第二步添加内在动机,看看在控制独立学习时间已经解释的变化后,它是否仍然是表现的重要预测因素。运行此分析:
Selecting Analyze > Regression > Linear…
选择分析% 3E回归 > 线性...
Into the Dependent variable box enter the variable we want to predict (i.e. Entrance Exam Performance)
在因变量框中输入我们想要预测的变量(即入学考试成绩)
In to the Independent(s) variable box enter the variable we want to control for (i.e. Independent Study score)
在独立变量框中输入我们想要控制的变量(即独立研究分数)
Then click on the Next button to move to the next ‘Block’ of our hierarchical model. Into this next step add the Independent(s) variable that we want to evaluate as a predictor, after controlling for the variable(s) already entered in Step 1 (note SPSS terms these ‘steps’ as Blocks). So that’ll be: Intrinsic Motivation Inventory score [IM]
然后单击“下一步”按钮移至分层模型的下一个“块” 。在控制步骤 1 中已输入的变量后,在下一步中添加我们想要作为预测变量进行评估的独立变量(请注意 SPSS 将这些“步骤”称为“块”)。所以那就是:内在动机库存分数[IM]
Next, click on the Statistics… option and make sure the option for R squared change is selected. This will ensure we get some extra values in our output that test for whether the increase in explanatory power of the model from one block/step to the next is statistically significant. In other words: do the Independent Variable(s) added in later blocks add further ‘unique’ variance to the proportion of variability in the outcome already systematically explained by independent variables added at earlier steps nested models?
接下来,单击“统计...”选项,并确保选择“R 平方变化”选项。这将确保我们在输出中获得一些额外的值,用于测试模型的解释力是否从一个块/步骤到下一个块/步骤有所增加。具有统计显着性。换句话说:在后面的块中添加的自变量是否会进一步向结果中的变异性比例添加“独特”方差,该结果已经由在早期步骤嵌套模型中添加的自变量系统地解释了?
To finish Click Continue and then Click OK
要完成,请单击“继续” ,然后单击“确定”
Based on the output answer the following questions:
根据输出回答以下问题:
Looking at the ANOVA table, do both the Step 1 and Step 2 regression models explain a significant amount of variability in the outcome? What values indicate this?
查看方差分析表,步骤 1 和步骤 2 回归模型是否都能解释结果中的大量变异?什么值表明了这一点?
Looking at the Model Summary how much variability does the model only containing Independent Study only explain (Hint: it’ll be an R Square value you’re interested in)?
查看模型摘要,仅包含独立研究的模型仅解释了多少变异性(提示:它将是您感兴趣的R平方值)?
Looking at the Model Summary how much variability does the model containing Independent Study and Intrinsic Motivation explain (Hint: it’ll be an R Square value you’re interested in)?
查看模型摘要,包含独立研究和内在动机的模型解释了多少变异性(提示:它将是您感兴趣的 R平方值)?
Looking at the
看着Model Summary
型号概要 how much more variability does
还有多少可变性the addition
添加 of Intrinsic Motivation explain and is this increase statistically significant (Hint: you’ll be interested in ‘Change
内在动机的解释,这种增加在统计上是否显着(提示:您会对“改变”感兴趣) Statistics
统计数据’ values
' 价值观 for Model 2
对于模型 2)?
)?
If your struggling with understanding the differences between what the F value in the ANOVA table represents and the F Change in the model summary represents consider this illustration below:
如果您难以理解方差分析表中的 F 值所代表的内容与模型摘要中的 F 变化所代表的内容之间的差异,请考虑下面的图示:
Looking at the Coefficients table, which of the two Independent variables in Step 2 is having the greater influence in this model on exam performance (Hint: use a standardised betas to make comparisons)? Also, what surprising finding is there regarding one of these regression coefficients?
查看系数表,步骤 2 中的两个独立变量中的哪一个对该模型中的考试成绩影响更大(提示:使用标准化贝塔进行比较) ?另外,关于这些回归系数之一有什么令人惊讶的发现?
Because of the tiered (AKA ‘hierarchical) nature of entry in Hierarchical Regression Model we would typically describe and interpret the regression coefficient for Intrinsic Motivation, entered at step 2, as ‘the relationship between Exam Score and Intrinsic Motivation after controlling for2 Hours of Independent Study’. In what way does this allow us to say more about the contribution specific independent variables are making within our regression model, compared to if we’d used only a Forced Entry approach?
由于分层回归模型中输入的分层(又称“分层”)性质,我们通常会描述和解释在步骤 2 中输入的内在动机的回归系数,即“控制2小时后考试分数与内在动机之间的关系”独立研究' 。与仅使用强制进入方法相比,这如何让我们更多地了解特定自变量在回归模型中所做的贡献?
Hint: recall the “limitation” discussed at the bottom of page 4
提示:回想一下第 4 页底部讨论的“限制”
Exercise 2b: Multiple Linear Regression
练习 2b:多元线性回归
Below is a description of the data from an observational study, which is suitable to be analysed using a slightly more complicated hierarchical multiple regression design.
以下是对观察性研究数据的描述,适合使用稍微复杂的层次多元回归设计进行分析。
In this exercise we’re going to practice: (i) the proper formatting for presenting results from hierarchical regression models and (ii) checking the full set of assumptions we would normally check when running a regression model:
在本练习中,我们将练习:( i ) 用于呈现分层回归模型结果的正确格式,以及 (ii) 检查我们在运行回归模型时通常会检查的全套假设:
Task 1: run a comprehensive regression model, which includes checking all assumptions
任务1 :运行综合回归模型,其中包括检查所有假设
For this exercise use the following dataset (i.e. Exercise2b_dataset.sav)
对于本练习,请使用以下数据集(即Exercise2b_dataset.sav )
To carry out the analysis, enter the variables in two steps (corresponding to two sets) using the Hierarchical linear regression technique we used in Exercise 2a.
要进行分析,请使用练习 2a 中使用的d的分层线性回归技术分两步输入变量(对应于两个集合) 。
In Block 1 enter: loc, positive, negative
在块 1 中输入:loc、正、负
Then click Next and in the next Block enter: compul, control, impair, delegate, selfw
然后单击“下一步”,在下一个块中输入:compul、control、impair、delegate、selfw
Specify
指定disaff
不满 as the
作为Dependent
依赖者 variable
多变的
We’re also going to request a full diagnostic check of all our assumptions3. Specifically:
我们还将要求对我们的所有假设进行全面的诊断检查3 。具体来说:
In the linear regression window in Statistics make sure the following options are selected:
在统计的线性回归窗口中,确保选择以下选项:
In Regression Coefficients: Estimates and Confidence Intervals
回归系数:估计值和置信区间
In Residuals: Durbin-Watson and Casewise diagnostics, with Outliers outside 2 standard deviations specified.
在残差中: Durbin-Watson和Casewise诊断,指定的异常值超出 2 个标准差。
Model Fit, R Squared change, Descriptives, and Collinearity diagnostics.
模型拟合、 R 平方变化、描述和共线性诊断。
In plots request plots for:
在绘图中请求绘图:
*ZRESID (y-axis) against *ZPRED (x-axis) to check assumptions of independent errors, homoscedasticity and linearity.
*ZRESID (y 轴)对照*ZPRED (x 轴)来检查独立误差、同方差性和线性的假设。
Also tick for a Histogram and Normal probability plot for your Standardized Residual Plots
还可以勾选标准化残差图的直方图和正态概率图
In the Save menu ask SPSS to create new variable list for each case in your dataset:
在“保存”菜单中,要求 SPSS 为数据集中的每个案例创建新的变量列表:
Standardized residuals
标准化残差
Mahalonobis, Cook’s and Leverage Value distances
马哈洛诺比斯距离、库克距离和杠杆价值距离
DFBeta(s) and Covariance ratio Influence Statistics
DFbeta (s)和协方差比影响统计
To run the model, come out of the side menus and select OK.
要运行模型,请退出侧面菜单并选择“确定”。
Task 2: interpret your results
任务 2:解释结果
Looking at your output. I want you to first of all take a look at and familiarise yourself with:
看看你的输出。我希望您首先看一下并熟悉一下:
Descriptive Stats: useful for reporting the data and for getting a feel for the distributions.
描述性统计:对于报告数据和了解分布很有用。
Correlation Matrix: Examine the (zero order) r values for marital disaffection against the predictor variables. You’ll see comparatively strong associations with control and impair (r >.300), whilst all others apart from locus of control seem to be statistically significant but relatively small in contributors in terms of their Effect Sizes.
相关矩阵:检查婚姻不感情与预测变量的(零阶) r值。您将看到与控制和损害( r >.3 00 )相关性相对较强,而除控制点之外的所有其他因素似乎在统计上均显着,但就其效应大小而言,贡献者相对较小。
The Hierarchical Regression Analysis: The model summary shows a statistically significant relationship with marital disharmony being indicated in Model 1 (although R2 suggests this has a small effect size). There’s then a significant increase in model fit with the addition of the second block of variables for Model 2 (see the jump in the R2 value from model 1 to model 2). The coefficients output shows the regression weights (AKA the Standardized Regression Coefficients) for each variable within each given step/block of the model. This shows, for block 1, the only significant contributor is negative affect, whilst in block 2 (our main focus in this study), both control and impair seem to be significant predictors (negative affect is no longer so). Also note, we know have confidence interval for each of these parameter estimates because we asked for them in step 1a when setting up this regression model.
层次回归分析:模型摘要显示与模型 1中所示的婚姻不和谐存在统计显着关系(尽管R 2 表明这具有较小的效应量)。然后,随着模型 2 的第二个变量块的添加,模型拟合度显着增加(参见R 2值从模型 1 到模型 2的跳跃) 。系数输出显示模型的每个给定步骤/块内每个变量的回归权重(也称为标准化回归系数) 。这表明,对于块 1,唯一重要的贡献者是负面影响,而在块 2(我们本研究的主要焦点)中,控制和损害似乎都是显着的预测因素(负面影响不再如此)。另请注意,我们知道每个参数估计值都有置信区间,因为我们在设置此回归模型时在步骤 1a 中要求了它们。
This analysis could be written up as demonstrated below:
该分析可以写成如下所示:
“To examine a unique contribution that workaholism may make to explaining marital disaffection, a hierarchical multiple regression analysis was performed. Variables that explain marital disaffection were entered in two steps. In step 1, marital disaffection was the dependent variable and locus of control (LOC), positive affect, and negative affect were the independent variables. In step 2, the subscales scores from the WART (Workaholism questionnaire) were added to the regression model.
“为了检验工作狂可能对解释婚姻不满做出的独特贡献,我们进行了分层多元回归分析。解释婚姻不满的变量分两步输入。在步骤 1 中,婚姻不满是因变量,控制点 (LOC)、积极情感和消极情感是自变量。在步骤 2 中,将 WART(工作狂问卷)的子量表分数添加到回归模型中。
The results of step 1 indicated that the variance accounted for (R2) by the first three independent variables (LOC, positive and negative affects) equalled .04 (adjusted R2 = .03), which was significantly significant (F(3, 290) = 3.47, p = .017). Negative affect was the only statistically significant independent variable, bNegative = 2.27 [0.44, 4.11], p = .015. In step 2, the five subscales of the WART were entered into the regression equation. The change in variance accounted for (ΔR2) was equal to .16, which was significantly different from zero (ΔF(5, 285) = 11.56, p < .001). At this step, only two of the subscales of workaholism contributed significantly to the explanation of marital disaffection though, namely, control (bControl =3,83 [1.47, 6.40], p = .002) and impaired communication (bImpair. = 5.38 [2.79, 7.97], p < .001). Standardised beta values for these coefficients were similar in size (b*Control = .233, b*Impair. = .269)”
步骤 1 的结果表明,前三个自变量(LOC、正面和负面影响)所占的方差 ( R 2 ) 等于 0.04(调整后的R 2 = 0.03),显着显着 ( F ( 3, 290) = 3.47, p = .017)。负面影响是唯一具有统计显着性的自变量, b负面= 2.27 [0.44, 4.11], p = .015。在步骤 2 中,将 WART 的五个子量表输入回归方程。方差变化 ( ΔR 2 ) 等于 0.16,与零显着不同 ( Δ F ( 5, 285) = 11.56, p < .001)。在此步骤中,只有两个工作狂分量表对婚姻不满的解释做出了显着贡献,即控制( b Control =3,83 [1.47, 6.40], p = .002)和沟通障碍( b mpair = 5.38 [2.79, 7.97], p < .001)。 这些系数的标准化beta值大小相似( b*控制= .233, b*损害= .269) ”
Notes: Instead of writing ‘change’ to indicate when an R2 or F is referring to a change in model fit it is technically more correct to use the “Δ” symbol.
注意:在 R 2或 F 指的是模型拟合的变化时,不写“变化”来表示,从技术上讲,使用“ Δ”符号更为正确。
Here I’ve again reported b’s and standardised betas (b*), just to give you both. Notice that for the b’s I’ve also included the confidence interval in brackets, which is again optional but considered good practice.
在这里,我再次报告了 b 和标准化贝塔值 (b*),只是为了给你们两者。请注意,对于 b,我还在括号中包含了置信区间,这也是可选的,但被认为是良好的做法。
Can you see where in the output values have come from and why they are being used at certain points in the summary above to support the specific interpretive statements? If anything appears inconsistent/odd/unclear to you in this summary ask questions in the practical!
您能看出输出值来自何处以及为什么在上面摘要中的某些点使用它们来支持特定的解释语句吗?如果本摘要中出现任何不一致/奇怪/不清楚的内容,请在实践中提出问题!
On the next pages is a complete (APA formatted) multiple regression summary table. Note how within the F, ∆F and β columns a legend is used to indicate values where the corresponding p-value is also statistical significance (e.g. * p<.05; ** p<.01; *** p<.001). Also, see that values for ∆R2and ∆F for Model 1 are omitted as these are identical to the overall model fit for this block. In other words, the overall fit of the first block compared to baseline model is one and the same as the ‘change’ in fit between block 1 and the baseline model, so you don’t write it twice!
接下来的几页是完整的(APA 格式)多元回归汇总表。请注意,在F 、 ΔF和β列中,图例如何用于指示相应 p 值也具有统计显着性的值(例如 * p<.05;** p< .01; *** p% 3C。001 ) 。另请注意,模型 1 的ΔR 2和ΔF值被省略,因为这些值与该块的整体模型拟合相同。换句话说,第一个块与基线模型相比的整体拟合度是一个,并且与块 1 和基线模型之间的拟合度“变化”相同,因此您不必写两次!