1 Introduction 1 介绍

In recent years, applications making use of Artificial Intelligence (AI) have gained re-newed popular interest. Expectations that AI might change the face of various life domains for the better are abundant [1,2,3]. Be it medicine, mobility, scientific progress, the economy, or politics; hopes are that AI will increase the veracity of input, effectiveness and efficiency of procedures as well as the overall quality of outcomes. Irrespective whether changes apply to the workplace, public management, industries producing goods and services as well as private life: As usual with the diffusion of new technologies there is tremendous uncertainty as to how exactly developments will play out [4], what social consequences will manifest and to what extent respective expectations of stakeholders and societal groups will materialize. Oftentimes, there will be some people that immensely profit from socio-technological innovations, while others are left behind and cannot cope with the unfolding of events [5]. Thus, whenever new technologies bring about social change, the success of their implementation or failure depends upon the reaction of the affected people. People might happily accept new technology, they might not care nor use it at all, or they may even show severe reactance towards it [6]. There is first empirical evidence suggesting that the general public itself shows some considerable restraint when it comes to the broad societal diffusion of AI applications or robots that might even border on actual fear of such technology [7,8,9]. However, as fear and respective threat perceptions are presuppositional theoretical constructs, they necessitate a more fine-grained approach that goes beyond broad claims of concerns or even fear regarding autonomous systems.
近年来,利用人工智能 (AI) 的应用程序再次引起了大众的兴趣。人们普遍期望人工智能可能会更好地改变生活各个领域的面貌 [1,2,3]。无论是医学、流动性、科学进步、经济还是政治;希望人工智能将提高输入的准确性、程序的有效性和效率以及结果的整体质量。无论变化是否适用于工作场所、公共管理、生产商品和服务的行业以及私人生活:与新技术的传播一样,发展将如何发展 [4]、将表现出什么社会后果以及利益相关者和社会群体的相应期望将在多大程度上实现,都存在巨大的不确定性。很多时候,会有一些人从社会技术创新中获益匪浅,而另一些人则落后于人,无法应对事态的发展 [5]。因此,每当新技术带来社会变革时,其实施的成功或失败都取决于受影响人群的反应。人们可能很高兴地接受新技术,他们可能根本不关心也不使用它,或者他们甚至可能对它表现出严重的反抗 [6]。首先的经验证据表明,当涉及到 AI 应用程序或机器人的广泛社会传播时,公众本身表现出一些相当大的克制,甚至可能接近于对此类技术的实际恐惧 [7,8,9]。 然而,由于恐惧和相应的威胁感知是预设的理论结构,它们需要一种更细粒度的方法,超越对自主系统的广泛担忧甚至恐惧的主张。

Accordingly, in this paper, we argue for an improved assessment of the perceived threats of AI and propose a survey scale to measure these threat perceptions. First, a broadly usable measurement would need to address perceived threats of AI as a precondition to any actual fear experienced. This conceptual difference is subsequently based on the literature on fear and fear appeals. Second, the perceived threat of AI would need to take into account the context-dependency of respective fears as most real-world applications of AI are highly domain-specific. AI that assists in the medical treatment of a person’s disease might be perceived vastly different from an AI that takes over their job. Third, not only do perceptions hinge on the domain in which people encounter AI applications, it would also be necessary to differentiate between the extent of an AI’s actual autonomy and reach in inflicting consequences upon a person. Thus, it needs to be asked to what extent the AI is merely used for analysis of a given situation, or going even further, whether the AI is used to actively give suggestions or even making autonomous decisions.
因此,在本文中,我们主张改进对 AI 感知威胁的评估,并提出了一个调查量表来衡量这些威胁感知。首先,一个广泛可用的测量方法需要解决感知到的人工智能威胁,这是任何实际恐惧的先决条件。这种概念上的差异随后基于关于恐惧和恐惧诉求的文献。其次,AI 的感知威胁需要考虑各自恐惧的上下文依赖性,因为 AI 的大多数实际应用都是高度特定于领域的。协助治疗一个人疾病的 AI 可能与接管他们工作的 AI 大不相同。第三,感知不仅取决于人们接触 AI 应用的领域,还需要区分 AI 的实际自主性程度和对人造成后果的范围。因此,需要询问 AI 在多大程度上仅用于分析给定情况,或者更进一步,AI 是否用于主动提供建议甚至做出自主决策。

As the field of application is crucial for the mechanism and effects of threat perceptions concerning AI, any standardized survey measure needs to be somewhat flexible and individually adaptable to accommodate the necessities of a broad application that considers AI’s functions and the context of implementation. That is why our scale construction opts for a design that can easily be adapted to varying research interests of AI scholars.
由于应用领域对于有关 AI 的威胁感知的机制和影响至关重要,因此任何标准化的调查措施都需要具有一定的灵活性和可单独调整性,以适应考虑 AI 功能和实施背景的广泛应用的必要性。这就是为什么我们的秤构建选择了一种可以轻松适应 AI 学者不同研究兴趣的设计。

Consequently, we developed a scale addressing threats of AI that takes into account such necessary distinctions and subsequently tested the proposed measure for three domains (i.e. loan origination, job recruitment and medical treatment that are subject to an AI application) in an online survey with German citizens. In our proposed measure of perceived threats of AI, we aim to cover all aspects of AI functionality and make it applicable to various societal fields, where AI applications are used. Thereby, we highlight three contributions of our scale, that are addressed in the following:
因此,我们开发了一个解决 AI 威胁的量表,该量表考虑了这些必要的区别,并随后在对德国公民的在线调查中测试了三个领域(即受 AI 应用程序约束的贷款发放、工作招聘和医疗)的拟议措施。在我们提出的 AI 感知威胁测量中,我们的目标是涵盖 AI 功能的所有方面,并使其适用于使用 AI 应用程序的各个社会领域。因此,我们重点介绍了我们规模的三个贡献,具体如下:

  1. (1)

    We underpin our scale development theoretically by connecting it with the psychological literature on fear appeals.
    我们通过将量表发展与关于恐惧诉求的心理学文献联系起来,从理论上支撑我们的量表发展。

  2. (2)

    The construction of the scale differentiates between the discrete functionalities of AI that may cause different emotional reactions.
    量表的构造区分了 AI 的离散功能,这些功能可能会导致不同的情绪反应。

  3. (3)

    Moreover, we consider perceived threats of AI as dependent on the context of the AI’s implementation. This means that any measure must pay respect to AI’s domain-specificity.
    此外,我们认为 AI 的感知威胁取决于 AI 实施的背景。这意味着任何措施都必须尊重 AI 的领域特异性。

The collected data supports the factorial structure of the proposed TAI scale. Furthermore, results show that people differentiate between distinct AI functionalities, in that, the extent of the functional reach and autonomy of an AI application evoke different degrees of threat perceptions irrespective of domain. Still, such distinct perceptions do also differ between the domains tested. For instance, recognition and prediction with regard to a physical ailment as well as the recommendation for a specific therapy made by an AI do not evoke substantial threat perceptions. Contrarily, autonomous decision-making in which an AI unilaterally decides on the proscribed treatment was met with relatively bigger apprehension. At the same time, the application of AI in medical treatment was generally perceived as less fearsome than situations where AI applications are used to screen applicants on a job or a financial loan.
收集的数据支持拟议的 TAI 量表的因子结构。此外,结果表明,人们区分不同的 AI 功能,因为 AI 应用程序的功能范围和自主性会引起不同程度的威胁感知,而与领域无关。尽管如此,这种不同的认知在测试的领域之间也确实有所不同。例如,对身体疾病的识别和预测以及 AI 对特定疗法的建议不会引起实质性的威胁感知。相反,人工智能单方面决定禁止治疗的自主决策受到了相对更大的担忧。与此同时,人工智能在医疗中的应用通常被认为不如使用 AI 应用程序筛选求职者或金融贷款的情况那么可怕。

Eventually, to measure construct validity, we assessed the effects of the Threats of Artificial Intelligence (TAI) scale on emotional fear. Threat perceptions are a necessary, but not sufficient prerequisite to fear. While most research directly focuses on fear, we will subsequently argue for the benefits of addressing the preceding threat perceptions. Ultimately, the threat perceptions do in fact trigger emotional fear. Lastly, we discuss the adoption and use of the TAI scale in survey questionnaires and make suggestions for its application in empirical research as well as general managerial recommendations with regard to public concerns of AI.
最终,为了测量结构效度,我们评估了人工智能威胁 (TAI) 量表对情绪恐惧的影响。威胁感知是恐惧的必要但不是充分的先决条件。虽然大多数研究直接关注恐惧,但我们随后将论证解决上述威胁感知的好处。最终,威胁感知实际上确实会引发情绪恐惧。最后,我们讨论了 TAI 量表在调查问卷中的采用和使用,并就其在实证研究中的应用提出了建议,并就公众对人工智能的担忧提出了一般管理建议。

2 Public Perceptions of Recent Developments in Artificial Intelligence
阿拉伯数字 公众对人工智能最新发展的看法

In recent years there has been a somewhat re-newed interest in applications of AI based on recent developments in computer technology that allows for use of extensive processing power and the analysis of vast amounts of so-called Big Data applications of Machine Learning, Deep Learning and Neural Networks. Such applications gather under the label of AI, which is ascribed a huge impact on society as a whole [10]. Thereby, AI has especially seen widespread use in business and public management [11]. As a consequence, the public discourse regarding AI is mainly driven by companies that provide AI technology looking for customers and markets for their products [1, 12, 13]. Meanwhile, empirical evidence from survey research supports the assumption that AI is not per se perceived as entirely positive by the public. A cross-national survey by Kelley et al. [10] shows that AI is connected with positive expectations in the field of medicine, but reservations are prevalent concerning data privacy and job loss. Another concern is raised by Araujo et al. [14], who state that citizens perceive high risks regarding decision-making AI. Moreover, a representative opinion poll by Zhang and Dafoe [15] illustrates that Americans as well as citizens from the European Union (EU) believe that robots and AI could have harmful consequences for societies and should be carefully managed. Additionally, Gnambs and Appel [16] show that attitudes towards robots have recently changed for the worse in the EU. Especially, when it comes to the influence of robots in the economy and the substitution of workforce, people express fear [7, 9]. On a broader level, a recent study by Liang and Lee [8] inquiring about the fear of AI even found that a considerable amount of all Americans reported fears when it comes to autonomous robots and AI.
近年来,基于计算机技术的最新发展,人们对人工智能的应用重新产生了兴趣,这些技术允许使用广泛的处理能力以及分析机器学习、深度学习和神经网络的大量所谓大数据应用。这些应用程序聚集在 AI 的标签下,它被认为对整个社会产生了巨大影响 [10]。因此,人工智能在商业和公共管理中得到了特别广泛的应用 [11]。因此,关于 AI 的公共话语主要由提供 AI 技术的公司推动,为其产品寻找客户和市场 [11213]。与此同时,来自调查研究的经验证据支持这样一个假设,即公众并不认为人工智能本身完全是积极的。Kelley 等人 [10] 的一项跨国调查显示,人工智能与医学领域的积极期望有关,但对数据隐私和失业持保留态度普遍存在。Araujo 等人 [14] 提出了另一个担忧,他们指出公民认为人工智能决策的风险很高。此外,Zhang 和 Dafoe [15] 的一项代表性民意调查表明,美国人以及欧盟 (EU) 的公民都认为机器人和人工智能可能会对社会产生有害后果,应该谨慎管理。此外,Gnambs 和 Appel [16] 表明,欧盟对机器人的态度最近发生了变化,情况变得更糟。特别是,当谈到机器人对经济的影响和劳动力的替代时,人们表达了恐惧 [79]。 在更广泛的层面上,Liang 和 Lee [8] 最近的一项研究询问了对 AI 的恐惧,甚至发现相当多的美国人报告了对自主机器人和 AI 的恐惧。

3 Measuring the Fear of Autonomous Robots and Artificial Intelligence
3 衡量对自主机器人和人工智能的恐惧

Using data from the Chapman Survey of American Fears, Liang and Lee [8] set out to investigate the prevalence of fear of autonomous robots and artificial intelligence (FARAI). They come to the conclusion that roughly a quarter of the US population experienced a heightened level of FARAI. In the respective study, participants were confronted with the question “How afraid are you of the following?”. The FARAI-scale was afterwards built out of these four items: (1) “Robots that can make their own decisions and take their own actions,”, (2) “Robots replacing people in the work-force”, (3) “Artificial intelligence” and (4) “People trusting artificial intelligence to do work”. All items were rated on a four-point Likert scale with answers ranging from “not afraid (1), slightly afraid (2), afraid (3), to very afraid (4)” [8].
利用查普曼美国人恐惧调查的数据,Liang 和 Lee [8] 着手调查对自主机器人和人工智能 (FARAI) 的恐惧普遍性。他们得出的结论是,大约四分之一的美国人口经历了 FARAI 水平升高。在相应的研究中,参与者面临一个问题“你对以下内容有多害怕?随后,FARAI 量表由以下四个项目构建:(1) “可以自己做决定并采取自己行动的机器人”,(2) “机器人取代劳动力中的人”,(3) “人工智能”和 (4) “相信人工智能的人可以做工作”。所有项目均采用四点李克特量表进行评分,答案范围从“不害怕 (1)、略微害怕 (2)、害怕 (3) 到非常害怕 (4)”[8]。

While the authors shed some first light in addressing threat perceptions of artificial intelligence and generate valuable insights into various associations of FARAI with demographic and personal characteristics, there is also need for a potential enhancement of the existing measurement of FARAI. As the FARAI scale was developed out of a broad questionnaire concerning many possible fears people in the US might have, the measurement was not specifically developed for measuring the distinctive fear of robots and AI, respectively. The FARAI scale also varies in its scope. While item 3 broadly queries the fear of AI in general, item 2 specifically inquiries about its specific impacts on the economic sector. Items 1 and 4 query a specific functionality of AI, with item 4 focusing on the human-machine connection. Thus, the items are mixed in their expressiveness and aim at different aspects of AI’s impact. Accordingly, the scale does not allow for distinct assessments of AI and necessary specifications concerning its domain of application and the employed functions.
虽然作者在解决人工智能的威胁感知方面提供了一些初步的曙光,并对 FARAI 与人口统计和个人特征的各种关联产生了有价值的见解,但也需要对现有的 FARAI 测量进行潜在的增强。由于 FARAI 量表是根据一份涉及美国人可能拥有的许多可能恐惧的广泛问卷开发的,因此该测量并不是专门为分别测量对机器人和 AI 的独特恐惧而开发的。FARAI 量表的范围也各不相同。第 3 项广泛询问了对人工智能的普遍恐惧,而第 2 项则专门询问了它对经济部门的具体影响。第 1 项和第 4 项查询 AI 的特定功能,第 4 项侧重于人机连接。因此,这些项目在表现力上混合在一起,并针对 AI 影响的不同方面。因此,该量表不允许对人工智能进行单独评估,也不允许对其应用领域和所采用的功能进行必要的规范。

Besides, the public understanding of robots and AI might be influenced by popular imaginations from pop-culture, science-fiction, and the media, as is also already implied by Liang and Lee [8] and Laakasuo et al. [17]. Due to the popularity of vastly different types of autonomous robots and AI in literature, film and comics, it is hard to pin down what exactly comes to a person’s mind inquired about both terms. Delineating boundaries may not be possible when it comes to the public imagination. As a survey research in the UK by Cave et al. [18] suggests, a quarter of the British population conflates the term AI with robots. Accordingly, a conceptual clarification concerning the distinction between the terms robot and artificial intelligence is required to begin with. In the FARAI measure by Liang and Lee [8], there is a mixture between both terms as two question items focus on each term, respectively. This terminological distinction is often conflated in empirical research [19]. We believe that the mixture of the terms might lead to avoidable ambiguity and maybe even confusion, since people may think of two distinct and even completely different phenomena or might not be able to distinguish between the two constructs at all. According to the Oxford English Dictionary a robot is “an intelligent artificial being typically made of metal and resembling in some way a human or other animal” or “a machine capable of automatically carrying out a complex series of movements, esp. one which is programmable” while AI is “the capacity of computers or other machines to exhibit or simulate intelligent behaviour”. There is certainly some conceptual overlap by definition, especially with regard to the capacity of intelligent behavior demonstrated by an artificial construct, hence, something that does not exist naturally, but is human-made. It also cannot be ruled out that appraisal of robots may be strongly associated with AI, especially when such robots are depicted as autonomous.
此外,公众对机器人和人工智能的理解可能受到流行文化、科幻小说和媒体的流行想象的影响,Liang 和 Lee [8] 以及 Laakasuo 等人 [17] 也已经暗示了这一点。由于文学、电影和漫画中广泛流行的自主机器人和 AI 类型迥异,因此很难确定一个人在询问这两个术语时究竟想到了什么。当涉及到公众的想象时,划定界限可能是不可能的。正如 Cave 等人 [18] 在英国进行的一项调查研究表明,四分之一的英国人口将 AI 一词与机器人混为一谈。因此,首先需要对术语 robotartificial intelligence 之间的区别进行概念性澄清。在 Liang 和 Lee [8] 的 FARAI 测量中,两个术语之间存在混合,因为两个问题项分别关注每个术语。这种术语上的区分在实证研究中经常被混为一谈[19]。我们认为,这两个术语的混合可能会导致本可避免的歧义甚至混淆,因为人们可能会想到两种截然不同甚至完全不同的现象,或者可能根本无法区分这两种结构。根据牛津英语词典,机器人是“通常由金属制成的智能人造生物,在某种程度上类似于人类或其他动物”或“能够自动执行一系列复杂运动的机器,尤其是可编程的机器”,而人工智能是“计算机或其他机器展示或模拟智能行为的能力”。 根据定义,肯定存在一些概念重叠,特别是关于人工结构所展示的智能行为的能力,因此,它不是自然存在的,而是人为的。也不能排除对机器人的评估可能与人工智能密切相关,尤其是当此类机器人被描述为自主机器人时。

Recently, the term AI, particularly, has renewedly received widespread attention and describes techniques from computer science that gather many different concepts like machine learning, deep learning or neural networks, which are the basis of autonomous functionality and pervasive implementation. As a consequence, we decided to focus our measurement solely on AI as it depicts the core issue of the nascent technology, i.e. autonomous intelligent behavior, which applies to many use cases that do not necessarily include a physical machine in motion.
最近,尤其是 AI 一词再次受到广泛关注,它描述了计算机科学中的技术,这些技术汇集了许多不同的概念,如机器学习深度学习神经网络,这些概念是自主功能和普遍实施的基础。因此,我们决定将测量重点完全放在 AI 上,因为它描述了新兴技术的核心问题,即自主智能行为,这适用于许多不一定包括运动中的物理机器的用例。

4 Threat Perceptions as Precondition of Fear
4 威胁感知是恐惧的先决条件

There is plenty of literature on the subject of fear, especially from the field of human psychology. Altogether, fear is defined as a negative emotion that is aroused in response to a perceived threat [20]. When it comes to the origins of emotion, many studies rely on the appraisal theory of emotion: “The appraisal task for the person is to evaluate perceived circumstances in terms of a relatively small number of categories of adaptional significance, corresponding to different types of benefit or harm, each with different implications for coping” [21]. Accordingly, the authors define relational themes for different emotions. According to Smith and Lazarus [21] anxiety, respectively fear, is evoked, when people perceive an ambiguous danger or threat, which is motivationally relevant as well as incongruent to their goals. Thereby, a threat is seen as an “environmental characteristic that represents something that portends negative consequences for the individuum” [22]. Furthermore, people perceive low or uncertain coping potential. In other words: Fear is the result of a persons’ appraisal process, where a situation or an object is perceived as threatening and relevant as well as no avoiding potential can be seen. If theses appraisals are processed, people react with fear and try to avoid the threat [21], i.e. in turning away from the object.
关于恐惧这个主题有很多文献,尤其是来自人类心理学领域的文献。总而言之,恐惧被定义为一种因感知到的威胁而引起的负面情绪 [20]。当谈到情绪的起源时,许多研究都依赖于情绪的评估理论:“对人的评估任务是根据相对较少的适应意义类别来评估感知环境,对应于不同类型的利益或伤害,每个类别对应对都有不同的影响”[21]。因此,作者为不同的情绪定义了关系主题。根据 Smith 和 Lazarus [21] 的说法,当人们感知到一种模棱两可的危险或威胁时,就会引发恐惧,这在动机上是相关的,并且与他们的目标不一致。因此,威胁被视为“代表预示着对个人产生负面影响的环境特征”[22]。此外,人们认为应对潜力低或不确定。换句话说:恐惧是一个人评估过程的结果,其中一种情况或物体被认为是具有威胁性和相关性的,并且看不到回避的可能性。如果这些评估被处理,人们的反应是恐惧,并试图避免威胁[21],即转身离开对象。

Many scholars build on appraisal theory to develop more specified theories on the mechanisms of fear. Especially in health communication much work on so called fear appeal literature has been done [22, 23]. In a nutshell, most fear appeal theories state that a specific object, event or situation (e.g., a disease) threatens the well-being of a person [24, 25]. With the development of the Extended Parallel Process Model (EPPM), Witte [25] theorizes that this threat is at first processed cognitively. Thereby, severity and susceptibility of the threat as well as coping potential (self and general), i.e. the amount of efficacy [25] respectively control [26], is rated. Depending on the weights of these cognitive apprehensions, people react differently. Fear emerges, when the threat perception is high, while the coping perception is low. As a result, message denial arises that is mostly characterized by avoiding the threat. On the other hand, when threat as well as coping potential are perceived as high, message acceptance results. If this happens, people actively engage with the threat, for example in gathering information about the threat or actively combating potential harms. In this case, fear does not emerge. Whereas the empirical examination of the EPPM found no clear proof [27] and many suggestions for extending the model have been made [28, 29], scholars agree upon the central persuasion effects of threat and coping perceptions [30]. Moreover, the EPPM commonly serves as a framework for further research [31].
许多学者以评价理论为基础,发展出关于恐惧机制的更具体的理论。特别是在健康传播方面,已经对所谓的恐惧诉求文献做了很多工作 [2223]。简而言之,大多数恐惧诉求理论指出,特定的物体、事件或情况(例如,疾病)威胁到一个人的福祉 [2425]。随着扩展并行过程模型 (EPPM) 的发展,Witte [25] 理论上,这种威胁首先是通过认知来处理的。因此,对威胁的严重性和易感性以及应对潜力(自身和一般)进行评级,即分别控制 [26] 的疗效 [25] 的数量。根据这些认知担忧的权重,人们的反应会有所不同。当威胁感知高时,恐惧就会出现,而应对感知低。因此,出现了消息拒绝,其主要特征是避免威胁。另一方面,当威胁和应对潜力被认为很高时,消息接受结果。如果发生这种情况,人们会积极应对威胁,例如收集有关威胁的信息或积极应对潜在危害。在这种情况下,恐惧不会出现。虽然对 EPPM 的实证检验没有发现明确的证据 [27],并且已经提出了许多扩展该模型的建议 [2829],但学者们同意威胁和应对感知的核心说服效应 [30]。此外,EPPM 通常作为进一步研究的框架 [31]。

Transferred to the subject of perceived threats of AI, we belief that AI is best described as an environmental factor that might cause fear. However, AI should not be treated as a specific fear itself. In our view, fear may be a result of a cognitive appraisal process, where AI depicts a potential threat origin. Thus, we explicitly focus on threats of AI, not fear of AI. This idea becomes more prevalent in thinking about an actual situation. For example, a person is confronted with an AI system that decides over an approval of a credit. This person most likely will not be afraid of the computer system, but will rather evaluate cognitively the threat that such a system might pose to its well-being. The person then rates the probability of the system causing harm (e.g., if it denies the credit). If the outcome of this process ends in a negative evaluation for the person, fear will be evoked. However, this fear is based on the threat the AI systems poses and not on the AI system itself. This is crucial for our understanding of threats of AI.
转移到人工智能感知威胁的主题上,我们认为人工智能最好被描述为可能引起恐惧的环境因素。然而,AI 本身不应被视为一种特定的恐惧。在我们看来,恐惧可能是认知评估过程的结果,其中 AI 描述了潜在的威胁来源。因此,我们明确关注 AI 的威胁,而不是对 AI 的恐惧。这种想法在考虑实际情况时变得更加普遍。例如,一个人面临着一个 AI 系统,该系统决定是否批准信贷。这个人很可能不会害怕计算机系统,而是会从认知上评估这样的系统可能对其健康构成的威胁。然后,该人员对系统造成伤害的概率进行评分(例如,如果它否认信用)。如果这个过程的结果以对这个人的负面评价结束,就会引起恐惧。然而,这种恐惧是基于 AI 系统构成的威胁,而不是 AI 系统本身。这对于我们理解 AI 的威胁至关重要。

5 Context Specificity of Threat Perceptions
5 威胁感知的上下文特异性

It is also important to address the social situation, in which a threat is perceived. Smith and Lazarus [21] already stated that an “appraisal can, of course, change (1) as the person-environment relationship changes; (2) in consequence of self-protective coping activity (e.g. emotion-focused coping); (3) in consequence of changing social structures and culturally based values and meanings; or (4) when personality changes, as when goals or beliefs are abandoned as unservicable”. Furthermore, Tudor [32] proposed a sociological approach for the understanding of fear. He developed a concept, in which he distinguishes parameters of fear including environments, cultures as well as social structures.
解决感知到威胁的社会情况也很重要。Smith 和 Lazarus [21] 已经指出,“当然,评估可以改变 (1) 随着人与环境关系的变化;(2) 自我保护应对活动的结果(例如 以情绪为中心的应对);(3) 由于社会结构和基于文化的价值观和意义的变化;或 (4) 当性格发生变化时,例如当目标或信念因不可服务而被放弃时”。此外,Tudor [32] 提出了一种理解恐惧的社会学方法。他提出了一个概念,在其中区分了恐惧的参数,包括环境、文化和社会结构。

Thereby, contexts can vary in manifold ways. A rather simple example for what Tudor [32] refers to as an environmental context is the case of the wild animal: for instance, a tiger could face a human being; however, arguably there is a huge difference in fear reaction if one is confronted with the tiger in a zoo or in its natural habitat. Thus, the environmental factor “cage” does have a huge impact on the incitement of fear. Additionally, cultural backgrounds can affect the way threats are perceived: “If our cultures repeatedly warn us that this kind of activity is dangerous, or that sort of situation is likely to lead to trouble, then this provides the soil in which fearfulness may grow” [32]. Lastly, social structures described that the societal role of an individual might influence threat perceptions. For instance, that could be the job position of an employee or just the belonging to a specific societal group.
因此,上下文可以以多种方式变化。对于都铎 [32] 所说的环境背景,一个相当简单的例子是野生动物的情况:例如,一只老虎可以面对一个人;然而,可以说,如果一个人在动物园或自然栖息地遇到老虎,恐惧反应会有很大的不同。因此,环境因素“笼子”确实对煽动恐惧有巨大影响。此外,文化背景会影响对威胁的感知方式:“如果我们的文化反复警告我们这种活动是危险的,或者那种情况可能会导致麻烦,那么这就为恐惧提供了滋生的土壤”[32]。最后,社会结构描述了个人的社会角色可能会影响威胁感知。例如,这可能是员工的工作职位,或者只是属于特定社会群体。

Furthermore, different social actors are able to influence the social construction of public fears, i.e. if and how environmental stimuli are treated as threats [26]. According to Dehne [26], the creation of fear, among other factors, is dependent upon transmission of information in a society. In that, especially scientific, economic, political and media actors affect the social construction of threats. However, depending on the actors that take the highest share in the public discourse, different threat perceptions might emerge. For instance, given an AI application in medicine, we assume that science as well as media lead the debate. On the other hand, an AI application in the field of recruiting will probably be led by economic actors. It is plausible that there are specific context dependencies (who informs the public about a specific AI application) that have an influence on (threat) perceptions.
此外,不同的社会行为者能够影响公众恐惧的社会建构,即环境刺激是否以及如何被视为威胁[26]。根据 Dehne [26] 的说法,除其他因素外,恐惧的产生取决于社会中的信息传递。在这方面,特别是科学、经济、政治和媒体行为者会影响威胁的社会建构。但是,根据在公共话语中占据最高份额的行为者,可能会出现不同的威胁感知。例如,鉴于人工智能在医学中的应用,我们假设科学和媒体都引领了辩论。另一方面,招聘领域的 AI 应用程序可能会由经济参与者主导。有理由认为,存在对(威胁)感知有影响的特定上下文依赖关系(谁向公众通报特定的 AI 应用程序)。

In summary, there are many (social) factors that shape the way emotions are elicited leading to the conclusion that threat perceptions heavily rely on the context in which an individual encounters AI. Of course, we are not able to cover all possible contexts of AI related threats. However, we distinguish two context groups, which are important for the understanding of TAI: AI functionality and distinct domains of AI applications.
总之,有许多(社会)因素会影响情绪的引发方式,从而得出结论,威胁感知在很大程度上取决于个人遇到 AI 的环境。当然,我们无法涵盖 AI 相关威胁的所有可能背景。但是,我们区分了两个上下文组,这对理解 TAI 很重要:AI 功能和 AI 应用程序的不同领域。

5.1 Distinct Dimensions of AI Functionality
5,1 AI 功能的不同维度

What an AI is capable or supposed to do may have a decisive effect on the appraisal of AI applications. However, AI is a generic term that unites many different functionalities. In the scientific community, there are manifold definitions on the term AI and what can and what cannot be counted as an AI system. Whereas there is not one definition, most scholars agree upon central functionalities AI systems can perform. Nevertheless, there is no consensus upon how to group these functionalities. For example, Hofmann et al. [33], identify perceiving, identification, reasoning, predicting, decision-making, generating and acting as AI functions. Though, we base our approach on the periodic systems of AI [34] and group AI functionalities into four categories, which undoubtedly intersect each other: recognition, prediction, recommendation and decision-making.
AI 的能力或应该做什么可能会对 AI 应用程序的评估产生决定性影响。但是,AI 是一个通用术语,它结合了许多不同的功能。在科学界,关于 AI 一词以及什么可以算作 AI 系统,什么不能算作 AI 系统,有多种定义。虽然没有一个定义,但大多数学者都同意人工智能系统可以执行的核心功能。然而,对于如何对这些功能进行分组,还没有达成共识。例如,Hofmann等[33]将感知、识别、推理、预测、决策、生成和作为AI功能进行识别。尽管如此,我们的方法基于人工智能的周期性系统[34],并将人工智能功能分为四类,这无疑是相互交叉的:识别、预测、推荐和决策。

Noteworthy, our approach is quite similar to Hofmann et al. [33]: however, we subsumed generating and acting into the category of decision-making as we focus on AI that act autonomously in that category. Additionally, we added perceiving and identification into one category: This decision was based upon the results of a pre-test of the scale, which we conducted with 304 participants. Our results show that participants could not differentiate between the perceiving and identification function.
值得注意的是,我们的方法与 Hofmann 等人 [33] 非常相似:然而,我们将生成和行动归入决策类别,因为我们专注于在该类别中自主行动的 AI。此外,我们将感知和识别添加到一个类别中:该决定基于我们对 304 名参与者进行的量表预测试的结果。我们的结果表明,参与者无法区分感知和识别功能。

In the following, we elaborate on our proposed AI function classes:
在下文中,我们将详细阐述我们提议的 AI 函数类:

5.1.1 Recognition
5.1.1 识别

Recognition describes the task of analyzing input data in various forms (e.g., images, audio) and recognizing specific patterns in the data. Depending on the application, these patterns can vary hugely. In a health application, AI recognition is used to detect and identify breast cancer [35]. In the economic sector AI systems promise to detect (personal) characteristics and abilities of potential employees via their voices and / or faces [36].
识别描述了分析各种形式(例如图像、音频)的输入数据并识别数据中的特定模式的任务。根据应用程序的不同,这些模式可能会有很大差异。在健康应用中,AI 识别用于检测和识别乳腺癌 [35]。在经济领域,人工智能系统有望通过潜在雇员的声音和/或面孔来检测他们的(个人)特征和能力[36]。

5.1.2 Prediction
5.1.2 预测

In prediction tasks, AI applications prognose future conditions on the basis of the analyzed data. It differentiates from recognition in forecasting developments of specific states, whereas recognition mostly classifies the given data. In the medical sector, a AI applications are able to calculate the further development of diseases on basis of medical diagnoses and (statistical) reports [37].
在预测任务中,AI 应用程序根据分析的数据预测未来情况。它与预测特定状态发展的识别不同,而识别主要对给定数据进行分类。在医疗领域,人工智能应用程序能够根据医学诊断和(统计)报告计算疾病的进一步发展[37]。

5.1.3 Recommendation
5.1.3 建议

Recommendation describes a task in the field of human-computer interaction. Thereby, AI systems directly engage with humans, mostly decision makers, in recommending specific actions. These actions are again highly dependent on the actual application field. For the medical example this could mean that the AI application, which takes into account all given data, proposes a medical treatment to the doctor [38]. Noteworthy, the decision to accept or decline this suggestion is still made by the physician or the patient, respectively.
推荐 描述了人机交互领域的一项任务。因此,AI 系统直接与人类(主要是决策者)合作,推荐具体行动。这些操作同样高度依赖于实际的应用领域。对于医学示例,这可能意味着 AI 应用程序考虑了所有给定数据,向医生提出了一种治疗方法 [38]。值得注意的是,接受或拒绝此建议的决定仍由医生或患者分别做出。

5.1.4 Decision-Making
5.1.4 决策

Ultimately, the functionality decision-making refers to AI systems that operate autonomously. Oftentimes, these applications are also called algorithmic decision-making (ADM) systems. Hereby, AI systems learn and act autonomously after being carefully trained by developers. The most prominent application is with no doubt autonomous driving [39]. However, decision-making tasks can also be found in other domains of application. For example, in medicine AI systems could directly decide over medical treatments of patients; in the higher education sector ADM could decide about the admissions of students’ applications to university [40]. Concerning the human-computer interaction an ADM substitutes the human task completely.
归根结底,功能决策是指自主运行的 AI 系统。通常,这些应用程序也称为算法决策 (ADM) 系统。因此,AI 系统在经过开发人员的精心训练后可以自主学习和行动。最突出的应用无疑是自动驾驶 [39]。但是,决策任务也可以在其他应用领域中找到。例如,在医学领域,人工智能系统可以直接决定患者的药物治疗;在高等教育领域,ADM可以决定学生申请大学的录取[40]。关于人机交互,ADM 完全取代了人工任务。

Noteworthy, two points are important to mention. Firstly, the functionalities can depend on each other and are thus not completely separable. Secondly, AI applications in specific fields do not necessarily have to fulfill all AI functionalities. Mostly, AI systems just perform one task while not providing the other ones. We stress that threat perceptions of AI - in technical terms - should not be treated as a second-order factor. Rather our scale deploys a toolbox which can be used to cover threat perceptions of the different functionalities. However, we expect that there are significant correlations between the functionalities. In conclusion, we pose the following research question:
值得注意的是,有两点很重要。首先,功能可以相互依赖,因此不能完全分离。其次,特定领域的 AI 应用程序不一定必须满足所有 AI 功能。大多数情况下,AI 系统只执行一项任务,而不提供其他任务。我们强调,从技术角度来看,不应将 AI 的威胁感知视为二阶因素。相反,我们的秤部署了一个工具箱,可用于涵盖不同功能的威胁感知。但是,我们预计这些功能之间存在显著的相关性。总之,我们提出了以下研究问题:

RQ1: Do respondents have distinct threat perceptions regarding the different functions AI systems perform?
RQ1:受访者对 AI 系统执行的不同功能是否有不同的威胁认知?

5.2 Distinct Domains of AI Application
5,2 AI 应用的不同领域

As usual, social science research addresses the social change induced by technological phenomena and artifacts in various domains of public and private life. Depending on the domain of application, AI may be wholeheartedly welcomed or seen as a severe threat [19, 41]. For instance, imagining to hand over certain decisions to an AI may appear rather innocuous for certain lifestyle choices such as buying a product or taking a faster route to a destination, but may lead to reactance when perceived individual stakes are high, e.g. when AI interferes in life altering decisions that affect one’s health or career decisions. As applications of AI are expected to get implemented in manifold life domains, research will need to address the respective perceptions of the people affected. The domain specificity of effects is already an established approach in social science research; for instance Acquisti et al. [42] as well as Bol et al. [43] found that distinct application domains do matter in terms of online privacy behavior. Additionally, Araujo et al. [14] analyzed perceptions of automated decision-making AI in three distinct domains. Thus, we believe that a measurement of threat perceptions also needs to be adaptable to a multi-faceted universe of AI related phenomena, some of which might not even be known to date. Concludingly, we propose a measurement that is adaptable to every AI domain. As follows, the proposed TAI scale is tested in three different domains, namely loan origination, job recruitment, and medical treatment.
像往常一样,社会科学研究涉及公共和私人生活各个领域的技术现象和人工制品引起的社会变革。根据应用领域的不同,AI 可能会受到全心全意的欢迎,也可能被视为严重威胁 [1941]。例如,对于某些生活方式选择(例如购买产品或采取更快的路线到达目的地),想象将某些决定交给 AI 可能看起来相当无害,但当感知到的个人风险很高时,可能会导致反抗,例如,当 AI 干预生活时,改变影响个人健康或职业决策的决定。由于人工智能的应用有望在多方面生活领域得到实施,因此研究需要解决受影响人群的相应看法。效应的领域特异性已经是社会科学研究中的一种既定方法;例如,Acquisti 等人 [42] 以及 Bol 等人 [43] 发现,就在线隐私行为而言,不同的应用领域确实很重要。此外,Araujo 等 [14] 分析了三个不同领域对自动决策 AI 的看法。因此,我们认为威胁感知的测量也需要适应 AI 相关现象的多方面领域,其中一些现象甚至可能迄今为止还不为人知。最后,我们提出了一种适用于每个 AI 领域的测量方法。如下,拟议的 TAI 量表在三个不同的领域进行了测试,即贷款发放、工作招聘和医疗。

5.2.1 Loan Origination: Assessing Creditworthiness
5.2.1 贷款发放:评估信用度

AI technologies are already applied in the finance sector, i.e. in credit approval [44]. As credit approval is a more or less mathematical problem, it is reasonable that AI based algorithms are applied for this purpose. The algorithms used analyze customer data and calculate potential payment defaults - and finally can decide, whether a credit is approved. As individual goals greatly depend on such decisions, it may pose a threat for individuals, who believe that their input data might be deficient or assume that the processing biased.
人工智能技术已经应用于金融领域,即信贷审批 [44]。由于信贷审批或多或少是一个数学问题,因此为此应用基于 AI 的算法是合理的。使用的算法分析客户数据并计算潜在的付款违约 - 最后可以决定信用是否获得批准。由于个人目标在很大程度上取决于此类决策,因此可能会对个人构成威胁,因为个人认为他们的输入数据可能存在缺陷或认为处理有偏差。

5.2.2 Job Recruitment: Assessing the Qualification and Aptitude of Applicants
5.2.2 职位招聘:评估申请人的资格和能力

Recently, AI applications have been applied to the field of human resource management, i.e. recruiting [45]. More specifically, AI can be used to analyze and predict performance of employees. Furthermore, AI based systems are able to recommend or select potential job candidates [46]. However, there are several potential risks of the use of AI systems in human resource management. For instance, algorithms based on existing job performance data may be biased and lead to discrimination of specific population groups [45, 47].
最近,人工智能应用已应用于人力资源管理领域,即招聘 [45]。更具体地说,人工智能可用于分析和预测员工的绩效。此外,基于人工智能的系统能够推荐或选择潜在的求职者[46]。然而,在人力资源管理中使用人工智能系统存在一些潜在风险。例如,基于现有工作绩效数据的算法可能存在偏差,并导致对特定人群的歧视 [4547]。

5.2.3 Health: Medical Treatment of Diseases
5.2.3 健康:疾病的医疗

One of the most important fields of AI development and implementation is with no doubt health care/medicine [48]. Especially, in fields where imaging techniques are applied (e.g., radiology) AI applications are frequently used [49]. Recent works show that AI applications are especially appropriate to detect and classify specific diseases, for example breast or skin cancer, in X-ray images [48, 49]. Moreover, another AI application can identify gene-related diseases in face images of patients [50]. Generally, people tend to have optimistic perceptions of the use of AI in medicine [10].
人工智能开发和实施最重要的领域之一无疑是医疗保健/医学 [48]。特别是在应用成像技术的领域(例如放射学),经常使用人工智能应用[49]。最近的研究表明,人工智能应用特别适合在 X 射线图像中检测和分类特定疾病,例如乳腺癌或皮肤癌 [4849]。此外,另一个人工智能应用程序可以识别患者面部图像中的基因相关疾病[50]。一般来说,人们对人工智能在医学中的应用持乐观看法 [10]。

Summing up, it may be assumed that distinct domains of AI application cause different threat perceptions. As mentioned earlier, a possible explanatory approach is that the public discourse, through which individuals are mostly confronted with AI, is led by different actor groups. Another reason to believe that domains do vary is the actual tasks AI systems perform and which severity individuals ascribe to them. Presumably, also personal relevance appraisals play a major role in the level of threat individuals ascribe to distinct domains. An individual, who does not plan to apply for a credit will probably rate the use of an AI system for credit approval as less threatening than a person who is in dire need of a loan. Arguably, we can only focus on a small sample of potential AI domains. In conclusion, we formulate the following hypothesis:
综上所述,可以假设 AI 应用程序的不同领域会导致不同的威胁感知。如前所述,一种可能的解释方法是,个人主要通过公共话语面对人工智能,由不同的参与者群体主导。相信领域确实不同的另一个原因是 AI 系统执行的实际任务以及个人赋予它们的严重性。据推测,个人相关性评估在个人归因于不同领域的威胁水平中也起着重要作用。不打算申请信贷的个人可能会认为使用 AI 系统进行信贷审批的威胁性低于急需贷款的人。可以说,我们只能关注潜在 AI 领域的一小部分样本。总之,我们提出了以下假设:

H1: Threat perceptions of AI differ between distinct domains of AI application.
H1:AI 应用程序的不同领域对 AI 的威胁感知不同。

6 Fear Reactions Towards AI
6 对 AI 的恐惧反应

As outlined in Sect. 4, perceived threats of AI are a precondition of emotional (fear) reactions. Thus, we assume that threat perceptions concerning AI actually trigger fear reactions. Accordingly, hypothesis 2 reads as followed.
如第 1 节所述。4、感知到的 AI 威胁是情绪(恐惧)反应的前提。因此,我们假设关于 AI 的威胁感知实际上会引发恐惧反应。因此,假设 2 如下所示。

H2: Threat perceptions of AI induce fear among respondents.
H2:对 AI 的威胁感知在受访者中引发恐惧。

As threat perceptions of AI functionalities and domains may differ vastly from each other, we are interested in whether the amount of perceived fear (if any) that is explained by our proposed measure also differs by context. Arguably, not all threat perceptions necessarily need to cause the same fear reactions. For instance, if subjects perceive high levels of efficacy in dealing with the potential threat, a far less strong emotional reaction is likely to occur [25]. This becomes particularly obvious, when comparing the recommendation and the decision-making functionality. Decision-making AI takes control away from the individual, whereas in recommendation at least an (other) human still has control over the process.
由于对 AI 功能和领域的威胁感知可能彼此差异很大,因此我们感兴趣的是,我们提出的措施所解释的感知恐惧量(如果有)是否也因环境而异。可以说,并非所有的威胁感知都必然需要引起相同的恐惧反应。例如,如果受试者认为在应对潜在威胁方面具有很高的效率,则可能会发生远不那么强烈的情绪反应 [25]。在比较推荐和决策功能时,这一点变得尤为明显。决策 AI 将控制权从个人手中夺走,而在建议中,至少(其他)人类仍然控制着该过程。

Therefore, we test whether the measure is able to capture induced fear reactions across the different contexts and whether these differ. Accordingly, we pose and address the following research question:
因此,我们测试了该措施是否能够捕捉不同背景下诱导的恐惧反应,以及这些反应是否不同。因此,我们提出并解决了以下研究问题:

RQ2: Does the influence of threat perceptions of AI on fear differ between contexts?
RQ2:AI 的威胁感知对恐惧的影响在不同情况下是否不同?

7 Method 7 方法

Accordingly, we set out to develop a measurement scale for the application in survey research on AI that addresses the threat perceptions of people that are confronted with various forms of AI implementation. Here, we explicitly emphasize that the proposed scale addresses the perceptions of individuals. Hence, it is not of much concern what an AI system actually does on a technical level, but how different ideal functionalities are seen in the eyes of respondents that usually do not have much knowledge about AI technology. The scale must be applicable with regards to the respondents and their individual imaginations of AI to show validity when it comes to threat perceptions. Again, this must not be coherent with the “technical”/“mathematical” level of actual AI systems. Rather respondents need only to differentiate between the observable functions AI systems perform.
因此,我们着手开发一个测量量表,用于 AI 调查研究,以解决面临各种形式的 AI 实施的人们的威胁感知。在这里,我们明确强调拟议的量表解决了个人的看法。因此,人工智能系统在技术层面上的实际作用并不重要,而是在通常对人工智能技术知之甚少的受访者眼中如何看待不同的理想功能。该量表必须适用于受访者及其对 AI 的个人想象,以在威胁感知方面显示有效性。同样,这不能与实际 AI 系统的“技术”/“数学”水平相一致。相反,受访者只需要区分 AI 系统执行的可观察功能。

Thus, the aim is to reliably and validly assess the extent to which respondents perceive autonomous systems as a threat to themselves. Moreover, the scale needs to be standardized allowing for comparisons between samples from various populations, but flexible enough allowing for application in distinct domains of AI research. It thus needs to be as concise as possible affording to be included in brief questionnaires.
因此,目的是可靠和有效地评估受访者在多大程度上将自主系统视为对自己的威胁。此外,量表需要标准化,以便对来自不同人群的样本进行比较,但又要足够灵活,允许应用于人工智能研究的不同领域。因此,它需要尽可能简洁,以便包含在简短的问卷中。

We tested our scale using a non-representative German online access panel. We used an online questionnaire with a split survey design comparing the threat perceptions towards AI in the three sectors loan origination, job recruitment, and medical treatment. Participants were randomly assigned to one of the three groups. We chose the method of matching urns after completion of the questionnaire to ensure an even distribution among the different questionnaire groups.
我们使用非代表性的德国在线访问面板测试了我们的秤。我们使用了一份在线问卷,采用拆分调查设计,比较了贷款发放、工作招聘和医疗三个领域对 AI 的威胁感知。参与者被随机分配到三组中的一组。我们选择在问卷完成后匹配骨灰盒的方法,以确保在不同问卷组之间均匀分布。

8 Sample 8 样本

Participants were recruited from the SoSci Open Access Panel between the 30th September and 14th October 2019 [51]. All in all, 917 subjects completed the questionnaire. In the data cleaning process, we had to drop 26 cases from our data set. Data elimination was based on two criteria: minus scores deployed by the access panel as well as time for completing the questionnaire. The minus scores are calculated on basis of the sum of the deviations from the average time for answering individual questionnaire pages in order to identify respondents’ inattentiveness. For our data cleaning process, we chose the access panel’s recommended value for a conservative cut-off criterion to ensure a high-quality data set. Moreover, we checked the overall time score participants needed to fill out the questionnaire. We excluded all data from participants with an answering time below five minutes, since we defined this minimum value after a pre-test of our questionnaire. Thus, our final sample consists of n=891 participants.
参与者是在 2019 年 9 月 30 日至 10 月 14 日期间从 SoSci 开放获取小组招募的 [51]。总共有 917 名受试者完成了问卷调查。在数据清理过程中,我们必须从数据集中删除 26 个案例。数据消除基于两个标准:减去访问面板部署的分数以及完成问卷的时间。负分是根据回答单个问卷页面的平均时间的偏差之和计算的,以确定受访者的注意力不集中。对于我们的数据清理过程,我们选择了访问面板的推荐值作为保守的截止标准,以确保数据集的高质量。此外,我们检查了参与者填写问卷所需的总体时间分数。我们排除了回答时间少于 5 分钟的参与者的所有数据,因为我们在问卷预测试后定义了这个最小值。因此,我们的最终样本由 n=891 名参与者组成。

Turning to the distribution by questionnaire groups, 296 participants were assigned to the ‘loan origination’ group, 294 to the ‘job recruitment’ group and 301 to the ‘medical treatment’ group. Of all participants 445 identify as female (50.0%) and 438 as male (49.2%), while seven (0.8%) respondents identify as non-binary. The average age of the participants is approximately 46 years (SD=15.66). Because of the demographic structure of the access panel, 82 percent of our participants have the highest German school leaving certificate. Arguably, our data is not representative for the German population, which should be acknowledged when interpreting the descriptive results. However, our primary interest of introducing a valid scale remain unchallenged by this limitation.
根据问卷组的分布情况,296 名参与者被分配到 “贷款发放 ”组,294 名参与者被分配到 “就业招聘 ”组,301 名参与者被分配到 “医疗 ”组。在所有参与者中,445 名受访者认为是女性 (50.0%),438 名是男性 (49.2%),而 7 名 (0.8%) 受访者认为是非二元性别。参与者的平均年龄约为 46 岁 (SD=15.66)。由于访问面板的人口结构,我们 82% 的参与者拥有最高的德国学校毕业证书。可以说,我们的数据不能代表德国人口,在解释描述性结果时应该承认这一点。但是,我们引入有效量表的主要利益仍未受到此限制的挑战。

9 Measurement 9 测量

9.1 Threat Perceptions of Artificial Intelligence
9,1 人工智能的威胁感知

We propose a measurement for threat perceptions concerning AI based on the specific functionality that AI systems can perform. We identified ‘recognition’, ‘prediction’, ‘recommendation’ and ‘decision-making’ as the core functions of current AI systems performance from a user’s perspective. The phenomenon AI was firstly explained to the participants with a short text, which also contained the information that AI currently draws widespread public attention. Furthermore, a broad definition of AI systems and functionality was given in a neutral tone as well as an explanation of how AI systems could be used in the specific context presented to the respondents.
我们根据 AI 系统可以执行的特定功能,提出了一种关于 AI 的威胁感知测量方法。从用户的角度来看,我们将“识别”、“预测”、“推荐”和“决策”确定为当前 AI 系统性能的核心功能。首先用简短的文字向参与者解释了 AI 现象,其中还包含了 AI 目前引起公众广泛关注的信息。此外,以中立的语气给出了人工智能系统和功能的广泛定义,并解释了如何在特定环境中使用人工智能系统,并向受访者介绍了。

To achieve context independence, we then developed formal items with clozes for the specific thematic foci, which can be seen as a toolbox that is customizable for distinct areas of application. Altogether, participants had to rate twelve statements on 5-point Likert scales (1=“non-threatening” to 5=“very threatening”). The question block with the statements was introduced with the following text: “If you now think of the use of AI in [specific context], how threatening do you think computer applications of artificial intelligence are that...”. The items for the specific functionalities reads as follows:
为了实现上下文独立性,我们随后为特定主题焦点开发了带有完形填空的正式项目,它可以被视为一个可针对不同应用领域定制的工具箱。总共,参与者必须在 5 点李克特量表(1=“非威胁性”到 5=“非常威胁”)上对 12 个陈述进行评分。带有语句的问题块由以下文本引入:“如果你现在考虑在 [特定上下文] 中使用 AI,你认为人工智能的计算机应用有多大的威胁......”。特定功能的项目如下:

Recognition: “(...) detect (object)” (RCG1), “(...) record (object)” (RCG2) and “(...) identify (object)” (RCG3).
认可:“(......detect (object)“ (RCG1), ”(...)record (object)“ (RCG2) 和 ”(...)identify (object)“(RCG3 的)。

Prediction: “(...) forecast the development of (object)” (PDC1), “(...) predict the development of (object)” (PDC2) and “(...) calculate the development of (object)” (PDC3).
预测:“(......预测(对象)的发展“(PDC1),”(......预测 (对象)“ (PDC2) 和”(...)计算 (对象) 的发展“(PDC3)。

Recommendation: “(...) recommend (action)” (RCM1), “(...) propose (action)” (RCM2) and “(...) suggest (action)” (RCM3).
建议:“(......推荐(行动)“(RCM1),”(......提议(行动)“(RCM2) 和”(......建议(行动)“(RCM3)。

Decision-making: “(...) decide on (action)” (DSM1), “(...) define (action)” (DSM2) and “(...) preset (action)” (DSM3).
决策:“(......决定(行动)“(DSM1),”(......define (action)“ (DSM2) 和 ”(...)预设 (操作)“ (DSM3)。

The brackets for object were filled with the terms I) “diseases”, II) “suitability of applicants/work performance”, and III) “probability of default of credits/creditworthiness”. The brackets for action were filled with the terms I) “medical treatment”, II) “hiring applicants”, and III) “granting of credits”. An example sentence reads as follows: “(...) how threatening do you think computer applications of artificial intelligence are that recommend a medical treatment.”
对象的括号中填写了 I) “疾病”、II) “申请人的适合性/工作表现”和 III) “信用违约概率/信誉”。行动的括号中填写了 I) “医疗”、II) “雇用申请人”和 III) “授予学分”。例句如下:“(......你认为人工智能的计算机应用有多大的威胁,可以推荐一种医疗。

9.2 Emotional Fear Reaction
9,2 情绪恐惧反应

Lastly, emotional responses towards the use of AI in the respective domain of application were retrieved. Participants had to rate how strongly they experience the emotion of fear on 5-point Likert scales (1=“non at all” to 5=“very strong”). Fear was measured through the items “afraid” (FEAR1), “frightened” (FEAR2), and “anxious” (FEAR3) [52].
最后,检索了对在相应应用领域中使用 AI 的情绪反应。参与者必须在 5 点李克特量表(1=“完全没有”到 5=“非常强烈”)上评估他们对恐惧情绪的感受程度。恐惧是通过 “afraid” (FEAR1)、“frightened” (FEAR2) 和 “anxious” (FEAR3) 来衡量的 [52]。

10 Results 10 结果

To test our measurement, we performed confirmatory factor analyses (CFA) with the lavaan package [53] in R (version 4.0) as well as several test statistics with the semTools package [54]. For visualization we used the semPlot package [55]. Firstly, we calculated a CFA with configural invariance. To check, if the factor loadings differ between the applications, we secondly calculated a model with measurement invariance. Thirdly, we constrained the intercept of the measurement to check for scalar invariance, i.e. to analyze whether threat perceptions are different between application areas. Lastly, we set one intercept free to gain our final model; that is, our results suggest the TAI scale as a measurement with partial scalar invariance. To check the influence of the TAI scale on emotional fear, we built a structural equation model (SEM) with emotional fear as dependent variable. We will further elaborate on our findings.
为了测试我们的测量,我们在 R(4.0 版)中使用 lavaan 包 [53] 进行了验证性因子分析 (CFA),并使用 semTools 包 [54] 进行了几次检验统计。对于可视化,我们使用了 semPlot 包 [55]。首先,我们计算了具有配置不变性的 CFA。为了检查不同应用程序的因子载荷是否不同,我们首先计算了一个具有测量不变性的模型。第三,我们限制了测量的截距以检查标量不变性,即分析应用领域之间的威胁感知是否不同。最后,我们释放了一个截距以获得最终模型;也就是说,我们的结果表明 TAI 量表是具有部分标量不变性的测量。为了检查 TAI 量表对情绪恐惧的影响,我们建立了一个以情绪恐惧为因变量的结构方程模型 (SEM)。我们将进一步详细说明我们的发现。

10.1 Descriptives
10,1 描述

We first calculated descriptive statistics (mean, standard deviation, skewness and kurtosis) for all scale items separately for each domain (Table 1). The descriptive values for each threat perception of the distinct AI functionalities are quite equal between the domains but differ considerably between different functionalities of AI. For example, we see that the decision-making functionality provoked the highest threat perceptions in all domains.
我们首先分别计算每个域的所有量表项目的描述性统计量 (平均值、标准差、偏度和峰度) (表 1)。不同 AI 功能的每个威胁感知的描述值在各个领域之间相当相等,但在 AI 的不同功能之间差异很大。例如,我们看到决策功能在所有域中引发了最高的威胁感知。

Table 1 Descriptives 表 1 描述

To check the reliability and factorial validity of the latent variables, we first calculated several test indices (Cronbach’s alpha, omega, omega2, omega3 and average variance extracted; Table 2). Cronbach’s alpha values are good, varying between .80 and .92, indicating a satisfactory reliability of the latent variables. The average variance extracted varies between min=.598 and max=.798, with values >.50 regarded as good [56].
为了检查潜在变量的可靠性和因子效度,我们首先计算了几个检验指数(提取的 Cronbach 的 alpha、omega、omega2、omega3 和平均方差;表 2)。Cronbach 的 alpha 值很好,在 .80 和 .92 之间变化,表明潜在变量的可靠性令人满意。提取的平均方差在 min=.598 和 max=.798 之间变化,>.50 的值被认为是好的 [56]。

In combination with considerable covariance among the latent factors this raises questions concerning the discriminant validity of the latent factors of the specified model [57]. Thus, a Fornell-Larcker test was performed for each model, separately. For this test the squared correlation between two factors is compared with the average variance extracted for each factor. Here, the former needs to show a lower value than the latter. As this was the case for all factors in the three domains of application, the results suggest discriminant validity between the latent factors within each group.
结合潜在因素之间相当大的协方差,这引发了关于指定模型潜在因素的判别有效性的问题 [57]。因此,对每个模型分别进行了 Fornell-Larcker 检验。对于此检验,将两个因子之间的平方相关性与为每个因子提取的平均方差进行比较。在这里,前者需要显示比后者更低的值。由于三个应用领域中的所有因素都是这种情况,因此结果表明每组内潜在因素之间存在判别效度。

Table 2 Reliability values
表 2 可靠性值

10.2 Measurement Invariance
10.2 测量不变性

Before addressing the hypotheses, we need to check for measurement invariance, i. e. whether the measurement of the construct can be considered invariant between groups. Only then a mean comparison of the construct between groups is viable. A first CFA model addresses configural invariance. In our model, the four latent factors are measured with the three respective manifest indicators described in the measurement section. Furthermore, we assume covariances between all latent factors as every dimension reflects a special aspect of perceptions regarding threats of AI. The chi-square test of model fit reaches significance, χ2(144)=207.091, p<.001. In addition, the approximate fit indices results show good fit for the model, TLI=.989, RMSEA=.038 (.026, .050), SRMR=.026. Following the suggestion by Vandenberg [58], we do not automatically reject the present model with high degrees of freedom and a considerable sample size on the basis of the strict chi-square test, but look into the reasons for any misspecification. Results show that the unexplained variance in the specified model stems from cross-loadings of items RCG2 and RCG3 on the dimension of prediction in the finance condition. While freeing the respective parameters would improve model fit, at this point no such action is taken.

In a first modification step, we calculated a CFA-model with measurement invariance. Thus, we constraint the factor loadings of all items assuming that the factors load equally on the latent factors in each group. We compared the measurement invariance model with the original model with configural invariance. The chi-square difference test shows that the measurement invariance model does not fit the data worse than the configural invariance model, Δχ2(16)=26.152, p=.052. Accordingly, our data support the assumption of measurement invariance, i. e. that the factor loadings are equal across different domains of application.
在第一个修改步骤中,我们计算了一个具有测量不变性的 CFA 模型。因此,我们约束所有项目的因子载荷,假设因子对每组中潜在因子的载荷相等。我们将测量不变性模型与具有配置不变性的原始模型进行了比较。卡方差值检验表明,测量不变性模型对数据的拟合并不比配置不变性模型差, Δχ2 (16)=26.152,p=.052。因此,我们的数据支持测量不变性的假设,即因子载荷在不同应用领域中相等。

In a second modification step, we calculated a CFA model with scalar invariance. Accordingly, we constrained the intercepts of the items and compared the fit of the model with the measurement invariance model. The chi-square difference test suggests, that the scalar invariance model performs significantly worse than the measurement invariance model, Δχ2(16)=32.642, p=.008. To detect non-invariant intercepts across the groups, we referred to the modification indices. These suggest that the intercept of item DSM3 is non-invariant (modind = 7.44). Accordingly, we freed the intercept constraint of item DSM3 and calculated a partial scalar invariance model. A chi-square difference test for the model with partial scalar invariance fits the data not worse than the measurement invariance model, Δχ2(14)=21.311, p=.094.
在第二个修改步骤中,我们计算了一个具有标量不变性的 CFA 模型。因此,我们限制了项目的截距,并将模型的拟合与测量不变性模型进行了比较。卡方差值检验表明,标量不变性模型的性能明显差于测量不变性模型, Δχ2 (16)=32.642,p=.008。为了检测跨组的非不变截距,我们参考了修改索引。这些表明项目 DSM3 的截距是非不变的 (modind = 7.44)。因此,我们释放了项目 DSM3 的截距约束并计算了部分标量不变性模型。具有部分标量不变性的模型的卡方差值检验拟合的数据不比测量不变性模型差, Δχ2 (14)=21.311,p=.094。

In a third modification step, we also constraint the residuals and hence calculated a model with strict invariance. According to our results, the strict invariance model performs relatively poorly, Δχ2(24) = 123.12, p<.001. Thus, the assumption of strict invariance is rejected.
在第三个修改步骤中,我们还对残差进行约束,因此计算了一个具有严格不变性的模型。根据我们的结果,严格不变性模型的表现相对较差, Δχ2 (24) = 123.12, p<.001 。因此,严格不变性的假设被拒绝。

Consequently, the model with partial scalar invariance will be discussed. The strict chi-square test for this model reaches significance, χ2(174)=254.555, p<.001. Again, the approximate fit indices show good fit for the model, TLI=.988, RMSEA=.039 (.028, .050), SRMR=.037. Once more, allowing for cross-loadings and correlated error terms of indicators from the same latent factor would improve model fit. However, no action for respecification is taken at this point.
因此,将讨论具有部分标量不变性的模型。此模型的严格卡方检验达到显著性, χ2 (174)=254.555, p<.001 。同样,近似拟合指数显示模型拟合良好,TLI=.988,RMSEA=.039 (.028, .050),SRMR=.037。再一次,允许来自相同潜在因子的指标的交叉载荷和相关误差项将提高模型拟合度。但是,此时不执行重新规范的操作。

Results suggest that the measurement in the first group from the domain of loan origination appears to be somewhat problematic. Not only did the estimated model suggest that there are unanticipated cross-loadings of indicators from the recognition function to the latent factor of prediction, but also one item intercept of the factor of decision-making is non-invariant. This serves as indication that the measurement in the loan origination group did not work as optimal as intended. However, model fit as indicated by the approximate fit indices was still satisfactory.
结果表明,来自贷款发放领域的第一组的测量似乎存在一些问题。估计模型不仅表明从识别函数到预测潜在因子的指标存在意想不到的交叉加载,而且决策因子的一个项目截距是非不变的。这表明贷款发放组中的测量没有像预期的那样发挥最佳效果。然而,近似拟合指数所指示的模型拟合仍然令人满意。

Turning towards RQ1, we detect that individuals in fact have different threat perceptions regarding distinct AI functionalities. Across all tested domains respondents perceived recognition, prediction, recommendation and decision-making as different, yet related, functionalities of AI systems. This confirms our proposed measurement. However, irrespective of the slightly problematic values in the loan origination domain, the distinction between the functionalities proves to be quite stable and consequently appears to be feasible.
转向 RQ1,我们检测到个体实际上对不同的 AI 功能有不同的威胁感知。在所有测试领域中,受访者认为识别、预测、推荐和决策是 AI 系统的不同但相关的功能。这证实了我们提议的测量方法。然而,无论贷款发放域中的值是否存在一些问题,功能之间的区别被证明是相当稳定的,因此似乎是可行的。

10.3 Mean Differences of AI Functions Between Conditions
10,3 条件之间 AI 函数的平均差异

After having established partial scalar invariance, the next step is to address the mean comparisons between the three domains to test H1. H1 states that there are differences regarding the threat perception of each function between the domains of AI application. The first domain regarding the application of AI in loan origination serves as a reference group. Accordingly, the means of the four latent factors of the four functions in this group are constrained to zero.
在建立部分标量不变性后,下一步是解决三个域之间的均值比较以测试 H1。H1 指出,AI 应用程序领域之间每个功能的威胁感知存在差异。关于 AI 在贷款发放中的应用的第一个领域作为参考组。因此,该组中四个函数的四个潜在因子的均值被约束为零。

Results show that compared with the domain 2 (‘job recruitment’) prediction (Δ M=.918, p<.001) and decision-making (Δ M=.327, p<.001) appeared to be significantly more threatening in the job recruitment domain, while recognition (Δ M=.082, p=.355) and recommendation (Δ M=.137, p<.116) did not differ between both conditions. Thus, the job recruitment domain was perceived as more threatening in two out of four functionalities than the loan origination condition.
结果显示,与领域 2(“工作招聘”)相比,预测 ( Δ M=.918, p<.001 ) 和决策 ( Δ M=.327, p<.001 ) 在工作招聘领域似乎更具威胁性,而认可 ( Δ M=.082, p=.355) 和推荐 ( Δ M=.137, p<.116 )) 在两种情况下没有差异。因此,就业招聘领域被认为在四个功能中的两个方面比贷款发放条件更具威胁性。

Compared with domain 3 (‘medical treatment’) the recognition (Δ M=-1.190, p<.001), prediction (Δ M=-.501, p<.001) and recommendation (Δ M=-.763, p<.001) were perceived as significantly less threatening in the medical treatment domain, while with regard to decision-making (Δ M=-.125, p=.139) there was no difference. Here, the loan origination domain was perceived as more threatening in three out of four functionalities.
与领域 3(“药物治疗”)相比,识别 ( Δ M=-1.190, p<.001 )、预测 ( Δ M=-.501, p<.001 ) 和推荐 ( Δ M=-.763, p<.001 ) 被认为在药物治疗领域的威胁明显较低,而在决策 ( Δ M=-.125,p=.139)没有差异。在这里,贷款发放域被认为在四分之三的功能中更具威胁性。

When comparing domain 2 (‘job recruitment’) with domain 3 (‘medical treatment’) the results indicate that the recognition (Δ M=-1.273, p<.001), prediction (Δ M=-1.419, p<.001), recommendation (Δ M=-.900 , p<.001), and decision-making (Δ M=-.452, p<.001) all differed significantly between both groups. In other words, the AI application for the treatment of health problems was deemed less threatening than an AI that was deployed to assess candidates for a job.
当将领域 2(“工作招聘”)与领域 3(“医疗”)进行比较时,结果表明认可 ( Δ M=-1.273, p<.001 )、预测 ( Δ M=-1.419, p<.001 )、推荐 ( Δ M=-.900 , p<.001 ) 和决策 ( Δ M=-.452, p<.001 ) 在两组之间均存在显著差异。换句话说,用于治疗健康问题的 AI 应用程序被认为比部署用于评估工作候选人的 AI 威胁更小。

Summing up the results, the usage of AI in medical treatment is perceived as less threatening in nearly all functionalities compared to the usage of AI for loan origination or job recruitment. On the other hand, the job recruitment domain was rated as most threatening compared to the other domains - at least regarding the prediction and the decision-making functionality.
总结结果,与使用 AI 进行贷款发放或工作招聘相比,人工智能在医疗中的使用被认为在几乎所有功能中的威胁较小。另一方面,与其他领域相比,求职领域被评为最具威胁性——至少在预测和决策功能方面是这样。

Consequently, H1 that assumed differences in the threat perceptions between the different domains of AI application was partially accepted. Threat perceptions regarding AI functionalities appear to be widely domain-dependent.
因此,假设 AI 应用程序不同领域之间的威胁感知存在差异的 H1 被部分接受。有关 AI 功能的威胁感知似乎广泛依赖于域。

10.4 Effects on Fear
10,4 对恐惧的影响

Fig. 1 图1
figure 1

SEM model SEM 模型

To assess whether threat perceptions are able to predict reported fear of the respondents, a structural regression model was specified. Here the three items that served as indicators of fear loaded on a latent factor that is modeled as an endogenous dependent variable. The four latent factors of threat perceptions of AI systems functions are modeled as exogenous independent variables. Fig. 1 displays the model.
为了评估威胁感知是否能够预测报告的对受访者的恐惧,我们指定了一个结构回归模型。在这里,作为恐惧指标的三个项目加载在一个潜在因子上,该因子被建模为内生因变量。AI 系统功能的威胁感知的四个潜在因素被建模为外生自变量。无花果。1 显示模型。

Before addressing the analysis, measurement invariance of the fear construct was assessed. The results suggest that measurement of the fear construct is non-invariant between the three groups. Especially, the loading of item FEAR1 is considerably non-invariant. Consequently, the equality constraints for the loadings and intercept of this item were freed for the analysis. The chi-square difference test indicates that this model with partial invariance still fits significantly worse compared to the configural model, Δχ2(4) = 11.95, p = .018. However, there is a substantial improvement in the comparative fit indices between the model with measurement invariance and scalar invariance, ΔTLI = .003. While the actual measurement of the fear construct is not optimal, we still rely on this previously tested measurement of fear from the literature for the purpose of an analysis of the connection between threat perceptions and self-reported fear.
在进行分析之前,评估了恐惧结构的测量不变性。结果表明,恐惧结构的测量在三组之间是不变的。特别是,物品 FEAR1 的加载是相当不变的。因此,释放了该项的载荷和截距的相等约束以进行分析。卡方差值检验表明,与配置模型 Δχ2 (4) = 11.95, p = .018 相比,该具有部分不变性的模型的拟合度仍然明显较差。然而,具有测量不变性和标量不变性的模型之间的比较拟合指数有了实质性的改进,TLI Δ = .003。虽然恐惧结构的实际测量并不是最佳的,但我们仍然依赖文献中这种先前测试过的恐惧测量来分析威胁感知和自我报告的恐惧之间的联系。

Concerning the eventual analysis, in a first step, a model was estimated where the regressions coefficient of threat perceptions on the induced fear could vary freely, χ2(274)=376.578, p<.001. Again, the approximate fit indices show good fit for the model, TLI=.987, RMSEA=.036 (.026, .044), SRMR=.039. However, as the threat perceptions are highly correlated this inflates the standard errors of the estimated parameters and it needs to be tested whether their respective effects actually differ between each other [59, 60]. Accordingly, due to high inter-correlations of the four latent factors in the respective groups, a second model was specified in which the effect of each latent factor on fear was constrained to be equal. The chi-square difference test for the first unconstrained model and the model with equality constraints shows that the second model does not fit the data significantly worse, Δχ2(9) = 12.976, p=.164.
关于最终分析,第一步,估计了一个模型,其中威胁感知对诱发恐惧的回归系数可以自由变化, χ2 (274)=376.578, p<.001 。同样,近似拟合指数显示模型拟合良好,TLI=.987,RMSEA=.036 (.026, .044),SRMR=.039。然而,由于威胁感知高度相关,这夸大了估计参数的标准误差,需要测试它们各自的效果是否真的彼此不同 [5960]。因此,由于各组中四个潜在因素的高度相互关联,指定了第二个模型,其中每个潜在因素对恐惧的影响被约束为相等。第一个无约束模型和具有相等约束的模型的卡方差值检验表明,第二个模型对数据的拟合并未显著变差, Δχ2 (9) = 12.976, p=.164

The model with equality constraints for the threat perceptions within each group still suggests good fit, χ2(283) = 389.555, p<.001, TLI =.987, RMSEA =.036 (.026, .044), SRMR =.041. The parameter estimates show a small effect of the threat perceptions of AI on fear for group 1 (‘loan origination’), B(SE)=.184(.014), p<.001, group 2 (‘job application’), B(SE)=.189(.018), p<.001, and group 3 (‘medical treatment’), B(SE)=.188(.017), p<.001, respectively. The effect size ranges from βmin =.135 to βmax =.203. Accordingly, H2 was accepted.
对每组内威胁感知具有相等约束的模型仍然表明拟合良好, χ2 (283) = 389.555, p<.001 , TLI = .987, RMSEA = .036 (.026, .044), SRMR = .041。参数估计显示,人工智能的威胁感知对第 1 组(“贷款发放”)、B(SE) = .184(.014)、 p<.001 第 2 组(“工作申请”)、B(SE) = .189(.018) p<.001 和第 3 组(“医疗”)、B(SE) = .188(.017)、 p<.001 、 的恐惧影响很小。效应大小范围为 βmin = .135 到 βmax = .203。因此,H2 被接受。

Eventually, RQ2 asks about differences of the effect of threat perceptions on fear between the domains. The similarity of the parameter estimates suggests that the effect of threat perceptions of AI on fear appears to be equal between the groups. Consequently, a model was specified where the effect of the latent factors of threat perceptions on fear were not only equal across threat perceptions of the functions of AI within the respective groups, but also did not differ between the domains. Therefore, it was tested whether the effect of the threat perceptions on fear was equal across the groups by using equality constraints. The chi-square difference test for the first unconstrained model and the model with equality constraints shows that the second model does not fit the data significantly worse, Δχ2(2)=0.059, p=.971.
最后,RQ2 询问了不同领域之间威胁感知对恐惧的影响的差异。参数估计的相似性表明,人工智能的威胁感知对恐惧的影响在两组之间似乎相等。因此,指定了一个模型,其中威胁感知的潜在因素对恐惧的影响不仅在各个组内对 AI 功能的威胁感知中相等,而且在各个领域之间也没有差异。因此,通过使用相等约束来测试威胁感知对恐惧的影响在各组之间是否相等。第一个无约束模型和具有相等约束的模型的卡方差值检验表明,第二个模型对数据的拟合并未明显变差, Δχ2 (2)=0.059,p=.971。

The model with equality constraints for the effect of threat perceptions of AI on perceived fear within and between each domain still suggests good fit, χ2(285)=389.614, p<.001, TLI=.987, RMSEA=.035 (.026, .044), SRMR=.041. The parameter estimates show that threat perceptions of AI have a small significant effect on reported fear for group 1 (‘loan origination’), group 2 (‘job application’), and group 3 (‘medical treatment’), respectively, B(SE)=.186(.010), p<.001. The effect size ranges from βmin=.133 to βmax=.202.
AI 的威胁感知对每个领域内部和之间感知恐惧的影响具有相等约束的模型仍然表明拟合良好, χ2 (285)=389.614, p<.001 ,TLI=.987,RMSEA=.035 (.026,.044),SRMR=.041。参数估计显示,人工智能的威胁感知对第 1 组(“贷款发放”)、第 2 组(“工作申请”)和第 3 组(“医疗”)的报告恐惧有很小的显着影响,分别为 B(SE)=.186(.010)、 p<.001 。效应大小范围从 βmin =.133 到 βmax =.202。

11 Discussion 11 讨论

In this paper we introduced a scale to measure threat perceptions of artificial intelligence. The scale can be used to assess citizens’ concerns regarding the use of AI systems. As AI technologies are increasingly introduced into everyday life, it is crucial to understand under what circumstances citizens might feel themselves to be threatened. Threats can be understood as a pre-condition of fear. Subsequently, according to fear appeal literature, being frightened can lead to denial and avoidance of the threatening object. Thus, if people perceive AI as a serious threat, it could cause a non-adoption of the technology.
在本文中,我们引入了一个量表来衡量人工智能的威胁感知。该量表可用于评估公民对使用 AI 系统的担忧。随着人工智能技术越来越多地引入日常生活,了解公民在什么情况下可能会感到自己受到威胁至关重要。威胁可以理解为恐惧的先决条件。随后,根据恐惧上诉文献,受到惊吓会导致否认和回避威胁对象。因此,如果人们将 AI 视为严重威胁,则可能导致不采用该技术。

However, AI is an umbrella term for a huge variety of different applications. AI applications can fulfill various functions and applications are used in almost every societal field. Arguably, there are huge variances in threat perceptions of different functions and domains of application. With the TAI-scale we propose a measurement to account for this context-specificity.
然而,AI 是各种不同应用程序的总称。AI 应用程序可以实现各种功能,应用程序几乎用于每个社会领域。可以说,不同功能和应用领域的威胁感知存在巨大差异。通过 TAI 量表,我们提出了一种测量方法来解释这种环境特异性。

First, the results suggest that threat perceptions of distinct AI functions can be reliably differentiated by respondents. Recognition, prediction, recommendation and decision-making are indeed perceived as different functions of AI systems. However, depending on the context evaluated the measure showed diverging factorial validity. In one case the indicator items had significant shared variance with more than one dimension. This impairment of discriminatory power indicates that thorough pre-testing of the adapted measures and data quality control are of utmost importance when devising the survey instrument in subsequent study designs. In doing so, researchers need to make sure that respondents fully comprehend the item wording and that the object of potential threat is clearly recognizable. Especially, this becomes important when respondents are confronted with new and technically sophisticated AI systems, for which there not yet exists enough direct personal experience.
首先,结果表明,受访者可以可靠地区分不同 AI 功能的威胁感知。识别、预测、推荐和决策确实被认为是人工智能系统的不同功能。然而,根据评估的环境,该措施显示出不同的因子效度。在一个案例中,指标项与多个维度具有显著的共享方差。这种区分力的损害表明,在随后的研究设计中设计调查工具时,对调整措施和数据质量控制的彻底预测试至关重要。在此过程中,研究人员需要确保受访者完全理解项目的措辞,并且潜在威胁的对象是可清楚地识别的。特别是,当受访者面对新的、技术复杂的人工智能系统时,这一点就变得很重要,而这些系统还没有足够的直接个人经验。

Second, threat perceptions are shown to vary between different domains, in which AI systems are deployed. This suggests that the notion of a general fear of AI needs to enhanced in favor of a broader conception not only of what actions AI is able to perform, but also what exactly is at stake in a given situation. In cases where AI systems seem useful and the consequences of its application appear insubstantial, the introduction of AI in another domain might evoke entirely opposite reactions. Thus, while general perceptions such as general predispositions concerning digital technology certainly do play a role when it comes to the evaluation of innovative AI systems, a more fine-grained approach is necessary and appears to be fruitful with the developed measurement. Respondents’ threat perceptions in this study varied considerably between domains. Especially, the use of AI in medical treatment was only perceived as lightly threatening, whereas threat perceptions were quite higher in the domains of job recruitment and loan origination. Regarding the levels of threat perceptions concerning the functionalities, it is evident that the decision-making function is perceived as most threatening within all three domains. Arguably, this might be based on the loss of humans’ autonomy. As this is only a hypothesis at this point, further studies should elaborate on these findings. Future applications of the TAI scale will yield further insights concerning the items’ and scales’ sensitivity with regards to different domains of AI applications.
其次,威胁感知在部署 AI 系统的不同领域之间有所不同。这表明,需要加强对 AI 的普遍恐惧的概念,以支持更广泛的概念,不仅要了解 AI 能够执行哪些操作,还要了解在特定情况下究竟利害关系是什么。在 AI 系统似乎有用但其应用的后果似乎不实质性的情况下,将 AI 引入另一个领域可能会引起完全相反的反应。因此,虽然在评估创新人工智能系统时,一般看法(例如对数字技术的一般倾向)确实发挥了作用,但更精细的方法是必需的,并且似乎对开发的测量方法很有成效。在这项研究中,受访者对威胁的感知因领域而异。特别是,人工智能在医疗中的使用仅被认为具有轻微的威胁性,而在就业招聘和贷款发放领域,威胁感知要高得多。关于对功能的威胁感知水平,很明显,在所有三个领域中,决策功能都被认为是最具威胁性的。可以说,这可能是基于人类自主性的丧失。由于目前这只是一个假设,因此进一步的研究应该详细说明这些发现。TAI 量表的未来应用将进一步了解项目和量表对 AI 应用不同领域的敏感性。

Third, threat perceptions are reliable predictors of self-reported fear. As the measurement of actual fear is rather complicated via the means of a survey instrument, the inquiry of threat perceptions appears not only to be preferable. It also suggests that it is reasonably well connected to individual self-reports of experienced fear. Further studies should elaborate on these findings and focus on the behavioral impact of AI-related threat perceptions. As the fear appeal literature suggests, one might expect that high levels of AI-induced fear lead to rejection of the technology or even protest behavior.
第三,威胁感知是自我报告的恐惧的可靠预测指标。由于通过调查工具测量实际恐惧相当复杂,因此对威胁感知的调查似乎不仅更可取。它还表明,它与个人对经历过恐惧的自我报告有相当好的联系。进一步的研究应该详细说明这些发现,并侧重于 AI 相关威胁感知的行为影响。正如恐惧诉求文献所表明的那样,人们可能会预期 AI 引起的高度恐惧会导致对该技术的拒绝甚至抗议行为。

Highlighting the good fit of our scale, we encourage researchers to implement the TAI scale in research focusing on public perceptions of AI systems. As outlined earlier, the TAI scale can be seen as a toolbox. Hence, it is possible to integrate only those functional dimensions in a survey that actually fit the AI system under research. Anyhow, we also advise researchers to be mindful when using the scale. In practice, there is a lot of confusion about the terminology of AI - even within the scientific community. Researchers using the scale have to make sure that the AI system under research actually performs AI tasks. Given the fact that usually non-experts serve as respondents, scholars have the responsibility to inform them rightfully about what the specific AI system under consideration is and what it is able to do. Otherwise, researchers would make claims about a threat of AI perceptions without actually examining AI.
为了强调我们量表的良好契合性,我们鼓励研究人员在专注于公众对 AI 系统的看法的研究中实施 TAI 量表。如前所述,TAI 量表可以看作是一个工具箱。因此,可以在调查中仅整合那些实际适合所研究的 AI 系统的功能维度。无论如何,我们还建议研究人员在使用体重秤时要小心。在实践中,关于 AI 的术语存在很多混淆——即使在科学界也是如此。使用该秤的研究人员必须确保所研究的 AI 系统确实执行 AI 任务。鉴于通常非专家担任受访者,学者有责任正确地告知他们所考虑的特定 AI 系统是什么以及它能够做什么。否则,研究人员会声称 AI 感知的威胁,而没有真正检查 AI。

Future studies should test the TAI scale in surveys employing representative sampling to make statements over the actual level of threat perceptions regarding the different functionalities of AI systems in various domains. With that, it would be possible to grasp the public threat perception of AI systems and draw conclusions for further implementation of AI in society. However, such research needs to be thoroughly theorized, as the mere information concerning levels of threat perceptions of AI with the public is of little academic value.
未来的研究应该在采用代表性抽样的调查中测试 TAI 量表,以陈述对各个领域中 AI 系统不同功能的威胁感知的实际水平。有了这个,就有可能掌握人工智能系统的公众威胁感知,并为人工智能在社会中的进一步实施得出结论。然而,此类研究需要彻底的理论化,因为仅关于公众对 AI 的威胁感知水平的信息几乎没有学术价值。

Another promising future direction of research could focus on the role of knowledge in attitude building. Knowledge about AI technology could influence the way individuals feel threatened by AI. Additionally, with this extension of research one may test, whether the perceptions of what an AI system is capable of do in fact match the technical level. It may be possible that the imagination of individuals and the real performance of AI systems might not correspond.
另一个有前途的未来研究方向可以集中在知识在态度建设中的作用。有关 AI 技术的知识可能会影响个人感到受到 AI 威胁的方式。此外,随着研究的扩展,人们可以测试对 AI 系统能力的看法是否真的与技术水平相匹配。个人的想象力和 AI 系统的实际性能可能不对应。

12 Limitations 12 局限性

As this study attempts to develop a more fine-grained approach to measuring threat perceptions regarding AI, its focus lay on scale construction and testing the application of the scale in an online survey. The results are thus limited to German online users from a non-representative online access panel. Further research should extend the scope of the domain of AI applications as well as addressing further groups of stakeholders and, especially, behavioral consequences of perceived threats of AI.
由于本研究试图开发一种更精细的方法来衡量对 AI 的威胁感知,因此其重点在于规模构建和测试规模在在线调查中的应用。因此,结果仅限于来自非代表性在线访问面板的德国在线用户。进一步的研究应扩大 AI 应用领域的范围,并解决更多利益相关者群体的问题,特别是感知到 AI 威胁的行为后果。

Furthermore, a translation of the scale to other languages appears as another promising avenue. As Gnambs and Appel [16] showed, based on longitudinal data from the Eurobarometer attitudes towards robots and autonomous systems vary between countries and might be subject to cultural influences that warrant research illuminating divergent perceptions and their antecedents.
此外,将量表翻译成其他语言似乎是另一个有前途的途径。正如 Gnambs 和 Appel [16] 所表明的那样,根据 Eurobarometer 的纵向数据,各国对机器人和自主系统的态度各不相同,并且可能受到文化影响,因此需要进行研究来阐明不同的认知及其前因。

Finally, we point out that, although we refer to the periodic system of AI [34] and the study of Hofmann and colleagues [33], the functional classes may be considered somewhat arbitrary. As AI is a very broad term, there might be other possibilities for dimensional structures of a scale focusing on public threat perceptions. However, our results give support for the dimensional structure we proposed.
最后,我们指出,尽管我们提到了人工智能的周期性系统[34]和Hofmann及其同事[33]的研究,但功能类别可能被认为有些武断。由于 AI 是一个非常广泛的术语,因此可能还有其他可能性,即专注于公共威胁感知的规模维度结构。然而,我们的结果支持我们提出的维度结构。

13 Conclusion 13 结论

The public perception of Artificial Intelligence will become increasingly important as applications that make use of AI technologies will further proliferate in various societal domains. A populace that perceives AI as threatening and that in consequence fears its proliferation may prove as detrimental as a blind trust in the benevolence of actors that implement AI systems as well as a general overestimation of the veracity of assertions and decisions made by AI. Consequently, the survey of threat perceptions of various AI systems is of great research interest. In this paper, we proposed and constructed a measurement of threat perceptions regarding AI that is able to capture various functions performed by AI systems and that is adaptable to any context of application that is of interest. The developed TAI scale showed satisfactory results in that it reliably captured threat perceptions regarding the distinct functions of recognition, prediction, recommendation, and decision-making by AI. The results also suggest that the developed scale is able to elucidate differences in these threat perceptions between distinct domains of AI applications.
随着利用人工智能技术的应用程序将在各个社会领域进一步激增,公众对人工智能的看法将变得越来越重要。将 AI 视为威胁并因此担心其扩散的民众可能被证明与盲目信任实施 AI 系统的行为者的善意以及普遍高估 AI 做出的断言和决策的真实性一样有害。因此,对各种 AI 系统的威胁感知调查具有极大的研究兴趣。在本文中,我们提出并构建了一种关于 AI 的威胁感知测量方法,它能够捕获 AI 系统执行的各种功能,并且适用于任何感兴趣的应用程序上下文。开发的 TAI 量表显示出令人满意的结果,因为它可靠地捕捉了关于 AI 识别、预测、推荐和决策的不同功能的威胁感知。结果还表明,开发的量表能够阐明 AI 应用程序不同领域之间这些威胁感知的差异。