这是用户在 2024-6-28 11:20 为 https://app.immersivetranslate.com/pdf-pro/9fd1237a-aeba-4074-a58a-f3d0a7aa36f3 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
2024_06_28_1100e4a7337f10c47c30g

A Theoretical Framework for Conversational Search
对话式搜索的理论框架

Filip Radlinski* 菲利普·拉德林斯基*filiprad@microsoft.comMicrosoft, United Kingdom
微软,英国

Nick Craswellnickcr@microsoft.comMicrosoft, United States 微软,美国

Abstract 摘要

This paper studies conversational approaches to information retrieval, presenting a theory and model of information interaction in a chat setting. In particular, we consider the question of what properties would be desirable for a conversational information retrieval system so that the system can allow users to answer a variety of information needs in a natural and efficient manner. We study past work on human conversations, and propose a small set of properties that taken together could measure the extent to which a system is conversational. Following this, we present a theoretical model of a conversational system that implements the properties. We describe how this system could be implemented, making the action space of an conversational search agent explicit. Our analysis of this model shows that while theoretical, the model could be practically implemented to satisfy the desirable properties presented. In doing so, we show that the properties are also feasible.
本文研究了信息检索的对话式方法,在聊天设置中提出了信息交互的理论和模型。特别是,我们考虑了对话式信息检索系统应具备哪些性质,以便系统能够让用户以自然和高效的方式回答各种信息需求。我们研究了过去关于人类对话的工作,并提出了一小组性质,这些性质共同构成了系统对话性的度量。在此基础上,我们提出了一个实现这些性质的对话系统的理论模型。我们描述了如何实现这个系统,明确了对话式搜索代理的行动空间。我们对这个模型的分析表明,虽然是理论性的,但该模型实际上可以被实现以满足所提出的理想性质。通过这样做,我们展示了这些性质也是可行的。

Keywords: Conversational Search, Chatbot, Personal Agent
关键词:对话式搜索,聊天机器人,个人代理

1. INTRODUCTION 1. 介绍

Recent progress in Machine Learning has brought tremendous improvements in natural language dialogs between humans and conversational agents. This has led to a plethora of commercial conversational agents (also called chat bots or simply bots) that are able to answer user requests from ordering pizza to suggesting holiday destinations. Such systems are conversational in that they assist users using a dialog interaction, be it in written or spoken form, usually with a rich human-like vocabulary.
机器学习近年来取得的进展在人类与对话代理之间的自然语言对话方面带来了巨大的改进。这导致了大量商业对话代理(也称为聊天机器人或简称机器人),它们能够回答用户的各种请求,从订购披萨到建议度假目的地。这些系统是具有对话性质的,它们通过对话交互来帮助用户,无论是书面形式还是口头形式,通常使用丰富的类人词汇。
To build an information retrieval system with a conversational user interface, it is useful to define a computational model that describes the process of conversational search. The model should allow the user to make a natural language request, akin to a traditional information retrieval query. It should allow the system to propose search results, but also ask the user for clarification if necessary. It should allow the user to give feedback on the system's results and suggestions, including negative feedback. Over time, the process should allow the system to build a cumulative picture of the
构建具有对话式用户界面的信息检索系统时,定义描述对话式搜索过程的计算模型非常有用。该模型应允许用户提出类似于传统信息检索查询的自然语言请求。它应允许系统提出搜索结果,但也在必要时向用户询问澄清问题。它应允许用户对系统的结果和建议提供反馈,包括负面反馈。随着时间的推移,该过程应使系统能够建立对用户需求的累积性了解。
user's information need based on their query statements and other relevance feedback.
基于用户的查询语句和其他相关反馈,用户的信息需求。
We observe that conversational search is in keeping with trends in the design of computing devices and interfaces [15]. Modern devices with small or no screen may provide responses via small on-screen cards and speech synthesis, so succinct conversational responses are appropriate. With speech recognition accuracy also improving due to progress in machine learning, the popularity of speech-based search input is also growing . Such a growth in natural language dialog between users and search systems may even lead to the dominant interaction model of one-shot keyword queries being displaced with conversational systems.
我们观察到,会话式搜索符合计算设备和界面设计的趋势[15]。现代设备可能通过小型屏幕或无屏幕提供小型屏幕卡片和语音合成的响应,因此简洁的会话式响应是合适的。随着机器学习进展,语音识别准确性也在提高,基于语音的搜索输入的流行度也在增长。用户和搜索系统之间自然语言对话的增长甚至可能导致一次性关键字查询的主导交互模型被会话式系统取代。
To build a computational model for conversational search it is important to define which steps are allowable during a conversation: The types of statements that can be made by the system and by the user. The system must build a model of the user's information need over the course of the conversation, such that he or she does not need to repeat important aspects of the information need. Cumulative clarifications should tend to move the process closer to success To make the conversation more flexible and natural, ideally most conversational steps that humans would take should be interpretable by the system and also potentially generated by the system. For example, in a real conversation about restaurants a person might ask "Do you like sushi? I went to a great place yesterday" or perhaps ask key questions such as "Are you looking for somewhere fancy?" In certain contexts the question may not directly appear to be about food, such as "How much time do you have?" People take into account what they know about the other person from past conversations and other aspects of context, and even ask for direct feedback "What did you think of the restaurant that I suggested to you last week?"
构建用于对话搜索的计算模型时,重要的是定义对话过程中允许哪些步骤:系统和用户可以发表的声明类型。系统必须在对话过程中建立用户信息需求的模型,以便用户无需重复重要信息需求的方面。累积澄清应该使过程更接近成功。为了使对话更加灵活和自然,理想情况下,大多数人类可能采取的对话步骤应该能够被系统解释,也可能由系统生成。例如,在关于餐厅的真实对话中,一个人可能会问:“你喜欢寿司吗?我昨天去了一个很棒的地方”,或者询问关键问题,比如“你想找一个高档的地方吗?”在某些情境下,问题可能并不直接涉及食物,比如“你有多少时间?”人们会考虑他们从过去对话和其他上下文方面了解到的对方信息,甚至会直接要求反馈:“你觉得我上周向你推荐的那家餐厅怎么样?”
Many conversational search tasks are similar: People offer a reference point, or a key choice, to elicit the information they consider most important for separating the sorts of places that they might recommend. In the restaurant domain, people very rarely enumerate the types of cuisines or ask for a specific limit on the number of miles you are willing to travel. The same applies in other domains, such as when searching personal photo collections - a particular photo may serve as a reference point from which the target may for instance be earlier/later, in a different location, but with the same people.
许多对话式搜索任务都很相似:人们提供一个参考点或关键选择,以引出他们认为最重要的信息,以区分他们可能推荐的地方类型。在餐厅领域,人们很少列举不同类型的菜肴,或要求对旅行的里程数设定具体限制。在其他领域也是如此,比如搜索个人照片收藏时 - 特定照片可能作为一个参考点,从中目标可能是更早/更晚,位于不同位置,但与相同的人。
In the field of spoken dialog systems, approaches already exist allowing conversational slot filling of a structured query within a schema (e.g. [44]). This allows users to book a ticket for a certain concert on a certain night, or set a certain reminder message to appear on their mobile device at a certain time. By contrast, in other conversational search scenarios there may not be a fixed schema or the underlying data may not be structured as a database. In those cases rather than slot filling it may be beneficial to use a more freeform conversation that nevertheless builds its understanding over the user's needs over multiple rounds of conversation and may provide responses as well as ask clarifying questions.
在口语对话系统领域,已经存在一些方法,允许在架构内进行对话槽填充的结构化查询(例如[44])。这使用户可以预订某个音乐会的门票,在某个晚上,或者设置某个提醒消息在某个时间出现在他们的移动设备上。相比之下,在其他对话搜索场景中,可能没有固定的架构,或者底层数据可能不是作为数据库结构化的。在这些情况下,与其进行槽填充,使用更自由形式的对话可能更有益,尽管它在多轮对话中建立对用户需求的理解,并可能提供回应以及提出澄清问题。
We hypothesize that two aspects of conversations are particularly pertinent to search settings. First, users often do not know how to describe their information need - be it for a recommendation, or information regarding a new topic. Part of the role of the conversation is to elicit the actual need from the user by helping them formulate it clearly [24]. Second, for many tasks particularly suited to multi-turn conversational interactions, a set of results interact to produce a single item response which satisfies the need. For instance, when selecting a product to purchase, it is often driven by a preference among available options [13]. On the other hand, in some settings the solution is by its very nature a set, as in when deciding upon a holiday destination which requires travel, accommodation and dining requirements to be satisfied [21].
我们假设对话的两个方面特别与搜索设置相关。首先,用户通常不知道如何描述他们的信息需求 - 无论是推荐,还是有关新主题的信息。对话的作用之一是通过帮助用户清晰地表达需求来引出用户的实际需求[24]。其次,对于许多特别适合多轮对话交互的任务,一组结果相互作用以产生满足需求的单个项目响应。例如,选择要购买的产品时,通常是由可用选项中的偏好驱动[13]。另一方面,在某些情况下,解决方案本质上是一个集合,例如在决定需要满足旅行、住宿和餐饮需求的度假目的地时[21]。
This paper presents a formalism that allows for richer sharing of initiative, and longer-term adaptation and personalization. Our goal is to capture the desirable properties of conversation specifically from an information retrieval perspective. Intuitively, the setting is that a conversational agent (assistant) has been asked satisfy an information need. At each point in time, the agent can perform one of a fixed set of actions, to which the user responds, with a back-and-forth. Mixed initiative and memory are key parts of this model. Our key contributions are
本文提出了一种形式主义,允许更丰富的主动共享,以及更长期的适应和个性化。我们的目标是从信息检索的角度捕捉对话的可取特性。直观地说,情景是一个对话代理(助手)被要求满足信息需求。在每个时间点,代理可以执行一组固定的动作之一,用户做出回应,来回交流。混合主动性和记忆是这个模型的关键部分。我们的主要贡献是
  1. We suggest a formal definition of a conversation from an information retrieval perspective, showing why each property is desirable.
    我们建议从信息检索的角度对对话进行正式定义,展示为什么每个属性都是可取的。
  2. We propose a theoretical model of conversations that allows agents to satisfy the formal properties, demonstrating that the definition is also practical.
    我们提出了一个对话的理论模型,使代理能够满足形式属性,证明定义也是实用的。
Human conversations have been studied for decades, and conversational research can be understood in relation to a number of existing research areas. We present an overview of the space to place our work in context. We start with conversations as applied to information retrieval, then discuss the broader literature regarding the essence of conversations.
人类对话已经被研究了几十年,对话研究可以与许多现有研究领域联系起来理解。我们提供了一个概述空间,以便将我们的工作置于上下文中。我们从将对话应用于信息检索开始,然后讨论有关对话本质的更广泛文献。

2.1 Search and Recommender Systems
2.1 搜索和推荐系统

In traditional ad hoc search, the system allows the user to provide a natural language request (query) describing results that they want. A minimal response from the system is a ranked list of results (akin to "try looking at these") and a search box that remains available (the system allowing the user to "tell me what you want"). Systems may also correct the user's query ("did you mean?"), suggest other queries ("try these related searches") or provide faceted browsing in the results ("refine by") [43]. In that sense the user is having a conversation with a system that is providing a variety of responses, and many results at once are bundled into a single search engine results page.
在传统的即席搜索中,系统允许用户提供描述他们想要的结果的自然语言请求(查询)。系统的最小响应是一份排名列表的结果(类似于“试着看看这些”),以及一个保持可用的搜索框(系统允许用户“告诉我你想要什么”)。系统还可以纠正用户的查询(“您是指?”),建议其他查询(“尝试这些相关搜索”)或在结果中提供分面浏览(“按...细化”)。在这种意义上,用户正在与提供各种响应的系统进行对话,并且许多结果一次性捆绑到单个搜索引擎结果页面中。
During such a conversation, even if the user's search task does not change their understanding of the task, their query vocabulary may change and they may apply a variety of search strategies [2]. In some systems the retrieval response is based only on the user's most recent query, but other systems can take into account past queries and other context . In that sense modern contextual information retrieval systems already allow some co-active development, where both the system and the human user develop their understanding over time. However, real conversations may have mixed initiative [37], where control of the conversation passes from one side to the other via assertions, commands, questions and prompts. For instance, Dredze et al. [14] showed how in the context of email search an agent may propose pertinent ways to select subsets of the result set by adding key-value pairs to the query.
在这样的对话中,即使用户的搜索任务不会改变他们对任务的理解,他们的查询词汇可能会改变,他们可能会应用各种搜索策略[2]。在一些系统中,检索响应仅基于用户最近的查询,但其他系统可以考虑过去的查询和其他上下文 。从这个意义上讲,现代上下文信息检索系统已经允许一些协同发展,系统和人类用户随着时间的推移都会发展他们的理解。然而,真实的对话可能具有混合主动性[37],在这种情况下,对话的控制通过断言、命令、问题和提示从一方传递到另一方。例如,Dredze 等人[14]展示了在电子邮件搜索的背景下,代理可以提出相关的方法来选择结果集的子集,通过向查询添加键值对。
Already over two decades ago, Belkin et al. [3] considered conversational information retrieval by characterizing information-seeking strategies. They proposed scripts that can be followed by a system for different types of retrieval tasks, using case-based reasoning to select next steps and offer users choices. This differs from our work, as we assume a simpler conversational interface (such as a chat) where users enter text in response to agent actions also consisting of simple statements. The users may or may not respond directly to the system's requests. Also, we model the retrieval problem as one where the system reasons about items that can be retrieved, rather than over the space of possible user intents.
已经超过二十年前,Belkin 等人[3]通过表征信息寻求策略来考虑对话式信息检索。他们提出了可以由系统遵循的脚本,用于不同类型的检索任务,使用基于案例的推理来选择下一步并为用户提供选择。这与我们的工作不同,因为我们假设一个更简单的对话界面(如聊天),用户以文本形式回应代理动作,这些动作也由简单的语句组成。用户可能会或可能不会直接回应系统的请求。此外,我们将检索问题建模为系统推理可以检索的项目,而不是用户意图的可能空间。
Conversational agents for more advanced multi-turn tasks have been proposed continuously since then, for instance recent work to isolate and resolve technical issues typically handled by a help desk [42]. This differs from our work, as we address the task of information retrieval rather than guiding a process by which a problem may be resolved. While a related informational goal could be to identify an instruction document, our goal is a characterization of a more general class of conversations.
自那时起,针对更复杂的多轮任务提出了不断的对话代理,例如最近的工作是隔离和解决通常由帮助台处理的技术问题[42]。这与我们的工作不同,因为我们处理的是信息检索任务,而不是引导解决问题的过程。虽然相关的信息目标可能是识别一份指导文件,但我们的目标是对更一般类别的对话进行表征。
In a spoken conversation or on a device with a small screen, it also becomes important for the search system to chose one response or a small number of responses, rather than bundling a large number of results and suggestions into a results page. For example, if the user's query was ambiguous it may be optimal to show search results for just one intent and query suggestions for another [19] If we consider the query suggestion to be a clarifying question, then showing such a suggestion prominently allows for a greater reward later by incurring an initially costly question. The idea of reinforcement learning, to optimally plan for a delayed reward rather than greedily always choosing the maximum immediate reward, is also explored under the card model of information retrieval [45] These are important steps towards a mixed initiative conversational search system, although still with traditional system responses such as results and query suggestions.
在口头对话或屏幕较小的设备上,对于搜索系统来说,选择一个或少数几个响应变得很重要,而不是将大量结果和建议捆绑到一个结果页面中。例如,如果用户的查询含糊不清,最佳做法可能是仅显示一个意图的搜索结果和另一个意图的查询建议。如果我们将查询建议视为澄清问题,那么突出显示这样的建议可以在最初产生成本高昂的问题后获得更大的回报。强化学习的理念是为了最优地规划延迟奖励,而不是贪婪地总是选择最大的即时奖励,这也在信息检索的卡片模型下得到探讨。这些是迈向混合主动对话搜索系统的重要步骤,尽管仍然具有传统系统响应,如结果和查询建议。
Methods for conversational recommendation have also been proposed. Recently, Christakopoulou et al. [9] studied whether to ask absolute or relative questions, comparing the utility of each for learning about users. They also asked questions contextually, based on what is already known. Much earlier, Linden et al. [21] proposed a conversational travel agent that allows the user to find an optimal or near-optimal trip by presenting the user with examples that characterize the solution space and allowing the user to express and modify their criteria. A key method for expressing such updates is critiquing, which gives feedback on facets of importance - such as airline, price or departure time of a flight - with respect to the options already presented. In general a critique can be directed at a particular attribute of a particular item, for example "like this one but cheaper" [23]. Another form of critiquing is at the item level, dating to the Rocchio relevance feedback algorithm where users of a search system may annotate results as relevant or not to refine the search query [29]. A critique differs in that it explains how a result could be modified to improve its utility to the user.
对话式推荐的方法也已被提出。最近,Christakopoulou 等人研究了是询问绝对问题还是相对问题,比较了每种方法对了解用户的效用。他们还根据已知信息情境性地提出问题。在更早之前,Linden 等人提出了一种对话式旅行代理,允许用户通过展示表征解决方案空间的示例来找到最佳或接近最佳的旅行方案,并允许用户表达和修改他们的标准。表达这种更新的关键方法是批评,它针对重要方面提供反馈 - 例如航空公司、价格或航班出发时间 - 相对于已经呈现的选项。一般来说,批评可以针对特定项目的特定属性,例如“喜欢这个但更便宜”[23]。另一种批评形式是在项目级别,可以追溯到 Rocchio 相关反馈算法,用户可以将搜索系统的结果注释为相关或不相关以细化搜索查询[29]。批评的不同之处在于它解释了如何修改结果以提高其对用户的效用。
We note that in human conversation critiquing also happens, but it is not limited to a pre-defined set of facets. An ideal conversational information retrieval system might allow free form critiquing of
我们注意到,在人类对话中也会发生批评,但不限于预定义的一组方面。一个理想的对话信息检索系统可能允许自由形式的批评。

suggested results in natural language. To enable free-form queries and critiques, the information retrieval system could build its models based on the language modeling approach to information retrieval. However, more advanced forms of reasoning may be required, particularly when the user answers a question or refers to other parts of the conversation, suggesting the use of more sophisticated natural language technology. Despite this, today most end-to-end conversational systems based on deep neural networks lack the ability to explicitly focus on a search task, rather giving generic contextual responses (for example, [31]). Memory networks have proved very effective at complex question answering scenarios, able to provide correct answers given complex pieces of information and potentially a large knowledge base [33]. However, they are unable to request clarification of the task at hand when the solution is uncertain.
自然语言中的建议结果。为了实现自由形式的查询和批评,信息检索系统可以基于语言建模方法构建其模型以进行信息检索。然而,可能需要更高级形式的推理,特别是当用户回答问题或提到对话的其他部分时,建议使用更复杂的自然语言技术。尽管如此,今天大多数基于深度神经网络的端到端对话系统缺乏明确专注于搜索任务的能力,而是提供通用的上下文响应(例如,[31])。记忆网络在复杂问题回答场景中表现出非常有效的能力,能够在给定复杂信息片段和可能庞大知识库的情况下提供正确答案[33]。然而,当解决方案不确定时,它们无法请求对任务的澄清。

2.2 Spoken Dialog Systems
2.2 口语对话系统

Spoken dialog system research enables a flexible conversation to take place including corrections and clarifications, usually in a closed domain such as setting a reminder or booking tickets (for instance, [44]). In an early system, Paek and Horvitz modeled spoken conversations using a Bayesian network that tracks confidence from the level of the audio signal obtained from the user through to predicting the user's goal with appropriate back-off depending on detected failure modes [26]. More recently, Chen et al. [8] have studied how a system can estimate the user's intent within a particular conversation step. In a simpler task, the Dialog State Tracking Challenge has pushed forward the ability of systems to fill known slots for the task of bus travel planning (e.g. [16]). Yet such a slotdriven approach differs from human recommendation where it is rarely important to fill all slots [9].
口语对话系统研究使得灵活的对话能够进行,包括更正和澄清,通常在封闭领域内进行,比如设置提醒或订票(例如,[44])。在早期系统中,Paek 和 Horvitz 使用贝叶斯网络对口语对话进行建模,该网络跟踪从用户获取的音频信号水平到根据检测到的失败模式预测用户目标的信心,并根据情况适当地回退[26]。最近,Chen 等人[8]研究了系统如何在特定对话步骤中估计用户意图。在一个更简单的任务中,对话状态跟踪挑战推动了系统填充已知插槽的能力,用于公交出行规划任务(例如,[16])。然而,这种基于插槽的方法与人类推荐不同,人类推荐很少重要填充所有插槽[9]。
Co-reference resolution can successfully track references to entities across spoken queries , yet back references to preferences expressed in a search scenario have not been explored to the best of our knowledge. Further, these systems do not involve mixed initiative, with the system simply keeping up with the human. Even in closed domain dialog systems, additional work is needed to make the turn-taking behavior of the system more flexible and efficient [27]. In a more open domain, Jiang et al. recently studied the most popular commercial personal agent systems capable of multi-turn task solving. They identified a set of actions that agents tend to perform, albeit at a high level [17]. Such an agent performs a mix of slot filling and information tasks, although in many cases for an information retrieval task it resorts to a traditional search engine results page.
共指消解可以成功跟踪跨口头查询中实体的引用 ,然而,我们所知,对于在搜索场景中表达的偏好的反向引用尚未得到探索。此外,这些系统不涉及混合倡议,系统只是简单地跟上人类的步伐。即使在封闭域对话系统中,还需要进一步的工作来使系统的轮流行为更加灵活和高效[27]。在更开放的领域中,Jiang 等人最近研究了能够进行多轮任务解决的最流行的商业个人代理系统。他们确定了代理通常执行的一组动作,尽管在许多情况下,对于信息检索任务,这样的代理会诉诸于传统的搜索引擎结果页面。

2.3 The Human Perspective
2.3 人类视角

Finally we turn to work studying conversations as performed by people. Perhaps among the most famous attempts to replicate conversations, Eliza was one of the first chat bots, replying to user statements consistently with how a therapist may engage a patient [39]. The algorithm rephrased statements made by the patient, reformulating them as questions back to the patient. More recently, deep learning systems have attempted to build contextualized chit chat systems, for instance as a Twitter bot that responds to context [31]. We consider what roles conversations per se appear to play as part of information exchange.
最后,我们转向研究人们进行的对话。也许在尝试复制对话方面最著名的是 Eliza,她是最早的聊天机器人之一,始终以治疗师可能与患者交流的方式回复用户的声明。该算法重新表述了患者的声明,将其重新构造为问题,再次提出给患者。最近,深度学习系统尝试构建具有上下文的闲聊系统,例如作为回应上下文的 Twitter 机器人。我们考虑对话本身似乎在信息交流中扮演的角色。

Conversations as Revealment
对话作为揭示

One significant role of conversation from an information retrieval perspective is to allow the two parties to reach an understanding as to what is required by the user, and what the answerer knows.
从信息检索的角度来看,对话的一个重要作用是让双方达成对用户需求和回答者知识的理解。
Before Web search became prevalent, as much information retrieval occurred in libraries, it was noted that the role of librarians was to help users to express their information needs. In particular, [24] studied how librarians assist in this task. The author found that the method of the librarians wasn't as important as that the conversation was happening. This suggests that automated conversational systems may also be effective even if using very different techniques.
在网络搜索变得普遍之前,大量信息检索发生在图书馆中,人们注意到图书馆员的角色是帮助用户表达他们的信息需求。特别是,[24] 研究了图书馆员如何协助完成这项任务。作者发现,图书馆员的方法并不像对话的发生那样重要。这表明,即使使用非常不同的技术,自动对话系统也可能是有效的。

Initiative and Engaging Behavior
主动和积极的行为

A number of authors have studied how a "virtual human" should behave [6, 38]. For instance Traum et al. describe desirable aspects of a system conversing with humans, such as being real-time and incremental as utterances are formed over time [36]. Similarly there has been extensive work on multi-modal systems, expressing emotion and so forth. These aspects are beyond the scope of our work, as we restrict ourselves to chat type interfaces. Additionally, our focus is to consider conditions on what needs to be possible to be said rather than how the information should be conveyed.
多位作者已经研究了“虚拟人”应该如何行为[6, 38]。例如,Traum 等人描述了与人类对话系统的理想方面,如实时和增量,因为话语随着时间形成[36]。同样,已经进行了大量关于多模态系统、表达情感等方面的工作。这些方面超出了我们工作的范围,因为我们将自己限制在聊天类型的界面上。此外,我们的重点是考虑需要可能说出的条件,而不是信息应该如何传达。
One of the key aspects of human conversations is initiative. A number of authors have considered what constitutes initiative in dialog systems . Of key interest to us is mixed initiative: At different times in the conversation, the human or the agent may take initiative. We use a generic definition of mixed initiative compared to past work, defining it as both the human and the system having initiative at different points in time. For instance, the agent may take initiative to clarify or elicit information from the user whenever appropriate, while allowing the user to drive the conversation at other times.
人类对话的一个关键方面是主动性。许多作者考虑了对话系统中的主动性构成 。我们感兴趣的一个关键点是混合主动性:在对话中的不同时间,人类或代理人可能采取主动。与过去的工作相比,我们使用了一个通用的混合主动性定义,将其定义为人类和系统在不同时间点具有主动性。例如,代理人可能会在适当时候采取主动,从用户那里澄清或引出信息,同时在其他时间允许用户引导对话。

Trust and Moral Character
信任和道德品质

A final important concept in agents emulating human behavior is one of moral character [18]. Any agent taking part in a conversation conveys a personality, and inherently builds a relationship with the user (for instance, trust with regards to what happens to information shared by the person with the agent). However, this aspect of conversational behavior is outside the scope of our work and is not a goal of the conversational model.
在模拟人类行为的代理程序中,一个最重要的概念是道德品质[18]。参与对话的任何代理都传达出一种个性,并在本质上与用户建立关系(例如,关于用户与代理共享信息后发生的事情的信任)。然而,对话行为的这一方面超出了我们工作的范围,也不是对话模型的目标。
It is also the case that when provided information (such as advice or recommendations), the source matters to people - it has been established that different sources have different influence on purchase decisions [30]. Effectiveness of a conversational system would likely depend on the system saying why it made a specific recommendations [35]. As with moral character, we do not address this question in this work.
提供信息(如建议或推荐)时,信息的来源对人们也很重要——已经确定不同的信息来源对购买决策有不同的影响[30]。会话系统的有效性可能取决于系统解释为什么做出特定的推荐[35]。与道德品质一样,我们在这项工作中不涉及这个问题。
In this section we consider the properties of conversations, proposing aspects that are applicable to search.
在本节中,我们考虑对话的属性,提出适用于搜索的方面。

3.1 What is a Conversation?
3.1 什么是对话?

The Oxford English Dictionary defines a conversation as a talk, especially an informal one, between two or more people, in which news and ideas are exchanged. While broad, this provides some guidance in information retrieval settings. In particular, we note that information is exchanged, suggesting symmetry where initiative may belong to both sides at different point in the conversation (rather than say a lecture). Hence we postulate that a conversational search system is a mixed-initiative system.
《牛津英语词典》将对话定义为两个或更多人之间的谈话,尤其是非正式的谈话,其中交换新闻和想法。虽然广义,但这为信息检索设置提供了一些指导。特别是,我们注意到信息是交换的,表明在对话中主动权可能在不同时间属于双方(而不是像讲座那样)。因此,我们假设对话搜索系统是一种混合主动系统。
We may also classify conversations by their outcomes. Often, a conversation may be an end in itself. We do not consider this type of conversation here as it does not involve information retrieval. Similarly, conversations may have as a goal to assist a person to follow a known sequence of steps. Once more, this type of conversation falls
我们也可以根据对话的结果进行分类。通常,对话本身可能就是一个结束。由于它不涉及信息检索,我们在这里不考虑这种类型的对话。同样,对话可能旨在帮助一个人按照已知的步骤序列进行操作。再次强调,这种类型的对话属于

beyond the scope of this work. We focus on conversations that aim to elicit user preferences, and identify target information.
超出本工作范围。我们专注于旨在引出用户偏好并识别目标信息的对话。
As a third aspect, we postulate that there is an element of memory: The conversation is a unit, and earlier statements can be referenced later in the conversation. Indeed, it should be possible to reference earlier statements in earlier conversations. A first consequence of the ability to index earlier statements is the existence of repair mechanisms, for instance the ability to clarify with "what I meant is..." [34, chapter 7]. More importantly in a search setting, memory allows information to be elicited from the user in a piecemeal fashion, maintaining simple steps that can together describe an arbitrarily complex information need. Indeed, it has been shown that loss of context is a common reason for user frustration with conversational systems [20]. It is important to note that memory thus plays two roles:
作为第三个方面,我们假设存在一种记忆元素:对话是一个单位,较早的陈述可以在对话后期引用。事实上,应该可以在早期对话中引用较早的陈述。能够索引早期陈述的第一个结果是存在修复机制,例如通过“我的意思是…”来澄清的能力[34,第 7 章]。在搜索设置中更重要的是,记忆允许以逐步方式从用户那里获取信息,保持可以共同描述任意复杂信息需求的简单步骤。事实上,已经表明,上下文丢失是用户对话系统感到沮丧的常见原因[20]。重要的是要注意,记忆因此发挥了两个角色:
  1. The systems remembers what was previously said by the user or the system to assist in resolving the user's information need.
    系统记住了用户或系统先前说过的内容,以帮助解决用户的信息需求。
  2. It is possible for the user to explicitly reference past information, for example to indicate what statements are not correct or should be "forgotten".
    用户可以明确引用过去的信息,例如指出哪些陈述是不正确的或应该“被遗忘”的。
Finally, the conversation should be adaptive, with neither participant following a script, but rather adapting to the current context. This expands upon common definitions of personalization, while avoiding the challenge of sessions. In particular, a conversational search agent is essentially fulfilling a long-term task, which may otherwise have consisted of many sessions in the traditional search engine sense. The abstraction will prove useful below.
最后,对话应该是适应性的,参与者都不应该遵循脚本,而是根据当前的语境进行调整。这扩展了个性化的常见定义,同时避免了会话的挑战。特别是,对话式搜索代理实质上是在完成一个长期任务,否则在传统搜索引擎的意义上可能需要许多会话。这种抽象将在下文中证明其有用性。
Taken together, these properties lead to the following definition:
综上所述,这些特性导致了以下定义:
DEFINITION 1. A conversational search system is a system for retrieving information that permits a mixed-initiative back and forth between a user and agent, where the agent's actions are chosen in response to a model of current user needs within the current conversation, using both short- and long-term knowledge of the user.
定义 1. 会话式搜索系统是一种用于检索信息的系统,允许用户和代理之间进行混合主动式来回交流,代理的操作是根据当前对话中用户需求的模型选择的,利用用户的短期和长期知识。
Further, the system has the following five properties, which we term the RRIMS properties:
此外,该系统具有以下五个特性,我们称之为 RRIMS 特性:
User Revealment The system helps the user express (potentially discover) their true information need, and possibly also long-term preferences.
用户揭示系统帮助用户表达(潜在地发现)他们真正的信息需求,可能还包括长期偏好。
System Revealment The system reveals to the user its capabilities and corpus, building the user's expectations of what it can and cannot do.
系统揭示系统向用户展示其功能和语料库,建立用户对其能力和局限性的期望。
Mixed Initiative The system and user both can take initiative as appropriate.
混合倡议系统和用户都可以根据需要采取主动。
Memory The user can reference past statements, which implicitly also remain true unless contradicted.
记忆用户可以参考过去的陈述,除非有相反的证据,否则这些陈述也仍然有效。
Set Retrieval The system can reason about the utility of sets of complementary items.
集合检索系统可以推理关于互补项目集的效用。

3.2 When should search involve conversation?
3.2 搜索应该何时涉及对话?

The appropriateness of a conversation for a search task is largely driven by the complexity of the task. The simplest search settings, where the user enters a single query and they expect to immediately identify relevant results clearly does not call for a back-and-forth with a search agent.
对于搜索任务而言,一次对话的适当性在很大程度上取决于任务的复杂性。在最简单的搜索设置中,用户输入单个查询并期望立即识别相关结果,显然不需要与搜索代理来回交流。
The next more complex type of tasks require memoryless refinement: The user learns the right terms to describe their information need by iterating with a search system. If each step is only informed by the results from the previous iteration, this does not require memory nor agent initiative. In such a setting a more complex model may in fact reduce user utility and does not call for conversational approaches to search.
下一个更复杂类型的任务需要无记忆的细化:用户通过与搜索系统迭代学习正确的术语来描述他们的信息需求。如果每一步只受到上一次迭代结果的影响,这就不需要记忆或主动性。在这种情况下,更复杂的模型实际上可能会降低用户效用,并且不需要对话式搜索方法。
However, consider these more complex scenarios where a conversation is more likely to be appropriate:
然而,请考虑这些更复杂的情况,其中更适合进行对话:

Faceted Elicitation 分面引诱

The user is searching for an item with rich attributes that can be individually specified, but are much simpler to provide piecewise. For instance, it may be possible to describe a difficult to find email such as I'm looking for an email that contains a link to a research paper that I got from a student who emailed me right after SIGIR last year. I can't remember the student's name, but I had never heard from her before.
用户正在搜索具有丰富属性的项目,这些属性可以单独指定,但更容易逐段提供。例如,可能可以描述一个难以找到的电子邮件,比如我正在寻找一封包含链接到去年 SIGIR 之后给我发来的一篇研究论文的电子邮件。我记不得那位学生的名字,但之前从未收到过她的来信。
The user is selecting among items based on facets - but cannot be expected to know how to reference these directly as this would involve memorizing a complex query language. As part of the search, the user is identifying aspects that can be used to describe a relevant item. In contrast to memoryless refinement, here the user may need to learn about a facet before returning to the top level of the search process with a tag describing how this facet can be satisfied. For instance, consider a similar case where the user is selecting a vacuum cleaner to purchase. Here, as an aside, the user may need to learn about relevant attributes such as how loud a given number of decibels really is, and then returning to his main task.
用户是根据特征在项目中进行选择,但不能指望他们知道如何直接引用这些,因为这将涉及记忆复杂的查询语言。作为搜索的一部分,用户正在识别可用于描述相关项目的方面。与无记忆的细化相比,在这里用户可能需要在返回到搜索过程的顶层之前学习有关一个特征的信息,并使用描述这个特征如何满足的标签。例如,考虑一个类似的情况,用户正在选择要购买的吸尘器。在这里,用户可能需要了解相关属性,比如给定分贝数到底有多大声音,然后再返回到他的主要任务。

Multi-Item Elicitation 多项目引诱

The user is searching for a single item supported by a set of nearby items. For instance, a photo which can only be described by the properties of other photos taken earlier such as the photo Alice took of me right after I took her picture a few months ago. In this case, the system may need to learn who Alice is.
用户正在搜索由一组附近项目支持的单个项目。例如,一张照片只能通过之前拍摄的其他照片的属性来描述,比如几个月前我拍摄她照片后,爱丽丝拍摄我的照片。在这种情况下,系统可能需要学习谁是爱丽丝。
While the search is for an item that has an easy to establish relevance, the user's only known description of this item (i.e. query) depends on other items, which may themselves need to be found. Then, the search system must estimate the relevance of the whole set of items.
在搜索易于建立相关性的项目时,用户对该项目(即查询)的唯一已知描述取决于其他可能需要找到的项目。然后,搜索系统必须估计整套项目的相关性。

Multi-Item Faceted Elicitation
多项目分面引导

In this setting, the user is searching for a set of items directly. Importantly, not only must the system estimate the utility of each single item, it must combine the utilities of multiple items to reach an assessment of an entire set.
在这种情况下,用户直接搜索一组项目。重要的是,系统不仅必须估计每个单独项目的效用,还必须结合多个项目的效用来评估整个集合。
For instance, planning a vacation where the results consist of a hotel, travel arrangements, restaurant plans, places to see, and so forth. During the conversation, the agent elicits users to describe relevant aspects of different destinations, hotels, transport options and attractions. Then, it must elicit information from the user to learn how to combine the utilities of a whole set of items to reach a final decision about a holiday as a package.
例如,规划一个度假旅行,结果包括酒店、旅行安排、餐厅计划、观光地点等。在对话过程中,代理商引导用户描述不同目的地、酒店、交通选择和景点的相关方面。然后,必须从用户那里获取信息,以了解如何将一整套项目的效用结合起来,从而做出关于度假套餐的最终决定。
It is this last setting in which we hypothesize that conversational approaches to search have the highest usefulness.
正是在这种情况下,我们假设对话式搜索方法具有最高的实用性。

Bounding Choices / Building Expectations
限定选择 / 建立期望

Simultaneously, a conversational interface may simplify the problem of need elicitation by providing users with bounded choices. It may be easier for a user to clarify their needs given precise choices rather than expecting them to come up with particular terms. Similarly, choices can be bounded by allowing the user to understand complex features available in a search system, as examples of how a need can be presented are given.
同时,对话界面可以通过为用户提供有限选择来简化需求引发的问题。用户可能更容易澄清他们的需求,因为给出明确选择比期望他们提出特定术语更容易。同样,通过允许用户了解搜索系统中可用的复杂功能,可以通过限制选择来界定选择,例如给出需求呈现的示例。
For instance users engage with facets for email search much more often when these are suggested contextually, rather than relying on the user to generate the relevant terms [14]. Similarly, expert search users are much more likely to use advanced operators [40], presumably as less expert users are unaware of the options available.
例如,当建议上下文时,用户更经常使用邮件搜索的方面,而不是依赖用户生成相关术语[14]。同样,专家搜索用户更有可能使用高级运算符[40],可能是因为不太熟练的用户不知道可用的选项。
This concept of bounding choices can also be considered as revealment from the side of the system, showing the user examples of the possibilities the system offers.
这种选择范围的概念也可以被视为系统一侧的揭示,向用户展示系统提供的可能性示例。

3.3 Learning, curiosity and serendipity
3.3 学习、好奇心和意外发现

During any search interaction, the agent may acquire knowledge that is useful to answer the user's current information need, while also building a model to improve personalization in future. For instance, an example in the previous section required the agent to learn to identify Alice in a query. This knowledge would allow the agent to answer future queries that refer to Alice without requiring the label to be provided anew.
在任何搜索交互过程中,代理可能会获取对回答用户当前信息需求有用的知识,同时建立模型以改进未来的个性化。例如,在前一节中的一个示例中,代理需要学会在查询中识别 Alice。这种知识将使代理能够回答未来涉及 Alice 的查询,而无需重新提供标签。
It is worth noting that there are also cases where the agent may provide the user with long-term utility at a cost in the current query, perhaps by eliciting information that happens to be related but not directly relevant. For instance, consider a user who searches for a restaurant recommendation and specifies that it should be vegetarian. The agent may clarify if the requirement is simply for the current query, or indicates that the user always requires vegetarian restaurants. Similarly, in a photo search scenario, the agent may elicit a name that can be applied to a face common in candidate photos.
值得注意的是,代理商有时会以当前查询的成本为用户提供长期效用,可能是通过引出相关但不直接相关的信息。例如,考虑一个搜索餐厅推荐并指定应该是素食的用户。代理商可能会澄清需求是仅针对当前查询,还是表明用户总是需要素食餐厅。同样,在照片搜索场景中,代理商可能会引出一个可以应用于候选照片中常见面孔的名称。
We do not exclude such scenarios from expectations of a conversational agents, although such actions return to issues of trust and moral character, and thus further treatment of them are beyond the scope of the current work.
我们不排除对话代理人期望中出现这种情况,尽管这些行为涉及信任和道德品质问题,因此对它们的进一步处理超出了当前工作的范围。

4. CONVERSATIONAL SEARCH MODEL
4. 对话式搜索模型

For our model, we assume there is a user searching for information, and a system or agent that is assisting the user. Search is performed over a well-defined corpus , where the user it looking for an item (which may not be unique) that contains the information needed. Such needed items are said to be of high utility to the user. Within a conversation, the system must estimate the utility of each item, which we write . We note that the agent may have a prior estimate of utility over items before the user has specified anything, based on long-term knowledge, although do not further consider how this prior utility is maintained.
对于我们的模型,我们假设有一个正在搜索信息的用户,以及一个正在协助用户的系统或代理。搜索是在一个明确定义的语料库 上进行的,用户正在寻找包含所需信息的项目 (可能不唯一)。这些所需项目被认为对用户非常有用。在对话中,系统必须估计每个项目的效用,我们将其写为 。我们注意到代理可能在用户指定任何内容之前就对项目 的效用有先前的估计,这是基于长期知识的,尽管我们不进一步考虑这种先前效用是如何维护的。

4.1 Interaction Approaches
4.1 交互方式

In each back and forth step in a conversation, the system provides some information to the user, and the user responds. Depending on what is provided, and what the response is, we find ourselves in different conversation settings seen in prior work. Existing approaches of which we are aware are summarized in Figure 1.
在对话中的每一个来回步骤中,系统向用户提供一些信息,用户做出回应。根据所提供的信息和回应的内容,我们发现自己处于先前工作中看到的不同对话设置中。我们知道的现有方法总结在图 1 中。
From the system perspective, our model provides for three basic types of information that the system may provide to the user. We term these actions that the system may perform. In increasing specificity, these are nothing, a partial item and a complete item. In particular, an item may be partially described in many ways. In the simplest case, the system may select a specific item feature to focus on, such as the concept of price in a product search scenario. Alternatively, the system may provide a suggested value of each field, e.g. price between and . Finally, the system may present a cluster of items, for instance a grouping of products that are somehow similar; this can be through of as a dynamic field having been created for say "electronic gadgets that make good gifts for a teenager".
从系统的角度来看,我们的模型提供了系统可能向用户提供的三种基本信息类型。我们将这些系统可能执行的操作称为动作。在逐渐具体化的过程中,这些动作分别是无、部分项目和完整项目。特别是,一个项目可以以多种方式进行部分描述。在最简单的情况下,系统可以选择要关注的特定项目特征,比如在产品搜索场景中的价格概念。或者,系统可以提供每个字段的建议值,例如价格在 之间。最后,系统可以呈现一组项目,例如一组某种程度相似的产品;这可以被视为为“适合作为青少年礼物的电子小工具”创建的动态字段。

Conversely, the user may be expected to provide (equivalently, the system may understand user statements that provide), feedback of different types. The simplest design would be for the user to provide either a binary or ordinal score in response to a question, or a preference given two or more choices. A more sophisticated feedback from the user would be a critique that indicates in what way the item or partial item presented by the system does not represent the user's information need. The most detailed level of feedback a user may provide would be free text. Clearly, the meaning of the user's feedback is only well defined given a specific question.
相反,用户可能需要提供(或者说系统可能理解提供的用户陈述)不同类型的反馈。最简单的设计是让用户对问题做出二元或有序评分,或者在两个或更多选择中做出偏好。用户提供的更复杂的反馈可能是一种批评,指出系统呈现的项目或部分项目不代表用户的信息需求的方式。用户可能提供的最详细级别的反馈是自由文本。显然,用户反馈的含义只有在特定问题的情况下才能明确定义。
Considering previous work, we note that each prior system typically falls into a single cell as indicated in the Figure. We now describe each of the labeled cells in turn. This will describe the basis of the richer action space model we propose in this paper.
考虑到先前的工作,我们注意到每个先前的系统通常都落入图中所示的单个单元格中。我们现在依次描述每个标记单元格。这将描述我们在本文中提出的更丰富的行动空间模型的基础。

Null System - Free Text User
空系统 - 自由文本用户

This is the starting point for most information retrieval systems such as Web search engines, and often for conversational systems where the user may specify many possible requests (such as commercial intelligent agents including Cortana and Siri). The user is simply presented with a search box into which any query can be entered.
这是大多数信息检索系统的起点,如 Web 搜索引擎,通常也适用于对话系统,用户可以指定许多可能的请求(如商业智能代理,包括 Cortana 和 Siri)。用户只需看到一个搜索框,可以输入任何查询。

Partial Item System - Pref/Rating User
部分项目系统 - Pref/Rating 用户

A user may be presented with partial information about matching items in various ways. The most common approach is for a conversational system to confirm a slot that has been inferred, such as "you are looking for an Italian restaurant, correct?" (see, for example [17]). Some systems may also cluster items, asking for a preference. For instance, it might ask "would you prefer to a fancy restaurant, or an inexpensive one?". A third interaction mode, where a preference is elicited over a set of (feature,value) pairs would for instance "would you prefer a laptop with a 12 inch screen for or a laptop with a 14 inch screen for . Note that all of these interaction modes - as well as critiques and free text entry discussed below - may also be considered "faceted search".
用户可能以各种方式被呈现有关匹配项目的部分信息。最常见的方法是让对话系统确认已被推断的插槽,例如“您是在找意大利餐厅,对吗?”(参见,例如[17])。一些系统还可能对项目进行分组,询问偏好。例如,它可能会问“您更喜欢高档餐厅还是便宜的餐厅?”。第三种交互模式,其中通过一组(特征,值)对引出偏好,例如“您更喜欢带有 12 英寸屏幕的笔记本电脑 还是带有 14 英寸屏幕的笔记本电脑 ”。请注意,所有这些交互模式 - 以及下面讨论的批评和自由文本输入 - 也可以被视为“分面搜索”。

Partial Item System - Critique User
部分项目系统 - 评论用户

When the user may provide a richer answer than a simple score or preference, this presents a more powerful information retrieval paradigm. In the simplest case, fielded search provides users with a selection of known fields and users may select or specify ranges for any property they desire. This is common in online shopping scenarios, where often the allowed field values are pre-specified In other settings a user is presented with specific individual facet values. Some commercial intelligent agents allow users to clarify in this way, rather requiring a simple yes/no. For instance, in response to a prompt "you are looking for an Italian restaurant, correct?", the user may reply "no, I'm looking for an Indian restaurant."
当用户可以提供比简单评分或偏好更丰富的答案时,这就呈现出一种更强大的信息检索范式。在最简单的情况下,字段搜索为用户提供了一组已知字段的选择,用户可以选择或指定他们所需的任何属性的范围。这在在线购物场景中很常见,通常允许字段值是预先指定的。在其他设置中,用户会看到特定的个体特征值。一些商业智能代理允许用户以这种方式澄清,而不是要求简单的是/否回答。例如,对于提示“您是在找意大利餐厅,对吗?”,用户可能会回答“不,我是在找印度餐厅。”

Partial Item System - Free Text User
部分项目系统 - 自由文本用户

When a system asks a user to fill in a particular aspect of an infor mation need, this is usually referred to as slot filling. For instance, many recommendation systems work in this way. As an example, systems taking part in the Dialog State Tracking Challenge [41] require users to specify travel details to complete a structured query over a public transit schedule.
当系统要求用户填写信息需求的特定方面时,通常被称为插槽填充。例如,许多推荐系统都是以这种方式工作的。作为一个例子,参加对话状态跟踪挑战的系统需要用户指定旅行细节,以完成对公共交通时间表的结构化查询。

Complete Item System - Pref/Rating User
完整项目系统 - 首选/评分用户

Classic approaches to recommendation often request ratings of items to learn a user model for further recommendations. These may be absolute rating requests ("how much did you enjoy the movie Kill Bill?") or preference requests ("did you enjoy Kill Bill or Pride and
经典的推荐方法通常要求对项目进行评分,以学习用户模型以进行进一步的推荐。这些可能是绝对评分请求(“你喜欢电影《杀死比尔》吗?”)或偏好请求(“你喜欢《杀死比尔》还是《骄傲与偏见》?”)。
Figure 1: Conversation action space, as matched to previous names from past work. The system may provide three types of feedback, and expect three types of responses in return. In each cell, we describe related work that falls into the appropriate category. We also note that many of the partial item field or field+value ) interaction approaches are often considered variants of faceted search.
图 1:对话动作空间,与以往工作中的先前名称相匹配。系统可能提供三种类型的反馈,并期望得到三种类型的回应。在每个单元格中,我们描述了属于相应类别的相关工作。我们还注意到,许多部分项目字段 或字段+值 )交互方法通常被认为是分面搜索的变体。
Prejudice more?"). For example, Christakopoulou et al. [9] describe such a system in the restaurant domain.
偏见更多?例如,Christakopoulou 等人[9]在餐厅领域描述了这样一个系统。

Complete Item System - Critique User
完整项目系统 - 评论用户

In this case, a system may select a given item, then allow the user to refine their information need anchoring of the properties of the item. For instance, Reilly et al. [28] describe a system where users are presented with an item and possible ways the information need can be refined. Users may select a pre-defined rich critique that allows the system to move closer to the user's goals. Su et al. [32] describe a more sophisticated restaurant recommendation system, where queries are matched to a complete item, which can then be refined or further metadata can be requested.
在这种情况下,系统可能会选择一个给定的项目,然后允许用户通过锚定项目的属性来细化他们的信息需求。例如,Reilly 等人[28]描述了一个系统,用户被呈现一个项目和可能的信息需求细化方式。用户可以选择一个预定义的丰富评论,使系统更接近用户的目标。Su 等人[32]描述了一个更复杂的餐厅推荐系统,其中查询与完整项目匹配,然后可以对其进行细化或请求更多元数据。
We also propose an extension to the critique model to allow the agent to learn about the collection directly from user feedback. For example, suppose the agent has been asked to recommend a movie. Given a movie, the user might respond with a critique "that movie is too gory". This can inform the agent about the existence of the concept "gory", which may not have been known to the agent previously. Once given a name and one example, the agent may learn to model it through further interactions with this and other users.
我们还提出了对批评模型的扩展,以允许代理根据用户反馈直接了解收藏品。例如,假设代理被要求推荐一部电影。用户可能会对一部电影做出批评“那部电影太血腥了”。这可以告知代理有关“血腥”概念的存在,这可能代理以前不知道。一旦给出名称和一个示例,代理可以通过与这个和其他用户的进一步互动来学习对其进行建模。

4.2 Interaction Choice 4.2 交互选择

Based on the above variety of existing user/system interaction models, we can now formally model a conversation as a back-andforth, where the user and agent take turns. For convenience, the conversation always starts with the agent. Each time it is the agent's turn, it will (1) select an action to perform and (2) request for the user to provide a specific type of response. Specifically, the actions available to the agent are:
基于上述各种现有用户/系统交互模型,我们现在可以正式将对话建模为一种来回交流,用户和代理人轮流进行。为方便起见,对话总是由代理人开始。每当轮到代理人时,它将(1)选择要执行的动作并(2)请求用户提供特定类型的响应。具体来说,代理人可执行的动作包括:
The null action - provide nothing, user is requested for free text describing the information need.
空操作 - 不提供任何内容,用户被要求提供描述信息需求的自由文本。
System provides a single partial item/cluster. User is requested to provide a rating, critique, or free text. System provides two or a small number of partial items, requesting a preference, critique, or free text.
系统提供单个部分项目/群集。用户被要求提供评分、评论或自由文本。 系统提供两个或少量部分项目,要求偏好、评论或自由文本。
System provides a single complete item. User is requested to provide a rating, critique, or free text.
系统提供一个完整的项目。用户被要求提供评分、评论或自由文本。
System provides two or a small number of complete items. User is requested to provide a preference, critique, or free text.
系统提供两个或少量完整项目。用户被要求提供偏好、评论或自由文本。
The user responses are of the following types:
用户响应有以下类型:
A rating of the current item/partial item.
当前项目/部分项目的评级。
A preference among the presented items/partial items.
所呈现的项目/部分项目中的偏好。
A lack of preference, either indicating that none of the option is suitable, or indicating that all options are equally suitable.
缺乏偏好,要么 表示没有一个选项合适,要么 表示所有选项都同样合适。
A critique of the current item/partial item, indicating how the current item/partial item could be modified to be of higher relevance to the user.
对当前项目/部分项目的批评,指出如何修改当前项目/部分项目以提高对用户的相关性。
Unstructured text describing their information need.
描述他们信息需求的非结构化文本。

4.3 Action Selection 4.3 行动选择

As described at the start of Section 4, the system maintains a distribution over utility values for each item . The goal in an information retrieval setting is to find an item with maximal utility. Thus, a conversational search algorithm must select actions to maximize user satisfaction while tracking expectations over user responses. The motivation behind the above model is that the choice available to the system is simple enough that the utility of each action and response request can be estimated, yet provides the richness necessary for a true conversational retrieval system.
正如第 4 节开头所描述的,系统针对每个项目 维护了一个效用值 的分布。在信息检索环境中,目标是找到具有最大效用的项目。因此,对话式搜索算法必须选择行动以最大化用户满意度,同时跟踪用户响应的期望。上述模型背后的动机是系统可用选择足够简单,以至于可以估计每个行动和响应请求的效用,同时又提供了真正对话式检索系统所需的丰富性。
Specific satisfaction metrics that can be optimized are beyond the scope of this work, given the large amount of past work on this topic (see recent work by Kiseleva et al. [20] for an overview). However, we will describe an example algorithm for this selection process in the analysis of our model below.
特定的满意度指标可以进行优化,但由于过去在这个主题上的大量研究(请参阅 Kiseleva 等人最近的研究[20]),这超出了本工作的范围。然而,我们将在下面分析我们的模型时描述这个选择过程的一个示例算法。

5. ANALYSIS 5. 分析

We finally present an analysis of the conversational model. In particular we assess four natural analysis questions: Are the criteria suggested for conversations both necessary and sufficient? Are the system action and user response spaces sufficiently rich? How can the system select among the possible actions? How can the system correctly interpret user feedback in this rich environment? We discuss each in turn.
我们最终提出了对话模型的分析。特别是,我们评估了四个自然分析问题:对话所建议的标准是否既必要又充分?系统动作和用户响应空间是否足够丰富?系统如何在可能的动作中进行选择?系统如何在这个丰富环境中正确解释用户反馈?我们依次讨论每个问题。

5.1 Are the conversational properties presented both necessary and sufficient?
5.1 所呈现的对话特性既必要又充分吗?

The goal of the conversational model presented in this work is to allow efficient and effective conversational information retrieval without more complexity than necessary. Thus the first research question we must address considers what kinds of tasks and can be solved with the properties presented. While showing that a wide variety of previously studied information retrieval tasks can be addressed, we argue that each of the properties presented is also necessary.
本文提出的对话模型的目标是实现高效和有效的对话信息检索,而不增加不必要的复杂性。因此,我们必须首先解决的第一个研究问题是考虑使用所提出的属性可以解决哪些类型的任务。通过展示可以解决各种先前研究过的信息检索任务,我们认为所提出的每个属性也是必要的。

Example application 1: Basic information retrieval.
示例应用程序 1:基本信息检索。

We start by considering a standard information retrieval task. We take the first topic from the most recent TREC Web Track [11]:
我们从考虑一个标准的信息检索任务开始。我们从最近的 TREC Web Track [11] 中取第一个主题:
Topic ID: 251 主题 ID:251
Query: Identifying spider bites
查询:识别蜘蛛咬伤
Description: Find data on how to identify
描述:查找有关如何识别数据的信息
spider bites. 蜘蛛咬伤。
While the task description is a short query, it suggests the user may need to identify a specific spider bite. As the variety of spiders is large, this need would likely be satisfied best in a conversation where the user and the system exchange information to assist the user in narrowing down the candidate set of all possible spiders into the most likely one(s). During this process, the system needs to remember what information the user has already provided, to allow the user to answer individual questions one at a time. It may also be the case that the user needs to revisit or alter answers by referring to past statements. For instance, the user may incorrectly answer one of the questions and not realize until later in the conversation.
虽然任务描述是一个简短的查询,但它暗示用户可能需要识别特定的蜘蛛咬伤。由于蜘蛛的种类繁多,最好在用户和系统交换信息以帮助用户将所有可能的蜘蛛候选集缩小到最有可能的蜘蛛之一的对话中满足这种需求。在这个过程中,系统需要记住用户已经提供的信息,以允许用户逐个回答单个问题。用户可能还需要重新访问或更改答案,参考过去的陈述。例如,用户可能会错误地回答其中一个问题,直到对话后期才意识到。
Being able to satisfy such needs illustrates the importance of user revealment, as well as memory in the conversational search setting.
能够满足这些需求表明用户揭示的重要性,以及在对话式搜索环境中的记忆。

Example application 2: Personal information search.
示例应用程序 2:个人信息搜索。

We consider a second information retrieval setting with increasing research attention - search over personal information. This may involve searching for personal emails (e.g. [5]), documents (e.g. [12]) or over personal records that allow a user to investigate for instance where a particular event took place [7]. Such search tasks involve heterogeneous items with rich metadata, where a user may remember a variety of contextual information. An effective conversational retrieval system must aid the user in specifying such information, without requiring the user to remember a complex query language.
我们考虑了一个备受关注的第二种信息检索设置 - 搜索个人信息。这可能涉及搜索个人电子邮件(例如[5]),文档(例如[12])或允许用户调查特定事件发生地点的个人记录[7]。这类搜索任务涉及具有丰富元数据的异构项,用户可能记得各种上下文信息。有效的对话检索系统必须帮助用户指定这些信息,而无需用户记住复杂的查询语言。
In this setting, mixed initiative is particularly valuable, where a system may prompt the user with information that the user may remember. This in turn may lead the user to recall other pertinent information the he or she wishes to provide. Traditional search interfaces for personal information of this nature tend to present rich user interfaces . A mixed initiative system can present users with choices when appropriate to refine the search space, yet allow the user to describe their information need in free text when this is the optimal strategy. Additionally, the value of system revealment and memory is clear.
在这种情况下,混合倡议尤为重要,系统可能提示用户记得的信息。这反过来可能会促使用户回想起其他相关信息,他或她希望提供的信息。这种性质的个人信息的传统搜索界面往往呈现丰富的用户界面 。混合倡议系统可以在适当时向用户提供选择,以细化搜索空间,同时允许用户在这是最佳策略时用自由文本描述他们的信息需求。此外,系统揭示和记忆的价值是明显的。

Example application 3: Product recommendation.
示例应用程序 3:产品推荐。

A common information retrieval task is product selection given a general information need. For instance, consider a new parent who must purchase a stroller for the first time. The parent may be unaware of the qualities of such a product without having previously engaged in this task. In an offline recommendation setting, the parent visits a retailer, and an assistant will describe the range of products available, how they differ and elicit the information from the parent as to which features are important, and ultimately guide the parent to suitable choices.
常见的信息检索任务是在一般信息需求的情况下进行产品选择。例如,考虑一个必须第一次购买婴儿车的新父母。父母可能不了解这种产品的特点,因为之前没有参与过这项任务。在线下推荐设置中,父母会去零售商店,助理会描述可用产品的范围,它们的区别,并征求父母对哪些特点重要的信息,最终引导父母做出合适的选择。
In a conversational search setting, the agent must similarly reveal to the user characteristics of the available search space, knowledge of which features exist in the corpus of items, and assist the user in expressing their information need suitably. Thus this example illustrates particularly the necessity for system and user revealment.
在对话式搜索环境中,代理必须类似地向用户展示可用搜索空间的特征,了解语料库中存在的特征,并帮助用户适当地表达他们的信息需求。因此,这个例子特别说明了系统和用户揭示的必要性。

Example application 4: Travel planning.
示例应用程序 4:旅行规划。

In some settings, search involves heterogeneous items that give rise to two distinct types of user utility: A given item has a utility to the user, while a set of items is needed to answer the user's information yet has a different type of utility.
在某些情境中,搜索涉及到异质项目,这些项目产生了两种不同类型的用户效用:一个给定项目对用户有用,而一组项目需要回答用户的信息,但具有不同类型的效用。
One common example is travel planning (see for example [21]), where travel and accommodation are both necessary yet each have their own utilities (cost, convenience, brand, etc), while the combination is the ultimate user's need and has its own utility to the user. For example, even an otherwise perfect inexpensive five star hotel is likely to have low utility for a cost-conscious traveler if it can only be reached by private helicopter on the intended travel date. Other distinct aspect to travel plans - including attractions and restaurants further complicate the retrieval setting, giving rise to distinct item-based and set utility functions.
一个常见的例子是旅行规划(例如参见[21]),其中旅行和住宿都是必要的,但每个都有自己的效用(成本、便利性、品牌等),而二者的结合是最终用户的需求,并对用户有自己的效用。例如,即使一家否则完美的廉价五星级酒店可能对注重成本的旅行者的效用较低,如果只能在预定的旅行日期乘坐私人直升机到达。旅行计划的另一个独特方面 - 包括景点和餐厅进一步复杂了检索设置,导致了独特的基于项目和集合效用函数。
While in the previous example total cost might be considered a strong indicator of utility for both an item and a set of items, in related settings these utilities may be quite different. For instance, consider an itinerary within a city: The user may have preferences about what types of attractions they prefer to visit, giving rise to an item utility. However for a complete itinerary the user may also value diversity, so as not to spend the entire day visiting only attractions of one type.
在前面的例子中,总成本可能被认为是一个物品和一组物品的效用的强有力指标,但在相关设置中,这些效用可能会有很大不同。例如,考虑城市内的行程:用户可能对他们喜欢参观的景点类型有偏好,从而产生物品效用。然而,对于一个完整的行程,用户可能也会重视多样性,以免整天只参观一种类型的景点。

Summary. 摘要。

As we have seen, each of the properties presented - User Revealment, System Revealement, Mixed Initiative, Memory and Set Retrieval - are natural for at least one of the example applications These applications typify the settings in which information retrieval systems are commonly used, and this in which conversational information retrieval should be possible. As such, we argue that the proposed properties are both necessary, as well as being sufficient to enable many common information retrieval tasks.
正如我们所看到的,所提出的每个属性 - 用户揭示、系统揭示、混合倡议、记忆和集合检索 - 至少对一个示例应用程序是自然的。这些应用程序代表了信息检索系统通常使用的设置,也代表了会话式信息检索可能发生的情境。因此,我们认为所提出的属性既是必要的,也足以实现许多常见的信息检索任务。

5.2 Are the system and user action spaces sufficiently rich?
5.2 系统和用户操作空间是否足够丰富?

After presenting the desirable properties for a conversational IR system, we presented a model in Section 4. We now analyze to what extent this model satisfies the desired properties.
在介绍了对话式信息检索系统的理想特性之后,我们在第 4 节中提出了一个模型。现在我们分析这个模型在多大程度上满足了所需的特性。

User Revealment. 用户揭示。

During search, a common strategy to assist users to refine their information needs is to present alternatives within the space of extant items. As such, the action spaces and provide efficient ways for the user to identify alternative items, as well as dimensions on which the items differ, and ambiguities within the information need described so far.
在搜索过程中,帮助用户细化信息需求的常见策略是在现有项目空间内提供替代选择。因此,操作空间 为用户提供了识别替代项目、项目之间差异的维度以及描述的信息需求中存在的歧义的有效方式。
For instance, when searching for a suitable product in a class where the user is unfamiliar (say, the first time a new parent must purchase a stroller), choices help the user reveal the relevant features to the user.
例如,当用户在一个陌生的类别中搜索适合的产品时(比如,新父母第一次购买婴儿车时),选择可以帮助用户揭示相关特性。

System Revealment. 系统揭示。

Free-form text entry systems are known to have low discoverability (e.g. [40]). By presenting users with confirmations ( ) and requesting a rating or critique as well as partial item choices , the system both demonstrates the ways in which it can partition and refine the search space, as well as common properties of the corpus available.
自由文本输入系统以低可发现性而闻名(例如[40])。通过向用户展示确认( )并请求评分 或评论 以及部分项目选择 ,系统既展示了其可以划分和细化搜索空间的方式,也展示了可用语料库的共同属性。

Set Retrieval. 设置检索。

In modeling partial item presentation as clusters, the model allows for retrieval of sets of items. For instance, taking a travel set retrieval scenario, the system may assist the user in identifying high utility items of various types, then subsequently present sets of complementary items as candidate solutions to the user's information need (for instance, inviting critiques of proposed combinations). We note that as seen in Figure 1, to the best of our knowledge such structured set-based retrieval has not been previously studied in a conversational setting, rather relying on a rich user interface.
在将部分项目展示建模为簇时,该模型允许检索项目集。例如,以旅行套检索场景为例,系统可以帮助用户识别各种类型的高效用项目,然后随后将互补项目集呈现为用户信息需求的候选解决方案(例如,邀请对提议的组合进行评论)。我们注意到,据我们所知,如图 1 所示,这种结构化的基于集合的检索在对话设置中尚未被研究,而是依赖于丰富的用户界面。

Memory. 内存。

As previously noted, we postulated that memory plays two distinct roles in a conversational search setting: (1) the system recalling past statements by default, and (2) the ability to reference explicitly to past statements (for instance to indicate that they are no longer correct). The first is addressed in the model implicitly, as a conversation is designed to be a continuous process, thus a continuation of a conversation implicitly requires the continuation of a user's information need and thus all earlier conversational steps.
正如前面所指出的,我们假设记忆在对话式搜索环境中发挥两种不同的作用:(1) 系统默认回忆过去的陈述,以及 (2) 明确参考过去的陈述的能力(例如指出它们不再正确)。第一种情况在模型中隐含地得到解决,因为对话被设计为一个连续的过程,因此对话的延续隐含地需要用户信息需求的延续,因此所有早期的对话步骤。
The second is addressed in the requirement that the model allows the user to always enter free text . While a weaker constraint on the implementation, this possibility - assuming the user text is interpreted correctly by the system - allows the user to refer to a previous statement to override it specifically. However, the details of how this could be implemented is left as future work.
第二个问题涉及到模型允许用户始终输入自由文本 。虽然在实施上是一个较弱的约束条件,但是假设系统正确解释用户文本,这种可能性允许用户引用先前的陈述以具体覆盖它。然而,关于如何实现这一点的细节将作为未来的工作留下。

Mixed Initiative. 混合倡议。

In providing the system a number of alternative interaction modes, from a basic free-text to structured and preference-based , the system is designed to choose the right level of initiative for an information task. In allowing the user to always return unstructured text , the user can at any time take the initiative from the system.
在为系统提供多种交互模式的基础上,从基本的自由文本 到结构化和基于偏好的 ,系统被设计为为信息任务选择合适的主动性水平。允许用户随时返回非结构化文本 ,用户可以随时从系统中采取主动。

5.3 Could a system optimally select the next action from the search space?
5.3 系统能否从搜索空间中最优地选择下一步动作?

Given the choices of actions, the system must have a model allowing it to decide which action it is to take at any given point in time. It must (1) select an action in a given context, and (2) interpret the user's response given the previous conversational history. We focus on the action selection process here, arguing that a system could reasonably implement the model presented while leaving the practical details as an open challenge for future work.
鉴于行动选择的选择,系统必须具有一个模型,使其能够决定在任何给定时间点采取哪种行动。它必须(1)在给定上下文中选择一种行动,并(2)解释用户的响应,考虑到先前的对话历史。我们在这里关注行动选择过程,认为系统可以合理地实现所提出的模型,同时将实际细节留给未来工作的一个开放挑战。
In our model, the system maintains a distribution over utility values for each item . As the goal in an information retrieval setting is to find an item with maximal utility, the action must be selected so as to maximize how efficiently this is achieved. While a number of algorithms may be optimal with different implementations of the model, here we present one possible implementation to demonstrate feasibility. Other approaches may be more efficient or lead to better user experiences. One of the goals of this paper is to inspire such approaches, hence we leave them as future work. Rather, we argue here that our model is suitable for describing a conversational information retrieval system.
在我们的模型中,系统针对每个项目 维护了一个效用值 的分布。在信息检索环境中,目标是找到具有最大效用的项目,因此必须选择行动以最大化实现这一目标的效率。虽然许多算法可能是最优的,具有不同的模型实现,但在这里我们提出了一种可能的实现以展示可行性。其他方法可能更有效或导致更好的用户体验。本文的一个目标是激发这样的方法,因此我们将它们留作未来的工作。相反,我们在这里论证我们的模型适用于描述对话式信息检索系统。
For each action the system may take, if the user response comes from a known distribution, we can infer the update to the utility of each item. Specifically, suppose that for each item the system has an estimate of utility , as well as an estimate of the uncertainty of the utility . If the system were to take some action , and observe response , the system can predict the update of the item utility and uncertainty (for instance, [9] propose a specific virtual-update multi-armed bandit algorithm for this purpose). Summing over all items and possible user responses, the utility of a system action can be computed as the expected reduction in uncertainty about which items have highest utility. This is more difficult in cases where the distribution over user responses is unknown, such as where the user is requested to provide free text. Here, the utility must be estimated based on prior observations of the system when such an action was requested. A deployed conversational system may estimate the expected utility gain of such open-ended requests, and use this for selecting when to perform such actions.
对于系统可能采取的每个操作,如果用户的响应来自已知分布,我们可以推断每个项目的效用更新。具体而言,假设对于每个项目,系统都有一个效用估计 ,以及效用不确定性的估计 。如果系统要执行某个操作 ,并观察到响应 ,系统可以预测项目效用和不确定性的更新(例如,[9]提出了一种特定的虚拟更新多臂老虎机算法用于此目的)。对所有项目和可能的用户响应求和,系统操作的效用可以计算为关于哪些项目具有最高效用的不确定性的预期减少。在用户响应分布未知的情况下,例如用户被要求提供自由文本的情况下,这将更加困难。在这种情况下,必须基于系统在请求此类操作时的先前观察来估计效用。部署的对话系统可以估计此类开放式请求的预期效用增益,并用于选择何时执行此类操作。
While the above addresses the system-side utility of an action, a second aspect of question utility is that of the cost a system action incurs upon the user. For instance, a question that has high utility in terms of uncertainty reduction may be difficult for a user to answer. This cost needs to be estimated by the system when selecting actions. A simple approach would be to assume a fixed cost for users to respond to any action. This assumption is commonly used by recommendation system where users are requested to label particular examples as part of the learning process. An alternative would be for a conversational system to observe how long it takes a user to respond to a given type of action, and/or how often the user response is not of the requested type for the given system action.
虽然上述内容涉及了行动的系统端效用,但问题效用的第二个方面是系统行动对用户造成的成本。例如,从不确定性减少的角度来看,一个问题可能对用户来说很难回答,但它具有很高的效用。系统在选择行动时需要估计这种成本。一个简单的方法是假设用户对任何行动的回应都有固定成本。这种假设通常被推荐系统使用,用户被要求标记特定示例作为学习过程的一部分。另一种方法是让对话系统观察用户回应给定类型的行动需要多长时间,以及/或者用户回应不符合给定系统行动要求的频率。
Finally, in the more complex case of set retrieval, the goal of the user is to find a suitable set of items rather than a single item. In this case, once the system has been able to identify items of high utility, it must learn a combined utility function. The form of this utility function would depend on the particular type of information need being addressed. For instance, in a travel planning scenario there may be complex constraints (e.g. only one hotel is needed at a time, the hotel must be near an airport to which there is a suitable flight, and so forth). While the action space is sufficiently rich to allow the system to propose tradeoffs between combinations, we believe that the details of the utility function learning needs to be addressed in a task-by-task manner.
最后,在更复杂的集合检索案例中,用户的目标是找到一组合适的项目,而不是单个项目。在这种情况下,一旦系统能够识别出高效用的项目,它必须学习一个组合效用函数。这个效用函数的形式将取决于所需信息类型的特定类型。例如,在旅行规划场景中可能存在复杂的约束条件(例如,一次只需要一个酒店,酒店必须靠近有适当航班的机场等)。虽然行动空间足够丰富,允许系统在组合之间提出权衡,但我们认为需要逐个任务地解决效用函数学习的细节。

5.4 How can the system interpret user feedback?
5.4 系统如何解释用户反馈?

Finally, we consider the question of whether our conversational model allows a system to interpret user feedback effectively
最后,我们考虑了我们的对话模型是否允许系统有效地解释用户反馈的问题
The model is for the system to take an action at each turn, requesting the user for a specific response. If the user response is of the expected type, the system will clearly be able to interpret it. However, we have also shown it important that the model allows the user to ignore the system's action (question) and provide alternative feedback, as in a real conversation. In this way, the user would be taking his or her own initiative, for instance if the agent is not providing useful information.
该模型是让系统在每个轮次采取一个动作 ,请求用户提供特定的响应。如果用户的响应是预期类型,系统将能够清楚地解释它。然而,我们还表明模型允许用户忽略系统的动作(问题)并提供替代反馈是重要的,就像在真实对话中一样。这样,用户将采取自己的主动行动,例如如果代理人没有提供有用信息。
One way to reduce the likelihood of unexpected feedback is to explicitly model common conversational outcomes. For instance, our model specifies one of the possible preference answers is that there is no preference among options being presented (with the
减少意外反馈可能性的一种方法是明确建模常见的对话结果。例如,我们的模型指定可能的偏好答案之一是 ,即在提出的选项中没有偏好。

uniform label still being positive or negative). This was noted by [9] that such feedback is particularly useful in some settings.
统一标签仍然是积极的或消极的)。[9]指出,在某些情况下,这种反馈特别有用。
While "unexpected" answers are thus the most problematic, we believe that our model provides the right amount of structure to facilitate interpretability. If the user chooses to provide alternative free-text feedback, interpretation of this feedback is relatively limited given the conversational context. Future success of conversational systems adhering to our general model will hinge on how often users choose to (or need to) revert to other answer types, and how well the system captures such deviations. A system that supports memory is also more robust to errors.
尽管“意外”的答案是最棘手的,但我们相信我们的模型提供了适量的结构,以促进可解释性。如果用户选择提供替代的自由文本反馈,那么在对话上下文中,对这些反馈的解释相对有限。遵循我们一般模型的对话系统未来的成功将取决于用户选择(或需要)多频繁地返回其他答案类型,以及系统如何捕捉这种偏差。支持记忆的系统也更能抵御错误。
Therefore we claim that our model provides a suitable framework for effectively interpreting user feedback.
因此,我们声称我们的模型为有效解释用户反馈提供了一个合适的框架。

6. CONCLUSIONS AND FUTURE WORK
6. 结论和未来工作

In this paper, we have described the characteristics of a conversational information retrieval system. These characteristics are based on a broad overview of previous work on human conversations. What have argued that the properties of conversations described are both necessary and sufficient to allow a rich variety of information retrieval tasks to be naturally performed using a conversational interface. We consider the primary contribution of these properties to provide a framework for design and evaluation of future conversational information retrieval systems. In allowing approaches to be compared, the types of tasks that such systems can address, and the way in which they differ from human-level conversations can be more easily characterized.
在本文中,我们描述了一种会话式信息检索系统的特征。这些特征基于对人类对话先前工作的广泛概述。我们认为所描述的对话特性既是必要的,也是充分的,可以自然地使用会话界面执行丰富多样的信息检索任务。我们认为这些特性的主要贡献在于为未来会话式信息检索系统的设计和评估提供框架。通过允许比较不同的方法,这些系统可以处理的任务类型,以及它们与人类对话的不同之处可以更容易地被表征。
In doing so, we also discussed when conversational approaches appear most valuable for information retrieval, illustrating with a number of tasks that appear to be well suited to chat-based search.
在这样做的过程中,我们还讨论了会话式方法在信息检索中何时显得最有价值,并举例说明了一些似乎非常适合基于聊天的搜索的任务。
Following this presentation, we presented a theoretical model that satisfies the conversational properties. While theoretical, the model provides the framework for a conversational search system that appears practical in implementation. The model is a generalization of many specialized systems that have previously been implemented, and have been shown to be effective by previous authors. However, none of the previous systems satisfy all the proposed properties of a conversational information retrieval system. We view the contribution of this model as a proposed structure that can be employed towards obtaining true conversational information retrieval. Implementing the model proposed is the most important future extension of this work.
在这个演示之后,我们提出了一个满足对话属性的理论模型。虽然是理论性的,但该模型为一个对话式搜索系统提供了实用的实现框架。该模型是许多先前实施的专门系统的泛化,并且已被先前的作者证明是有效的。然而,先前的系统中没有一个满足所有提出的对话信息检索系统的属性。我们认为这个模型的贡献是提出了一个可以用于获得真正对话信息检索的结构。实施所提出的模型是这项工作最重要的未来拓展。
It is further worth considering limitations in the characterization of conversational information retrieval. It may be the case that some of the properties presented can be replaced with others that serve similar function but lead to higher user satisfaction through more natural interaction. The properties described reflect previous findings of human conversations, thus it may be the case that automatic conversational system do not need to reflect human level conversation to be widely useful for information retrieval. In particular, we have chosen to represent knowledge about the corpus as a utility function defined over items and sets of items. It is possible that such a utility function may not exist, or may be too complex to model in relatively short interactions. Incorporating prior knowledge about global utility and popularity may allow the conversational properties to be refined.
进一步考虑对话信息检索特征的限制是值得的。可能有一些呈现的特性可以被替换为其他具有类似功能但通过更自然的互动导致更高用户满意度的特性。所描述的特性反映了人类对话的先前发现,因此自动对话系统可能不需要反映人类水平的对话即可广泛用于信息检索。特别是,我们选择将关于语料库的知识表示为定义在项目和项目集上的效用函数。这样的效用函数可能不存在,或者在相对较短的互动中建模过于复杂。整合关于全局效用和流行度的先验知识可能使对话特性得以完善。

Acknowledgments 致谢

The authors would like to thank Cecily Morrison and Alex Taylor for insightful discussions about the nature of (human) conversations. We also thank Paul Thomas for useful feedback on related work, and the anonymous reviewers for helpful comments.
作者们要感谢 Cecily Morrison 和 Alex Taylor 就(人类)对话性质进行深入讨论。我们还要感谢 Paul Thomas 对相关工作提供的有用反馈,以及匿名审稿人提供的帮助性评论。

References 参考资料

[1] J. Allen, C. I. Guinn, and E. Horvtz. Mixed-initiative interaction. IEEE Intelligent Systems and their Applications, 14(5):14-23, 1999 .
J. Allen, C. I. Guinn, 和 E. Horvtz. Mixed-initiative interaction. IEEE Intelligent Systems and their Applications, 14(5):14-23, 1999.
[2] M. J. Bates. The design of browsing and berrypicking techniques for the online search interface. Online review, 13(5):407-424, 1989.
[2] M. J. Bates. 在线搜索界面的浏览和采摘技术设计。在线评论,13(5):407-424,1989 年。
[3] N. J. Belkin, C. Cool, A. Stein, and U. Thiel. Cases, scripts, and information seeking strategies: on the design of interactive information retrieval systems. Expert Systems with Applications, 9:379-395, 1995.
[4] P. N. Bennett, R. W. White, W. Chu, S. T. Dumais, P. Bailey, F. Borisyuk, and X. Cui. Modeling the impact of short- and long-term behavior on search personalization. In Proceedings of the ACM SIGIR International Conference on Research and Development in Information Retrieval (SIGIR), 2012.
[4] P. N. Bennett, R. W. White, W. Chu, S. T. Dumais, P. Bailey, F. Borisyuk, and X. Cui. 模拟短期和长期行为对搜索个性化的影响。在 2012 年 ACM SIGIR 国际信息检索研究与开发会议(SIGIR)论文集中。
[5] D. Carmel, G. Halawi, L. Lewin-Eytan, Y. Maarek, and A. Raviv. Rank by time or by relevance?: Revisiting email search. In Proceedings of ACM Conference on Information and Knowledge Management (CIKM), pages 283-292, 2015.
[6] J. Cassell. More than just another pretty face: Embodied conversational interface agents. Communications of the ACM, 4(43):70-78, 2000.
[6] J.卡塞尔. 不仅仅是另一个漂亮的面孔: 具身对话接口代理. ACM 通讯, 4(43):70-78, 2000.
[7] Y. Chen and G. J. F. Jones. Are episodic context features helpful for refinding tasks? lessons learnt from a case study with lifelogs. In Proceedings of the Information Interaction in Context Symposium (IIiX), 2014.
[7] Y. Chen and G. J. F. Jones. 剧集上下文特征对重新查找任务有帮助吗?从生活日志案例研究中学到的教训。在信息交互背景研讨会(IIiX)2014 年论文集中。
[8] Y.-N. Chen, D. Hakkani-Tur, and X. He. Zero-shot learning of intent embeddings for expansion by convolutional deep structured semantic models. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 2016.
[8] Y.-N. Chen, D. Hakkani-Tur, and X. He. Zero-shot learning of intent embeddings for expansion by convolutional deep structured semantic models. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 2016. [8] Y.-N. Chen, D. Hakkani-Tur 和 X. He. Zero-shot learning of intent embeddings for expansion by convolutional deep structured semantic models. 在国际声学、语音和信号处理会议(ICASSP)论文集中,2016 年 3 月。
[9] K. Christakopoulou, F. Radlinski, and K. Hofmann. Towards conversational recommender systems. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2016.
[9] K. Christakopoulou, F. Radlinski, and K. Hofmann. 走向对话式推荐系统。在 2016 年 ACM SIGKDD 国际知识发现与数据挖掘会议(KDD)论文集中。
[10] J. Chu-Carroll and M. K. Brown. An evidential model for tracking initiative in collaborative dialogue interactions. In S. Haller, A. Kobsa, and S. McRoy, editors, Computational Models of Mixed-Initiative Interaction, pages 49-87. Springer, 1999.
[10] J. Chu-Carroll 和 M. K. Brown. 一种用于跟踪协作对话互动中主动性的证据模型. 在 S. Haller, A. Kobsa 和 S. McRoy 编辑的《混合主动交互的计算模型》中, 49-87 页. Springer, 1999.
[11] K. Collins-Thompson, C. Macdonald, P. Bennett, and E. M. Voorhees. TREC 2014 web track overview. NIST Special Publication 500-308: The Twenty-Third Text REtrieval Conference Proceedings (TREC), 2014.
[12] E. Cutrell, S. T. Dumais, and J. Teevan. Searching to eliminate personal information management. Communications of the ACM, 49(1):58-64, 2006.
[12] E. Cutrell, S. T. Dumais, and J. Teevan. 搜索以消除个人信息管理。ACM 通讯,49(1):58-64,2006 年。
[13] R. Dhar. Consumer preference for a no-choice option. Journal of Consumer Research, 24(2):215-231, 1997.
[13] R. Dhar. 消费者对无选择选项的偏好。消费者研究杂志,24(2):215-231,1997 年。
[14] M. Dredze, B. N. Schilit, and P. Norvig. Suggesting email view filters for triage and search. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2009.
[14] M. Dredze, B. N. Schilit, and P. Norvig. 为分类和搜索建议电子邮件查看过滤器。在 2009 年国际人工智能联合会议(IJCAI)论文集中。
[15] M. A. Hearst. 'natural' search user interfaces. Communications of the ACM, 54(11):60-67, 2011.
[15] M. A. Hearst. “自然”搜索用户界面。ACM 通讯,54(11):60-67,2011 年。
[16] M. Henderson, B. Thomson, and S. J. Young. Deep neural network approach for the dialog state tracking challenge. In Proceedings of the Annual SIGdial Meeting on Discourse and Dialogue (SIGDIAL), 2013.
[16] M. Henderson, B. Thomson, and S. J. Young. 深度神经网络方法用于对话状态跟踪挑战。在年度 SIGdial 会议关于话语和对话(SIGDIAL)的论文集,2013。
[17] J. Jiang, A. H. Awadallah, R. Jones, U. Ozertem, I. Zitouni, R. G. Kulkarni, and O. Z. Khan. Automatic online evaluation of intelligent assistants. In Proceedings of the International World Wide Web Conference (WWW), 2015.
[17] J. Jiang, A. H. Awadallah, R. Jones, U. Ozertem, I. Zitouni, R. G. Kulkarni, and O. Z. Khan. 智能助手的自动在线评估。在 2015 年国际万维网大会(WWW)论文集中。
[18] D. G. Johnson and T. M. Powers. Computers as surrogate agents. In J. van den Hoven and J. Weckert, editors, Information Technology and Moral Philosophy, chapter 13. Cambridge University Press, 2008.
[18] D. G. Johnson 和 T. M. Powers。计算机作为代理人。在 J. van den Hoven 和 J. Weckert 编辑的《信息技术与道德哲学》第 13 章。剑桥大学出版社,2008 年。
[19] M. P. Kato and K. Tanaka. To suggest, or not to suggest for queries with diverse intents: Optimizing search result presentation. In Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM), pages 133-142. ACM, 2016.
[19] M. P. Kato 和 K. Tanaka。对于具有不同意图的查询是否建议:优化搜索结果呈现。在 ACM 国际网络搜索和数据挖掘会议(WSDM)论文集中,第 133-142 页。ACM,2016 年。
[20] J. Kiseleva, K. Williams, J. Jiang, A. H. Awadallah, A. C. Crook, I. Zitouni, and T. Anastasakos. Understanding user satisfaction with intelligent assistants. In Proceedings of the ACM SIGIR Conference on Human Information Interaction and Retrieval (CHIIR), 2016.
[20] J. Kiseleva, K. Williams, J. Jiang, A. H. Awadallah, A. C. Crook, I. Zitouni, and T. Anastasakos. 了解用户对智能助手的满意度。在 2016 年 ACM SIGIR 人类信息交互与检索会议(CHIIR)论文集中。
[21] G. Linden, S. Hanks, and N. Lesh. Interactive assessment of user preference models: The automated travel assistant. In User Modeling, pages 67-78. Springer, 1997.
[21] G. Linden, S. Hanks, and N. Lesh. 用户偏好模型的交互评估:自动旅行助手。在用户建模中,第 67-78 页。Springer,1997 年。
[22] N. Matthijs and F. Radlinski. Personalizing web search using long term browsing history. In Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM), pages 25-34, 2011.
[22] N. Matthijs 和 F. Radlinski. 利用长期浏览历史个性化网络搜索. 在 ACM 国际网络搜索和数据挖掘会议 (WSDM) 论文集中, 页码 25-34, 2011.
[23] K. McCarthy, Y. Salem, and B. Smyth. Experience-based critiquing: Reusing critiquing experiences to improve conversational recommendation. In Proceedings of the International Conference on Case-Based Reasoning, pages 480-494. Springer, 2010.
[23] K. McCarthy, Y. Salem, and B. Smyth. 基于经验的批评:重复利用批评经验以改进对话式推荐。在国际案例推理会议论文集中,第 480-494 页。Springer,2010 年。
[24] R. Nordlie. "user revealment" - a comparison of initial queries and ensuing question development in online searching and in human reference interactions. In Proceedings of the ACM SIGIR International Conference on Research and Development in Information Retrieval (SIGIR), pages 11-18, 1999.
[24] R. Nordlie. “用户揭示” - 在在线搜索和人类参考互动中初始查询和随后问题发展的比较。发表于 ACM SIGIR 国际信息检索研究与发展会议(SIGIR)论文集,1999 年,第 11-18 页。
[25] D. G. Novick and S. Sutton. What is mixed-initiative interaction. In Procedings of the AAAI Spring Symposium on Computational Models for Mixed Initiative Interaction, pages 114-116, 1997.
[25] D. G. Novick and S. Sutton. 什么是混合倡议交互。在计算模型的 AAAI 春季研讨会上,页码 114-116,1997 年。
[26] T. Paek and E. Horvitz. Conversation as action under uncertainty. In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), pages 455-464, 2000.
[26] T. Paek and E. Horvitz. 对话作为不确定性下的行动。在不确定性人工智能会议(UAI)论文集中,第 455-464 页,2000 年。
[27] A. Raux and M. Eskenazi. Optimizing the turn-taking behavior of task-oriented spoken dialog systems. ACM Transactions on Speech Language Processing, 9(1):1:1-1:23, May 2012.
[27] A. Raux 和 M. Eskenazi。优化面向任务的口语对话系统的交替行为。ACM Transactions on Speech Language Processing,9(1):1:1-1:23,2012 年 5 月。
[28] J. Reilly, K. McCarthy, L. McGinty, and B. Smyth. Incremental critiquing. In Proceedings of the SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, 2004.
[28] J. Reilly, K. McCarthy, L. McGinty, and B. Smyth. 增量评论。在 2004 年 SGAI 国际人工智能创新技术和应用会议论文集中。
[29] J. J. Rocchio. Relevance feedback in information retrieval. In G. Salton, editor, The SMART Retrieval System - Experiments in Automatic Document Processing. Prentice Hall, Englewood, Cliffs, New Jersey, 1971.
[29] J. J. Rocchio. 信息检索中的相关反馈。在 G. Salton 编辑,《智能检索系统-自动文档处理实验》。Prentice Hall,新泽西州恩格尔伍德克利夫斯,1971 年。
[30] S. Senecal and J. Nantel. The influence of online product recommendations on consumers' online choices. Journal of Retailing, 80:159-169, 2004.
[30] S. Senecal 和 J. Nantel。在线产品推荐对消费者在线选择的影响。《零售学杂志》,80:159-169,2004 年。
[31] A. Sordoni, M. Galley, M. Auli, C. Brockett, Y. Ji, M. Mitchell, J.-Y. Nie, J. Gao, and B. Dolan. A neural network approach to context-sensitive generation of conversational responses. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT), June 2015.
[31] A. Sordoni, M. Galley, M. Auli, C. Brockett, Y. Ji, M. Mitchell, J.-Y. Nie, J. Gao, and B. Dolan. 一种神经网络方法用于上下文敏感的对话回复生成。在北美计算语言学协会会议论文集 - 人类语言技术 (NAACL-HLT) 会议论文集中发表,2015 年 6 月。
[32] P.-H. Su, M. Gas̆ić, N. Mrks̆ić, L. Rojas-Barahona, S. Ultes, D. Vandyke, T.-H. Wen, and S. Young. On-line active reward learning for policy optimisation in spoken dialogue systems. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2016.
[32] P.-H. Su, M. Gas̆ić, N. Mrks̆ić, L. Rojas-Barahona, S. Ultes, D. Vandyke, T.-H. Wen, and S. Young. 在口语对话系统中的在线主动奖励学习用于策略优化。在计算语言学年会(ACL)论文集,2016 年。
[33] S. Sukhbaatar, a. szlam, J. Weston, and R. Fergus. End-to-end memory networks. In Proceedings of the International Conference on Advances in Neural Information Processing Systems (NIPS), pages 2440-2448, 2015.
[33] S. Sukhbaatar, a. szlam, J. Weston, and R. Fergus. 端到端记忆网络. 在国际神经信息处理系统进展会议(NIPS)论文集中, 页码 2440-2448, 2015.
[34] P. ten Have. Doing Conversation Analysis: A Practical Guide. SAGE Publications Ltd, 2007.
[34] P. ten Have. 进行对话分析:实用指南。SAGE Publications Ltd,2007。
[35] N. Tintarev and J. Masthoff. Effective explanations of recommendations: user-centered design. In Proceedings of the ACM conference on Recommender Systems (RecSys), pages 153-156, 2007.
[35] N. Tintarev 和 J. Masthoff. 推荐的有效解释:用户中心设计. 在 ACM 推荐系统会议(RecSys)论文集中,第 153-156 页,2007 年。
[36] D. Traum, D. DeValut, J. Lee, Z. Wang, and S. Marsella. Incremental dialogue understanding and feedback for multiparty, multimodal conversation. In Proceedings of Intelligent Virtual Agents (IVA), pages 275-288, 2012.
[36] D. Traum, D. DeValut, J. Lee, Z. Wang, and S. Marsella. 增量对话理解和多方、多模态对话的反馈。在智能虚拟代理(IVA)会议论文集中,第 275-288 页,2012 年。
[37] M. Walker and S. Whittaker. Mixed initiative in dialogue: An investigation into discourse segmentation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pages 70-78, 1990.
[37] M. Walker and S. Whittaker. 对话中的混合倡议:话语分割的调查。在计算语言学协会(ACL)年会论文集中,第 70-78 页,1990 年。
[38] Z. Wang, J. Lee, and S. Marsella. Towards more comprehensive listening behavior: Beyond the bobble head. In Proceedings of Intelligent Virtual Agents (IVA), pages 216-227, 2011.
[38] Z. Wang, J. Lee, and S. Marsella. 朝着更全面的听觉行为发展:超越摇头娃娃。在智能虚拟代理人(IVA)会议论文集中,第 216-227 页,2011 年。
[39] J. Weizenbaum. ELIZA - a computer program for the study of natural language communication between man and machine. Communications of the ACM, 9:36-45, January 1966.
[39] J. Weizenbaum. ELIZA - 用于研究人机自然语言交流的计算机程序. ACM 通讯, 9:36-45, 1966 年 1 月.
[40] R. W. White and D. Morris. Investigating the querying and browsing behavior of advanced search engine users. In Proceedings of the ACM SIGIR International Conference on Research and Development in Information Retrieval (SIGIR), 2007.
[40] R. W. White 和 D. Morris。调查高级搜索引擎用户的查询和浏览行为。在 ACM SIGIR 国际信息检索研究与发展会议(SIGIR)论文集中,2007 年。
[41] J. Williams, A. Raux, D. Ramachadran, and A. Black. The dialog state tracking challenge. In Proceedings of the Annual SIGdial Meeting on Discourse and Dialogue (SIGDIAL), 2013.
[41] J. Williams, A. Raux, D. Ramachadran, and A. Black. 对话状态跟踪挑战。在年度 SIGdial 会议关于话语和对话(SIGDIAL)的论文集,2013 年。
[42] K. Yao and G. Zweig. Attention with intention for neural network conversation model. In NIPS Workshop on Machine Learning for Spoken Language Understanding and Interaction, 2015.
[42] K. Yao 和 G. Zweig. Attention with intention for neural network conversation model. In NIPS Workshop on Machine Learning for Spoken Language Understanding and Interaction, 2015.
[43] K.-P. Yee, K. Swearingen, K. Li, and M. Hearst. Faceted metadata for image search and browsing. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI), pages 401-408. ACM, 2003.
[43] K.-P. Yee, K. Swearingen, K. Li, and M. Hearst. 用于图像搜索和浏览的多面元数据。在人机交互计算系统(SIGCHI)会议论文集(CHI)中的论文,第 401-408 页。ACM,2003 年。
[44] S. Young, M. Gašić, B. Thomson, and J. D. Williams. Pomdp-based statistical spoken dialog systems: A review. Proceedings of the IEEE, 101(5):1160-1179, 2013.
[44] S. Young, M. Gašić, B. Thomson, and J. D. Williams. 基于 Pomdp 的统计口语对话系统:综述。IEEE 会议论文集,101(5):1160-1179,2013 年。
[45] Y. Zhang and C. Zhai. A sequential decision formulation of the interface card model for interactive ir. In Proceedings of the ACM SIGIR International Conference on Research and Development in Information Retrieval (SIGIR), pages 85-94. ACM, 2016.
[45] Y. 张 and C. 翟. 交互式信息检索界面卡模型的顺序决策表述. 在 ACM SIGIR 国际信息检索研究与发展会议(SIGIR)论文集中, 页码 85-94. ACM, 2016.

  1. *Now at Google UK.
    现在在谷歌英国。
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@ acm.org.
    允许以个人或课堂使用为目的制作此作品的全部或部分数字或硬拷贝,无需支付费用,但前提是不得为盈利或商业优势而制作或分发拷贝,并且拷贝必须带有本通知和第一页上的完整引用。对于本作品中作者之外的组成部分的版权必须得到尊重。允许在提供信用的情况下进行摘要。要进行其他复制、再出版、发布到服务器或重新分发到列表,需要事先获得特定许可和/或支付费用。请向 permissions@acm.org 请求权限。
    CHIIR '17, March 07 - 11, 2017, Oslo, Norway
    CHIIR '17,2017 年 3 月 07 日至 11 日,挪威奥斯陆
    (c) 2017 Copyright held by the owner/author(s). Publication rights licensed to ACM ISBN 978-1-4503-4677-1/17/03...$15.00
    (с) 2017 版权由所有者/作者持有。出版权许可给 ACM ISBN 978-1-4503-4677-1/17/03...$15.00
    DOI: http://dx.doi.org/10.1145/3020165.3020183
    DOI:http://dx.doi.org/10.1145/3020165.3020183