GraphRAG: Unlocking LLM discovery on narrative private data
图谱 RAG：解锁叙事个人数据的LLM发现

Published February 13, 2024
发布日期：2024 年 2 月 13 日

By Jonathan Larson , Senior Principal Data Architect Steven Truitt , Principal Program Manager
高级数据架构师乔纳森·拉尔森首席程序经理史蒂文·特鲁伊特

Share this page 分享此页

Project Ire - GraphRag background: Blue-green gradient

Editor’s note, Apr. 2, 2024 – Figure 1 was updated to clarify the origin of each source.
编者注（2024 年 4 月 2 日）：图 1 已更新，以明确各来源的出处。

Perhaps the greatest challenge – and opportunity – of LLMs is extending their powerful capabilities to solve problems beyond the data on which they have been trained, and to achieve comparable results with data the LLM has never seen. This opens new possibilities in data investigation, such as identifying themes and semantic concepts with context and grounding on datasets. In this post, we introduce GraphRAG, created by Microsoft Research, as a significant advance in enhancing the capability of LLMs.
也许，LLMs面临的最大挑战（也是机遇）是将它们强大的能力扩展到超出训练数据的问题解决，并在LLM从未见过的数据上取得可比较的结果。这为数据调查开辟了新可能，比如利用上下文和数据集中的语义概念识别主题。在这篇博文中，我们介绍微软研究团队开发的 GraphRAG，这是提升LLMs能力的重要进展。

Publication 出版物 Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine
泛化基金会模型能否胜过专门调优？医学案例研究

Retrieval-Augmented Generation (RAG) is a technique to search for information based on a user query and provide the results as reference for an AI answer to be generated. This technique is an important part of most LLM-based tools and the majority of RAG approaches use vector similarity as the search technique. GraphRAG uses LLM-generated knowledge graphs to provide substantial improvements in question-and-answer performance when conducting document analysis of complex information. This builds upon our recent research, which points to the power of prompt augmentation when performing discovery on private datasets. Here, we define private dataset as data that the LLM is not trained on and has never seen before, such as an enterprise’s proprietary research, business documents, or communications. Baseline RAG[1] was created to help solve this problem, but we observe situations where baseline RAG performs very poorly. For example:
检索增强生成（RAG）是一种技术，它根据用户查询搜索信息，并将结果作为 AI 生成答案的参考。这种技术是大多数LLM工具的重要组成部分，大部分 RAG 方法都使用向量相似性作为搜索方式。GraphRAG 利用LLM生成的知识图谱，在处理复杂信息的文档分析时显著提高了问答性能。这建立在我们最近的研究基础上，该研究指出，在私人数据集上进行发现时，提示增强的强大作用。在这里，我们定义私人数据集为模型未受训且从未见过的数据，如企业的专有研究、商业文档或通信。基线 RAG[1]被创建来解决这个问题，但我们观察到基线 RAG 在某些情况下表现很差。例如：

Baseline RAG struggles to connect the dots. This happens when answering a question requires traversing disparate pieces of information through their shared attributes in order to provide new synthesized insights.
基线 RAG 难以关联这些点。这是当回答问题需要通过共享属性遍历不同片段信息，以提供新的合成见解时发生的。
Baseline RAG performs poorly when being asked to holistically understand summarized semantic concepts over large data collections or even singular large documents.
基线 RAG 在整体理解大量数据集合或甚至单个大型文档的总结语义概念时表现不佳。

To address this, the tech community is working to develop methods that extend and enhance RAG (e.g., LlamaIndex (opens in new tab)). Microsoft Research’s new approach, GraphRAG, uses the LLM to create a knowledge graph based on the private dataset. This graph is then used alongside graph machine learning to perform prompt augmentation at query time. GraphRAG shows substantial improvement in answering the two classes of questions described above, demonstrating intelligence or mastery that outperforms other approaches previously applied to private datasets.
为了解决这个问题，科技界正在努力开发扩展和增强 RAG（如 LlamaIndex）的方法。微软研究的新方法 GraphRAG 利用LLM基于私人数据集构建知识图谱。然后，该图谱与图机器学习结合，在查询时进行提示增强。GraphRAG 在回答上述两类问题上显示出显著的进步，表现出超越以往应用于私人数据集的其他方法的智能或精通程度。

Applying RAG to private datasets
应用 RAG 到私人数据集

To demonstrate the effectiveness of GraphRAG, let’s start with an investigation using the Violent Incident Information from News Articles (VIINA) dataset (opens in new tab). This dataset was chosen due to its complexity and the presence of differing opinions and partial information. It is a messy real-world test case that was recent enough not to be included in the LLM base model’s training.
为了展示 GraphRAG 的有效性，我们首先使用新闻文章中的暴力事件信息（VIINA）数据集进行探究。选择这个数据集是因为其复杂性，以及存在不同的观点和部分信息。这是一个足够近的现实世界测试案例，没有包含在LLM基础模型的训练中。

For this research, we use thousands of news articles from both Russian and Ukrainian news sources for the month of June 2023, translated into English, to create a private dataset on which we will perform our LLM-based retrieval. The dataset is far too large to fit into an LLM context window, thus demanding a RAG approach.
针对这项研究，我们使用了 2023 年 6 月来自俄罗斯和乌克兰新闻来源的数千篇新闻文章，这些文章被翻译成英文，以创建一个私有数据集，我们将在这个数据集上进行基于LLM的检索。由于数据集太大，无法适应LLM的上下文窗口，因此需要采用 RAG 方法。

We start with an exploratory query, which we pose to both a baseline RAG system and to our new approach, GraphRAG:
我们首先提出一个探索性查询，分别向基线 RAG 系统和我们的新方法 GraphRAG 提问：

Query: “What is Novorossiya?”
查询：“什么是 Novorossiya？”

Baseline RAG 基线 RAG

GraphRAG 图形 RAG

The term “Novorossiya” is not mentioned in the provided context. However, to provide some information, Novorossiya, or New Russia, is a historical term referring to a region in the south of Ukraine that was conquered by the Russian Empire in the 18th century. In recent years, the term has been revived by Russian-backed separatists in Eastern Ukraine to refer to the territories they claim, but it is not officially recognized as a separate entity by international bodies.
"Novorossiya"一词在提供的上下文中未提及。不过，简单介绍一下，Novorossiya 或新俄罗斯是一个历史术语，指 18 世纪被俄罗斯帝国征服的乌克兰南部地区。近年来，俄罗斯支持的乌克兰东部分离主义者复兴了这一术语，指他们声称的领土，但国际机构并未正式承认其为独立实体。

Novorossiya, also known as New Russia, is a term with historical and contemporary significance. Historically, it referred to a region in the southern part of the Russian Empire that is now part of Ukraine. In the contemporary context, it has been used by pro-Russian separatists to refer to certain areas of Ukraine [Data: Entities (6494, 912, 6493)].
Novorossiya，也称为新俄罗斯，是一个具有历史和当代意义的术语。历史上，它指的是俄罗斯帝国南部现在属于乌克兰的地区。在当代语境中，亲俄分离主义者使用它来指代乌克兰的某些区域。[数据：实体(6494, 912, 6493)]。

The term Novorossiya has been used in the context of a political movement in Ukraine, particularly by Russian nationalists. This movement has been associated with efforts to destabilize Ukraine and create a so-called “Novorossiya” [Data: Entities (912); Relationships (15211, 15206)]. The term has also been used to denote a separatist state in Ukraine, known as the Federal Republic of Novorossiya (FRN) [Data: Entities (7690)].
“Novorossiya”一词在乌克兰的政治运动中被使用，特别是被俄罗斯民族主义者。这个运动与试图动摇乌克兰稳定、建立所谓的“Novorossiya”有关[数据：实体（912）；关系（15211, 15206）]。该术语也被用来指代乌克兰的一个分离主义国家，即新罗西斯克联邦共和国（FRN）[数据：实体（7690)]。

In these results, we can see both systems perform well – highlighting a class of query on which baseline RAG performs well. Let’s try a query that requires connecting the dots:
在这项结果中，我们可以看到两个系统都表现良好——突出了基线 RAG 在一类查询上的优势。让我们尝试一个需要连接线索的查询：

Query: “What has Novorossiya done?”
查询：“Novorossiya 做了什么？”

Baseline RAG 基线 RAG

GraphRAG 图形 RAG

The text does not provide specific information on what Novorossiya has done.
文本没有提供关于诺沃罗西娅所做具体行动的信息。

Novorossiya, a political movement in Ukraine, has been involved in a series of destructive activities, particularly targeting various entities in Ukraine [Entities (6494, 912)]. The movement has been linked to plans to destroy properties of several Ukrainian entities, including Rosen, the Odessa Canning Factory, the Odessa Regional Radio Television Transmission Center, and the National Television Company of Ukraine [Relationships (15207, 15208, 15209, 15210)]. In addition to these, Novorossiya has also been associated with plans to destroy properties of Roshen, the Odessa Conservatory Factory, the Odesa Regional Radio Television Transmission Center, and the National Telecommunications Company of Ukraine [Relationships (8329, 8330, 8331, 8332)]. Furthermore, Novorossiya has been implicated in plans to blow up ATMs of PrivatBank, the largest state bank in Ukraine [Relationships (8335, 8335)].
乌克兰政治运动“新罗西斯克”参与了一系列破坏活动，特别是针对乌克兰的多个实体[实体(6494, 912)]。该运动与摧毁包括罗森、敖德萨罐头厂、敖德萨地区广播电视台和乌克兰国家电视台在内的多个乌克兰实体的设施有关[关系(15207, 15208, 15209, 15210)]。此外，“新罗西斯克”还与破坏罗申、敖德萨音乐学院工厂、敖德萨地区广播电视台和乌克兰国家电信公司的设施计划有关[关系(8329, 8330, 8331, 8332)]。另外，“新罗西斯克”被卷入了对乌克兰最大国有银行 PrivatBank 自动取款机的爆炸阴谋[关系(8335, 8335)]。

The Office of the General Prosecutor in Ukraine has reported on the creation of Novorossiya, indicating the government’s awareness and potential concern over the activities of this movement [Relationships (8264, 8264)]. The destructive activities of Novorossiya have had significant impacts on the targeted entities and have likely contributed to the ongoing tensions and conflicts in the region.
乌克兰总检察院报告了新罗西娅的成立，显示出政府对该运动的认识和潜在关注[关系(8264, 8264)]。新罗西娅的破坏活动对目标实体产生了重大影响，并可能加剧了该地区的持续紧张和冲突。
In summary, Novorossiya has been involved in a series of destructive activities targeting various entities in Ukraine. These activities have had significant impacts on the targeted entities and have likely contributed to the ongoing tensions and conflicts in the region. The Ukrainian government, through the Office of the General Prosecutor, has acknowledged the existence and activities of Novorossiya, indicating a level of concern over the movement’s actions.
总之，Novorossiya 参与了一系列针对乌克兰多个实体的破坏活动。这些行动对目标实体产生了重大影响，可能加剧了该地区的紧张局势和冲突。乌克兰政府，通过总检察院，承认了 Novorossiya 的存在及其活动，显示出对该运动行为的担忧。

Baseline RAG fails to answer this question. Looking at the source documents inserted into the context window (Figure 1), none of the text segments discuss Novorossiya, resulting in this failure.
基线 RAG 无法回答这个问题。观察插入到上下文窗口的源文档（图 1），没有文本片段讨论 Novorossiya，导致了这一失败。

A table entitled “Relevant chunks of source documents” with 10 rows of text segments and the news source that they were pulled from in the VIINA dataset. Each text segment mentions a news event happening in Ukraine and Russia. None include the term ‘Novorossiya’. — Figure 1: Baseline RAG retrieved context [2]
图 1：基线 RAG 检索的上下文[2]

In comparison, the GraphRAG approach discovered an entity in the query, Novorossiya. This allows the LLM to ground itself in the graph and results in a superior answer that contains provenance through links to the original supporting text. For example, Figure 2 below shows the exact content the LLM used for the LLM-generated statement, “Novorossiya has been implicated in plans to blow up ATMs.” We see the snippet from the raw source documents (after English translation) that the LLM used to support the assertion that a specific bank was a target for Novorossiya via the relationship that exists between the two entities in the graph.
相比之下，GraphRAG 方法在查询中识别出了实体“Novorossiya”。这使得LLM能够将自己定位在图中，从而得到包含原始支持文本链接的优越答案，提供了来源证据。例如，下图 2 显示了LLM用于生成声明“Novorossiya 涉嫌炸毁 ATM 计划”的确切内容。我们看到，LLM使用了来自原始文档（经过英语翻译）的片段，通过图中两个实体之间的关系，支持 Novorossiya 针对特定银行的具体目标这一论断。

Figure 2: GraphRAG Provenance An image of the GraphRAG system displaying a table of the VIINA source text used to ground the connection between Novorossiya and PrivatBank. The table has three columns for source, date, and text. There is a single row of content shown. The row shows the source is from ‘interfaxua’, the date of publication is June 8, 2023, and the text box contains a paragraph taken from the source document. In summary, the text describes the creation of Novorossiya with intent to commit acts of terrorism targeting PrivatBank, the Regional Radio and Television Broadcasting Center, and other targets. It describes recruitment of residents of Odessa. Highlighted in the text box are two separate strings of text. The first is the word ‘Novorossiya’ and the second is the text ‘criminal blew up buildings of military commissariats, ATMs’. — Figure 2: GraphRAG provenance
图 2：GraphRAG 来源

By using the LLM-generated knowledge graph, GraphRAG vastly improves the “retrieval” portion of RAG, populating the context window with higher relevance content, resulting in better answers and capturing evidence provenance.
通过使用LLM生成的知识图谱，GraphRAG 显著增强了 RAG 的“检索”部分，为上下文窗口填充更相关的内容，从而得到更好的答案并捕捉证据来源。

Being able to trust and verify LLM-generated results is always important. We care that the results are factually correct, coherent, and accurately represent content found in the source material. GraphRAG provides the provenance, or source grounding information, as it generates each response. It demonstrates that an answer is grounded in the dataset. Having the cited source for each assertion readily available also enables a human user to quickly and accurately audit the LLM’s output directly against the original source material.
能够信任和验证LLM生成的结果始终很重要。我们关心的是结果在事实上的准确性、连贯性，以及准确反映源材料中的内容。GraphRAG 在生成每个响应时提供了源头或源接地信息。这表明答案基于数据集。每个断言都有引用来源，这使得人类用户可以方便快捷地直接将LLM的输出与原始源材料进行审计。

However, this isn’t all that’s possible using GraphRAG.
但是，GraphRAG 的可能功能不止这些。

Whole dataset reasoning 全文数据推理

Baseline RAG struggles with queries that require aggregation of information across the dataset to compose an answer. Queries such as “What are the top 5 themes in the data?” perform terribly because baseline RAG relies on a vector search of semantically similar text content within the dataset. There is nothing in the query to direct it to the correct information.
基线 RAG 在需要跨数据集聚合信息以组成答案的查询上表现不佳。比如“数据中前 5 个主题是什么？”这样的问题就做得很差，因为基线 RAG 依赖于在数据集中对语义相似文本内容的向量搜索。查询中没有任何内容指引它找到正确的信息。

However, with GraphRAG we can answer such questions, because the structure of the LLM-generated knowledge graph tells us about the structure (and thus themes) of the dataset as a whole. This allows the private dataset to be organized into meaningful semantic clusters that are pre-summarized. The LLM uses these clusters to summarize these themes when responding to a user query.
但是，通过 GraphRAG，我们可以回答这些问题，因为由LLM生成的知识图谱的结构揭示了整个数据集的结构（从而揭示了主题）。这使得私人数据可以被组织成有意义的语义集群，并预先进行了总结。当LLM响应用户查询时，会利用这些集群来概括这些主题。

We illustrate whole-dataset reasoning abilities by posing the following question to the two systems:
我们通过提出以下问题来展示整套数据推理能力：对于两个系统：

Query: “What are the top 5 themes in the data?“
查询："数据中的前 5 个主题是什么？"

Baseline RAG 基线 RAG

GraphRAG 图形 RAG

1. Improving the quality of life in cities and urban development.
1. 提升城市生活质量和城市发展。
2. The meeting between Vladimir Putin and the President of the United Arab Emirates, Mohammed bin Zayed Al Nahyan.
2. 俄罗斯总统弗拉基米尔·普京与阿拉伯联合酋长国总统穆罕默德·本·扎耶德·阿勒纳哈扬的会晤。
3. The National Rating of the Investment Climate in the Regions of Russia.
3. 俄罗斯各地区投资气候的国家评级。
4. The state of the Russian economy and the growth of domestic brands.
4. 俄罗斯经济的状态与国内品牌的增长。
5. The discussion on the meaning of life and the role of science in understanding it.
5. 对生命意义及科学在理解其作用的讨论。

The data presents a wide range of themes, but the top five most prevalent themes can be identified as follows:
数据涵盖广泛的主题，但最常出现的五大主题如下：

1. Conflict and Military Activity: A significant portion of the data revolves around the ongoing conflict in Ukraine, with numerous reports detailing military activities, occupations, and attacks by various entities. This theme includes the actions of the Ukrainian military, Russian forces, and various other military units and entities [Data: Reports (513, 241, 151, 493, 343, +more)].
1. 冲突与军事行动：大量数据围绕乌克兰的持续冲突，详细记录了各方面的军事活动、占领和袭击。主题涉及乌克兰军队、俄罗斯部队以及众多其他军事单位和实体的行为[数据：报告（513，241，151，493，343，+更多）]。

2. Political and Governmental Entities: Many reports focus on political figures and governmental entities, highlighting their roles, relationships, and actions in the context of the current situation. This includes entities like the Kremlin, the President of Ukraine, the US State Department, and various regional administrations [Data: Reports (98, 493, 265, 129, 196, +more)].
2. 政治和政府机构：许多报告聚焦于政治人物和政府实体，强调他们在当前形势下的角色、关系和行动。这些实体包括克里姆林宫、乌克兰总统、美国国务院以及各区域管理部门[数据：报告（98,493,265,129,196，+更多）]。

3. Infrastructure and Environmental Concerns: Several reports highlight potential environmental impacts and infrastructure threats. These include potential environmental damage due to industrial activities, flooding crises, and threats to infrastructure due to military conflicts [Data: Reports (308, 154, 201, 348, 216, +more)].
3. 基础设施与环境问题：多项报告指出可能的环境影响和基础设施威胁。这些包括工业活动可能引发的环境破坏、洪水危机，以及军事冲突对基础设施构成的威胁[数据：报告（308, 154, 201, 348, 216，+更多）]。

4. Community Analysis and Threat Assessment: Several reports provide detailed analyses of specific communities, often centered around a particular location or entity. These analyses typically discuss the key entities within the community, their relationships, and any associated threats. Many reports also include assessments of potential threats associated with the entities or situations they discuss [Data: Reports (323, 596, 518, 583, 179, +more)].
4. 社区分析与威胁评估：有多份报告对特定社区进行了详尽剖析，通常围绕某个地点或实体展开。这些分析通常会探讨社区内的关键实体、它们之间的关系，以及任何关联的威胁。许多报告还对讨论中的实体或情况可能带来的潜在威胁进行了评估[数据：报告（323, 596, 518, 583, 179，+更多）]。

5. Health, Humanitarian Concerns, and Emergency Situations: Some reports focus on health and humanitarian issues, such as the Shereshevsky-Turner Syndrome community and the potential risk of intestinal diseases in the Odessa region. The data also covers emergency situations, including natural disasters, accidents, and attacks [Data: Reports (204, 149, 372, 154, 532, +more)].
5. 健康、人道主义问题及紧急情况：一些报告关注健康和人道主义议题，如谢列舍夫斯基-特纳综合症社群，以及敖德萨地区可能的肠道疾病风险。数据还涵盖紧急情况，包括自然灾害、事故和袭击[数据：报告（204, 149, 372, 154, 532，+更多）]。
These themes provide a comprehensive overview of the current situation, highlighting the complexity and interconnectedness of the issues at hand.
这些主题提供了一个全面的情况概述，突出了问题的复杂性和相关性。

Looking at the results from baseline RAG, we see that none of the listed themes has much to do with the war between the two countries. As anticipated, the vector search retrieved irrelevant text, which was inserted into the LLM’s context window. Results that were included were likely keying on the word “theme,” resulting in a less than useful assessment of what is going on in the dataset.
查看基线 RAG 的结果，我们发现列出的主题与两国之间的战争关系不大。正如预期的那样，向量搜索检索到了无关文本，并插入到了LLM的上下文中。包含的结果可能是针对“主题”这个词，因此对数据集中的情况评估不足。

Observing the results from GraphRAG, we can clearly see that the results are far more aligned with what is going on in the dataset as a whole. The answer provides the five main themes as well as supporting details that are observed in the dataset. The referenced reports are pre-generated by the LLM for each semantic cluster in GraphRAG and, in turn, provide provenance back to original source material.
从 GraphRAG 的结果观察，我们可以清楚看到结果与数据集的整体情况更加吻合。答案提供了五个主要主题以及在数据集中观察到的支撑细节。GraphRAG 为每个语义簇预生成了引用报告，并反过来提供了对原始来源材料的追溯。

Creating LLM-generated knowledge graphs
创建LLM生成的知识图谱

We note the basic flow that underpins GraphRAG, which builds upon our prior research (opens in new tab) and repositories (opens in new tab) using graph machine learning:
我们注意到 GraphRAG 的基本流程，它建立在我们先前的研究和基于图机器学习的存储库之上：

The LLM processes the entire private dataset, creating references to all entities and relationships within the source data, which are then used to create an LLM-generated knowledge graph.
LLM处理整个私有数据集，为源数据中的所有实体和关系创建引用，然后使用这些引用生成LLM生成的知识图谱。
This graph is then used to create a bottom-up clustering that organizes the data hierarchically into semantic clusters (indicated by using color in Figure 3 below). This partitioning allows for pre-summarization of semantic concepts and themes, which aids in holistic understanding of the dataset.
然后，该图表被用来创建一个自下而上的聚类，将数据分层组织成语义集群（如图 3 所示，用颜色表示）。这种划分有助于对数据集的整体理解，可以预先总结语义概念和主题。
At query time, both of these structures are used to provide materials for the LLM context window when answering a question.
查询时，这两种结构都用于提供回答问题时的LLM上下文窗口内容。

An example visualization of the graph is shown in Figure 3. Each circle is an entity (e.g., a person, place, or organization), with the entity size representing the number of relationships that entity has, and the color representing groupings of similar entities. The color partitioning is a bottom-up clustering method built on top of the graph structure, which enables us to answer questions at varying levels of abstraction.
图 3 显示了该图的一个示例可视化。每个圆圈代表一个实体（例如，人、地点或组织），实体的大小表示该实体的关系数量，颜色表示相似实体的分组。颜色划分是基于图结构的自下而上的聚类方法，这使得我们能够回答不同抽象层次的问题。

Figure 3: LLM-generated knowledge graph built from a private dataset using GPT-4 Turbo. A knowledge graph visualization represented by a collection in 3D space projected onto a 2D image of circles of varying sizes and colors. The circles are grouped together in space by color, and within each color area the larger circles are surrounded by many smaller circles. Each circle represents an entity within the knowledge graph. — Figure 3: LLM-generated knowledge graph built from a private dataset using GPT-4 Turbo.
图 3：使用 GPT-4 Turbo 从私人数据集构建的LLM生成的知识图谱。

Result metrics 结果指标

The illustrative examples above are representative of GraphRAG’s consistent improvement across multiple datasets in different subject domains. We assess this improvement by performing an evaluation using an LLM grader to determine a pairwise winner between GraphRAG and baseline RAG. We use a set of qualitative metrics, including comprehensiveness (completeness within the framing of the implied context of the question), human enfranchisement (provision of supporting source material or other contextual information), and diversity (provision of differing viewpoints or angles on the question posed). Initial results show that GraphRAG consistently outperforms baseline RAG on these metrics. 
上述示例代表了 GraphRAG 在不同主题领域多个数据集上的持续改进。我们通过使用LLM评估器进行评价，来确定 GraphRAG 与基线 RAG 之间的逐对优胜者。我们采用了一系列定性指标，包括完整性（在问题暗示的上下文中提供完整信息）、人类授权（提供支持的源材料或其他上下文信息）和多样性（对提出的问题提供不同的观点或角度）。初步结果显示，GraphRAG 在这些指标上始终优于基线 RAG。

In addition to relative comparisons, we also use SelfCheckGPT (opens in new tab) to perform an absolute measurement of faithfulness to help ensure factual, coherent results grounded in the source material. Results show that GraphRAG achieves a similar level of faithfulness to baseline RAG. We are currently developing an evaluation framework to measure performance on the class of problems above. This will include more robust mechanisms for generating question-answer test sets as well as additional metrics, such as accuracy and context relevance.
除了相对比较外，我们还使用 SelfCheckGPT 进行绝对准确度测量，以确保结果基于原始资料，事实准确且连贯。结果显示，GraphRAG 达到与基线 RAG 相似的准确度水平。我们正在开发一个评估框架，以衡量上述问题类别的性能。这将包括更强大的问答测试集生成机制，以及额外的指标，如精确度和上下文相关性。

Next steps 下一步

By combining LLM-generated knowledge graphs and graph machine learning, GraphRAG enables us to answer important classes of questions that we cannot attempt with baseline RAG alone. We have seen promising results after applying this technology to a variety of scenarios, including social media, news articles, workplace productivity, and chemistry. Looking forward, we plan to work closely with customers on a variety of new domains as we continue to apply this technology while working on metrics and robust evaluation. We look forward to sharing more as our research continues.
通过结合LLM生成的知识图谱和图机器学习，GraphRAG 使我们能够回答单凭基础 RAG 无法处理的重要问题类型。在社交媒体、新闻文章、工作效率和化学等各种场景中应用这项技术后，我们看到了令人鼓舞的结果。展望未来，我们计划与客户紧密合作，在不断应用这项技术的同时，致力于指标和稳健评估。我们期待随着研究的深入，分享更多成果。

[1] (opens in new tab) As baseline RAG in this comparison we use LangChain’s Q&A (opens in new tab), a well-known representative example of this class of RAG tools in widespread use today.
[1] 本比较中的基线 RAG 是 LangChain 的 Q&A，它是当前广泛使用的这类 RAG 工具的知名代表例。

[2] This dataset contains sensitive topics. The dataset was chosen solely to showcase tools for data analysis that surface all relevant information including its origin. The tools, grounded by that dataset information, enable a human user to more rapidly reach informed conclusions within the context of opposing viewpoints from both Ukrainian (unian) and Russian (ria) news articles sourced in their native languages. The tools highlight the source of each statement, which can be used to identify where the information is originating.
[2] 此数据集包含敏感主题。选择该数据集仅是为了展示能够揭示所有相关信息，包括其来源的数据分析工具。这些基于该数据集信息的工具，使用户能够更快地在使用乌克兰（unian）和俄罗斯（ria）新闻文章的原始语言来源中，根据对立观点得出知情结论。工具突出显示每个陈述的来源，这可用于识别信息的来源。

Related publications 相关出版物

Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine
泛化基金会模型能否胜过专门调优？医学案例研究

Meet the authors 作者见面会

Jonathan Larson 乔纳森·洛伦森

Senior Principal Data Architect
首席数据架构师

Learn more 了解更多

Steven Truitt 史蒂文·特鲁伊特

Principal Program Manager
主程序经理

Learn more 了解更多

Continue reading 继续阅读

white icons on a blue and green gradient background

July 10, 2024 2024 年 7 月 10 日

Research Areas 研究领域

Artificial intelligence

Related tools 相关工具

Graspologic

Related projects 相关项目

Project GraphRAG

Microsoft Research Blog 微软研究博客

GraphRAG: Unlocking LLM discovery on narrative private data
图谱 RAG：解锁叙事个人数据的LLM发现

Applying RAG to private datasets
应用 RAG 到私人数据集

Whole dataset reasoning 全文数据推理

Creating LLM-generated knowledge graphs
创建LLM生成的知识图谱

Result metrics 结果指标

Next steps 下一步

Related publications 相关出版物

Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine
泛化基金会模型能否胜过专门调优？医学案例研究

Meet the authors 作者见面会

Jonathan Larson 乔纳森·洛伦森

Steven Truitt 史蒂文·特鲁伊特

Continue reading 继续阅读

Empowering NGOs with generative AI in the fight against human trafficking
赋予 NGO 使用生成式 AI 打击人口贩卖的力量

GraphRAG: New tool for complex data discovery now on GitHub
GraphRAG：复杂数据发现的新工具现已在 GitHub 上发布

Research Focus: Week of May 27, 2024
研究焦点：2024 年 5 月 27 日周

SAMMO: A general-purpose framework for prompt optimization
SAMMO：用于快速优化的通用框架

Research Areas 研究领域

Related tools 相关工具

Related projects 相关项目

Microsoft Research Blog 微软研究博客

Applying RAG to private datasets应用 RAG 到私人数据集

Whole dataset reasoning 全文数据推理

AI Frontiers: The future of scale with Ahmed Awadallah and Ashley Llorens人工智能前沿：艾哈迈德·奥瓦达利和阿什利·洛伦斯的规模化未来

Creating LLM-generated knowledge graphs创建LLM生成的知识图谱

Result metrics 结果指标

Next steps 下一步

Related publications 相关出版物

Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine泛化基金会模型能否胜过专门调优？医学案例研究

Meet the authors 作者见面会

Jonathan Larson 乔纳森·洛伦森

Steven Truitt 史蒂文·特鲁伊特

Continue reading 继续阅读

Empowering NGOs with generative AI in the fight against human trafficking赋予 NGO 使用生成式 AI 打击人口贩卖的力量

GraphRAG: New tool for complex data discovery now on GitHubGraphRAG：复杂数据发现的新工具现已在 GitHub 上发布

Research Focus: Week of May 27, 2024研究焦点：2024 年 5 月 27 日周

SAMMO: A general-purpose framework for prompt optimizationSAMMO：用于快速优化的通用框架

Research Areas 研究领域

Related tools 相关工具

Related projects 相关项目

Applying RAG to private datasets
应用 RAG 到私人数据集

AI Frontiers: The future of scale with Ahmed Awadallah and Ashley Llorens
人工智能前沿：艾哈迈德·奥瓦达利和阿什利·洛伦斯的规模化未来

Creating LLM-generated knowledge graphs
创建LLM生成的知识图谱

Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine
泛化基金会模型能否胜过专门调优？医学案例研究

Empowering NGOs with generative AI in the fight against human trafficking
赋予 NGO 使用生成式 AI 打击人口贩卖的力量

GraphRAG: New tool for complex data discovery now on GitHub
GraphRAG：复杂数据发现的新工具现已在 GitHub 上发布

Research Focus: Week of May 27, 2024
研究焦点：2024 年 5 月 27 日周

SAMMO: A general-purpose framework for prompt optimization
SAMMO：用于快速优化的通用框架