这是用户在 2024-4-19 14:10 为 https://docs.velociraptor.app/blog/2020/2020-04-16-velociraptor-e48a47e0317d/ 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?

Velociraptor 迅猛龙

This is an introductory article explaining the rationale behind Velociraptor’s design and particularly how Velociraptor evolved with some historical context compared with other DFIR tooling. We took a lot of inspiration and learned many lessons by using other great tools, and Velociraptor is our attempt at pushing the field forward.
这是一篇介绍性文章,解释了 Velociraptor 设计背后的基本原理,特别是与其他 DFIR 工具相比,Velociraptor 如何在某些历史背景下演变。通过使用其他出色的工具,我们获得了很多灵感并学到了很多经验教训,Velociraptor 是我们推动该领域向前发展的尝试。

Digital forensics is primarily focused on answering questions. Most practitioners limit their cases around high level questions, such as did the user access a particular file? Was malware run on the user’s workstation? Did an attacker crack an account?
数字取证主要侧重于回答问题。大多数从业者将他们的案例限制在高级问题上,例如用户是否访问了特定文件?恶意软件是否在用户的工作站上运行?攻击者是否破解了帐户?

Over the years, DFIR practitioners have developed and refined methodologies for answering such questions. For example, by examining the timestamps stored in the NTFS filesystem we are able to build a timeline tracing an intruders path through the network. These methodologies are often encoded informally in practitioners’ experience and training. Wouldn’t it be great to have a way to formally document and encode these methodologies?
多年来,DFIR 从业者开发并完善了回答此类问题的方法。例如,通过检查存储在 NTFS 文件系统中的时间戳,我们能够构建一个时间线来跟踪入侵者通过网络的路径。这些方法通常非正式地编码在从业者的经验和培训中。如果有一种方法来正式记录和编码这些方法不是很好吗?

In many digital evidence based cases, time is of the essence. The forensic practitioner is looking to answer questions quickly and efficiently, since the amount and size of digital evidence is increasing with every generation of new computing devices. We now see the emergence of triage techniques to quickly classify a machine as worthy of further forensic analysis. When triaging a system, the practitioner has to be surgical in their approach — examining specific artifacts before even acquiring the hard disk or memory.
在许多基于数字证据的案件中,时间至关重要。法医从业者希望快速有效地回答问题,因为随着每一代新计算设备的出现,数字证据的数量和规模都在不断增加。我们现在看到分类技术的出现,可以快速将机器分类为值得进一步取证分析的机器。在对系统进行分类时,从业者的方法必须像外科手术一样——在获取硬盘或内存之前检查特定的工件。

Triaging is particularly prevalent in enterprise incident response. In this scenario it is rare for legal prosecution to take place, instead the enterprise is interested in quickly containing the incident and learning of possible impacts. As part of this analysis, the practitioner may need to triage many thousands of machines to find those machines who were compromised, avoiding the acquisition of bit-for-bit forensically sound images.
分类在企业事件响应中尤其普遍。在这种情况下,很少会发生法律起诉,相反,企业有兴趣快速遏制事件并了解可能的影响。作为此分析的一部分,从业者可能需要对数千台机器进行分类,以找到那些受到损害的机器,从而避免获取逐位取证声音图像。

The rise of the endpoint DFIR agent
端点 DFIR 代理的兴起

This transition from traditional forensic techniques to highly scalable distributed analysis has resulted in multiple offering of endpoint agents. An agent is specialized software running on enterprise endpoints providing forensic analysis and telemetry to central servers. This architectures enables detection of attackers from different endpoints as they traverse through the network and provides a more distributed detection coverage for more assets simultaneously.
从传统取证技术到高度可扩展的分布式分析的转变导致了端点代理的多种提供。代理是在企业端点上运行的专用软件,为中央服务器提供取证分析和遥测。这种架构能够在攻击者穿越网络时检测来自不同端点的攻击者,并同时为更多资产提供更加分布式的检测覆盖范围。

One of the first notable endpoint agents was GRR, a Google internal project open sourced around 2012. GRR is an agent installed on many endpoints controlled by a central server. The agent is able to perform some low level forensic analysis by incorporating other open source tools such as the Sleuthkit and The Rekall Memory forensic suite. The GRR framework was one of the first to offer the concept of hunting — actively seeking forensic anomalies on many endpoints at the same time. For the first time, analysts could pose a question — such as “Which endpoints contain this registry key”, to thousands of endpoints at once, and receive an answer within hours.
第一个著名的端点代理是 GRR,这是一个在 2012 年左右开源的 Google 内部项目。GRR 是安装在由中央服务器控制的许多端点上的代理。该代理能够通过结合其他开源工具(例如 Sleuthkit 和 Rekall Memory 取证套件)来执行一些低级取证分析。 GRR 框架是第一个提供狩猎概念的框架之一——同时在多个端点上积极寻找取证异常。分析师第一次可以同时向数千个端点提出问题,例如“哪些端点包含此注册表项”,并在数小时内收到答案。

Hunting is particularly useful for rapid triaging — we can focus our attention only on those machines which show potential signs of compromise. GRR also provides interactive remote access to the endpoint, allowing for user inspection of the endpoint (such as interactively examining files, directories and registry keys).
狩猎对于快速分类特别有用——我们可以只将注意力集中在那些表现出潜在妥协迹象的机器上。 GRR 还提供对端点的交互式远程访问,允许用户检查端点(例如交互式检查文件、目录和注册表项)。

As useful as GRR’s approach was at the time, there were some shortfalls, mainly around lack of flexibility and limited scale and performance. GRR features are built into the agent making it difficult to rapidly push new code updates or new capabilities in response to changing needs. It is also difficult to control the amount of data transferred from the endpoint which often ends up being much too detailed than necessary, leading to performance issues on the server.
尽管 GRR 的方法在当时很有用,但也存在一些缺陷,主要是缺乏灵活性以及规模和性能有限。 GRR 功能内置于代理中,因此很难快速推送新代码更新或新功能来响应不断变化的需求。控制从端点传输的数据量也很困难,这通常会变得过于详细,从而导致服务器上的性能问题。

The next breakthrough in the field was the release of Facebook’s OSQuery. This revolutionary tool allows one to query the endpoints using a SQL like syntax query. By querying the endpoint, it is possible to adapt the results sent, apply arbitrary filtering and combine different modules in new creative ways. OSQuery’s approach proved to be very flexible in the rapidly evolving stages of incident response, where users need to modify their queries rapidly in response to emerging needs.
该领域的下一个突破是 Facebook 的 OSQuery 的发布。这一革命性的工具允许人们使用类似 SQL 的语法查询来查询端点。通过查询端点,可以调整发送的结果、应用任意过滤并以新的创造性方式组合不同的模块。事实证明,OSQuery 的方法在事件响应快速发展的阶段非常灵活,用户需要快速修改查询以响应新出现的需求。

Introducing Velociraptor 迅猛龙简介

Learning from these early projects, Velociraptor was released in 2019. Similar to GRR, Velociraptor also allows for hunting across many thousands of machines. Inspired by OSQuery, Velociraptor implements a new query language dubbed VQL (Velociraptor Query Language) which is similar to SQL but extends the query language in a more powerful way. Velociraptor also emphasizes ease of installation and very low latency — typically collecting artifacts from thousands of endpoints in a matter of seconds.
借鉴这些早期项目的经验,Velociraptor 于 2019 年发布。与 GRR 类似,Velociraptor 也允许在数千台机器上进行狩猎。受 OSQuery 的启发,Velociraptor 实现了一种名为 VQL(Velociraptor 查询语言)的新查询语言,它类似于 SQL,但以更强大的方式扩展了查询语言。 Velociraptor 还强调易于安装和极低的延迟——通常在几秒钟内从数千个端点收集工件。

Figure 1 above shows an overview of the Velociraptor architecture. The Velociraptor server maintains communications with the endpoint agents (called Clients) for command and control. The web based administration user interface is used to task individual clients, run hunts and collect data.
上图 1 显示了 Velociraptor 架构的概述。 Velociraptor 服务器与端点代理(称为客户端)保持通信以进行命令和控制。基于网络的管理用户界面用于向各个客户分配任务、运行搜索和收集数据。

Ultimately, Velociraptor agents are simply VQL engines — all tasks to the agent are simply VQL queries that the engine executes. VQL queries, just like database queries, result in a table, with columns (as dictated by the query) and multiple rows. The agent will execute the query, and send back the results to the server which simply stores them as files. This approach means the server is not really processing the results other than just storing them in files. Therefore the load on the server is minimal allowing for vastly scalable performance.
最终,Velociraptor 代理只是 VQL 引擎——代理的所有任务都只是引擎执行的 VQL 查询。 VQL 查询与数据库查询一样,会生成一个包含列(由查询指定)和多行的表。代理将执行查询,并将结果发送回服务器,服务器将它们简单地存储为文件。这种方法意味着服务器除了将结果存储在文件中之外并没有真正处理结果。因此,服务器上的负载最小,从而可以实现极大的可扩展性能。

Velociraptor artifacts 迅猛龙文物

Writing free-form queries is a powerful tool, but from a user experience perspective, it is not ideal. Users will need to remember potentially complex queries. Velociraptor solves this by implementing “**Artifacts**”. An artifact is a text file written in YAML which encapsulates the VQL, adds some human readable descriptions and provides some parameters allowing users to customize the operation of the artifact to some extent.
编写自由格式查询是一个强大的工具,但从用户体验的角度来看,它并不理想。用户需要记住潜在的复杂查询。 Velociraptor 通过实施“**Artifacts**”解决了这个问题。工件是用 YAML 编写的文本文件,它封装了 VQL,添加了一些人类可读的描述并提供了一些参数,允许用户在某种程度上自定义工件的操作。

As an example of this process, we consider the Windows Scheduled Tasks. These tasks are often added by attackers as a way of gaining persistence and a backdoor to a compromised system (See Att&ck Matrix T1053). Velociraptor can collect and analyse these tasks if provided with the appropriate VQL query. By writing the query into an artifact we make it possible for other users to simply re-use our VQL.
作为此过程的一个示例,我们考虑 Windows 计划任务。这些任务通常由攻击者添加,作为获得持久性的一种方式以及受损系统的后门(请参阅 Att&ck Matrix T1053)。如果提供适当的 VQL 查询,Velociraptor 可以收集和分析这些任务。通过将查询写入工件,其他用户可以简单地重用我们的 VQL。

Figure 2 shows the Windows.System.TaskScheduler artifact as viewed in the GUI. The artifact contains some user readable background information, parameters and the VQL source. As Figure 3 below shows, in the GUI, one simply needs to search for the scheduled tasks artifact, select it and collect it from the endpoint.
图 2 显示了在 GUI 中查看的 Windows.System.TaskScheduler 工件。该工件包含一些用户可读的背景信息、参数和 VQL 源。如下图 3 所示,在 GUI 中,只需搜索计划任务工件、选择它并从端点收集它即可。

As soon as we issue the collection request, the client will run the VQL query, and send the result to the server within seconds. If the agent is not online at the time of the query, the task will be queued on the server until the endpoint comes back online, at which time the artifact will be collected immediately.
一旦我们发出采集请求,客户端就会运行VQL查询,并在几秒钟内将结果发送到服务器。如果代理在查询时不在线,任务将在服务器上排队,直到端点重新在线,此时将立即收集工件。

Figure 4 shows the result of this collection. We see the agent took 5 seconds to upload the 180 scheduled task XML files, which took a total of 5.7mb. We can click the “Prepare Download” button now to prepare a zip file containing these files for export. We can then download the Zip file through the GUI and store it as evidence as required.
图 4 显示了该收集的结果。我们看到代理花了 5 秒上传了 180 个计划任务 XML 文件,总共花费了 5.7mb。我们现在可以单击“准备下载”按钮来准备包含这些文件的 zip 文件以供导出。然后我们可以通过 GUI 下载 Zip 文件并根据需要将其存储为证据。

Figure 5 shows the results from this artifact. The VQL query also instructed the endpoint to parse the XML files on the endpoint and extract the launched command directly. It is now possible to quickly triage all the scheduled tasks looking for unusual or suspicious tasks. The exported Zip file will also contain the CSV files produced by this analysis and can be processed using any tool that supports CSV formatted data (e.g. Excel, MySQL or Elastic through Logstash).
图 5 显示了该工件的结果。 VQL 查询还指示端点解析端点上的 XML 文件并直接提取启动的命令。现在可以快速对所有计划任务进行分类,以查找异常或可疑的任务。导出的 Zip 文件还将包含此分析生成的 CSV 文件,并且可以使用支持 CSV 格式数据的任何工具(例如 Excel、MySQL 或 Elastic 通过 Logstash)进行处理。

Hunting with Velociraptor
与迅猛龙一起狩猎

Continuing our example of scheduled tasks, we now wish to hunt for these across the entire enterprise. This captures the state of the deployment at a point in time when the hunt was collected and allows us to go back and see which new scheduled tasks appeared at a later point in time.
继续我们的计划任务示例,我们现在希望在整个企业中寻找这些任务。这捕获了收集搜索时的某个时间点的部署状态,并允许我们返回并查看稍后的时间点出现了哪些新的计划任务。

Hunting is simply a way to collect the same artifact from many machines at the same time. The GUI simply packages the results from these collections into a single exported file.
狩猎只是一种同时从许多机器收集相同物品的方法。 GUI 只是将这些集合的结果打包到单个导出文件中。

Figure 7 shows a hunt created to collect the Windows.System.TaskScheduler artifact. We can see the total number of clients scheduled and completed and that the hunt will expire in one week. If new machines appear within this time, they will also have that artifact collected. Finally we can prepare an export zip file for download that contains all the client’s collected artifacts.
图 7 显示了为收集 Windows.System.TaskScheduler 项目而创建的搜索。我们可以看到计划和完成的客户总数,并且搜索将在一周后到期。如果在此时间内出现新机器,他们也会收集该神器。最后,我们可以准备一个导出 zip 文件以供下载,其中包含客户收集的所有工件。

Forensic analysis on the endpoint
终点法医分析

To be really effective, Velociraptor implements many forensic capabilities directly on the endpoint. This allows for writing artifacts that can leverage this analysis, either in a surgical way — identifying directly the relevant data, or in order to enrich the results by automatically providing more context to the analyst. In this section we examine some of these common use cases and see how they can be leveraged through use of artifacts.
为了真正有效,Velociraptor 直接在端点上实现了许多取证功能。这允许编写可以利用此分析的工件,或者以外科手术的方式——直接识别相关数据,或者通过自动向分析师提供更多上下文来丰富结果。在本节中,我们将研究其中一些常见用例,并了解如何通过使用工件来利用它们。

Searching for files 搜索文件

A common task for analysts is to search for particular filenames. For example, in a drive by download or phishing email case, we already know in advance the name of the dropped file and we simply want to know if the file exists on any of our endpoints.
分析师的一项常见任务是搜索特定的文件名。例如,在驱动器下载或网络钓鱼电子邮件的情况下,我们已经提前知道所删除文件的名称,我们只想知道该文件是否存在于我们的任何端点上。

The Windows.Search.FileFinder artifact is designed to search for various files by filename. Figure 8 below illustrates the parameters that can be used to customize the collection. For a typical drive-by download, we might want to search for all binaries downloaded recently within the user’s home directories. We can also collect matching files centrally to further analyse those binaries. The artifact also allows us to filter by keywords appearing within file contents.
Windows.Search.FileFinder 工件旨在按文件名搜索各种文件。下面的图 8 说明了可用于自定义集合的参数。对于典型的偷渡式下载,我们可能想要搜索用户主目录中最近下载的所有二进制文件。我们还可以集中收集匹配文件以进一步分析这些二进制文件。该工件还允许我们按文件内容中出现的关键字进行过滤。

Searching for files is a very common operation which covers many of the common use cases, but it is limited to finding files that are not currently deleted. Velociraptor also includes a complete NTFS filesystem parser available through a VQL plugin. This allows us to extract low level information from every MFT entry.
搜索文件是一种非常常见的操作,涵盖了许多常见用例,但仅限于查找当前未删除的文件。 Velociraptor 还包括一个完整的 NTFS 文件系统解析器,可通过 VQL 插件使用。这使我们能够从每个 MFT 条目中提取低级信息。

**Figure 10 **shows a sample of this output. We can see details like the FILE_NAME timestamps, as well as the STANDARD_INFORMATION stream timestamps (useful for detecting time stomping).
**图 10 **显示了此输出的示例。我们可以看到 FILE_NAME 时间戳等详细信息,以及 STANDARD_INFORMATION 流时间戳(对于检测时间踩踏很有用)。

While the Windows.NTFS.MFT artifact dumps all MFT entries from the endpoint, we can make this more surgical and specifically search for deleted executables. To do this we would need to modify the VQL query to add an additional filter.
虽然 Windows.NTFS.MFT 工件从端点转储所有 MFT 条目,但我们可以使其更加精确,并专门搜索已删除的可执行文件。为此,我们需要修改 VQL 查询以添加额外的过滤器。

Modifying or customizing an artifact is easy to do through the GUI. Simply search for the artifact in the “View Artifacts” screen, and then click the “Modify Artifact” button to bring up an editor allowing the YAML to be directly edited (Note that all customized artifacts, automatically receive the prefix “Custom” in their name setting them apart from curated artifacts).
通过 GUI 可以轻松修改或自定义工件。只需在“View Artifacts”屏幕中搜索该工件,然后单击“Modify Artifact”按钮即可调出一个编辑器,允许直接编辑 YAML(请注意,所有自定义工件都会自动在其名称中接收前缀“Custom”)名称将它们与精选的文物区分开来)。

In the figure above we added the condition “WHERE FileName =~ ‘.exe$’ AND NOT InUse” to restrict output only to deleted executables. We now select this customized version and collect it on the endpoint as before. Since we have filtered only those executables which are deleted in this query, the result set is much smaller and somewhat quicker to calculate. Figure 11 below shows a single binary was found on our test system still recoverable in unused MFT entry.
在上图中,我们添加了条件“WHERE FileName =~ ‘.exe$’ AND NOT InUse”以将输出限制为仅输出已删除的可执行文件。我们现在选择这个自定义版本并像以前一样在端点上收集它。由于我们仅过滤了在此查询中删除的那些可执行文件,因此结果集要小得多,并且计算速度要快一些。下面的图 11 显示了在我们的测试系统上发现的单个二进制文件在未使用的 MFT 条目中仍可恢复。

Figure 11 shows an MFT entry for a binary that had been removed from disk. If we are lucky we can attempt to recover the deleted file using the **Windows.NTFS.Recover **artifact. This artifact simply dumps out all the attribute streams from the specified MFT entry (including the $DATA attribute) and uploads them to the server. Figure 12 below shows how we can select to collect this artifact, and specify the MFT entry reported in the previous collection as a parameter to the artifact.
图 11 显示了已从磁盘中删除的二进制文件的 MFT 条目。如果幸运的话,我们可以尝试使用 **Windows.NTFS.Recover ** 工件来恢复已删除的文件。该工件只是转储指定 MFT 条目中的所有属性流(包括 $DATA 属性)并将它们上传到服务器。下面的图 12 显示了我们如何选择收集此工件,并将先前收集中报告的 MFT 条目指定为工件的参数。

Figure 13 shows the output from the **Windows.NTFS.Recover **artifact, showing the **$DATA **stream was correctly recovered as verified by its hash.
图 13 显示了 **Windows.NTFS.Recover ** 工件的输出,显示 **$DATA ** 流已通过其哈希值验证正确恢复。

The previous example demonstrates how having advanced forensic analysis capabilities is valuable during endpoint monitoring. The example of a drive by download required us confirming if a particular executable is present on any of our endpoints. We started off by performing a simple filename search for executables. But realizing this will only yield currently existing files, we move onto deep level NTFS analysis dumping all MFT entry information. We then modified the VQL query to restrict the output to only the subset of results of interest in our case.
前面的示例演示了高级取证分析功能在端点监控期间的价值。下载驱动器的示例要求我们确认特定的可执行文件是否存在于我们的任何端点上。我们首先对可执行文件执行简单的文件名搜索。但认识到这只会产生当前存在的文件,我们转而进行深层 NTFS 分析,转储所有 MFT 条目信息。然后,我们修改了 VQL 查询,将输出限制为仅包含我们案例中感兴趣的结果子集。

This modified query can now run as a hunt on the entire fleet to determine which executables have recently been deleted anywhere, which would confirm if the malware was run on other machines we are not aware of. We can then potentially use NTFS recovery techniques to recover the binary for further analysis. Without the flexibility of the powerful Velociraptor Query Language it would be difficult to adapt to such a fluid and rapidly developing incident.
这个修改后的查询现在可以在整个机群上运行,以确定最近在任何地方删除了哪些可执行文件,这将确认恶意软件是否在我们不知道的其他计算机上运行。然后,我们可以使用 NTFS 恢复技术来恢复二进制文件以供进一步分析。如果没有强大的 Velociraptor 查询语言的灵活性,将很难适应这种流动且快速发展的事件。

Conclusions 结论

Velociraptor includes many other low level analysis modules, such as parsing prefetch files, raw registry access (for AMCache analysis), ESE database parser (facilitating SRUM database forensics and Internet Explorer history analysis), SQLite parsers (for Chrome and Firefox history) and much more.
Velociraptor 包括许多其他低级分析模块,例如解析预取文件、原始注册表访问(用于 AMCache 分析)、ESE 数据库解析器(促进 SRUM 数据库取证和 Internet Explorer 历史分析)、SQLite 解析器(用于 Chrome和 Firefox 历史)等等。

The true power of Velociraptor is in combining these low level modules with other VQL queries to further enrich the output or narrow down queries making them more surgical and reducing the amount of false positives. This more targeted approach is critical when hunting at scale in order to reduce the amount of data collected and assist the operator in focusing on the truly important evidence quickly and efficiently.
Velociraptor 的真正强大之处在于将这些低级模块与其他 VQL 查询相结合,以进一步丰富输出或缩小查询范围,使它们更加精确并减少误报数量。这种更有针对性的方法在大规模搜寻时至关重要,可以减少收集的数据量,并帮助操作员快速有效地关注真正重要的证据。

The type of analysis performed is driven by a flexible VQL query, written into an artifact by the user. This unprecedented level of flexibility and scale in a forensic tool allows for flexible and novel response and collection. It is really only limited by the imagination of the user.
执行的分析类型由灵活的 VQL 查询驱动,由用户写入工件。取证工具具有前所未有的灵活性和规模,可以实现灵活新颖的响应和收集。它实际上仅受用户想象力的限制。

We opened this article by imagining a world where experienced forensic practitioners could transfer and encode their knowledge and experience into actionable artifacts. Velociraptor’s artifacts help to bring this vision to life — allowing experienced users to encode their workflow in VQL artifacts opens these techniques up to be used by other practitioners in a more consistent and automated fashion. We hope to inspire a vibrant community of VQL Artifact authors to facilitate exchange of experience, techniques and approaches between practitioners and researchers alike.
我们在本文开头想象了一个世界,经验丰富的法医从业者可以将他们的知识和经验转移并编码为可操作的工件。 Velociraptor 的工件有助于将这一愿景变为现实 - 允许经验丰富的用户在 VQL 工件中编码其工作流程,从而使这些技术可供其他从业者以更加一致和自动化的方式使用。我们希望激发一个充满活力的 VQL Artifact 作者社区,促进从业者和研究人员之间的经验、技术和方法的交流。

Velociraptor is available under an open source license on GitHub. You can download the latest Velociraptor release and use it immediately, or clone the source repository and contribute to the project. You can also contribute VQL snippets or artifacts directly to the project in order to share commonly used artifacts with the larger community.
Velociraptor 可根据 GitHub 上的开源许可证使用。您可以下载最新的 Velociraptor 版本并立即使用它,或者克隆源存储库并为项目做出贡献。您还可以直接向项目贡献 VQL 片段或工件,以便与更大的社区共享常用工件。

About the author: Mike Cohen is a digital forensic researcher and senior software engineer. He has supported leading open-source DFIR projects including as a core developer of Volatility and lead developer of both Rekall and Google’s Grr Rapid Response. Mike has founded Velocidex in 2018 after working at Google for the previous 8 years in developing cutting edge DFIR tools. Velocidex is the company behind the Velociraptor open source endpoint visibility tool.
作者简介:Mike Cohen 是一名数字取证研究员和高级软件工程师。他支持领先的开源 DFIR 项目,包括作为 Volatility 的核心开发人员以及 Rekall 和 Google 的 Grr Rapid Response 的首席开发人员。 Mike 在 Google 工作了 8 年,致力于开发尖端 DFIR 工具,之后于 2018 年创立了 Velocidex。 Velocidex 是 Velociraptor 开源端点可见性工具背后的公司。