1. Overview 1. 概述
1.1. Introduction 1.1. 简介
pugixml is a light-weight C++ XML processing library. It consists of a DOM-like interface with rich traversal/modification capabilities, an extremely fast XML parser which constructs the DOM tree from an XML file/buffer, and an XPath 1.0 implementation for complex data-driven tree queries. Full Unicode support is also available, with two Unicode interface variants and conversions between different Unicode encodings (which happen automatically during parsing/saving). The library is extremely portable and easy to integrate and use. pugixml is developed and maintained since 2006 and has many users. All code is distributed under the MIT license, making it completely free to use in both open-source and proprietary applications.
pugixml 是一个轻量级的 C++ XML 处理库。它由一个具有丰富遍历/修改能力的类似 DOM 的接口、一个从 XML 文件/缓冲区构建 DOM 树的极快 XML 解析器,以及一个用于复杂数据驱动树查询的 XPath 1.0 实现组成。此外,还提供了完整的 Unicode 支持,包括两个 Unicode 接口变体和不同 Unicode 编码之间的转换(在解析/保存期间自动发生)。该库非常便携,易于集成和使用。PugiXML 自 2006 年以来开发和维护,拥有众多用户。所有代码均在 MIT 许可证下分发,使其完全免费用于开源和专有应用程序。
pugixml enables very fast, convenient and memory-efficient XML document processing. However, since pugixml has a DOM parser, it can’t process XML documents that do not fit in memory; also the parser is a non-validating one, so if you need DTD or XML Schema validation, the library is not for you.
pugixml 支持非常快速、方便且节省内存的 XML 文档处理。但是,由于 pugixml 有一个 DOM 解析器,它无法处理不适合内存的 XML 文档;此外,解析器是非验证器,因此如果您需要 DTD 或 XML Schema 验证,则该库不适合您。
This is the complete manual for pugixml, which describes all features of the library in detail. If you want to start writing code as quickly as possible, you are advised to read the quick start guide first.
这是 pugixml 的完整手册,它详细描述了该库的所有功能。如果你想尽快开始编写代码,建议你先阅读 快速入门指南。
Note 注意
|
No documentation is perfect; neither is this one. If you find errors or omissions, please don’t hesitate to submit an issue or open a pull request with a fix.
没有文档是完美的;这个也不是。如果您发现错误或遗漏,请随时提交问题或打开包含修复程序的拉取请求。 |
1.2. Feedback 1.2. 反馈
If you believe you’ve found a bug in pugixml (bugs include compilation problems (errors/warnings), crashes, performance degradation and incorrect behavior), please file an issue via issue submission form. Be sure to include the relevant information so that the bug can be reproduced: the version of pugixml, compiler version and target architecture, the code that uses pugixml and exhibits the bug, etc.
如果您认为您在 pugixml 中发现了错误(错误包括编译问题(错误/警告)、崩溃、性能下降和错误行为),请通过问题提交表单提交问题。请务必包含相关信息,以便可以重现 Bug:pugixml 的版本、编译器版本和目标架构、使用 pugixml 并显示 Bug 的代码等。
Feature requests can be reported the same way as bugs, so if you’re missing some functionality in pugixml or if the API is rough in some places and you can suggest an improvement, file an issue. However please note that there are many factors when considering API changes (compatibility with previous versions, API redundancy, etc.), so generally features that can be implemented via a small function without pugixml modification are not accepted. However, all rules have exceptions.
功能请求的报告方式与 bug 相同,因此,如果您在 pugixml 中缺少某些功能,或者 API 在某些地方很粗糙,您可以提出改进建议,请提交问题。但是请注意,在考虑 API 更改时有很多因素(与以前版本的兼容性、API 冗余等),因此通常不接受可以通过小函数实现而无需修改 pugixml 的功能。但是,所有规则都有例外。
If you have a contribution to pugixml, such as build script for some build system/IDE, or a well-designed set of helper functions, or a binding to some language other than C++, please file an issue or open a pull request. Your contribution has to be distributed under the terms of a license that’s compatible with pugixml license; i.e. GPL/LGPL licensed code is not accepted.
如果您对 pugixml 有贡献,例如某些构建系统/IDE 的构建脚本,或一组设计良好的帮助程序函数,或绑定到 C++ 以外的某些语言,请提交问题或打开拉取请求。您的贡献必须根据与 pugixml 许可证兼容的许可证条款进行分发;即不接受 GPL/LGPL 许可代码。
If filing an issue is not possible due to privacy or other concerns, you can contact pugixml author by e-mail directly: arseny.kapoulkine@gmail.com.
如果由于隐私或其他问题而无法提交问题,您可以直接通过电子邮件联系 pugixml 作者:arseny.kapoulkine@gmail.com。
1.3. Acknowledgments 1.3. 致谢
pugixml could not be developed without the help from many people; some of them are listed in this section. If you’ve played a part in pugixml development and you can not find yourself on this list, I’m truly sorry; please send me an e-mail so I can fix this.
没有许多人的帮助,Pugixml 是无法开发的;本节列出了其中一些。如果你参与了 pugixml 的开发,但你不能在这个列表中找到自己,我真的很抱歉;请给我发一封电子邮件,以便我解决这个问题。
Thanks to Kristen Wegner for pugxml parser, which was used as a basis for pugixml.
感谢 Kristen Wegner 的 pugxml 解析器,它被用作 pugixml 的基础。
Thanks to Neville Franks for contributions to pugxml parser.
感谢 Neville Franks 对 pugxml 解析器的贡献。
Thanks to Artyom Palvelev for suggesting a lazy gap contraction approach.
感谢 Artyom Palvelev 提出的懒惰间隙收缩方法。
Thanks to Vyacheslav Egorov for documentation proofreading and fuzz testing.
感谢 Vyacheslav Egorov 提供文档校对和模糊测试。
1.4. License 1.4. 许可证
The pugixml library is distributed under the MIT license:
pugixml 库在 MIT 许可证下分发:
Copyright (c) 2006-2023 Arseny Kapoulkine Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
This means that you can freely use pugixml in your applications, both open-source and proprietary. If you use pugixml in a product, it is sufficient to add an acknowledgment like this to the product distribution:
这意味着您可以在您的应用程序中自由使用 pugixml,包括开源和专有应用程序。如果您在产品中使用 pugixml,则向产品分发添加如下致谢就足够了:
This software is based on pugixml library (https://pugixml.org). pugixml is Copyright (C) 2006-2023 Arseny Kapoulkine.
2. Installation 2. 安装
2.1. Getting pugixml 2.1. 获取 pugixml
pugixml is distributed in source form. You can either download a source distribution or clone the Git repository.
pugixml 以源代码形式分发。您可以下载源代码分发或克隆 Git 存储库。
2.1.1. Source distributions
2.1.1. 源码分发
You can download the latest source distribution as an archive:
您可以将最新的源代码分发作为存档下载:
pugixml-1.14.zip (Windows line endings)
/
pugixml-1.14.tar.gz (Unix line endings)
pugixml-1.14.zip (Windows 行尾) / pugixml-1.14.tar.gz (Unix 行尾)
The distribution contains library source, documentation (the manual you’re reading now and the quick start guide) and some code examples. After downloading the distribution, install pugixml by extracting all files from the compressed archive.
该发行版包含库源代码、文档(您现在正在阅读的手册和快速入门指南)和一些代码示例。下载发行版后,通过从压缩存档中提取所有文件来安装 pugixml。
If you need an older version, you can download it from the version archive.
如果您需要旧版本,可以从版本存档下载。
2.1.2. Git repository 2.1.2. Git 仓库
The Git repository is located at https://github.com/zeux/pugixml/. There is a Git tag "v{version}" for each version; also there is the "latest" tag, which always points to the latest stable release.
Git 存储库位于 https://github.com/zeux/pugixml/。每个版本都有一个 Git 标签 “v{version}”;此外,还有 “latest” 标签,它总是指向最新的稳定版本。
For example, to checkout the current version, you can use this command:
例如,要签出当前版本,您可以使用以下命令:
git clone https://github.com/zeux/pugixml
cd pugixml
git checkout v1.14
The repository contains library source, documentation, code examples and full unit test suite.
存储库包含库源代码、文档、代码示例和完整的单元测试套件。
Use tag if you want to automatically get new versions. Use other tags if you want to switch to new versions only explicitly. Also please note that the master branch contains the work-in-progress version of the code; while this means that you can get new features and bug fixes from master without waiting for a new release, this also means that occasionally the code can be broken in some configurations.latest
如果要自动获取新版本,请使用 tag。如果您只想显式切换到新版本,请使用其他标签。另请注意,master 分支包含代码的 work-in-progress 版本;虽然这意味着您可以从 master 获得新功能和错误修复,而无需等待新版本,但这也意味着在某些配置中有时代码可能会损坏。最近的
2.1.3. Subversion repository
2.1.3. Subversion 仓库
You can access the Git repository via Subversion using https://github.com/zeux/pugixml URL. For example, to checkout the current version, you can use this command:
您可以使用 https://github.com/zeux/pugixml URL 通过 Subversion 访问 Git 存储库。例如,要签出当前版本,您可以使用以下命令:
svn checkout https://github.com/zeux/pugixml/tags/v1.14 pugixml
2.1.4. Packages 2.1.4. 软件包
pugixml is available as a package via various package managers. Note that most packages are maintained separately from the main repository so they do not necessarily contain the latest version.
pugixml 可以通过各种包管理器作为包获得。请注意,大多数软件包都与主存储库分开维护,因此它们不一定包含最新版本。
Here’s an incomplete list of pugixml packages in various systems:
以下是各种系统中 pugixml 软件包的不完整列表:
-
Linux (Ubuntu, Debian, Fedora, Arch Linux, other distributions)
Linux(Ubuntu、Debian、Fedora、Arch Linux、其他发行版)
2.2. Building pugixml 2.2. 构建 pugixml
pugixml is distributed in source form without any pre-built binaries; you have to build them yourself.
pugixml 以源代码形式分发,没有任何预构建的二进制文件;你必须自己构建它们。
The complete pugixml source consists of three files - one source file, , and two header files, and . is the primary header which you need to include in order to use pugixml classes/functions; is a supplementary configuration file (see Additional configuration options). The rest of this guide assumes that is either in the current directory or in one of include directories of your projects, so that can find the header; however you can also use relative path (i.e. ) or include directory-relative path (i.e. ).pugixml.cpp
pugixml.hpp
pugiconfig.hpp
pugixml.hpp
pugiconfig.hpp
pugixml.hpp
#include "pugixml.hpp"
#include "../libs/pugixml/src/pugixml.hpp"
#include <xml/thirdparty/pugixml/src/pugixml.hpp>
完整的 pugixml 源文件由三个文件组成 - 一个源文件、 、 两个头文件和 .是使用 PugiXML 类/函数需要包含的主要头文件;是补充配置文件(请参阅 其他配置选项)。本指南的其余部分假设它位于当前目录或项目的 include 目录中,以便可以找到标头;但是,您也可以使用相对路径(即 )或包含目录相对路径(即 )。pugixml.cpp
pugixml.hpp
pugiconfig.hpp
pugixml.hpp
pugiconfig.hpp
pugixml.hpp
#include “pugixml.hpp”
#include "../libs/pugixml/src/pugixml.hpp"
#include <xml/thirdparty/pugixml/src/pugixml.hpp>
2.2.1. Building pugixml as a part of another static library/executable
2.2.1. 将 pugixml 构建为另一个静态库/可执行文件的一部分
The easiest way to build pugixml is to compile the source file, , along with the existing library/executable. This process depends on the method of building your application; for example, if you’re using Microsoft Visual Studio [1], Apple Xcode, Code::Blocks or any other IDE, just add pugixml.cpp
to one of your projects.pugixml.cpp
构建 pugixml 的最简单方法是编译源文件 ,以及现有的库/可执行文件。此过程取决于构建应用程序的方法;例如,如果您使用的是 Microsoft Visual Studio [1]、Apple Xcode、Code::Blocks 或任何其他 IDE,只需将 pugixml.cpp
添加到您的一个项目中即可。pugixml.cpp
If you’re using Microsoft Visual Studio and the project has precompiled headers turned on, you’ll see the following error messages:
如果使用的是 Microsoft Visual Studio,并且项目启用了预编译标头,则会看到以下错误消息:
pugixml.cpp(3477) : fatal error C1010: unexpected end of file while looking for precompiled header. Did you forget to add '#include "stdafx.h"' to your source?
The correct way to resolve this is to disable precompiled headers for ; you have to set "Create/Use Precompiled Header" option (Properties dialog → C/C++ → Precompiled Headers → Create/Use Precompiled Header) to "Not Using Precompiled Headers". You’ll have to do it for all project configurations/platforms (you can select Configuration "All Configurations" and Platform "All Platforms" before editing the option):pugixml.cpp
解决此问题的正确方法是禁用 ;必须将“创建/使用预编译头”选项(C/C++ → 预编译头→创建/使用预编译头→“属性”对话框)设置为“不使用预编译头”。您必须对所有项目配置/平台执行此操作(您可以在编辑选项之前选择配置 “All Configurations” 和 Platform “All Platforms”):pugixml.cpp
2.2.2. Building pugixml as a standalone static library
2.2.2. 将 pugixml 构建为独立的静态库
It’s possible to compile pugixml as a standalone static library. This process depends on the method of building your application; pugixml distribution comes with project files for several popular IDEs/build systems. There are project files for Apple XCode, Code::Blocks, Codelite, Microsoft Visual Studio 2005, 2008, 2010+, and configuration scripts for CMake and premake4. You’re welcome to submit project files/build scripts for other software; see Feedback.
可以将 pugixml 编译为独立的静态库。此过程取决于构建应用程序的方法;pugixml 发行版附带了几个流行的 IDE/构建系统的项目文件。有适用于 Apple XCode、Code::Blocks、Codelite、Microsoft Visual Studio 2005、2008、2010+ 的项目文件,以及用于 CMake 和 premake4 的配置脚本。欢迎您提交其他软件的项目文件/构建脚本;请参阅 反馈。
There are two projects for each version of Microsoft Visual Studio: one for dynamically linked CRT, which has a name like , and another one for statically linked CRT, which has a name like . You should select the version that matches the CRT used in your application; the default option for new projects created by Microsoft Visual Studio is dynamically linked CRT, so unless you changed the defaults, you should use the version with dynamic CRT (i.e. for Microsoft Visual Studio 2008).pugixml_vs2008.vcproj
pugixml_vs2008_static.vcproj
pugixml_vs2008.vcproj
每个版本的 Microsoft Visual Studio 都有两个项目:一个用于动态链接的 CRT,其名称类似于 ,另一个用于静态链接的 CRT,其名称类似于 。您应该选择与应用程序中使用的 CRT 匹配的版本;由 Microsoft Visual Studio 创建的新项目的默认选项是动态链接的 CRT,因此,除非您更改了默认值,否则应使用具有动态 CRT 的版本(即 Microsoft Visual Studio 2008)。pugixml_vs2008.vcproj
pugixml_vs2008_static.vcproj
pugixml_vs2008.vcproj
In addition to adding pugixml project to your workspace, you’ll have to make sure that your application links with pugixml library. If you’re using Microsoft Visual Studio 2005/2008, you can add a dependency from your application project to pugixml one. If you’re using Microsoft Visual Studio 2010+, you’ll have to add a reference to your application project instead. For other IDEs/systems, consult the relevant documentation.
除了将 pugixml 项目添加到工作区之外,还必须确保应用程序与 pugixml 库链接。如果您使用的是 Microsoft Visual Studio 2005/2008,则可以将应用程序项目中的依赖项添加到 pugixml 中。如果您使用的是 Microsoft Visual Studio 2010+,则必须添加对应用程序项目的引用。对于其他 IDE/系统,请参阅相关文档。
Microsoft Visual Studio 2005/2008 | Microsoft Visual Studio 2010+ | ||
---|---|---|---|
2.2.3. Building pugixml as a standalone shared library
2.2.3. 将 pugixml 构建为独立的共享库
It’s possible to compile pugixml as a standalone shared library. The process is usually similar to the static library approach; however, no preconfigured projects/scripts are included into pugixml distribution, so you’ll have to do it yourself. Generally, if you’re using GCC-based toolchain, the process does not differ from building any other library as DLL (adding -shared to compilation flags should suffice); if you’re using MSVC-based toolchain, you’ll have to explicitly mark exported symbols with a declspec attribute. You can do it by defining PUGIXML_API macro, i.e. via :pugiconfig.hpp
可以将 pugixml 编译为独立的共享库。该过程通常类似于静态库方法;但是,PugiXML 发行版中没有包含预配置的项目/脚本,因此您必须自己完成。通常,如果您使用的是基于 GCC 的工具链,则该过程与构建任何其他库作为 DLL 没有什么不同(将 -shared 添加到编译标志就足够了);如果您使用的是基于 MSVC 的工具链,则必须使用 declspec 属性显式标记导出的符号。你可以通过定义 PUGIXML_API 宏来实现,即通过 :pugiconfig.hpp
#ifdef _DLL
#define PUGIXML_API __declspec(dllexport)
#else
#define PUGIXML_API __declspec(dllimport)
#endif
Caution 谨慎
|
If you’re using STL-related functions, you should use the shared runtime library to ensure that a single heap is used for STL allocations in your application and in pugixml; in MSVC, this means selecting the 'Multithreaded DLL' or 'Multithreaded Debug DLL' to 'Runtime library' property ( or linker switch). You should also make sure that your runtime library choice is consistent between different projects.
/MD /MDd 如果使用的是 STL 相关函数,则应使用共享运行时库来确保在应用程序和 pugixml 中使用单个堆进行 STL 分配;在 MSVC 中,这意味着选择“多线程 DLL”或“多线程调试 DLL”到“运行时库”属性(或链接器开关)。您还应确保 Runtime Library 的选择在不同项目之间保持一致。 /MD /MDd |
2.2.4. Using pugixml in header-only mode
2.2.4. 在仅头文件模式下使用 pugixml
It’s possible to use pugixml in header-only mode. This means that all source code for pugixml will be included in every translation unit that includes . This is how most of Boost and STL libraries work.pugixml.hpp
可以在 header-only 模式下使用 pugixml。这意味着 pugixml 的所有源代码都将包含在包含 的每个翻译单元中。这就是大多数 Boost 和 STL 库的工作方式。pugixml.hpp
Note that there are advantages and drawbacks of this approach. Header mode may improve tree traversal/modification performance (because many simple functions will be inlined), if your compiler toolchain does not support link-time optimization, or if you have it turned off (with link-time optimization the performance should be similar to non-header mode). However, since compiler now has to compile pugixml source once for each translation unit that includes it, compilation times may increase noticeably. If you want to use pugixml in header mode but do not need XPath support, you can consider disabling it by using PUGIXML_NO_XPATH define to improve compilation time.
请注意,此方法有优点和缺点。如果您的编译器工具链不支持链接时优化,或者您将其关闭(使用链接时优化,性能应类似于非标头模式),则 Header 模式可以提高树遍历/修改性能(因为将内联许多简单的函数)。但是,由于编译器现在必须为包含它的每个翻译单元编译一次 pugixml 源代码,因此编译时间可能会显著增加。如果您想在头文件模式下使用 pugixml,但不需要 XPath 支持,您可以考虑使用 PUGIXML_NO_XPATH define 来禁用它,以缩短编译时间。
To enable header-only mode, you have to define . You can either do it in , or provide them via compiler command-line.PUGIXML_HEADER_ONLY
pugiconfig.hpp
要启用仅标头模式,必须定义 .您可以在 中执行此操作,也可以通过编译器命令行提供它们。PUGIXML_HEADER_ONLY
pugiconfig.hpp
Note that it is safe to compile if is defined - so if you want to i.e. use header-only mode only in Release configuration, you
can include pugixml.cpp in your project (see Building pugixml as a part of another static library/executable), and conditionally enable header-only mode in like this:pugixml.cpp
PUGIXML_HEADER_ONLY
pugiconfig.hpp
请注意,如果定义了,则编译是安全的 - 因此,如果您想仅在 Release 配置中使用仅头文件模式,则可以在项目中包含pugixml.cpp(参见 将 pugixml 构建为另一个静态库/可执行文件的一部分),并有条件地启用仅头文件模式,如下所示:在
pugiconfig.hpp
PUGIXML_HEADER_ONLY
pugixml.cpp
#ifndef _DEBUG
#define PUGIXML_HEADER_ONLY
#endif
2.2.5. Additional configuration options
2.2.5. 其他配置选项
pugixml uses several defines to control the compilation process. There are two ways to define them: either put the needed definitions to (it has some examples that are commented out) or provide them via compiler command-line. Consistency is important: the definitions should match in all source files that include (including pugixml sources) throughout the application. Adding defines to lets you guarantee this, unless your macro definition is wrapped in preprocessor / directive and this directive is not consistent. will never contain anything but comments, which means that when upgrading to a new version, you can safely leave your modified version intact.pugiconfig.hpp
pugixml.hpp
pugiconfig.hpp
#if
#ifdef
pugiconfig.hpp
Pugixml 使用多个 define 来控制编译过程。有两种方法可以定义它们:将所需的定义放入 (它有一些被注释掉的示例) 或通过编译器命令行提供它们。一致性很重要:定义应该在整个应用程序中包含(包括 pugixml 源)的所有源文件中匹配。添加 defines to 可以保证这一点,除非你的宏定义包装在预处理器 / 指令中,并且这个指令不一致。将不包含除注释之外的任何内容,这意味着在升级到新版本时,您可以安全地保持修改后的版本不变。pugiconfig.hpp
pugixml.hpp
pugiconfig.hpp
#if
#ifdef
pugiconfig.hpp
PUGIXML_WCHAR_MODE
define toggles between UTF-8 style interface (the in-memory text encoding is assumed to be UTF-8, most functions use as character type) and UTF-16/32 style interface (the in-memory text encoding is assumed to be UTF-16/32, depending on size, most functions use as character type). See Unicode interface for more details.char
wchar_t
wchar_t
PUGIXML_WCHAR_MODE
定义 UTF-8 样式接口(假设内存中文本编码为 UTF-8,大多数函数用作字符类型)和 UTF-16/32 样式接口(假定内存中文本编码为 UTF-16/32,具体取决于大小,大多数函数用作字符类型)之间的切换。有关更多详细信息,请参阅 Unicode 接口。char
wchar_t
wchar_t
PUGIXML_COMPACT
define activates a different internal representation of document storage that is much more memory efficient for documents with a lot of markup (i.e. nodes and attributes), but is slightly slower to parse and access. For details see Compact mode.PUGIXML_COMPACT
定义会激活文档存储的不同内部表示形式,对于具有大量标记(即节点和属性)的文档,该表示形式可以提高内存效率,但解析和访问速度稍慢。有关详细信息,请参阅紧凑模式。
PUGIXML_NO_XPATH
define disables XPath. Both XPath interfaces and XPath implementation are excluded from compilation. This option is provided in case you do not need XPath functionality and need to save code space.PUGIXML_NO_XPATH
定义将禁用 XPath。XPath 接口和 XPath 实现都被排除在编译之外。如果您不需要 XPath 功能并且需要节省代码空间,则提供此选项。
PUGIXML_NO_STL
define disables use of STL in pugixml. The functions that operate on STL types are no longer present (i.e. load/save via iostream) if this macro is defined. This option is provided in case your target platform does not have a standard-compliant STL implementation.PUGIXML_NO_STL
define 禁止在 pugixml 中使用 STL。如果定义了此宏,则对 STL 类型进行操作的函数将不再存在(即通过 iostream 加载/保存)。如果您的目标平台没有符合标准的 STL 实施,则提供此选项。
PUGIXML_NO_EXCEPTIONS
define disables use of exceptions in pugixml. This option is provided in case your target platform does not have exception handling capabilities.PUGIXML_NO_EXCEPTIONS
define 禁止在 pugixml 中使用异常。如果您的目标平台没有异常处理功能,则提供此选项。
PUGIXML_API
, and defines let you specify custom attributes (i.e. declspec or calling conventions) for pugixml classes and non-member functions. In absence of or definitions, definition is used instead. For example, to specify fixed calling convention, you can define to i.e. . Another example is DLL import/export attributes in MSVC (see Building pugixml as a standalone shared library).PUGIXML_CLASS
PUGIXML_FUNCTION
PUGIXML_CLASS
PUGIXML_FUNCTION
PUGIXML_API
PUGIXML_FUNCTION
__fastcall
PUGIXML_API
和 defines 允许您为 Pugixml 类和非成员函数指定自定义属性(即 declspec 或调用约定)。在没有 or 定义的情况下,改用定义。例如,要指定固定调用约定,您可以定义 i.e.另一个示例是 MSVC 中的 DLL 导入/导出属性(请参阅将 pugixml 构建为独立的共享库)。PUGIXML_CLASS
PUGIXML_FUNCTION
PUGIXML_CLASS
PUGIXML_FUNCTION
PUGIXML_API
PUGIXML_FUNCTION
__fastcall
Note 注意
|
In that example is inconsistent between several source files; this is an exception to the consistency rule.
PUGIXML_API 在该示例中,多个源文件之间不一致;这是一致性规则的一个例外。 PUGIXML_API |
PUGIXML_MEMORY_PAGE_SIZE
, and can be used to customize certain important sizes to optimize memory usage for the application-specific patterns. For details see Memory consumption tuning.PUGIXML_MEMORY_OUTPUT_STACK
PUGIXML_MEMORY_XPATH_PAGE_SIZE
PUGIXML_MEMORY_PAGE_SIZE
,并可用于自定义某些重要大小,以优化特定于应用程序的模式的内存使用。有关详细信息,请参阅内存消耗优化。
PUGIXML_MEMORY_XPATH_PAGE_SIZE
PUGIXML_MEMORY_OUTPUT_STACK
PUGIXML_HAS_LONG_LONG
define enables support for type in pugixml. This define is automatically enabled if your platform is known to have support (i.e. has C++11 support or uses a reasonably modern version of a known compiler); if pugixml does not recognize that your platform supports but in fact it does, you can enable the define manually.long long
long long
long long
PUGIXML_HAS_LONG_LONG
define 启用对 pugixml 中 type 的支持。如果您的平台已知支持(即支持 C++11 或使用已知编译器的合理现代版本),则会自动启用此定义;如果 Pugixml 无法识别您的平台支持,但实际上它支持,您可以手动启用 Define。长 长
长
长 长 长
2.3. Portability 2.3. 可移植性
pugixml is written in standard-compliant C++ with some compiler-specific workarounds where appropriate. pugixml is compatible with the C++11 standard, but does not require C++11 support. Each version is tested with a unit test suite with code coverage exceeding 99%.
pugixml 是用符合标准的 C++ 编写的,并在适当的情况下提供了一些特定于编译器的解决方法。pugixml 与 C++11 标准兼容,但不需要 C++11 支持。每个版本都使用代码覆盖率超过 99% 的单元测试套件进行测试。
pugixml runs on a variety of desktop platforms (including Microsoft Windows, Linux, FreeBSD, Apple MacOSX and Sun Solaris), game consoles (inclusing Microsoft Xbox 360, Microsoft Xbox One, Nintendo Wii, Sony Playstation Portable and Sony Playstation 3) and mobile platforms (including Android, iOS, BlackBerry, Samsung bada and Microsoft Windows CE).
pugixml 可在各种桌面平台(包括 Microsoft Windows、Linux、FreeBSD、Apple MacOSX 和 Sun Solaris)、游戏机(包括 Microsoft Xbox 360、Microsoft Xbox One、Nintendo Wii、Sony Playstation Portable 和 Sony Playstation 3)和移动平台(包括 Android、iOS、BlackBerry、Samsung bada 和 Microsoft Windows CE)上运行。
pugixml supports various architectures, such as x86/x86-64, PowerPC, ARM, MIPS and SPARC. In general it should run on any architecture since it does not use architecture-specific code and does not rely on features such as unaligned memory access.
pugixml 支持各种架构,如 x86/x86-64、PowerPC、ARM、MIPS 和 SPARC。一般来说,它应该在任何架构上运行,因为它不使用特定于架构的代码,也不依赖于非对齐内存访问等功能。
pugixml can be compiled using any C++ compiler; it was tested with all versions of Microsoft Visual C++ from 6.0 up to 2015, GCC from 3.4 up to 5.2, Clang from 3.2 up to 3.7, as well as a variety of other compilers (e.g. Borland C++, Digital Mars C++, Intel C++, Metrowerks CodeWarrior and PathScale). The code is written to avoid compilation warnings even on reasonably high warning levels.
pugixml 可以使用任何 C++ 编译器进行编译;它已使用 Microsoft Visual C++ 的所有版本(从 6.0 到 2015)、GCC 3.4 到 5.2、Clang 3.2 到 3.7 以及各种其他编译器(例如 Borland C++、Digital Mars C++、Intel C++、Metrowerks CodeWarrior 和 PathScale)进行了测试。编写该代码是为了避免编译警告,即使在相当高的警告级别上也是如此。
Note that some platforms may have very bare-bones support of C++; in some cases you’ll have to use and/or to compile without issues. This mostly applies to old game consoles and embedded systems.PUGIXML_NO_STL
PUGIXML_NO_EXCEPTIONS
请注意,某些平台可能对 C++ 提供非常基本的支持;在某些情况下,您必须使用和/或编译而不会出现问题。这主要适用于旧游戏机和嵌入式系统。PUGIXML_NO_STL
PUGIXML_NO_EXCEPTIONS
3. Document object model 3. 文档对象模型
pugixml stores XML data in DOM-like way: the entire XML document (both document structure and element data) is stored in memory as a tree. The tree can be loaded from a character stream (file, string, C++ I/O stream), then traversed with the special API or XPath expressions. The whole tree is mutable: both node structure and node/attribute data can be changed at any time. Finally, the result of document transformations can be saved to a character stream (file, C++ I/O stream or custom transport).
pugixml 以类似 DOM 的方式存储 XML 数据:整个 XML 文档(包括文档结构和元素数据)作为树存储在内存中。可以从字符流(文件、字符串、C++ I/O 流)加载树,然后使用特殊的 API 或 XPath 表达式进行遍历。整个树是可变的:节点结构和节点/属性数据都可以随时更改。最后,文档转换的结果可以保存到字符流(文件、C++ I/O 流或自定义传输)。
3.1. Tree structure 3.1. 树结构
The XML document is represented with a tree data structure. The root of the tree is the document itself, which corresponds to C++ type xml_document. Document has one or more child nodes, which correspond to C++ type xml_node. Nodes have different types; depending on a type, a node can have a collection of child nodes, a collection of attributes, which correspond to C++ type xml_attribute, and some additional data (i.e. name).
XML 文档用树数据结构表示。树的根是文档本身,它对应于C++类型 xml_document。Document 具有一个或多个子节点,这些子节点对应于C++类型 xml_node。节点具有不同的类型;根据类型,节点可以具有子节点的集合、与类型 xml_attribute 对应的属性C++集合以及一些附加数据(即 名称)。
The tree nodes can be of one of the following types (which together form the enumeration ):xml_node_type
树节点可以是以下类型之一(它们共同构成枚举):xml_node_type
-
Document node () - this is the root of the tree, which consists of several child nodes. This node corresponds to xml_document class; note that xml_document is a sub-class of xml_node, so the entire node interface is also available. However, document node is special in several ways, which are covered below. There can be only one document node in the tree; document node does not have any XML representation. Document generally has one child element node (see ), although documents parsed from XML fragments (see ) can have more than one.
node_document
document_element()
parse_fragment
文档节点 () - 这是树的根,由多个子节点组成。此节点对应于xml_document类;请注意,xml_document 是 xml_node 的子类,因此整个 Node 接口也可用。但是,document 节点在几个方面很特殊,下面将介绍这些方面。树中只能有一个文档节点;document 节点没有任何 XML 表示形式。文档通常有一个子元素节点(请参阅 ),尽管从XML片段解析的文档(请参阅 )可以有多个。node_document
document_element() parse_fragment
-
Element/tag node () - this is the most common type of node, which represents XML elements. Element nodes have a name, a collection of attributes and a collection of child nodes (both of which may be empty). The attribute is a simple name/value pair. The example XML representation of element nodes is as follows:
node_element
元素/标记节点 () - 这是最常见的节点类型,表示 XML 元素。元素节点具有名称、属性集合和子节点集合(这两个子节点都可以为空)。该属性是一个简单的名称/值对。元素节点的示例 XML 表示形式如下:node_element
<node attr="value"><child/></node>
There are two element nodes here: one has name , single attribute and single child , another has name and does not have any attributes or child nodes.
"node"
"attr"
"child"
"child"
这里有两个元素节点: 一个 具有 name 、 single attribute 和 single child ,另一个具有 name 并且没有任何属性或子节点。“node”
“attr”“
child”“
child”
-
Plain character data nodes () represent plain text in XML. PCDATA nodes have a value, but do not have a name or children/attributes. Note that plain character data is not a part of the element node but instead has its own node; an element node can have several child PCDATA nodes. The example XML representation of text nodes is as follows:
node_pcdata
纯字符数据节点 () 表示 XML 中的纯文本。PCDATA 节点有一个值,但没有 name 或 children/attributes。请注意,纯字符数据不是 element 节点的一部分,而是有自己的节点;一个元素节点可以有多个子 PCDATA 节点。文本节点的示例 XML 表示形式如下:node_pcdata
<node> text1 <child/> text2 </node>
Here element has three children, two of which are PCDATA nodes with values and .
"node"
" text1 "
" text2 "
此处元素有三个子项,其中两个是具有值和 的 PCDATA 节点。“节点”
“ 文本 1 ”
“ 文本 2 ”
-
Character data nodes () represent text in XML that is quoted in a special way. CDATA nodes do not differ from PCDATA nodes except in XML representation - the above text example looks like this with CDATA:
node_cdata
字符数据节点 () 表示以特殊方式引用的 XML 中的文本。CDATA 节点与 PCDATA 节点没有区别,但 XML 表示形式除外 - 上面的文本示例类似于 CDATA:node_cdata
<node> <![CDATA[text1]]> <child/> <![CDATA[text2]]> </node>
CDATA nodes make it easy to include non-escaped , and characters in plain text. CDATA value can not contain the character sequence , since it is used to determine the end of node contents.
<
&
>
]]>
CDATA 节点可以轻松地在纯文本中包含 non-escaped 和 characters 。CDATA 值不能包含 character sequence ,因为它用于确定节点内容的结尾。<
&
>
]]>
-
Comment nodes () represent comments in XML. Comment nodes have a value, but do not have a name or children/attributes. The example XML representation of a comment node is as follows:
node_comment
注释节点 () 表示 XML 中的注释。注释节点具有值,但没有名称或 children/attributes。注释节点的示例 XML 表示形式如下:node_comment
<!-- comment text -->
Here the comment node has value . By default comment nodes are treated as non-essential part of XML markup and are not loaded during XML parsing. You can override this behavior with parse_comments flag.
"comment text"
此处的 comment 节点具有 value 。默认情况下,注释节点被视为 XML 标记的非必要部分,并且在 XML 解析期间不会加载。您可以使用 parse_comments 标志覆盖此行为。“评论文本”
-
Processing instruction node () represent processing instructions (PI) in XML. PI nodes have a name and an optional value, but do not have children/attributes. The example XML representation of a PI node is as follows:
node_pi
处理指令节点 () 表示 XML 中的处理指令 (PI)。PI 节点具有名称和可选值,但没有 children/attributes。PI 节点的示例 XML 表示形式如下:node_pi
<?name value?>
Here the name (also called PI target) is , and the value is . By default PI nodes are treated as non-essential part of XML markup and are not loaded during XML parsing. You can override this behavior with parse_pi flag.
"name"
"value"
此处的名称(也称为 PI target)为 ,值为 。默认情况下,PI 节点被视为 XML 标记的非必要部分,并且在 XML 解析期间不会加载。您可以使用 parse_pi 标志覆盖此行为。“名称”
“值”
-
Declaration node () represents document declarations in XML. Declaration nodes have a name () and an optional collection of attributes, but do not have value or children. There can be only one declaration node in a document; moreover, it should be the topmost node (its parent should be the document). The example XML representation of a declaration node is as follows:
node_declaration
"xml"
声明节点 () 表示 XML 中的文档声明。声明节点具有 name () 和可选的属性集合,但没有 value 或 children。文档中只能有一个声明节点;此外,它应该是最顶层的节点(其父节点应该是 document)。声明节点的示例 XML 表示形式如下:node_declaration
“xml”
<?xml version="1.0"?>
Here the node has name and a single attribute with name and value . By default declaration nodes are treated as non-essential part of XML markup and are not loaded during XML parsing. You can override this behavior with parse_declaration flag. Also, by default a dummy declaration is output when XML document is saved unless there is already a declaration in the document; you can disable this with format_no_declaration flag.
"xml"
"version"
"1.0"
此处的节点具有 name 和具有 name 和 value 的单个属性。默认情况下,声明节点被视为 XML 标记的非必要部分,并且在 XML 解析期间不会加载。您可以使用 parse_declaration 标志覆盖此行为。此外,默认情况下,在保存 XML 文档时,除非文档中已经有声明,否则会输出虚拟声明;您可以使用 format_no_declaration 标志禁用此功能。“xml”
“版本”
“1.0”
-
Document type declaration node () represents document type declarations in XML. Document type declaration nodes have a value, which corresponds to the entire document type contents; no additional nodes are created for inner elements like . There can be only one document type declaration node in a document; moreover, it should be the topmost node (its parent should be the document). The example XML representation of a document type declaration node is as follows:
node_doctype
<!ENTITY>
文档类型声明节点 () 表示 XML 中的文档类型声明。文档类型声明节点具有一个值,该值对应于整个文档类型内容;不会为内部元素创建其他节点,例如 .文档中只能有一个文档类型声明节点;此外,它应该是最顶层的节点(其父节点应该是 document)。文档类型声明节点的示例 XML 表示形式如下:node_doctype
<!实体>
<!DOCTYPE greeting [ <!ELEMENT greeting (#PCDATA)> ]>
Here the node has value . By default document type declaration nodes are treated as non-essential part of XML markup and are not loaded during XML parsing. You can override this behavior with parse_doctype flag.
"greeting [ <!ELEMENT greeting (#PCDATA)> ]"
此处,节点具有 value .默认情况下,文档类型声明节点被视为 XML 标记的非必要部分,并且在 XML 解析期间不会加载。您可以使用 parse_doctype 标志覆盖此行为。"greeting [ <!ELEMENT greeting (#PCDATA)> ]"
Finally, here is a complete example of XML document and the corresponding tree representation (samples/tree.xml):
最后,以下是 XML 文档和相应树表示形式的完整示例 (samples/tree.xml):
|
3.2. C++ interface 3.2. C++ 接口
Note 注意
|
All pugixml classes and functions are located in the namespace; you have to either use explicit name qualification (i.e. ), or to gain access to relevant symbols via directive (i.e. or ). The namespace will be omitted from all declarations in this documentation hereafter; all code examples will use fully qualified names.
pugi pugi::xml_node using using pugi::xml_node; using namespace pugi; 所有 pugixml 类和函数都位于命名空间中;您必须使用显式名称限定(即 ),或通过指令(即 或 )访问相关符号。此文档的所有声明中将省略命名空间;所有代码示例都将使用完全限定名称。 pugi pugi::xml_node using using pugi::xml_node; 使用命名空间 pugi; |
Despite the fact that there are several node types, there are only three C++ classes representing the tree (, , ); some operations on are only valid for certain node types. The classes are described below.xml_document
xml_node
xml_attribute
xml_node
尽管存在多个节点类型,但只有三个 C++ 类表示树 (、、);某些操作仅对某些节点类型有效。这些类如下所述。xml_document
xml_node
xml_attribute
xml_node
xml_document
is the owner of the entire document structure; it is a non-copyable class. The interface of consists of loading functions (see Loading document), saving functions (see Saving document) and the entire interface of , which allows for document inspection and/or modification. Note that while is a sub-class of , is not a polymorphic type; the inheritance is present only to simplify usage. Alternatively you can use the function to get the element node that’s the immediate child of the document.xml_document
xml_node
xml_document
xml_node
xml_node
document_element
xml_document
是整个文档结构的所有者;它是一个不可复制的类。的界面由加载功能(请参阅加载文档)、保存功能(请参阅保存文档)和 的整个界面组成,后者允许进行文档检查和/或修改。请注意,while 是 的子类 ,不是多态类型;存在继承只是为了简化使用。或者,您可以使用该函数来获取作为文档的直接子节点的 element 节点。xml_document
xml_node
xml_document
xml_node
xml_node
document_element
Default constructor of initializes the document to the tree with only a root node (document node). You can then populate it with data using either tree modification functions or loading functions; all loading functions destroy the previous tree with all occupied memory, which puts existing node/attribute handles for this document to invalid state. If you want to destroy the previous tree, you can use the function; it destroys the tree and replaces it with either an empty one or a copy of the specified document. Destructor of also destroys the tree, thus the lifetime of the document object should exceed the lifetimes of any node/attribute handles that point to the tree.xml_document
xml_document::reset
xml_document
默认构造函数 of 将文档初始化为仅包含根节点(文档节点)的树。然后,您可以使用树修改函数或加载函数用数据填充它;所有加载函数都会销毁具有所有占用内存的前一棵树,这会使此文档的现有节点/属性句柄处于无效状态。如果要销毁前一棵树,可以使用该函数;它会销毁树并将其替换为空树或指定文档的副本。析构函数 of 也会销毁树,因此文档对象的生命周期应超过指向树的任何节点/属性句柄的生命周期。xml_document
xml_document::reset
xml_document
Caution 谨慎
|
While technically node/attribute handles can be alive when the tree they’re referring to is destroyed, calling any member function for these handles results in undefined behavior. Thus it is recommended to make sure that the document is destroyed only after all references to its nodes/attributes are destroyed.
虽然从技术上讲,当节点/属性句柄引用的树被销毁时,它们可以是活动的,但为这些句柄调用任何成员函数都会导致未定义的行为。因此,建议确保仅在销毁对文档的节点/属性的所有引用后才销毁文档。 |
xml_node
is the handle to document node; it can point to any node in the document, including the document node itself. There is a common interface for nodes of all types; the actual node type can be queried via the method. Note that is only a handle to the actual node, not the node itself - you can have several handles pointing to the same underlying object. Destroying handle does not destroy the node and does not remove it from the tree. The size of is equal to that of a pointer, so it is nothing more than a lightweight wrapper around a pointer; you can safely pass or return objects by value without additional overhead.xml_node::type()
xml_node
xml_node
xml_node
xml_node
xml_node
xml_node
是 document 节点的句柄;它可以指向文档中的任何节点,包括文档节点本身。所有类型的节点都有一个通用接口;可以通过该方法查询实际的节点类型。请注意,这只是实际节点的句柄,而不是节点本身 - 您可以有多个句柄指向同一个底层对象。销毁 handle 不会销毁节点,也不会将其从树中删除。的大小等于指针的大小,因此它只不过是指针周围的轻量级包装器;您可以安全地按值传递或返回对象,而不会产生额外的开销。xml_node::type()
xml_node
xml_node
xml_node
xml_node
xml_node
There is a special value of type, known as null node or empty node (such nodes have type ). It does not correspond to any node in any document, and thus resembles null pointer. However, all operations are defined on empty nodes; generally the operations don’t do anything and return empty nodes/attributes or empty strings as their result (see documentation for specific functions for more detailed information). This is useful for chaining calls; i.e. you can get the grandparent of a node like so: ; if a node is a null node or it does not have a parent, the first call returns null node; the second call then also returns null node, which makes error handling easier.xml_node
node_null
node.parent().parent()
parent()
parent()
有一个特殊的 type 值,称为 null node 或 empty node (此类节点具有 type )。它不对应于任何文档中的任何节点,因此类似于 null 指针。但是,所有操作都在空节点上定义;通常,这些操作不执行任何操作,并返回空节点/属性或空字符串作为其结果(有关更多详细信息,请参阅特定函数的文档)。这对于链接调用很有用;即,您可以像这样获取节点的祖父节点: ;如果节点是 Null 节点或没有父节点,则第一次调用返回 Null 节点;然后,第二个调用还返回 Null 节点,这使得错误处理更容易。xml_node
node_null
node.parent().parent()
parent()parent
()
xml_attribute
is the handle to an XML attribute; it has the same semantics as , i.e. there can be several handles pointing to the same underlying object and there is a special null attribute value, which propagates to function results.xml_node
xml_attribute
xml_attribute
是 XML 属性的句柄;它与 具有相同的语义,即可以有多个句柄指向同一个底层对象,并且有一个特殊的 null 属性值,该值会传播到函数结果。xml_node
xml_attribute
Both and have the default constructor which initializes them to null objects.xml_node
xml_attribute
两者都具有默认构造函数,该构造函数将它们初始化为 null 对象。xml_node
xml_attribute
xml_node
and try to behave like pointers, that is, they can be compared with other objects of the same type, making it possible to use them as keys in associative containers. All handles to the same underlying object are equal, and any two handles to different underlying objects are not equal. Null handles only compare as equal to null handles. The result of relational comparison can not be reliably determined from the order of nodes in file or in any other way. Do not use relational comparison operators except for search optimization (i.e. associative container keys).xml_attribute
xml_node
并尝试像指针一样运行,也就是说,它们可以与相同类型的其他对象进行比较,从而可以将它们用作关联容器中的键。同一基础对象的所有句柄都相等,不同基础对象的任意两个句柄也不相等。Null 句柄仅与 Null 句柄进行比较。关系比较的结果无法可靠地根据 file 中的节点顺序或任何其他方式确定。请勿使用除搜索优化(即关联容器键)之外的关系比较运算符。xml_attribute
If you want to use or objects as keys in hash-based associative containers, you can use the member functions. They return the hash values that are guaranteed to be the same for all handles to the same underlying object. The hash value for null handles is 0. Note that hash value does not depend on the content of the node, only on the location of the underlying structure in memory - this means that loading the same document twice will likely produce different hash values, and copying the node will not preserve the hash.xml_node
xml_attribute
hash_value
如果要在基于哈希的关联容器中将 or 对象用作键,则可以使用成员函数。它们返回保证同一基础对象的所有句柄相同的哈希值。null 句柄的哈希值为 0。请注意,哈希值不取决于节点的内容,只取决于底层结构在内存中的位置 - 这意味着加载同一文档两次可能会产生不同的哈希值,并且复制节点不会保留哈希值。xml_node
xml_attribute
hash_value
Finally handles can be implicitly cast to boolean-like objects, so that you can test if the node/attribute is empty with the following code: or . Alternatively you can check if a given / handle is null by calling the following methods:if (node) { … }
if (!node) { … } else { … }
xml_node
xml_attribute
最后,句柄可以隐式转换为类似布尔的对象,以便您可以使用以下代码测试 node/attribute 是否为空:或 .或者,您可以通过调用以下方法来检查给定的 / 句柄是否为 null:if (node) { ... }
if (!node) { ... } else { ... }
xml_node
xml_attribute
bool xml_attribute::empty() const;
bool xml_node::empty() const;
Nodes and attributes do not exist without a document tree, so you can’t create them without adding them to some document. Once underlying node/attribute objects are destroyed, the handles to those objects become invalid. While this means that destruction of the entire tree invalidates all node/attribute handles, it also means that destroying a subtree (by calling xml_node::remove_child) or removing an attribute invalidates the corresponding handles. There is no way to check handle validity; you have to ensure correctness through external mechanisms.
没有文档树就不存在节点和属性,因此如果不将它们添加到某个文档中,则无法创建它们。一旦底层节点/属性对象被销毁,这些对象的句柄就会失效。虽然这意味着销毁整个树会使所有节点/属性句柄失效,但这也意味着销毁子树(通过调用 xml_node::remove_child)或删除属性会使相应的句柄失效。无法检查句柄的有效性;您必须通过外部机制确保正确性。
3.3. Unicode interface 3.3. Unicode 接口
There are two choices of interface and internal representation when configuring pugixml: you can either choose the UTF-8 (also called char) interface or UTF-16/32 (also called wchar_t) one. The choice is controlled via PUGIXML_WCHAR_MODE define; you can set it via or via preprocessor options, as discussed in Additional configuration options. If this define is set, the wchar_t interface is used; otherwise (by default) the char interface is used. The exact wide character encoding is assumed to be either UTF-16 or UTF-32 and is determined based on the size of type.pugiconfig.hpp
wchar_t
在配置 pugixml 时,有两种接口和内部表示方式可供选择:您可以选择 UTF-8(也称为 char)接口或 UTF-16/32(也称为 wchar_t)接口。选择通过 PUGIXML_WCHAR_MODE define 进行控制;您可以通过 或 通过 preprocessor options 进行设置,如 Additional configuration options.如果设置了此定义,则使用 wchar_t 接口;否则(默认情况下)使用 char 接口。假定确切的宽字符编码为 UTF-16 或 UTF-32,并根据类型的大小确定。pugiconfig.hpp
wchar_t
Note 注意
|
If the size of is 2, pugixml assumes UTF-16 encoding instead of UCS-2, which means that some characters are represented as two code points.
wchar_t 如果大小为 2,则 pugixml 采用 UTF-16 编码而不是 UCS-2,这意味着某些字符表示为两个码位。 wchar_t |
All tree functions that work with strings work with either C-style null terminated strings or STL strings of the selected character type. For example, node name accessors look like this in char mode:
所有使用字符串的树函数都使用 C 样式的 null 终止字符串或所选字符类型的 STL 字符串。例如,节点名称访问器在 char 模式下如下所示:
const char* xml_node::name() const;
bool xml_node::set_name(const char* value);
and like this in wchar_t mode:
在 wchar_t 模式下如下所示:
const wchar_t* xml_node::name() const;
bool xml_node::set_name(const wchar_t* value);
There is a special type, , that is defined as the character type and depends on the library configuration; it will be also used in the documentation hereafter. There is also a type , which is defined as the STL string of the character type; it corresponds to in char mode and to in wchar_t mode.pugi::char_t
pugi::string_t
std::string
std::wstring
有一个特殊类型 ,它被定义为字符类型,并且取决于库配置;它也将在以后的文档中使用。还有一个 type ,它被定义为字符类型的 STL 字符串;它对应于 in char mode 和 in wchar_t mode。pugi::char_t
pugi::string_t
std::string
std::wstring
In addition to the interface, the internal implementation changes to store XML data as ; this means that these two modes have different memory usage characteristics - generally UTF-8 mode is more memory and performance efficient, especially if is 4. The conversion to upon document loading and from upon document saving happen automatically, which also carries minor performance penalty. The general advice however is to select the character mode based on usage scenario, i.e. if UTF-8 is inconvenient to process and most of your XML data is non-ASCII, wchar_t mode is probably a better choice.pugi::char_t
sizeof(wchar_t)
pugi::char_t
pugi::char_t
除了接口之外,内部实现还更改了将 XML 数据存储为 ;这意味着这两种模式具有不同的内存使用特性 - 通常 UTF-8 模式的内存和性能效率更高,尤其是在 4 时。在文档加载时和 from 在文档加载时自动转换为 to,这也会带来轻微的性能损失。然而,一般的建议是根据使用场景选择字符模式,即如果 UTF-8 不方便处理并且大多数 XML 数据是非 ASCII 的,wchar_t 模式可能是更好的选择。pugi::char_t
sizeof(wchar_t)
pugi::char_t
pugi::char_t
There are cases when you’ll have to convert string data between UTF-8 and wchar_t encodings; the following helper functions are provided for such purposes:
在某些情况下,您必须在 UTF-8 和 wchar_t 编码之间转换字符串数据;为此,提供了以下帮助程序函数:
std::string as_utf8(const wchar_t* str);
std::wstring as_wide(const char* str);
Both functions accept a null-terminated string as an argument , and return the converted string. performs conversion from UTF-16/32 to UTF-8; performs conversion from UTF-8 to UTF-16/32. Invalid UTF sequences are silently discarded upon conversion. has to be a valid string; passing null pointer results in undefined behavior. There are also two overloads with the same semantics which accept a string as an argument:str
as_utf8
as_wide
str
这两个函数都接受以 null 结尾的字符串作为参数,并返回转换后的字符串。 执行从 UTF-16/32 到 UTF-8 的转换;执行从 UTF-8 到 UTF-16/32 的转换。无效的 UTF 序列在转换时被静默丢弃。必须是有效的字符串;传递 null 指针会导致 undefined 行为。还有两个具有相同语义的重载,它们接受字符串作为 argument:str
as_utf8
as_wide
str
std::string as_utf8(const std::wstring& str);
std::wstring as_wide(const std::string& str);
Note 注意
|
Most examples in this documentation assume char interface and therefore will not compile with PUGIXML_WCHAR_MODE. This is done to simplify the documentation; usually the only changes you’ll have to make is to pass string literals, i.e. instead of
you’ll have to use 您必须使用
|
3.4. Thread-safety guarantees
3.4. 线程安全保证
Almost all functions in pugixml have the following thread-safety guarantees:
pugixml 中的几乎所有函数都有以下线程安全保证:
-
it is safe to call free (non-member) functions from multiple threads
从多个线程调用 free (非成员) 函数是安全的 -
it is safe to perform concurrent read-only accesses to the same tree (all constant member functions do not modify the tree)
对同一棵树执行并发只读访问是安全的(所有常量成员函数都不会修改树) -
it is safe to perform concurrent read/write accesses, if there is only one read or write access to the single tree at a time
如果一次只有一个对单个树的读或写访问,则执行并发读/写访问是安全的
Concurrent modification and traversing of a single tree requires synchronization, for example via reader-writer lock. Modification includes altering document structure and altering individual node/attribute data, i.e. changing names/values.
单个树的并发修改和遍历需要同步,例如通过读写器锁。修改包括更改文档结构和更改单个节点/属性数据,即更改名称/值。
The only exception is set_memory_management_functions; it modifies global variables and as such is not thread-safe. Its usage policy has more restrictions, see Custom memory allocation/deallocation functions.
唯一的例外是 set_memory_management_functions;它修改全局变量,因此不是线程安全的。它的使用策略有更多限制,请参阅自定义内存分配/释放函数。
3.5. Exception guarantees
3.5. 异常保证
With the exception of XPath, pugixml itself does not throw any exceptions. Additionally, most pugixml functions have a no-throw exception guarantee.
除了 XPath 之外,pugixml 本身不会抛出任何异常。此外,大多数 pugixml 函数都有 no-throw 异常保证。
This is not applicable to functions that operate on STL strings or IOstreams; such functions have either strong guarantee (functions that operate on strings) or basic guarantee (functions that operate on streams). Also functions that call user-defined callbacks (i.e. xml_node::traverse or xml_node::find_node) do not provide any exception guarantees beyond the ones provided by the callback.
这不适用于对 STL 字符串或 IOstreams 进行操作的函数;此类函数具有 Strong Guarantee (对 Strings 进行操作的函数) 或 Basic Guarantee (对流进行操作的函数)。此外,调用用户定义的回调(即 xml_node::traverse 或 xml_node::find_node)的函数不会提供超出回调提供的异常保证的任何异常保证。
If exception handling is not disabled with PUGIXML_NO_EXCEPTIONS define, XPath functions may throw xpath_exception on parsing errors; also, XPath functions may throw in low memory conditions. Still, XPath functions provide strong exception guarantee.std::bad_alloc
如果未使用 PUGIXML_NO_EXCEPTIONS define 禁用异常处理,则 XPath 函数可能会在解析错误时引发xpath_exception;此外,XPath 函数可能会在内存不足的情况下引发。尽管如此,XPath 函数仍提供了强大的异常保证。std::bad_alloc
3.6. Memory management 3.6. 内存管理
pugixml requests the memory needed for document storage in big chunks, and allocates document data inside those chunks. This section discusses replacing functions used for chunk allocation and internal memory management implementation.
PugiXML 请求以大块形式存储文档所需的内存,并在这些块中分配文档数据。本节讨论替换用于 chunk allocation 和 internal memory management 实现的函数。
3.6.1. Custom memory allocation/deallocation functions
3.6.1. 自定义内存分配/释放函数
All memory for tree structure, tree data and XPath objects is allocated via globally specified functions, which default to malloc/free. You can set your own allocation functions with set_memory_management function. The function interfaces are the same as that of malloc/free:
树结构、树数据和 XPath 对象的所有内存都通过全局指定的函数分配,这些函数默认为 malloc/free。您可以使用 set_memory_management 函数设置自己的分配函数。函数接口与 malloc/free 相同:
typedef void* (*allocation_function)(size_t size);
typedef void (*deallocation_function)(void* ptr);
You can use the following accessor functions to change or get current memory management functions:
您可以使用以下访问器函数来更改或获取当前的内存管理函数:
void set_memory_management_functions(allocation_function allocate, deallocation_function deallocate);
allocation_function get_memory_allocation_function();
deallocation_function get_memory_deallocation_function();
Allocation function is called with the size (in bytes) as an argument and should return a pointer to a memory block with alignment that is suitable for storage of primitive types (usually a maximum of and types alignment is sufficient) and size that is greater than or equal to the requested one. If the allocation fails, the function has to either return null pointer or to throw an exception.void*
double
调用分配函数时,以大小(以字节为单位)作为参数,并应返回一个指向内存块的指针,该内存块的对齐方式适合存储基元类型(通常为最大值,类型对齐就足够了)和大于或等于所请求的大小。如果分配失败,则函数必须返回 null 指针或引发异常。虚空*
双精度
Deallocation function is called with the pointer that was returned by some call to allocation function; it is never called with a null pointer. If memory management functions are not thread-safe, library thread safety is not guaranteed.
Deallocation function 使用对 allocation function 的某个调用返回的指针调用;它永远不会使用 null 指针调用。如果内存管理函数不是线程安全的,则无法保证库线程安全。
This is a simple example of custom memory management (samples/custom_memory_management.cpp):
这是自定义内存管理的简单示例 (samples/custom_memory_management.cpp):
void* custom_allocate(size_t size)
{
return new (std::nothrow) char[size];
}
void custom_deallocate(void* ptr)
{
delete[] static_cast<char*>(ptr);
}
pugi::set_memory_management_functions(custom_allocate, custom_deallocate);
When setting new memory management functions, care must be taken to make sure that there are no live pugixml objects. Otherwise when the objects are destroyed, the new deallocation function will be called with the memory obtained by the old allocation function, resulting in undefined behavior.
在设置新的内存管理函数时,必须注意确保没有活动的 pugixml 对象。否则,当对象被销毁时,将使用旧分配函数获得的内存调用新的 deallocation 函数,从而导致未定义的行为。
3.6.2. Memory consumption tuning
3.6.2. 内存消耗调整
There are several important buffering optimizations in pugixml that rely on predefined constants. These constants have default values that were tuned for common usage patterns; for some applications, changing these constants might improve memory consumption or increase performance. Changing these constants is not recommended unless their default values result in visible problems.
pugixml 中有几个重要的缓冲优化依赖于预定义的常量。这些常量具有针对常见使用模式进行调整的默认值;对于某些应用程序,更改这些常量可能会改善内存消耗或提高性能。不建议更改这些常量,除非它们的默认值会导致可见的问题。
These constants can be tuned via configuration defines, as discussed in Additional configuration options; it is recommended to set them in .pugiconfig.hpp
这些常量可以通过配置定义进行调整,如 其他配置选项中所述;建议在 中设置它们。pugiconfig.hpp
-
PUGIXML_MEMORY_PAGE_SIZE
controls the page size for document memory allocation. Memory for node/attribute objects is allocated in pages of the specified size. The default size is 32 Kb; for some applications the size is too large (i.e. embedded systems with little heap space or applications that keep lots of XML documents in memory). A minimum size of 1 Kb is recommended.PUGIXML_MEMORY_PAGE_SIZE
控制文档内存分配的页面大小。节点/属性对象的内存以指定大小的页面为单位分配。默认大小为 32 Kb;对于某些应用程序,大小太大(即堆空间很小的嵌入式系统或在内存中保留大量 XML 文档的应用程序)。建议最小大小为 1 Kb。 -
PUGIXML_MEMORY_OUTPUT_STACK
controls the cumulative stack space required to output the node. Any output operation (i.e. saving a subtree to file) uses an internal buffering scheme for performance reasons. The default size is 10 Kb; if you’re using node output from threads with little stack space, decreasing this value can prevent stack overflows. A minimum size of 1 Kb is recommended.PUGIXML_MEMORY_OUTPUT_STACK
控制输出节点所需的累积堆栈空间。出于性能原因,任何输出操作(即将 subtree 保存到 file)都使用内部缓冲方案。默认大小为 10 Kb;如果您使用来自堆栈空间较小的线程的节点输出,则减小此值可以防止堆栈溢出。建议最小大小为 1 Kb。 -
PUGIXML_MEMORY_XPATH_PAGE_SIZE
controls the page size for XPath memory allocation. Memory for XPath query objects as well as internal memory for XPath evaluation is allocated in pages of the specified size. The default size is 4 Kb; if you have a lot of resident XPath query objects, you might need to decrease the size to improve memory consumption. A minimum size of 256 bytes is recommended.
PUGIXML_MEMORY_XPATH_PAGE_SIZE
控制 XPath 内存分配的页面大小。XPath 查询对象的内存以及用于 XPath 评估的内部内存以指定大小的页面分配。默认大小为 4 Kb;如果您有很多常驻 XPath 查询对象,则可能需要减小大小以改善内存消耗。建议的最小大小为 256 字节。
3.6.3. Document memory management internals
3.6.3. 文档内存管理内部
Constructing a document object using the default constructor does not result in any allocations; document node is stored inside the xml_document object.
使用默认构造函数构造 document 对象不会导致任何分配;Document 节点存储在 xml_document 对象中。
When the document is loaded from file/buffer, unless an inplace loading function is used (see Loading document from memory), a complete copy of character stream is made; all names/values of nodes and attributes are allocated in this buffer. This buffer is allocated via a single large allocation and is only freed when document memory is reclaimed (i.e. if the xml_document object is destroyed or if another document is loaded in the same object). Also when loading from file or stream, an additional large allocation may be performed if encoding conversion is required; a temporary buffer is allocated, and it is freed before load function returns.
当从文件/缓冲区加载文档时,除非使用就地加载功能(请参阅从内存加载文档),否则将创建字符流的完整副本;节点和属性的所有名称/值都在此缓冲区中分配。此缓冲区通过单个大型分配进行分配,并且仅在回收文档内存时释放(即,如果xml_document对象被销毁或如果另一个文档加载到同一对象中)。此外,从文件或流加载时,如果需要编码转换,则可以执行额外的大型分配;分配了一个临时缓冲区,并在 Load Function 返回之前释放该缓冲区。
All additional memory, such as memory for document structure (node/attribute objects) and memory for node/attribute names/values is allocated in pages on the order of 32 Kb; actual objects are allocated inside the pages using a memory management scheme optimized for fast allocation/deallocation of many small objects. Because of the scheme specifics, the pages are only destroyed if all objects inside them are destroyed; also, generally destroying an object does not mean that subsequent object creation will reuse the same memory. This means that it is possible to devise a usage scheme which will lead to higher memory usage than expected; one example is adding a lot of nodes, and them removing all even numbered ones; not a single page is reclaimed in the process. However this is an example specifically crafted to produce unsatisfying behavior; in all practical usage scenarios the memory consumption is less than that of a general-purpose allocator because allocation meta-data is very small in size.
所有其他内存,例如文档结构(节点/属性对象)的内存和节点/属性名称/值的内存,都以 32 KB 的顺序按页面分配;实际对象使用针对许多小对象的快速分配/释放而优化的内存管理方案在页面内分配。由于方案的具体情况,只有当页面中的所有对象都被销毁时,页面才会被销毁;此外,通常销毁对象并不意味着后续对象创建将重用相同的内存。这意味着可以设计一种使用方案,这将导致内存使用率高于预期;一个例子是添加很多节点,然后它们删除所有偶数节点;在此过程中不会回收任何页面。然而,这是一个专门为产生令人不满意的行为而设计的示例;在所有实际使用场景中,内存消耗都比通用分配器少,因为分配元数据的大小非常小。
3.6.4. Compact mode 3.6.4. 精简模式
By default nodes and attributes are optimized for efficiency of access. This can cause them to take a significant amount of memory - for documents with a lot of nodes and not a lot of contents (short attribute values/node text), and depending on the pointer size, the document structure can take noticeably more memory than the document itself (e.g. on a 64-bit platform in UTF-8 mode a markup-heavy document with the file size of 2.1 Mb can use 2.1 Mb for document buffer and 8.3 Mb for document structure).
默认情况下,节点和属性针对访问效率进行了优化。这可能会导致它们占用大量内存 - 对于具有大量节点而内容不多(短属性值/节点文本)的文档,并且根据指针大小,文档结构可能比文档本身占用更多的内存(例如,在 UTF-8 模式下的 64 位平台上,文件大小为 2.1 Mb 的标记密集型文档可以使用 2.1 Mb 作为文档缓冲区,而 8.3Mb 表示文档结构)。
If you are processing big documents or your platform is memory constrained and you’re willing to sacrifice a bit of performance for memory, you can compile pugixml with define which will activate compact mode. Compact mode uses a different representation of the document structure that assumes locality of reference between nodes and attributes to optimize memory usage. As a result you get significantly smaller node/attribute objects; usually most objects in most documents don’t require additional storage, but in the worst case - if assumptions about locality of reference don’t hold - additional memory will be allocated to store the extra data required.PUGIXML_COMPACT
如果你正在处理大型文档,或者你的平台内存受限,并且你愿意为内存牺牲一点性能,你可以使用 define 编译 pugixml,这将激活 compact 模式。Compact 模式使用文档结构的不同表示形式,该表示形式假定节点和属性之间的引用位置,以优化内存使用。因此,您可以获得明显更小的节点/属性对象;通常,大多数文档中的大多数对象不需要额外的存储,但在最坏的情况下 - 如果关于引用位置的假设不成立 - 将分配额外的内存来存储所需的额外数据。PUGIXML_COMPACT
The compact storage supports all existing operations - including tree modification - with the same amortized complexity (that is, all basic document manipulations are still O(1) on average). The operations are slightly slower; you can usually expect 10-50% slowdown in terms of processing time unless your processing was memory-bound.
紧凑的存储支持所有现有操作 - 包括树修改 - 具有相同的摊销复杂性(即,所有基本文档操作平均仍然是 O(1))。操作速度稍慢;除非您的处理受内存限制,否则您通常可以预期处理时间会减慢 10-50%。
On 32-bit architectures document structure in compact mode is typically reduced by around 2.5x; on 64-bit architectures the ratio is around 5x. Thus for big markup-heavy documents compact mode can make the difference between the processing of a multi-gigabyte document running completely from RAM vs requiring swapping to disk. Even if the document fits into memory, compact storage can use CPU caches more efficiently by taking less space and causing less cache/TLB misses.
在 32 位体系结构上,紧凑模式下的文档结构通常减少约 2.5 倍;在 64 位架构上,该比率约为 5 倍。因此,对于大型标记密集型文档,紧凑模式可以在处理完全从 RAM 运行的数 GB 文档与需要交换到磁盘之间产生差异。即使文档适合内存,紧凑存储也可以通过占用更少的空间和导致更少的缓存/TLB 未命中来更有效地使用 CPU 缓存。
4. Loading document 4. 加载文档
pugixml provides several functions for loading XML data from various places - files, C++ iostreams, memory buffers. All functions use an extremely fast non-validating parser. This parser is not fully W3C conformant - it can load any valid XML document, but does not perform some well-formedness checks. While considerable effort is made to reject invalid XML documents, some validation is not performed for performance reasons. Also some XML transformations (i.e. EOL handling or attribute value normalization) can impact parsing speed and thus can be disabled. However for vast majority of XML documents there is no performance difference between different parsing options. Parsing options also control whether certain XML nodes are parsed; see Parsing options for more information.
pugixml 提供了几个函数,用于从不同位置加载 XML 数据 - 文件、C++ iostreams、内存缓冲区。所有函数都使用极快的非验证解析器。此解析器不完全符合 W3C - 它可以加载任何有效的 XML 文档,但不执行一些格式正确的检查。虽然我们付出了相当大的努力来拒绝无效的 XML 文档,但出于性能原因,不会执行某些验证。此外,某些 XML 转换(即 EOL 处理或属性值规范化)可能会影响解析速度,因此可以禁用。但是,对于绝大多数 XML 文档,不同的解析选项之间没有性能差异。解析选项还控制是否解析某些 XML 节点;有关更多信息,请参阅解析选项。
XML data is always converted to internal character format (see Unicode interface) before parsing. pugixml supports all popular Unicode encodings (UTF-8, UTF-16 (big and little endian), UTF-32 (big and little endian); UCS-2 is naturally supported since it’s a strict subset of UTF-16) as well as some non-Unicode encodings (Latin-1) and handles all encoding conversions automatically. Unless explicit encoding is specified, loading functions perform automatic encoding detection based on source XML data, so in most cases you do not have to specify document encoding. Encoding conversion is described in more detail in Encodings.
在解析之前,XML 数据始终转换为内部字符格式(请参阅 Unicode 接口)。pugixml 支持所有流行的 Unicode 编码(UTF-8、UTF-16(大端和小端)、UTF-32(大端和小端);UCS-2 自然受到支持,因为它是 UTF-16) 以及一些非 Unicode 编码 (Latin-1) 的严格子集,并自动处理所有编码转换。除非指定了显式编码,否则加载函数将根据源 XML 数据执行自动编码检测,因此在大多数情况下,您不必指定文档编码。编码转换在 编码 中有更详细的描述。
4.1. Loading document from file
4.1. 从文件加载文档
The most common source of XML data is files; pugixml provides dedicated functions for loading an XML document from file:
XML 数据最常见的来源是文件;pugixml 提供了从文件加载 XML 文档的专用函数:
xml_parse_result xml_document::load_file(const char* path, unsigned int options = parse_default, xml_encoding encoding = encoding_auto);
xml_parse_result xml_document::load_file(const wchar_t* path, unsigned int options = parse_default, xml_encoding encoding = encoding_auto);
These functions accept the file path as its first argument, and also two optional arguments, which specify parsing options (see Parsing options) and input data encoding (see Encodings). The path has the target operating system format, so it can be a relative or absolute one, it should have the delimiters of the target system, it should have the exact case if the target file system is case-sensitive, etc.
这些函数接受文件路径作为其第一个参数,以及两个可选参数,这些参数指定解析选项(请参阅解析选项)和输入数据编码(请参阅编码)。该路径具有目标操作系统格式,因此它可以是相对或绝对格式,它应该具有目标系统的分隔符,如果目标文件系统区分大小写,则它应该具有确切的大小写,等等。
File path is passed to the system file opening function as is in case of the first function (which accepts ); the second function either uses a special file opening function if it is provided by the runtime library or converts the path to UTF-8 and uses the system file opening function.const char* path
文件路径与第一个函数(接受)一样传递给系统文件打开函数;第二个函数使用特殊的文件打开函数(如果它由运行时库提供),或者将路径转换为 UTF-8 并使用系统文件打开函数。const char* 路径
load_file
destroys the existing document tree and then tries to load the new tree from the specified file. The result of the operation is returned in an xml_parse_result object; this object contains the operation status and the related information (i.e. last successfully parsed position in the input file, if parsing fails). See Handling parsing errors for error handling details.load_file
会销毁现有文档树,然后尝试从指定文件加载新树。操作的结果在 xml_parse_result 对象中返回;此对象包含操作状态和相关信息(即,如果解析失败,则为 Importing 文件中的上次成功解析位置)。有关错误处理的详细信息,请参阅处理解析错误。
This is an example of loading XML document from file (samples/load_file.cpp):
以下是从文件 (samples/load_file.cpp) 加载 XML 文档的示例:
pugi::xml_document doc;
pugi::xml_parse_result result = doc.load_file("tree.xml");
std::cout << "Load result: " << result.description() << ", mesh name: " << doc.child("mesh").attribute("name").value() << std::endl;
4.2. Loading document from memory
4.2. 从内存中加载文档
Sometimes XML data should be loaded from some other source than a file, i.e. HTTP URL; also you may want to load XML data from file using non-standard functions, i.e. to use your virtual file system facilities or to load XML from GZip-compressed files. All these scenarios require loading document from memory. First you should prepare a contiguous memory block with all XML data; then you have to invoke one of buffer loading functions. These functions will handle the necessary encoding conversions, if any, and then will parse the data into the corresponding XML tree. There are several buffer loading functions, which differ in the behavior and thus in performance/memory usage:
有时 XML 数据应该从文件以外的其他来源加载,即 HTTP URL;此外,您可能希望使用非标准函数从文件中加载 XML 数据,即使用虚拟文件系统工具或从 GZip 压缩文件加载 XML。所有这些情况都需要从内存中加载文档。首先,您应该准备一个包含所有 XML 数据的连续内存块;然后你必须调用 buffer loading 函数之一。这些函数将处理必要的编码转换(如果有),然后将数据解析到相应的 XML 树中。有几个缓冲区加载函数,它们的行为不同,因此性能/内存使用也不同:
xml_parse_result xml_document::load_buffer(const void* contents, size_t size, unsigned int options = parse_default, xml_encoding encoding = encoding_auto);
xml_parse_result xml_document::load_buffer_inplace(void* contents, size_t size, unsigned int options = parse_default, xml_encoding encoding = encoding_auto);
xml_parse_result xml_document::load_buffer_inplace_own(void* contents, size_t size, unsigned int options = parse_default, xml_encoding encoding = encoding_auto);
All functions accept the buffer which is represented by a pointer to XML data, , and data size in bytes. Also there are two optional arguments, which specify parsing options (see Parsing options) and input data encoding (see Encodings). The buffer does not have to be zero-terminated.contents
所有函数都接受缓冲区,该缓冲区由指向 XML 数据的指针表示,以及以字节为单位的数据大小。此外,还有两个可选参数,用于指定解析选项(请参阅 解析选项)和输入数据编码(请参阅 编码)。缓冲区不必以零结尾。内容
load_buffer
function works with immutable buffer - it does not ever modify the buffer. Because of this restriction it has to create a private buffer and copy XML data to it before parsing (applying encoding conversions if necessary). This copy operation carries a performance penalty, so inplace functions are provided - and store the document data in the buffer, modifying it in the process. In order for the document to stay valid, you have to make sure that the buffer’s lifetime exceeds that of the tree if you’re using inplace functions. In addition to that, does not assume ownership of the buffer, so you’ll have to destroy it yourself; assumes ownership of the buffer and destroys it once it is not needed. This means that if you’re using , you have to allocate memory with pugixml allocation function (you can get it via get_memory_allocation_function).load_buffer_inplace
load_buffer_inplace_own
load_buffer_inplace
load_buffer_inplace_own
load_buffer_inplace_own
load_buffer
函数适用于不可变缓冲区 - 它永远不会修改缓冲区。由于此限制,它必须在解析之前创建一个私有缓冲区并将 XML 数据复制到其中(如有必要,请应用编码转换)。此复制操作会带来性能损失,因此提供了就地函数 - 并将文档数据存储在缓冲区中,并在过程中对其进行修改。为了使文档保持有效,如果使用就地函数,则必须确保缓冲区的生存期超过树的生存期。除此之外,它不会承担缓冲区的所有权,因此您必须自己销毁它;获取缓冲区的所有权,并在不需要缓冲区时销毁它。这意味着,如果你正在使用 ,你必须使用 pugixml 分配函数来分配内存(你可以通过 get_memory_allocation_function 来获取它)。load_buffer_inplace
load_buffer_inplace_own
load_buffer_inplace
load_buffer_inplace_own
load_buffer_inplace_own
The best way from the performance/memory point of view is to load document using ; this function has maximum control of the buffer with XML data so it is able to avoid redundant copies and reduce peak memory usage while parsing. This is the recommended function if you have to load the document from memory and performance is critical.load_buffer_inplace_own
从性能/内存的角度来看,最好的方法是使用 加载文档;此函数对 XML 数据的缓冲区具有最大程度的控制,因此它能够避免冗余副本,并在解析时减少峰值内存使用。如果您必须从内存中加载文档并且性能至关重要,则建议使用此功能。load_buffer_inplace_own
There is also a simple helper function for cases when you want to load the XML document from null-terminated character string:
还有一个简单的帮助程序函数,用于从以 null 结尾的字符串加载 XML 文档的情况:
xml_parse_result xml_document::load_string(const char_t* contents, unsigned int options = parse_default);
It is equivalent to calling with being either or , depending on the character type. This function assumes native encoding for input data, so it does not do any encoding conversion. In general, this function is fine for loading small documents from string literals, but has more overhead and less functionality than the buffer loading functions.load_buffer
size
strlen(contents)
wcslen(contents) * sizeof(wchar_t)
它等效于使用 或 进行调用,具体取决于字符类型。此函数假定输入数据的本机编码,因此它不会执行任何编码转换。通常,此函数适用于从字符串 Literals 加载小文档,但与缓冲区加载函数相比,开销更大,功能更少。load_buffer
size
strlen(contents)
wcslen(contents) * sizeof(wchar_t)
This is an example of loading XML document from memory using different functions (samples/load_memory.cpp):
以下是使用不同函数从内存中加载 XML 文档的示例 (samples/load_memory.cpp):
const char source[] = "<mesh name='sphere'><bounds>0 0 1 1</bounds></mesh>";
size_t size = sizeof(source);
// You can use load_buffer to load document from immutable memory block:
pugi::xml_parse_result result = doc.load_buffer(source, size);
// You can use load_buffer_inplace to load document from mutable memory block; the block's lifetime must exceed that of document
char* buffer = new char[size];
memcpy(buffer, source, size);
// The block can be allocated by any method; the block is modified during parsing
pugi::xml_parse_result result = doc.load_buffer_inplace(buffer, size);
// You have to destroy the block yourself after the document is no longer used
delete[] buffer;
// You can use load_buffer_inplace_own to load document from mutable memory block and to pass the ownership of this block
// The block has to be allocated via pugixml allocation function - using i.e. operator new here is incorrect
char* buffer = static_cast<char*>(pugi::get_memory_allocation_function()(size));
memcpy(buffer, source, size);
// The block will be deleted by the document
pugi::xml_parse_result result = doc.load_buffer_inplace_own(buffer, size);
// You can use load to load document from null-terminated strings, for example literals:
pugi::xml_parse_result result = doc.load_string("<mesh name='sphere'><bounds>0 0 1 1</bounds></mesh>");
4.3. Loading document from C++ IOstreams
4.3. 从 C++ IOstreams 加载文档
To enhance interoperability, pugixml provides functions for loading document from any object which implements C++ interface. This allows you to load documents from any standard C++ stream (i.e. file stream) or any third-party compliant implementation (i.e. Boost Iostreams). There are two functions, one works with narrow character streams, another handles wide character ones:std::istream
为了增强互操作性,pugixml 提供了从任何实现 C++ 接口的对象加载文档的函数。这允许您从任何标准 C++ 流(即文件流)或任何第三方兼容实现(即 Boost Iostreams)加载文档。有两个函数,一个处理窄字符流,另一个处理宽字符流:std::istream
xml_parse_result xml_document::load(std::istream& stream, unsigned int options = parse_default, xml_encoding encoding = encoding_auto);
xml_parse_result xml_document::load(std::wistream& stream, unsigned int options = parse_default);
load
with argument loads the document from stream from the current read position to the end, treating the stream contents as a byte stream of the specified encoding (with encoding autodetection as necessary). Thus calling on an opened object is equivalent to calling .std::istream
xml_document::load
std::ifstream
xml_document::load_file
load
with argument 将文档从 stream 从当前读取位置加载到末尾,将 stream 内容视为指定编码的字节流(必要时使用编码自动检测)。因此,调用打开的对象等效于调用 .std::istream
xml_document::加载
std::ifstream
xml_document::load_file
load
with argument treats the stream contents as a wide character stream (encoding is always encoding_wchar). Because of this, using with wide character streams requires careful (usually platform-specific) stream setup (i.e. using the function). Generally use of wide streams is discouraged, however it provides you the ability to load documents from non-Unicode encodings, i.e. you can load Shift-JIS encoded data if you set the correct locale.std::wstream
load
imbue
load
with argument 将流内容视为宽字符流(编码始终为 encoding_wchar)。因此,与宽字符流一起使用需要仔细的(通常是特定于平台的)流设置(即使用函数)。通常不建议使用宽流,但它提供了从非 Unicode 编码加载文档的能力,即,如果您设置了正确的区域设置,则可以加载 Shift-JIS 编码的数据。std::wstream
负载
注入
This is a simple example of loading XML document from file using streams (samples/load_stream.cpp); read the sample code for more complex examples involving wide streams and locales:
这是使用流 (samples/load_stream.cpp) 从文件加载 XML 文档的简单示例;有关涉及宽流和区域设置的更复杂的示例,请阅读示例代码:
std::ifstream stream("weekly-utf-8.xml");
pugi::xml_parse_result result = doc.load(stream);
4.4. Handling parsing errors
4.4. 处理解析错误
All document loading functions return the parsing result via object. It contains parsing status, the offset of last successfully parsed character from the beginning of the source stream, and the encoding of the source stream:xml_parse_result
所有文档加载函数都通过 object 返回解析结果。它包含解析状态、上次成功解析的字符与源流开头的偏移量以及源流的编码xml_parse_result
struct xml_parse_result
{
xml_parse_status status;
ptrdiff_t offset;
xml_encoding encoding;
operator bool() const;
const char* description() const;
};
Parsing status is represented as the enumeration and can be one of the following:xml_parse_status
解析状态表示为枚举,可以是以下值之一xml_parse_status
-
status_ok
means that no error was encountered during parsing; the source stream represents the valid XML document which was fully parsed and converted to a tree.status_ok
表示在解析过程中未遇到错误;源流表示已完全解析并转换为树的有效 XML 文档。 -
status_file_not_found
is only returned by function and means that file could not be opened.load_file
status_file_not_found
仅由 function 返回,表示无法打开该文件。load_file
-
status_io_error
is returned by function and by functions with / arguments; it means that some I/O error has occurred during reading the file/stream.load_file
load
std::istream
std::wstream
status_io_error
由函数和带有 / 参数的函数返回;这意味着在读取文件/流期间发生了一些 I/O 错误。load_file
加载
std::istream
std::wstream
-
status_out_of_memory
means that there was not enough memory during some allocation; any allocation failure during parsing results in this error.status_out_of_memory
表示在某些分配期间内存不足;解析期间的任何分配失败都会导致此错误。 -
status_internal_error
means that something went horribly wrong; currently this error does not occurstatus_internal_error
表示出了什么可怕的错误;目前不会发生此错误 -
status_unrecognized_tag
means that parsing stopped due to a tag with either an empty name or a name which starts with incorrect character, such as .#
status_unrecognized_tag
表示由于标签名称为空或名称以不正确的字符开头(如 .#
-
status_bad_pi
means that parsing stopped due to incorrect document declaration/processing instructionstatus_bad_pi
表示由于文档声明/处理指令不正确而停止解析 -
status_bad_comment
, , and mean that parsing stopped due to the invalid construct of the respective typestatus_bad_cdata
status_bad_doctype
status_bad_pcdata
status_bad_comment
、 和 表示解析由于相应类型的构造无效而停止status_bad_cdata
status_bad_doctype status_bad_pcdata
-
status_bad_start_element
means that parsing stopped because starting tag either had no closing symbol or contained some incorrect symbol>
status_bad_start_element
表示解析已停止,因为起始标记没有结束符号或包含一些不正确的符号>
-
status_bad_attribute
means that parsing stopped because there was an incorrect attribute, such as an attribute without value or with value that is not quoted (note that is incorrect in XML)<node attr=1>
status_bad_attribute
表示解析已停止,因为存在不正确的属性,例如属性没有值或值未被引用(请注意,这在 XML 中是不正确的)<node attr=1>
-
status_bad_end_element
means that parsing stopped because ending tag had incorrect syntax (i.e. extra non-whitespace symbols between tag name and>
)status_bad_end_element
表示解析已停止,因为结束标记的语法不正确(即标记名称和 > 之间有额外的非空格符号)
-
status_end_element_mismatch
means that parsing stopped because the closing tag did not match the opening one (i.e. ) or because some tag was not closed at all<node></nedo>
status_end_element_mismatch
表示解析停止是因为结束标签与开始标签不匹配(即 ),或者因为某些标签根本没有关闭<node></nedo>
-
status_no_document_element
means that no element nodes were discovered during parsing; this usually indicates an empty or invalid documentstatus_no_document_element
表示在解析过程中未发现任何元素节点;这通常表示文档为空或无效
description()
member function can be used to convert parsing status to a string; the returned message is always in English, so you’ll have to write your own function if you need a localized string. However please note that the exact messages returned by function may change from version to version, so any complex status handling should be based on value. Note that returns a string even in ; you’ll have to call as_wide to get the string.description()
status
description()
char
PUGIXML_WCHAR_MODE
wchar_t
description()
成员函数可用于将解析状态转换为字符串;返回的消息始终为英文,因此如果您需要本地化字符串,则必须编写自己的函数。但请注意,函数返回的确切消息可能会因版本而异,因此任何复杂的状态处理都应基于 value。请注意,即使在 ;您必须调用 as_wide 才能获取字符串。description()
status
description()
char
PUGIXML_WCHAR_MODE
wchar_t
If parsing failed because the source data was not a valid XML, the resulting tree is not destroyed - despite the fact that load function returns error, you can use the part of the tree that was successfully parsed. Obviously, the last element may have an unexpected name/value; for example, if the attribute value does not end with the necessary quotation mark, like in example, the value of attribute will contain the string .<node attr="value>some data</node>
attr
value>some data</node>
如果由于源数据不是有效的 XML 而导致解析失败,则不会销毁生成的树 - 尽管 load 函数返回错误,但您可以使用成功解析的树部分。显然,最后一个元素可能具有意外的名称/值;例如,如果 attribute 值不以必要的引号结尾(如 example),则 attribute 的值将包含字符串 . <node attr="value>some data</node>
attr
value>some 数据</node>
In addition to the status code, parsing result has an member, which contains the offset of last successfully parsed character if parsing failed because of an error in source data; otherwise is 0. For parsing efficiency reasons, pugixml does not track the current line during parsing; this offset is in units of pugi::char_t (bytes for character mode, wide characters for wide character mode). Many text editors support 'Go To Position' feature - you can use it to locate the exact error position. Alternatively, if you’re loading the document from memory, you can display the error chunk along with the error description (see the example code below).offset
offset
解析结果除了状态码之外,还有一个成员,包含源数据出错解析失败时,最后解析成功字符的偏移量;否则为 0。出于解析效率的原因,pugixml 在解析期间不会跟踪当前行;此偏移量以 pugi::char_t 为单位(字符模式为字节,宽字符模式为宽字符)。许多文本编辑器都支持 'Go To Position' 功能 - 您可以使用它来定位确切的错误位置。或者,如果要从内存中加载文档,则可以显示错误块以及错误描述(请参阅下面的示例代码)。偏移
偏移
Caution 谨慎
|
Offset is calculated in the XML buffer in native encoding; if encoding conversion is performed during parsing, offset can not be used to reliably track the error position.
偏移量 (Offset) 是在本机编码的 XML 缓冲区中计算的;如果在解析期间执行编码转换,则 offset 不能用于可靠地跟踪错误位置。 |
Parsing result also has an member, which can be used to check that the source data encoding was correctly guessed. It is equal to the exact encoding used during parsing (i.e. with the exact endianness); see Encodings for more information.encoding
解析结果也有一个成员,可用于检查源数据编码是否被正确猜测。它等于解析期间使用的确切编码(即具有确切的字节序);有关更多信息,请参阅 编码 。编码
Parsing result object can be implicitly converted to ; if you do not want to handle parsing errors thoroughly, you can just check the return value of load functions as if it was a : .bool
bool
if (doc.load_file("file.xml")) { … } else { … }
解析结果对象可以隐式转换为 ;如果您不想彻底处理解析错误,您可以只检查 Load 函数的返回值,就像它是 : 一样。布尔
布尔
if (doc.load_file("file.xml")) { … } else { … }
This is an example of handling loading errors (samples/load_error_handling.cpp):
以下是处理加载错误 (samples/load_error_handling.cpp) 的示例:
pugi::xml_document doc;
pugi::xml_parse_result result = doc.load_string(source);
if (result)
{
std::cout << "XML [" << source << "] parsed without errors, attr value: [" << doc.child("node").attribute("attr").value() << "]\n\n";
}
else
{
std::cout << "XML [" << source << "] parsed with errors, attr value: [" << doc.child("node").attribute("attr").value() << "]\n";
std::cout << "Error description: " << result.description() << "\n";
std::cout << "Error offset: " << result.offset << " (error at [..." << (source + result.offset) << "]\n\n";
}
4.5. Parsing options 4.5. 解析选项
All document loading functions accept the optional parameter . This is a bitmask that customizes the parsing process: you can select the node types that are parsed and various transformations that are performed with the XML text. Disabling certain transformations can improve parsing performance for some documents; however, the code for all transformations is very well optimized, and thus the majority of documents won’t get any performance benefit. As a rule of thumb, only modify parsing flags if you want to get some nodes in the document that are excluded by default (i.e. declaration or comment nodes).options
所有文档加载函数都接受可选参数 。这是一个自定义解析过程的位掩码:您可以选择要解析的节点类型以及使用 XML 文本执行的各种转换。禁用某些转换可以提高某些文档的解析性能;但是,所有转换的代码都得到了很好的优化,因此大多数文档不会获得任何性能优势。根据经验,如果要获取文档中默认排除的某些节点(即声明或注释节点),请仅修改解析标志。选项
Note 注意
|
You should use the usual bitwise arithmetics to manipulate the bitmask: to enable a flag, use ; to disable a flag, use .
mask | flag mask & ~flag 您应该使用通常的按位算术来操作位掩码:要启用标志,请使用 ;要禁用标志,请使用 . 面具 | 旗帜 面具 & ~旗帜 |
These flags control the resulting tree contents:
这些标志控制生成的树内容:
-
parse_declaration
determines if XML document declaration (node with type node_declaration) is to be put in DOM tree. If this flag is off, it is not put in the tree, but is still parsed and checked for correctness. This flag is off by default.parse_declaration
确定是否将 XML 文档声明(类型为 node_declaration 的节点)放入 DOM 树中。如果此标志处于关闭状态,则不会将其放入树中,但仍会对其进行解析并检查其正确性。默认情况下,此标志处于关闭状态。 -
parse_doctype
determines if XML document type declaration (node with type node_doctype) is to be put in DOM tree. If this flag is off, it is not put in the tree, but is still parsed and checked for correctness. This flag is off by default.parse_doctype
确定是否将 XML 文档类型声明(类型为 node_doctype 的节点)放在 DOM 树中。如果此标志处于关闭状态,则不会将其放入树中,但仍会对其进行解析并检查其正确性。默认情况下,此标志处于关闭状态。 -
parse_pi
determines if processing instructions (nodes with type node_pi) are to be put in DOM tree. If this flag is off, they are not put in the tree, but are still parsed and checked for correctness. Note that (document declaration) is not considered to be a PI. This flag is off by default.<?xml …?>
parse_pi
确定是否将处理指令(类型为 node_pi 的节点)放入 DOM 树中。如果此标志处于关闭状态,则不会将它们放入树中,但仍会对其进行解析和检查正确性。请注意,(文档声明)不被视为 PI。默认情况下,此标志处于关闭状态。<?xml ...?>
-
parse_comments
determines if comments (nodes with type node_comment) are to be put in DOM tree. If this flag is off, they are not put in the tree, but are still parsed and checked for correctness. This flag is off by default.parse_comments
确定是否将注释(类型为 node_comment 的节点)放入 DOM 树中。如果此标志处于关闭状态,则不会将它们放入树中,但仍会对其进行解析和检查正确性。默认情况下,此标志处于关闭状态。 -
parse_cdata
determines if CDATA sections (nodes with type node_cdata) are to be put in DOM tree. If this flag is off, they are not put in the tree, but are still parsed and checked for correctness. This flag is on by default.parse_cdata
确定是否将 CDATA 部分(类型为 node_cdata 的节点)放入 DOM 树中。如果此标志处于关闭状态,则不会将它们放入树中,但仍会对其进行解析和检查正确性。默认情况下,此标志处于打开状态。 -
parse_trim_pcdata
determines if leading and trailing whitespace characters are to be removed from PCDATA nodes. While for some applications leading/trailing whitespace is significant, often the application only cares about the non-whitespace contents so it’s easier to trim whitespace from text during parsing. This flag is off by default.parse_trim_pcdata
确定是否要从 PCDATA 节点中删除前导和尾随空格字符。虽然对于某些应用程序来说,前导/尾随空格很重要,但应用程序通常只关心非空白内容,因此在解析过程中更容易从文本中修剪空格。默认情况下,此标志处于关闭状态。 -
parse_ws_pcdata
determines if PCDATA nodes (nodes with type node_pcdata) that consist only of whitespace characters are to be put in DOM tree. Often whitespace-only data is not significant for the application, and the cost of allocating and storing such nodes (both memory and speed-wise) can be significant. For example, after parsing XML string , element will have three children when is set (child with type node_pcdata and value , child with type node_element and name , and another child with type node_pcdata and value ), and only one child when is not set. This flag is off by default.<node> <a/> </node>
<node>
parse_ws_pcdata
" "
"a"
" "
parse_ws_pcdata
parse_ws_pcdata
确定是否将仅包含空格字符的 PCDATA 节点(类型为 node_pcdata 的节点)放入 DOM 树中。通常,仅空白数据对应用程序来说并不重要,并且分配和存储此类节点的成本(内存和速度方面)可能很高。例如,在解析 XML 字符串后,元素在设置时将具有三个子项(类型为 node_pcdata 和 value 的 child 、 类型为 node_element 和 name 的 child ,以及另一个类型为 node_pcdata 和 value 的子项),未设置时只有一个 child 。默认情况下,此标志处于关闭状态。<node> <a/> </node%
3E<node>
parse_ws_pcdata
“ ”
“a”
“ ”
parse_ws_pcdata
-
parse_ws_pcdata_single
determines if whitespace-only PCDATA nodes that have no sibling nodes are to be put in DOM tree. In some cases application needs to parse the whitespace-only contents of nodes, i.e. , but is not interested in whitespace markup elsewhere. It is possible to use parse_ws_pcdata flag in this case, but it results in excessive allocations and complicates document processing; this flag can be used to avoid that. As an example, after parsing XML string with flag set, element will have one child , and element will have one child with type node_pcdata and value . This flag has no effect if parse_ws_pcdata is enabled. This flag is off by default.<node> </node>
<node> <a> </a> </node>
parse_ws_pcdata_single
<node>
<a>
<a>
" "
parse_ws_pcdata_single
确定是否将没有同级节点的仅空白 PCDATA 节点放入 DOM 树中。在某些情况下,应用程序需要解析节点的仅空白内容,即 ,但对其他地方的空白标记不感兴趣。在这种情况下,可以使用 parse_ws_pcdata 标志,但这会导致过多的分配并使文档处理复杂化;此标志可用于避免这种情况。例如,在解析设置了标志的 XML 字符串后, element 将具有一个 child ,而 element 将具有一个 type 为 node_pcdata 和 value 的子 。如果启用了 parse_ws_pcdata,则此标志无效。默认情况下,此标志处于关闭状态。<node> </node>
<node> <a> </a> </node>
parse_ws_pcdata_single
<node>
<a>
<a>
“ ” ”
-
parse_embed_pcdata
determines if PCDATA contents is to be saved as element values. Normally element nodes have names but not values; this flag forces the parser to store the contents as a value if PCDATA is the first child of the element node (otherwise PCDATA node is created as usual). This can significantly reduce the memory required for documents with many PCDATA nodes. To retrieve the data you can use on the element nodes or any of the higher-level functions like or . This flag is off by default. Since this flag significantly changes the DOM structure it is only recommended for parsing documents with many PCDATA nodes in memory-constrained environments. This flag is off by default.xml_node::value()
child_value
text
parse_embed_pcdata
确定是否将 PCDATA 内容保存为元素值。通常,元素节点有名称但没有值;如果 PCDATA 是 element 节点的第一个子节点,则此标志会强制解析器将内容存储为值(否则将照常创建 PCDATA 节点)。这可以显著减少具有许多 PCDATA 节点的文档所需的内存。要检索可以在元素节点或任何更高级别的函数(如 或 )上使用的数据。默认情况下,此标志处于关闭状态。由于此标志会显著更改 DOM 结构,因此仅建议在内存受限的环境中解析具有许多 PCDATA 节点的文档。默认情况下,此标志处于关闭状态。xml_node::value()
child_value
文本
-
parse_merge_pcdata
determines if PCDATA contents is to be merged with the previous PCDATA node when no intermediary nodes are present between them. If the PCDATA contains CDATA sections, PI nodes, or comments in between, and either of the flags parse_cdata ,parse_pi ,parse_comments is not set, the contents of the PCDATA node will be merged with the previous one. This flag is off by default.parse_merge_pcdata
确定当 PCDATA 内容之间不存在中间节点时,是否要与前一个 PCDATA 节点合并。如果 PCDATA 包含 CDATA 部分、PI 节点或中间的注释,并且未设置parse_cdata 、 、 parse_pi 、 parse_comments 标志中的任何一个,则 PCDATA 节点的内容将与前一个节点合并。默认情况下,此标志处于关闭状态。 -
parse_fragment
determines if document should be treated as a fragment of a valid XML. Parsing document as a fragment leads to top-level PCDATA content (i.e. text that is not located inside a node) to be added to a tree, and additionally treats documents without element nodes as valid and permits multiple top-level element nodes (currently multiple top-level element nodes are also permitted when the flag is off, but that behavior should not be relied on). This flag is off by default.parse_fragment
确定是否应将 document 视为有效 XML 的片段。将文档解析为片段会导致将顶级 PCDATA 内容(即不在节点内的文本)添加到树中,并且还将没有元素节点的文档视为有效,并允许多个顶级元素节点(当前,当标志关闭时也允许多个顶级元素节点,但不应依赖该行为)。默认情况下,此标志处于关闭状态。
Caution 谨慎
|
Using in-place parsing (load_buffer_inplace) with flag may result in the loss of the last character of the buffer if it is a part of PCDATA. Since PCDATA values are null-terminated strings, the only way to resolve this is to provide a null-terminated buffer as an input to - i.e. .
parse_fragment load_buffer_inplace doc.load_buffer_inplace("test\0", 5, pugi::parse_default | pugi::parse_fragment) 如果缓冲区是 PCDATA 的一部分,则使用带有 flag 的就地解析 (load_buffer_inplace) 可能会导致缓冲区的最后一个字符丢失。由于 PCDATA 值是以 null 结尾的字符串,因此解决此问题的唯一方法是提供以 null 结尾的缓冲区作为 - 即 . parse_fragment load_buffer_inplace doc.load_buffer_inplace("test\0", 5, pugi::parse_default | pugi::parse_fragment) |
These flags control the transformation of tree element contents:
这些标志控制树元素内容的转换:
-
parse_escapes
determines if character and entity references are to be expanded during the parsing process. Character references have the form or ( is Unicode numeric representation of character in either decimal () or hexadecimal () form), entity references are , , , and (note that as pugixml does not handle DTD, the only allowed entities are predefined ones). If character/entity reference can not be expanded, it is left as is, so you can do additional processing later. Reference expansion is performed on attribute values and PCDATA content. This flag is on by default.&#…;
&#x…;
…
&#…;
&#x…;
<
>
&
'
"
parse_escapes
确定在解析过程中是否扩展角色和实体引用。字符引用的形式为 or( 是字符的 Unicode 数字表示形式,可以是十进制 () 或十六进制 () 形式),实体引用是 、 、 和 (请注意,由于 pugixml 不处理 DTD,因此唯一允许的实体是预定义的实体)。如果字符/实体引用无法展开,则保持原样,以便您稍后进行其他处理。对属性值和 PCDATA 内容执行引用扩展。默认情况下,此标志处于打开状态。&#...;
&#x...;
...
&#...;
&#x...;
<
《尺寸》
&
&
“
-
parse_eol
determines if EOL handling (that is, replacing sequences by a single character, and replacing all standalone characters by ) is to be performed on input data (that is, comment contents, PCDATA/CDATA contents and attribute values). This flag is on by default.\r\n
\n
\r
\n
parse_eol
确定是否对输入数据(即注释内容、PCDATA/CDATA 内容和属性值)执行 EOL 处理(即,用单个字符替换序列,并将所有独立字符替换为)。默认情况下,此标志处于打开状态。\r\n
\n
\r
\n
-
parse_wconv_attribute
determines if attribute value normalization should be performed for all attributes. This means, that whitespace characters (new line, tab and space) are replaced with space (). New line characters are always treated as if parse_eol is set, i.e. is converted to a single space. This flag is on by default.' '
\r\n
parse_wconv_attribute
确定是否应对所有属性执行属性值归一化。这意味着,空格字符(换行符、制表符和空格)将替换为空格 ()。换行符始终被视为设置了 parse_eol,即转换为单个空格。默认情况下,此标志处于打开状态。' '
\r\n
-
parse_wnorm_attribute
determines if extended attribute value normalization should be performed for all attributes. This means, that after attribute values are normalized as if parse_wconv_attribute was set, leading and trailing space characters are removed, and all sequences of space characters are replaced by a single space character. parse_wconv_attribute has no effect if this flag is on. This flag is off by default.parse_wnorm_attribute
确定是否应对所有属性执行扩展属性值规范化。这意味着,在对属性值进行规范化后,就像设置了 parse_wconv_attribute 一样,前导和尾随空格字符将被删除,并且所有空格字符序列都替换为单个空格字符。如果此标志处于打开状态,parse_wconv_attribute 则不起作用。默认情况下,此标志处于关闭状态。
Note 注意
|
parse_wconv_attribute option performs transformations that are required by W3C specification for attributes that are declared as CDATA; parse_wnorm_attribute performs transformations required for NMTOKENS attributes. In the absence of document type declaration all attributes should behave as if they are declared as CDATA, thus parse_wconv_attribute is the default option.
parse_wconv_attribute 选项对声明为 CDATA 的属性执行 W3C 规范所需的转换;parse_wnorm_attribute 执行 NMTOKENS 属性所需的转换。在没有文档类型声明的情况下,所有属性的行为都应与它们声明为 CDATA 一样,因此 parse_wconv_attribute 是默认选项。 |
Additionally there are three predefined option masks:
此外,还有三个预定义的选项掩码:
-
parse_minimal
has all options turned off. This option mask means that pugixml does not add declaration nodes, document type declaration nodes, PI nodes, CDATA sections and comments to the resulting tree and does not perform any conversion for input data, so theoretically it is the fastest mode. However, as mentioned above, in practice parse_default is usually equally fast.parse_minimal
已关闭所有选项。此选项掩码意味着 pugixml 不会向结果树添加声明节点、文档类型声明节点、PI 节点、CDATA 部分和注释,并且不会对输入数据执行任何转换,因此理论上它是最快的模式。但是,如上所述,在实践中,parse_default通常同样快。 -
parse_default
is the default set of flags, i.e. it has all options set to their default values. It includes parsing CDATA sections (comments/PIs are not parsed), performing character and entity reference expansion, replacing whitespace characters with spaces in attribute values and performing EOL handling. Note, that PCDATA sections consisting only of whitespace characters are not parsed (by default) for performance reasons.parse_default
是默认的标志集,即它的所有选项都设置为默认值。它包括解析 CDATA 部分(不解析注释/PI)、执行字符和实体引用扩展、将属性值中的空格字符替换为空格以及执行 EOL 处理。请注意,出于性能原因,仅由空格字符组成的 PCDATA 部分不会被解析(默认情况下)。 -
parse_full
is the set of flags which adds nodes of all types to the resulting tree and performs default conversions for input data. It includes parsing CDATA sections, comments, PI nodes, document declaration node and document type declaration node, performing character and entity reference expansion, replacing whitespace characters with spaces in attribute values and performing EOL handling. Note, that PCDATA sections consisting only of whitespace characters are not parsed in this mode.parse_full
是一组标志,它将所有类型的节点添加到结果树中,并对 input 数据执行默认转换。它包括解析 CDATA 部分、注释、PI 节点、文档声明节点和文档类型声明节点、执行字符和实体引用扩展、将属性值中的空格字符替换为空格以及执行 EOL 处理。请注意,在此模式下,不会解析仅包含空白字符的 PCDATA 节。
This is an example of using different parsing options (samples/load_options.cpp):
以下是使用不同解析选项 (samples/load_options.cpp) 的示例:
const char* source = "<!--comment--><node><</node>";
// Parsing with default options; note that comment node is not added to the tree, and entity reference < is expanded
doc.load_string(source);
std::cout << "First node value: [" << doc.first_child().value() << "], node child value: [" << doc.child_value("node") << "]\n";
// Parsing with additional parse_comments option; comment node is now added to the tree
doc.load_string(source, pugi::parse_default | pugi::parse_comments);
std::cout << "First node value: [" << doc.first_child().value() << "], node child value: [" << doc.child_value("node") << "]\n";
// Parsing with additional parse_comments option and without the (default) parse_escapes option; < is not expanded
doc.load_string(source, (pugi::parse_default | pugi::parse_comments) & ~pugi::parse_escapes);
std::cout << "First node value: [" << doc.first_child().value() << "], node child value: [" << doc.child_value("node") << "]\n";
// Parsing with minimal option mask; comment node is not added to the tree, and < is not expanded
doc.load_string(source, pugi::parse_minimal);
std::cout << "First node value: [" << doc.first_child().value() << "], node child value: [" << doc.child_value("node") << "]\n";
4.6. Encodings 4.6. 编码
pugixml supports all popular Unicode encodings (UTF-8, UTF-16 (big and little endian), UTF-32 (big and little endian); UCS-2 is naturally supported since it’s a strict subset of UTF-16) as well as some non-Unicode encodings (Latin-1) and handles all encoding conversions. Most loading functions accept the optional parameter . This is a value of enumeration type , that can have the following values:encoding
xml_encoding
pugixml 支持所有流行的 Unicode 编码(UTF-8、UTF-16(大端和小端)、UTF-32(大端和小端);UCS-2 自然受到支持,因为它是 UTF-16) 以及一些非 Unicode 编码 (Latin-1) 的严格子集,并处理所有编码转换。大多数加载函数都接受可选参数 。这是 enumeration type 的值,它可以具有以下值:encoding
xml_encoding
-
encoding_auto
means that pugixml will try to guess the encoding based on source XML data. The algorithm is a modified version of the one presented in Appendix F of XML recommendation. It tries to find a Byte Order Mark of one of the supported encodings first; if that fails, it checks if the first few bytes of the input data look like a representation of or in one of UTF-16 or UTF-32 variants; if that fails as well, encoding is assumed to be either UTF-8 or one of the non-Unicode encodings - to make the final decision the algorithm tries to parse the attribute of the XML document declaration, ultimately falling back to UTF-8 if document declaration is not present or does not specify a supported encoding.<
<?
encoding
encoding_auto
意味着 pugixml 将尝试根据源 XML 数据猜测编码。该算法是 XML 建议附录 F 中介绍的算法的修改版本。它首先尝试查找受支持编码之一的字节顺序标记;如果失败,它将检查 Importing 数据的前几个字节是否看起来像 UTF-16 或 UTF-32 变体之一的表示形式;如果这也失败了,则假定编码为 UTF-8 或非 Unicode 编码之一 - 为了做出最终决定,算法会尝试解析 XML 文档声明的属性,如果文档声明不存在或未指定支持的编码,则最终回退到 UTF-8。<
<?
编码
-
encoding_utf8
corresponds to UTF-8 encoding as defined in the Unicode standard; UTF-8 sequences with length equal to 5 or 6 are not standard and are rejected.encoding_utf8
对应于 Unicode 标准中定义的 UTF-8 编码;长度等于 5 或 6 的 UTF-8 序列不是标准的,将被拒绝。 -
encoding_utf16_le
corresponds to little-endian UTF-16 encoding as defined in the Unicode standard; surrogate pairs are supported.encoding_utf16_le
对应于 Unicode 标准中定义的 little-endian UTF-16 编码;支持代理项对。 -
encoding_utf16_be
corresponds to big-endian UTF-16 encoding as defined in the Unicode standard; surrogate pairs are supported.encoding_utf16_be
对应于 Unicode 标准中定义的 big-endian UTF-16 编码;支持代理项对。 -
encoding_utf16
corresponds to UTF-16 encoding as defined in the Unicode standard; the endianness is assumed to be that of the target platform.encoding_utf16
对应于 Unicode 标准中定义的 UTF-16 编码;假定字节序为目标平台的字节序。 -
encoding_utf32_le
corresponds to little-endian UTF-32 encoding as defined in the Unicode standard.encoding_utf32_le
对应于 Unicode 标准中定义的 little-endian UTF-32 编码。 -
encoding_utf32_be
corresponds to big-endian UTF-32 encoding as defined in the Unicode standard.encoding_utf32_be
对应于 Unicode 标准中定义的 big-endian UTF-32 编码。 -
encoding_utf32
corresponds to UTF-32 encoding as defined in the Unicode standard; the endianness is assumed to be that of the target platform.encoding_utf32
对应于 Unicode 标准中定义的 UTF-32 编码;假定字节序为目标平台的字节序。 -
encoding_wchar
corresponds to the encoding of type; it has the same meaning as either or , depending on size.wchar_t
encoding_utf16
encoding_utf32
wchar_t
encoding_wchar
对应于 type 的编码;它与 或 具有相同的含义,具体取决于大小。wchar_t
encoding_utf16
encoding_utf32
wchar_t
-
encoding_latin1
corresponds to ISO-8859-1 encoding (also known as Latin-1).encoding_latin1
对应于 ISO-8859-1 编码(也称为 Latin-1)。
The algorithm used for correctly detects any supported Unicode encoding for all well-formed XML documents (since they start with document declaration) and for all other XML documents that start with ; if your XML document does not start with and has encoding that is different from UTF-8, use the specific encoding.encoding_auto
<
<
用于正确检测所有格式正确的 XML 文档(因为它们以文档声明开头)和所有其他以 ;如果 XML 文档不以 UTF-8 开头且编码不同于 UTF-8,则使用特定编码。encoding_auto
<
<
Note 注意
|
The current behavior for Unicode conversion is to skip all invalid UTF sequences during conversion. This behavior should not be relied upon; moreover, in case no encoding conversion is performed, the invalid sequences are not removed, so you’ll get them as is in node/attribute contents.
Unicode 转换的当前行为是在转换过程中跳过所有无效的 UTF 序列。不应依赖这种行为;此外,如果未执行编码转换,则不会删除无效序列,因此您将在 Node/Attribute 内容中按原样获取它们。 |
4.7. Conformance to W3C specification
4.7. 符合 W3C 规范
pugixml is not fully W3C conformant - it can load any valid XML document, but does not perform some well-formedness checks. While considerable effort is made to reject invalid XML documents, some validation is not performed because of performance reasons.
pugixml 并不完全符合 W3C - 它可以加载任何有效的 XML 文档,但不会执行一些格式正确的检查。虽然我们付出了相当大的努力来拒绝无效的 XML 文档,但由于性能原因,没有执行某些验证。
There is only one non-conformant behavior when dealing with valid XML documents: pugixml does not use information supplied in document type declaration for parsing. This means that entities declared in DOCTYPE are not expanded, and all attribute/PCDATA values are always processed in a uniform way that depends only on parsing options.
在处理有效的 XML 文档时,只有一种不一致的行为: pugixml 不使用文档类型声明中提供的信息进行解析。这意味着在 DOCTYPE 中声明的实体不会被扩展,并且所有 attribute/PCDATA 值始终以仅取决于解析选项的统一方式进行处理。
As for rejecting invalid XML documents, there are a number of incompatibilities with W3C specification, including:
至于拒绝无效的 XML 文档,与 W3C 规范有许多不兼容之处,包括:
-
Multiple attributes of the same node can have equal names.
同一节点的多个属性可以具有相同的名称。 -
Tag and attribute names are not fully validated for consisting of allowed characters, so some invalid tags are not rejected
标记和属性名称未完全验证是否包含允许的字符,因此不会拒绝某些无效标记 -
Attribute values which contain are not rejected.
<
包含 的属性值不会被拒绝。<
-
Invalid entity/character references are not rejected and are instead left as is.
无效的实体/字符引用不会被拒绝,而是保持原样。 -
Comment values can contain .
--
注释值可以包含 .--
-
XML data is not required to begin with document declaration; additionally, document declaration can appear after comments and other nodes.
XML 数据不需要以文档声明开头;此外,Document Declaration 可以显示在 Comments 和其他节点之后。 -
Invalid document type declarations are silently ignored in some cases.
在某些情况下,无效的文档类型声明会被静默忽略。 -
Unicode validation is not performed so invalid UTF sequences are not rejected.
不执行 Unicode 验证,因此不会拒绝无效的 UTF 序列。 -
Document can contain multiple top-level element nodes.
Document 可以包含多个顶级元素节点。
5. Accessing document data
5. 访问文档数据
pugixml features an extensive interface for getting various types of data from the document and for traversing the document. This section provides documentation for all such functions that do not modify the tree except for XPath-related functions; see XPath for XPath reference. As discussed in C++ interface, there are two types of handles to tree data - xml_node and xml_attribute. The handles have special null (empty) values which propagate through various functions and thus are useful for writing more concise code; see this description for details. The documentation in this section will explicitly state the results of all function in case of null inputs.
Pugixml 具有一个广泛的界面,用于从文档中获取各种类型的数据和遍历文档。本节提供了除 XPath 相关函数外不修改树的所有此类函数的文档;请参阅 XPath 以获取 XPath 参考。如C++界面中所述,树数据有两种类型的句柄 - xml_node 和 xml_attribute。句柄具有特殊的 null(空)值,这些值通过各种函数传播,因此对于编写更简洁的代码很有用;有关详细信息,请参阅此描述。本节中的文档将明确说明在 null 输入的情况下 all 函数的结果。
Basic traversal functions
基本遍历函数
The internal representation of the document is a tree, where each node has a list of child nodes (the order of children corresponds to their order in the XML representation), and additionally element nodes have a list of attributes, which is also ordered. Several functions are provided in order to let you get from one node in the tree to the other. These functions roughly correspond to the internal representation, and thus are usually building blocks for other methods of traversing (i.e. XPath traversals are based on these functions).
文档的内部表示是一个树,其中每个节点都有一个子节点列表(子节点的顺序对应于它们在 XML 表示中的顺序),此外,元素节点还有一个属性列表,该列表也是有序的。提供了几个函数,以便让您从树中的一个节点到另一个节点。这些函数大致对应于内部表示,因此通常是其他遍历方法的构建块(即 XPath 遍历基于这些函数)。
xml_node xml_node::parent() const;
xml_node xml_node::first_child() const;
xml_node xml_node::last_child() const;
xml_node xml_node::next_sibling() const;
xml_node xml_node::previous_sibling() const;
xml_attribute xml_node::first_attribute() const;
xml_attribute xml_node::last_attribute() const;
xml_attribute xml_attribute::next_attribute() const;
xml_attribute xml_attribute::previous_attribute() const;
parent
function returns the node’s parent; all non-null nodes except the document have non-null parent. and return the first and last child of the node, respectively; note that only document nodes and element nodes can have non-empty child node list. If node has no children, both functions return null nodes. and return the node that’s immediately to the right/left of this node in the children list, respectively - for example, in , calling for a handle that points to results in a handle pointing to , and calling results in handle pointing to . If node does not have next/previous sibling (this happens if it is the last/first node in the list, respectively), the functions return null nodes. , , and functions behave similarly to the corresponding child node functions and allow to iterate through attribute list in the same way.first_child
last_child
next_sibling
previous_sibling
<a/><b/><c/>
next_sibling
<b/>
<c/>
previous_sibling
<a/>
first_attribute
last_attribute
next_attribute
previous_attribute
parent
函数返回节点的 parent;除 document 之外的所有非 null 节点都具有非 null 父节点。并分别返回节点的第一个和最后一个子节点;请注意,只有文档节点和元素节点可以具有非空的子节点列表。如果 node 没有子节点,则两个函数都返回 null 节点。并分别返回 children 列表中紧邻此节点右侧/左侧的节点 - 例如,在 中,调用指向指向句柄的句柄的句柄,以及调用 Handle 中的结果。如果 node 没有下一个/上一个同级节点(如果它是列表中的最后一个/第一个节点,则会发生这种情况),则函数将返回 null 节点。、 和 函数的行为与相应的子节点函数类似,并允许以相同的方式迭代 attribute list。first_child
last_child
next_sibling
previous_sibling<a/><b/><c/>
next_sibling<b/%
3E<c/>
previous_sibling
<a/>
first_attribute
last_attribute
next_attribute
previous_attribute
Note 注意
|
Because of memory consumption reasons, attributes do not have a link to their parent nodes. Thus there is no function.
xml_attribute::parent() 由于内存消耗的原因,属性没有指向其父节点的链接。因此没有功能。 xml_attribute::p arent() |
Calling any of the functions above on the null handle results in a null handle - i.e. returns the second child of , and null handle if is null, has no children at all or if it has only one child node.node.first_child().next_sibling()
node
node
在 null 句柄上调用上述任何函数都会产生 null 句柄 - 即返回 的第二个子项,如果为 null,则返回 null 句柄,则根本没有子项,或者只有一个子节点。 node.first_child().next_sibling()
节点
With these functions, you can iterate through all child nodes and display all attributes like this (samples/traverse_base.cpp):
使用这些函数,您可以遍历所有子节点并显示所有属性,如下所示 (samples/traverse_base.cpp):
for (pugi::xml_node tool = tools.first_child(); tool; tool = tool.next_sibling())
{
std::cout << "Tool:";
for (pugi::xml_attribute attr = tool.first_attribute(); attr; attr = attr.next_attribute())
{
std::cout << " " << attr.name() << "=" << attr.value();
}
std::cout << std::endl;
}
5.1. Getting node data 5.1. 获取节点数据
Apart from structural information (parent, child nodes, attributes), nodes can have name and value, both of which are strings. Depending on node type, name or value may be absent. node_document nodes do not have a name or value, node_element and node_declaration nodes always have a name but never have a value, node_pcdata, node_cdata, node_comment and node_doctype nodes never have a name but always have a value (it may be empty though), node_pi nodes always have a name and a value (again, value may be empty). In order to get node’s name or value, you can use the following functions:
除了结构信息(父节点、子节点、属性)之外,节点还可以具有 name 和 value,这两者都是字符串。根据节点类型,名称或值可能不存在。node_document节点没有名称或值,node_element 和 node_declaration 节点总是有名称但从来没有值,node_pcdata、node_cdata、node_comment 和 node_doctype 节点从来没有名称但总是有值(虽然它可能为空),node_pi节点总是有名称和值(同样, 值可能为空)。要获取节点的名称或值,您可以使用以下函数:
const char_t* xml_node::name() const;
const char_t* xml_node::value() const;
In case node does not have a name or value or if the node handle is null, both functions return empty strings - they never return null pointers.
如果 node 没有名称或值,或者 node 句柄为 null,则两个函数都返回空字符串 - 它们从不返回 null 指针。
It is common to store data as text contents of some node - i.e. . In this case, node does not have a value, but instead has a child of type node_pcdata with value . pugixml provides several helper functions to parse such data:<node><description>This is a node</description></node>
<description>
"This is a node"
通常将数据存储为某个节点的文本内容 - 即 .在这种情况下, node 没有值,而是具有 node_pcdata 类型的子项,其值为 。pugixml 提供了几个辅助函数来解析此类数据: <node><description>This is a node</description></node>
<description>
“This is a node”
const char_t* xml_node::child_value() const;
const char_t* xml_node::child_value(const char_t* name) const;
xml_text xml_node::text() const;
child_value()
returns the value of the first child with type node_pcdata or node_cdata; is a simple wrapper for . For the above example, calling and will both produce string . If there is no child with relevant type, or if the handle is null, functions return empty string.child_value(name)
child(name).child_value()
node.child_value("description")
description.child_value()
"This is a node"
child_value
child_value()
返回类型为 node_pcdata 或 node_cdata; 的第一个子项的值;是 的简单包装器 。在上面的例子中,调用 和 都会产生字符串 。如果没有具有相关类型的 child,或者 handle 为 null,则函数返回空字符串。child_value(name)
child(name).child_value()
node.child_value("description")
description.child_value()
“这是一个节点child_value
text()
returns a special object that can be used for working with PCDATA contents in more complex cases than just retrieving the value; it is described in Working with text contents sections.text()
返回一个特殊对象,该对象可用于在更复杂的情况下处理 PCDATA 内容,而不仅仅是检索值;它在 使用文本内容 部分中进行了介绍。
There is an example of using some of these functions at the end of the next section.
在下一节的末尾有一个使用其中一些函数的示例。
5.2. Getting attribute data
5.2. 获取 attribute 数据
All attributes have name and value, both of which are strings (value may be empty). There are two corresponding accessors, like for :xml_node
所有属性都有 name 和 value,两者都是字符串(value 可能为空)。有两个相应的访问器,例如 :xml_node
const char_t* xml_attribute::name() const;
const char_t* xml_attribute::value() const;
In case the attribute handle is null, both functions return empty strings - they never return null pointers.
如果属性句柄为 null,则两个函数都返回空字符串 - 它们从不返回 null 指针。
If you need a non-empty string if the attribute handle is null (for example, you need to get the option value from XML attribute, but if it is not specified, you need it to default to instead of ), you can use accessor:"sorted"
""
as_string
如果属性句柄为 null,则需要非空字符串(例如,需要从 XML 属性获取选项值,但如果未指定,则需要默认为而不是 ),可以使用 accessor:“sorted”
“as_string
const char_t* xml_attribute::as_string(const char_t* def = "") const;
It returns argument if the attribute handle is null. If you do not specify the argument, the function is equivalent to .def
value()
如果属性 handle 为 null,则返回 argument。如果不指定参数,则函数等效于 。def
值()
In many cases attribute values have types that are not strings - i.e. an attribute may always contain values that should be treated as integers, despite the fact that they are represented as strings in XML. pugixml provides several accessors that convert attribute value to some other type:
在许多情况下,属性值具有非字符串的类型 - 即属性可能始终包含应被视为整数的值,尽管它们在 XML 中表示为字符串。pugixml 提供了几个将 attribute value 转换为其他类型的访问器:
int xml_attribute::as_int(int def = 0) const;
unsigned int xml_attribute::as_uint(unsigned int def = 0) const;
double xml_attribute::as_double(double def = 0) const;
float xml_attribute::as_float(float def = 0) const;
bool xml_attribute::as_bool(bool def = false) const;
long long xml_attribute::as_llong(long long def = 0) const;
unsigned long long xml_attribute::as_ullong(unsigned long long def = 0) const;
as_int
, , , , and convert attribute values to numbers. If attribute handle is null argument is returned (which is 0 by default). Otherwise, all leading whitespace characters are truncated, and the remaining string is parsed as an integer number in either decimal or hexadecimal form (applicable to , , and ; hexadecimal format is used if the number has or prefix) or as a floating point number in either decimal or scientific form ( or ).as_uint
as_llong
as_ullong
as_double
as_float
def
as_int
as_uint
as_llong
as_ullong
0x
0X
as_double
as_float
as_int
、 、 、 并将属性值转换为数字。如果 attribute handle 为 null,则返回 argument (默认为 0)。否则,所有前导空白字符将被截断,其余字符串将被解析为十进制或十六进制形式的整数(适用于 、 和 ;如果数字带有 或 前缀,则使用十六进制格式)或十进制或科学形式( 或 )的浮点数。as_uint
as_llong
as_ullong
as_double as_float
def
as_int
as_uint
as_llong
as_ullong
0x
0X
as_double
as_float
In case the input string contains a non-numeric character sequence or a number that is out of the target numeric range, the result is undefined.
如果输入字符串包含非数字字符序列或超出目标数字范围的数字,则结果为 undefined。
Caution 谨慎
|
Number conversion functions depend on current C locale as set with , so may return unexpected results if the locale is different from .
setlocale "C" 数字转换函数取决于当前 C 语言环境,如使用 设置的那样,因此如果语言环境与 不同,则可能会返回意外结果。 setlocale “C” |
as_bool
converts attribute value to boolean as follows: if attribute handle is null, argument is returned (which is by default). If attribute value is empty, is returned. Otherwise, is returned if the first character is one of . This means that strings like and are recognized as , while strings like and are recognized as . For more complex matching you’ll have to write your own function.def
false
false
true
'1', 't', 'T', 'y', 'Y'
"true"
"yes"
true
"false"
"no"
false
as_bool
将 attribute value 转换为 boolean,如下所示:如果 attribute handle 为 null,则返回 argument(默认情况下)。如果 attribute value 为空,则返回。否则,如果第一个字符是 之一,则返回 。这意味着 like 和 like 的字符串被识别为 ,而像 和 这样的字符串被识别为 .对于更复杂的匹配,您必须编写自己的函数。def
false
false
true'1', 't', 'T', 'y', 'Y',
“true”
“yes”
true
“false”
“no”
false
Note 注意
|
as_llong and are only available if your platform has reliable support for the type, including string conversions.
as_ullong long long as_llong ,并且仅当您的平台对类型(包括字符串转换)有可靠的支持时才可用。 as_ullong long long |
This is an example of using these functions, along with node data retrieval ones (samples/traverse_base.cpp):
以下是使用这些函数以及节点数据检索函数 (samples/traverse_base.cpp) 的示例:
for (pugi::xml_node tool = tools.child("Tool"); tool; tool = tool.next_sibling("Tool"))
{
std::cout << "Tool " << tool.attribute("Filename").value();
std::cout << ": AllowRemote " << tool.attribute("AllowRemote").as_bool();
std::cout << ", Timeout " << tool.attribute("Timeout").as_int();
std::cout << ", Description '" << tool.child_value("Description") << "'\n";
}
5.3. Contents-based traversal functions
5.3. 基于内容的遍历函数
Since a lot of document traversal consists of finding the node/attribute with the correct name, there are special functions for that purpose:
由于许多文档遍历都包括查找具有正确名称的节点/属性,因此有用于此目的的特殊函数:
xml_node xml_node::child(const char_t* name) const;
xml_attribute xml_node::attribute(const char_t* name) const;
xml_node xml_node::next_sibling(const char_t* name) const;
xml_node xml_node::previous_sibling(const char_t* name) const;
child
and return the first child/attribute with the specified name; and return the first sibling in the corresponding direction with the specified name. All string comparisons are case-sensitive. In case the node handle is null or there is no node/attribute with the specified name, null handle is returned.attribute
next_sibling
previous_sibling
child
并返回具有指定名称的第一个 child/attribute;并返回相应方向上具有指定名称的第一个同级。所有字符串比较都区分大小写。如果节点句柄为 null 或没有具有指定名称的节点/属性,则返回 null 句柄。属性
next_sibling
previous_sibling
child
and functions can be used together to loop through all child nodes with the desired name like this:next_sibling
child
和 functions 可以一起使用来循环访问具有所需名称的所有子节点,如下所示:next_sibling
for (pugi::xml_node tool = tools.child("Tool"); tool; tool = tool.next_sibling("Tool"))
attribute
function needs to look for the target attribute by name. If a node has many attributes, finding each by name can be time consuming. If you have an idea of how attributes are ordered in the node, you can use a faster function:attribute
函数需要按名称查找目标属性。如果节点具有许多属性,则按名称查找每个属性可能非常耗时。如果您知道属性在节点中的排序方式,则可以使用更快的函数:
xml_attribute xml_node::attribute(const char_t* name, xml_attribute& hint) const;
The extra argument is used to guess where the attribute might be, and is updated to the location of the next attribute so that if you search for multiple attributes in the right order, the performance is maximized. Note that has to be either null or has to belong to the node, otherwise the behavior is undefined.hint
hint
extra 参数用于猜测属性可能位于何处,并更新为下一个属性的位置,以便以正确的顺序搜索多个属性,则性能会最大化。请注意,它必须为 null 或必须属于节点,否则行为未定义。hint
提示
You can use this function as follows:
您可以按如下方式使用此功能:
xml_attribute hint;
xml_attribute id = node.attribute("id", hint);
xml_attribute name = node.attribute("name", hint);
xml_attribute version = node.attribute("version", hint);
This code is correct regardless of the order of the attributes, but it’s faster if , and occur in that order."id"
"name"
"version"
无论属性的顺序如何,此代码都是正确的,但如果 , 并且按该顺序出现,则速度会更快。“id”
“名称”“
版本”
Occasionally the needed node is specified not by the unique name but instead by the value of some attribute; for example, it is common to have node collections with each node having a unique id: . There are two functions for finding child nodes based on the attribute values:<group><item id="1"/> <item id="2"/></group>
有时,所需的节点不是由唯一名称指定,而是由某个属性的值指定;例如,通常具有每个节点都具有唯一 id: 的节点集合。有两个函数可用于根据属性值查找子节点: <group><item id="1"/> <item id="2"/></group>
xml_node xml_node::find_child_by_attribute(const char_t* name, const char_t* attr_name, const char_t* attr_value) const;
xml_node xml_node::find_child_by_attribute(const char_t* attr_name, const char_t* attr_value) const;
The three-argument function returns the first child node with the specified name which has an attribute with the specified name/value; the two-argument function skips the name test for the node, which can be useful for searching in heterogeneous collections. If the node handle is null or if no node is found, null handle is returned. All string comparisons are case-sensitive.
三参数函数返回具有指定名称的第一个子节点,该子节点具有具有指定名称/值的属性;双参数函数跳过 Node 的名称 test,这对于在异构集合中搜索非常有用。如果节点句柄为 null 或未找到节点,则返回 null 句柄。所有字符串比较都区分大小写。
In all of the above functions, all arguments have to be valid strings; passing null pointers results in undefined behavior.
在上述所有函数中,所有参数都必须是有效的字符串;传递 null 指针会导致 undefined 行为。
This is an example of using these functions (samples/traverse_base.cpp):
以下是使用以下函数的示例 (samples/traverse_base.cpp):
std::cout << "Tool for *.dae generation: " << tools.find_child_by_attribute("Tool", "OutputFileMasks", "*.dae").attribute("Filename").value() << "\n";
for (pugi::xml_node tool = tools.child("Tool"); tool; tool = tool.next_sibling("Tool"))
{
std::cout << "Tool " << tool.attribute("Filename").value() << "\n";
}
5.4. Range-based for-loop support
5.4. 基于范围的 for 循环支持
If your C++ compiler supports range-based for-loop (this is a C++11 feature, at the time of writing it’s supported by Microsoft Visual Studio 2012+, GCC 4.6+ and Clang 3.0+), you can use it to enumerate nodes/attributes. Additional helpers are provided to support this; note that they are also compatible with Boost Foreach, and possibly other pre-C++11 foreach facilities.
如果您的 C++ 编译器支持基于范围的 for 循环(这是 C++11 的一项功能,在撰写本文时,Microsoft Visual Studio 2012+、GCC 4.6+ 和 Clang 3.0+ 支持它),您可以使用它来枚举节点/属性。提供了其他帮助程序来支持这一点;请注意,它们还与 Boost Foreach 兼容,并且可能还兼容其他 C++11 之前的 foreach 工具。
implementation-defined-type xml_node::children() const;
implementation-defined-type xml_node::children(const char_t* name) const;
implementation-defined-type xml_node::attributes() const;
children
function allows you to enumerate all child nodes; function with argument allows you to enumerate all child nodes with a specific name; function allows you to enumerate all attributes of the node. Note that you can also use node object itself in a range-based for construct, which is equivalent to using .children
name
attributes
children()
children
function 允许您枚举所有子节点;function with argument 允许您枚举具有特定名称的所有子节点;function 允许您枚举节点的所有属性。请注意,您还可以在基于范围的 for 构造中使用节点对象本身,这等效于使用 .子项
名称
属性
children()
This is an example of using these functions (samples/traverse_rangefor.cpp):
以下是使用以下函数的示例 (samples/traverse_rangefor.cpp):
for (pugi::xml_node tool: tools.children("Tool"))
{
std::cout << "Tool:";
for (pugi::xml_attribute attr: tool.attributes())
{
std::cout << " " << attr.name() << "=" << attr.value();
}
for (pugi::xml_node child: tool.children())
{
std::cout << ", child " << child.name();
}
std::cout << std::endl;
}
While using makes the intent of the code clear, note that each node can be treated as a container of child nodes, since it provides / member functions described in the next section. Because of this, you can iterate through node’s children simply by using the node itself:children()
begin()
end()
虽然 using 明确了代码的意图,但请注意,每个节点都可以被视为子节点的容器,因为它提供了 / 成员函数,如下一节所述。因此,您只需使用节点本身即可遍历节点的子节点:children()
begin()
end()
for (pugi::xml_node child: tool)
5.5. Traversing node/attribute lists via iterators
5.5. 通过迭代器遍历节点 / 属性列表
Child node lists and attribute lists are simply double-linked lists; while you can use / and other such functions for iteration, pugixml additionally provides node and attribute iterators, so that you can treat nodes as containers of other nodes or attributes:previous_sibling
next_sibling
子节点列表和属性列表只是双向链接列表;虽然您可以使用 / 和其他此类函数进行迭代,但 pugixml 还提供了节点和属性迭代器,以便您可以将节点视为其他节点或属性的容器:previous_sibling
next_sibling
class xml_node_iterator;
class xml_attribute_iterator;
typedef xml_node_iterator xml_node::iterator;
iterator xml_node::begin() const;
iterator xml_node::end() const;
typedef xml_attribute_iterator xml_node::attribute_iterator;
attribute_iterator xml_node::attributes_begin() const;
attribute_iterator xml_node::attributes_end() const;
begin
and return iterators that point to the first node/attribute, respectively; and return past-the-end iterator for node/attribute list, respectively - this iterator can’t be dereferenced, but decrementing it results in an iterator pointing to the last element in the list (except for empty lists, where decrementing past-the-end iterator results in undefined behavior). Past-the-end iterator is commonly used as a termination value for iteration loops (see sample below). If you want to get an iterator that points to an existing handle, you can construct the iterator with the handle as a single constructor argument, like so: . For , you’ll have to provide both an attribute and its parent node.attributes_begin
end
attributes_end
xml_node_iterator(node)
xml_attribute_iterator
begin
和 return 分别指向第一个节点/属性的迭代器;并分别返回 node/attribute list 的 past-the-end iterator - 此迭代器不能被取消引用,但递减它会导致迭代器指向列表中的最后一个元素(空列表除外,其中递减超过末尾的迭代器会导致未定义的行为)。Past-the-end 迭代器通常用作迭代循环的终止值(请参阅下面的示例)。如果要获取指向现有句柄的迭代器,可以使用句柄作为单个构造函数参数构造迭代器,如下所示:。对于 ,您必须同时提供属性及其父节点。attributes_begin
attributes_end
xml_node_iterator(node)
结束
xml_attribute_iterator
begin
and return equal iterators if called on null node; such iterators can’t be dereferenced. and behave the same way. For correct iterator usage this means that child node/attribute collections of null nodes appear to be empty.end
attributes_begin
attributes_end
如果在
null 节点上调用,则 begin 并返回相等的迭代器;此类迭代器不能被取消引用。并且行为方式相同。为了正确使用迭代器,这意味着 null 节点的子节点/属性集合似乎为空。结束
attributes_begin
attributes_end
Both types of iterators have bidirectional iterator semantics (i.e. they can be incremented and decremented, but efficient random access is not supported) and support all usual iterator operations - comparison, dereference, etc. The iterators are invalidated if the node/attribute objects they’re pointing to are removed from the tree; adding nodes/attributes does not invalidate any iterators.
这两种类型的迭代器都具有双向迭代器语义(即它们可以递增和递减,但不支持高效的随机访问),并支持所有常见的迭代器操作 - 比较、取消引用等。如果迭代器指向的节点/属性对象已从树中删除,则迭代器将失效;添加 nodes/attributes 不会使任何迭代器失效。
Here is an example of using iterators for document traversal (samples/traverse_iter.cpp):
下面是一个使用迭代器进行文档遍历的示例 (samples/traverse_iter.cpp):
for (pugi::xml_node_iterator it = tools.begin(); it != tools.end(); ++it)
{
std::cout << "Tool:";
for (pugi::xml_attribute_iterator ait = it->attributes_begin(); ait != it->attributes_end(); ++ait)
{
std::cout << " " << ait->name() << "=" << ait->value();
}
std::cout << std::endl;
}
Caution 谨慎
|
Node and attribute iterators are somewhere in the middle between const and non-const iterators. While dereference operation yields a non-constant reference to the object, so that you can use it for tree modification operations, modifying this reference using assignment - i.e. passing iterators to a function like - will not give expected results, as assignment modifies local handle that’s stored in the iterator.
std::sort Node 和 attribute 迭代器介于 const 和非 const 迭代器之间。虽然取消引用操作会产生对对象的非常量引用,以便您可以将其用于树修改操作,但使用赋值修改此引用(即将迭代器传递给函数)不会产生预期的结果,因为赋值会修改存储在迭代器中的本地句柄。 std::sort |
5.6. Recursive traversal with xml_tree_walker
5.6. 使用 xml_tree_walker 进行递归遍历
The methods described above allow traversal of immediate children of some node; if you want to do a deep tree traversal, you’ll have to do it via a recursive function or some equivalent method. However, pugixml provides a helper for depth-first traversal of a subtree. In order to use it, you have to implement interface and to call function:xml_tree_walker
traverse
上面描述的方法允许遍历某个节点的直接子节点;如果你想做一个深树遍历,你必须通过递归函数或一些等效的方法来实现。但是,pugixml 为子树的深度优先遍历提供了一个帮助程序。为了使用它,你必须实现 interface 并调用 function:xml_tree_walker
traverse
class xml_tree_walker
{
public:
virtual bool begin(xml_node& node);
virtual bool for_each(xml_node& node) = 0;
virtual bool end(xml_node& node);
int depth() const;
};
bool xml_node::traverse(xml_tree_walker& walker);
The traversal is launched by calling function on traversal root and proceeds as follows:traverse
通过在遍历根上调用 function 来启动遍历,并按如下方式进行:traverse
-
First, function is called with traversal root as its argument.
begin
首先,以遍历根作为其参数调用 function。开始
-
Then, function is called for all nodes in the traversal subtree in depth first order, excluding the traversal root. Node is passed as an argument.
for_each
然后,对遍历子树中的所有节点进行深度一阶调用 function,不包括遍历根。Node 作为参数传递。for_each
-
Finally, function is called with traversal root as its argument.
end
最后,以遍历根作为其参数调用 function。结束
If , or any of the calls return , the traversal is terminated and is returned as the traversal result; otherwise, the traversal results in . Note that you don’t have to override or functions; their default implementations return .begin
end
for_each
false
false
true
begin
end
true
如果 , 或任何调用返回 ,则遍历终止并作为遍历结果返回;否则,遍历将导致 。请注意,您不必覆盖 or 函数;它们的默认实现返回 。开始
结束
for_each
假
真
开始
结束
真
You can get the node’s depth relative to the traversal root at any point by calling function. It returns if called from /, and returns 0-based depth if called from - depth is 0 for all children of the traversal root, 1 for all grandchildren and so on.depth
-1
begin
end
for_each
您可以通过调用 function 来获取节点在任何点相对于遍历根的深度。如果从 / 调用,则返回从 0 开始的深度 - 遍历根的所有子项的深度为 0,所有孙子项的深度为 1,依此类推。深度
1
开始
结束
for_each
This is an example of traversing tree hierarchy with xml_tree_walker (samples/traverse_walker.cpp):
以下是使用 xml_tree_walker 遍历树层次结构的示例 (samples/traverse_walker.cpp):
struct simple_walker: pugi::xml_tree_walker
{
virtual bool for_each(pugi::xml_node& node)
{
for (int i = 0; i < depth(); ++i) std::cout << " "; // indentation
std::cout << node_types[node.type()] << ": name='" << node.name() << "', value='" << node.value() << "'\n";
return true; // continue traversal
}
};
simple_walker walker;
doc.traverse(walker);
5.7. Searching for nodes/attributes with predicates
5.7. 使用谓词搜索节点/属性
While there are existing functions for getting a node/attribute with known contents, they are often not sufficient for simple queries. As an alternative for manual iteration through nodes/attributes until the needed one is found, you can make a predicate and call one of functions:find_
虽然存在用于获取具有已知内容的节点/属性的现有函数,但它们通常不足以进行简单查询。作为手动迭代节点/属性直到找到所需节点/属性的替代方法,您可以创建一个谓词并调用以下函数之一:find_
template <typename Predicate> xml_attribute xml_node::find_attribute(Predicate pred) const;
template <typename Predicate> xml_node xml_node::find_child(Predicate pred) const;
template <typename Predicate> xml_node xml_node::find_node(Predicate pred) const;
The predicate should be either a plain function or a function object which accepts one argument of type (for ) or (for and ), and returns . The predicate is never called with null handle as an argument.xml_attribute
find_attribute
xml_node
find_child
find_node
bool
谓词应为普通函数或函数对象,它接受一个 (for ) 或 (for and ) 类型的参数,并返回 。从不以 null 句柄作为参数来调用谓词。xml_attribute
find_attribute
xml_node
find_child
find_node
bool
find_attribute
function iterates through all attributes of the specified node, and returns the first attribute for which the predicate returned . If the predicate returned for all attributes or if there were no attributes (including the case where the node is null), null attribute is returned.true
false
find_attribute
函数遍历指定节点的所有属性,并返回谓词返回的第一个属性。如果为所有属性返回谓词,或者没有属性(包括节点为 null 的情况),则返回 null 属性。真
假
find_child
function iterates through all child nodes of the specified node, and returns the first node for which the predicate returned . If the predicate returned for all nodes or if there were no child nodes (including the case where the node is null), null node is returned.true
false
find_child
函数遍历指定节点的所有子节点,并返回谓词返回的第一个节点。如果为所有节点返回谓词,或者没有子节点(包括节点为 null 的情况),则返回 null 节点。真
假
find_node
function performs a depth-first traversal through the subtree of the specified node (excluding the node itself), and returns the first node for which the predicate returned . If the predicate returned for all nodes or if subtree was empty, null node is returned.true
false
find_node
函数对指定节点的子树(不包括节点本身)执行深度优先遍历,并返回谓词返回的第一个节点。如果为所有节点返回谓词,或者 subtree 为空,则返回 null 节点。真
假
This is an example of using predicate-based functions (samples/traverse_predicate.cpp):
以下是使用基于谓词的函数 (samples/traverse_predicate.cpp) 的示例:
bool small_timeout(pugi::xml_node node)
{
return node.attribute("Timeout").as_int() < 20;
}
struct allow_remote_predicate
{
bool operator()(pugi::xml_attribute attr) const
{
return strcmp(attr.name(), "AllowRemote") == 0;
}
bool operator()(pugi::xml_node node) const
{
return node.attribute("AllowRemote").as_bool();
}
};
// Find child via predicate (looks for direct children only)
std::cout << tools.find_child(allow_remote_predicate()).attribute("Filename").value() << std::endl;
// Find node via predicate (looks for all descendants in depth-first order)
std::cout << doc.find_node(allow_remote_predicate()).attribute("Filename").value() << std::endl;
// Find attribute via predicate
std::cout << tools.last_child().find_attribute(allow_remote_predicate()).value() << std::endl;
// We can use simple functions instead of function objects
std::cout << tools.find_child(small_timeout).attribute("Filename").value() << std::endl;
5.8. Working with text contents
5.8. 使用文本内容
It is common to store data as text contents of some node - i.e. . In this case, node does not have a value, but instead has a child of type node_pcdata with value . pugixml provides a special class, , to work with such data. Working with text objects to modify data is described in the documentation for modifying document data; this section describes the access interface of .<node><description>This is a node</description></node>
<description>
"This is a node"
xml_text
xml_text
通常将数据存储为某个节点的文本内容 - 即 .在这种情况下, node 没有值,而是具有 node_pcdata 类型的子项,其值为 。pugixml 提供了一个特殊的类 ,用于处理此类数据。使用文本对象修改数据 在修改文档数据的文档;本节介绍其访问接口。 <node><description>This is a node</description></node>
<描述>
“这是一个节点”
xml_text
xml_text
You can get the text object from a node by using method:text()
您可以使用 method:text() 从节点获取文本对象
xml_text xml_node::text() const;
If the node has a type or , then the node itself is used to return data; otherwise, a first child node of type or is used.node_pcdata
node_cdata
node_pcdata
node_cdata
如果节点的类型为 或 ,则使用节点本身返回数据;否则,使用 or 类型的第一个子节点。node_pcdata
node_cdata
node_pcdata
node_cdata
You can check if the text object is bound to a valid PCDATA/CDATA node by using it as a boolean value, i.e. or . Alternatively you can check it by using the method:if (text) { … }
if (!text) { … }
empty()
您可以通过将文本对象用作布尔值(即 或 )来检查文本对象是否绑定到有效的 PCDATA/CDATA 节点。或者,你可以使用 method:if (text) { ... }
来检查它if (!text) { ... }
空()
bool xml_text::empty() const;
Given a text object, you can get the contents (i.e. the value of PCDATA/CDATA node) by using the following function:
给定一个文本对象,您可以使用以下函数获取内容(即 PCDATA/CDATA 节点的值):
const char_t* xml_text::get() const;
In case text object is empty, the function returns an empty string - it never returns a null pointer.
如果 text 对象为空,该函数将返回一个空字符串 - 它永远不会返回 null 指针。
If you need a non-empty string if the text object is empty, or if the text contents is actually a number or a boolean that is stored as a string, you can use the following accessors:
如果文本对象为空,或者文本内容实际上是数字或存储为字符串的布尔值,则需要非空字符串,则可以使用以下访问器:
const char_t* xml_text::as_string(const char_t* def = "") const;
int xml_text::as_int(int def = 0) const;
unsigned int xml_text::as_uint(unsigned int def = 0) const;
double xml_text::as_double(double def = 0) const;
float xml_text::as_float(float def = 0) const;
bool xml_text::as_bool(bool def = false) const;
long long xml_text::as_llong(long long def = 0) const;
unsigned long long xml_text::as_ullong(unsigned long long def = 0) const;
All of the above functions have the same semantics as similar members: they return the default argument if the text object is empty, they convert the text contents to a target type using the same rules and restrictions. You can refer to documentation for the attribute functions for details.xml_attribute
上述所有函数都具有与类似成员相同的语义:如果文本对象为空,则它们返回 default 参数,它们使用相同的规则和限制将文本内容转换为目标类型。有关详细信息,您可以参考属性函数的文档。xml_attribute
xml_text
is essentially a helper class that operates on values. It is bound to a node of type node_pcdata or node_cdata. You can use the following function to retrieve this node:xml_node
xml_text
实质上是一个对值进行操作的帮助程序类。它绑定到 node_pcdata 或 node_cdata 类型的节点。您可以使用以下函数检索此节点:xml_node
xml_node xml_text::data() const;
Essentially, assuming is an object, calling is equivalent to calling .text
xml_text
text.get()
text.data().value()
本质上,假设是一个对象,则调用 等同于调用 。 text
xml_text
text.get()
text.data().value()
This is an example of using object (samples/text.cpp):xml_text
以下是使用 object (samples/text.cpp) 的示例xml_text
std::cout << "Project name: " << project.child("name").text().get() << std::endl;
std::cout << "Project version: " << project.child("version").text().as_double() << std::endl;
std::cout << "Project visibility: " << (project.child("public").text().as_bool(/* def= */ true) ? "public" : "private") << std::endl;
std::cout << "Project description: " << project.child("description").text().get() << std::endl;
5.9. Miscellaneous functions
5.9. 其他函数
If you need to get the document root of some node, you can use the following function:
如果需要获取某个节点的文档根目录,可以使用以下功能:
xml_node xml_node::root() const;
This function returns the node with type node_document, which is the root node of the document the node belongs to (unless the node is null, in which case null node is returned).
此函数返回类型为 node_document 的节点,该节点是节点所属文档的根节点(除非节点为 null,在这种情况下,返回 null 节点)。
While pugixml supports complex XPath expressions, sometimes a simple path handling facility is needed. There are two functions, for getting node path and for converting path to a node:
虽然 pugixml 支持复杂的 XPath 表达式,但有时需要简单的路径处理工具。有两个函数,用于获取节点路径和将路径转换为节点:
string_t xml_node::path(char_t delimiter = '/') const;
xml_node xml_node::first_element_by_path(const char_t* path, char_t delimiter = '/') const;
Node paths consist of node names, separated with a delimiter (which is by default); also paths can contain self () and parent () pseudo-names, so that this is a valid path: . returns the path to the node from the document root, looks for a node represented by a given path; a path can be an absolute one (absolute paths start with the delimiter), in which case the rest of the path is treated as document root relative, and relative to the given node. For example, in the following document: , node has path ; calling for document with path results in node ; calling for node with path results in node ; calling with path results in node for any node./
.
..
"../../foo/./bar"
path
first_element_by_path
<a><b><c/></b></a>
<c/>
"a/b/c"
first_element_by_path
"a/b"
<b/>
first_element_by_path
<a/>
"../a/./b/../."
<a/>
first_element_by_path
"/a"
<a/>
节点路径由节点名称组成,用分隔符分隔(默认情况下);此外,路径可以包含 self () 和 parent () 伪名称,因此这是一个有效的路径:。返回文档根目录中节点的路径,查找由给定路径表示的节点;路径可以是绝对路径(绝对路径以 Delimiter 开头),在这种情况下,路径的其余部分被视为文档根相对路径,并且相对于给定节点。例如,在以下文档中: ,节点具有 path ;使用 path 调用 document 会导致 node ;使用 path 调用 node 会导致 node ;使用 path 调用会导致任何节点的 NODE。/
的
..
“../../foo/./bar“
路径
first_element_by_path
<a><b><c/></b></a>
<c/>
”a/b/c“
first_element_by_path
”a/b“
<b/>
first_element_by_path
<a/>
”../a/./b/../。
<a/>
first_element_by_path
“/a”
<a/>
In case path component is ambiguous (if there are two nodes with given name), the first one is selected; paths are not guaranteed to uniquely identify nodes in a document. If any component of a path is not found, the result of is null node; also returns null node for null nodes, in which case the path does not matter. returns an empty string for null nodes.first_element_by_path
first_element_by_path
path
如果路径组件不明确(如果有两个具有给定名称的节点),则选择第一个节点;不能保证路径唯一标识文档中的节点。如果未找到 path 的任何组件,则 的结果为 null node;此外,对于 NULL 节点,还会返回 NULL 节点,在这种情况下,路径无关紧要。返回 Null 节点的空字符串。first_element_by_path
first_element_by_path
路径
Note 注意
|
path function returns the result as STL string, and thus is not available if PUGIXML_NO_STL is defined.
path 函数将结果作为 STL 字符串返回,因此如果定义了 PUGIXML_NO_STL,则不可用。 |
pugixml does not record row/column information for nodes upon parsing for efficiency reasons. However, if the node has not changed in a significant way since parsing (the name/value are not changed, and the node itself is the original one, i.e. it was not deleted from the tree and re-added later), it is possible to get the offset from the beginning of XML buffer:
出于效率原因,PugiXML 在解析时不记录节点的行/列信息。但是,如果节点在解析后没有发生重大变化(名称/值没有改变,节点本身是原始节点,即它没有从树中删除并在以后重新添加),则可以从 XML 缓冲区的开头获取偏移量:
ptrdiff_t xml_node::offset_debug() const;
If the offset is not available (this happens if the node is null, was not originally parsed from a stream, or has changed in a significant way), the function returns -1. Otherwise it returns the offset to node’s data from the beginning of XML buffer in pugi::char_t units. For more information on parsing offsets, see parsing error handling documentation.
如果偏移量不可用(如果节点为 null、最初未从流中解析或已发生重大更改),则函数返回 -1。否则,它将以 pugi::char_t 为单位返回从 XML 缓冲区开头到节点数据的偏移量。有关解析偏移量的更多信息,请参阅解析错误处理文档。
6. Modifying document data
6. 修改文档数据
The document in pugixml is fully mutable: you can completely change the document structure and modify the data of nodes/attributes. This section provides documentation for the relevant functions. All functions take care of memory management and structural integrity themselves, so they always result in structurally valid tree - however, it is possible to create an invalid XML tree (for example, by adding two attributes with the same name or by setting attribute/node name to empty/invalid string). Tree modification is optimized for performance and for memory consumption, so if you have enough memory you can create documents from scratch with pugixml and later save them to file/stream instead of relying on error-prone manual text writing and without too much overhead.
pugixml 中的文档是完全可变的:您可以完全更改文档结构并修改节点/属性的数据。本节提供相关功能的文档。所有函数本身都负责内存管理和结构完整性,因此它们始终会产生结构上有效的树 - 但是,可以创建无效的 XML 树(例如,通过添加两个具有相同名称的属性或将属性/节点名称设置为空/无效字符串)。树形修改针对性能和内存消耗进行了优化,因此如果您有足够的内存,则可以使用 pugixml 从头开始创建文档,然后将它们保存到文件/流中,而不是依赖容易出错的手动文本编写,并且不会产生太多开销。
All member functions that change node/attribute data or structure are non-constant and thus can not be called on constant handles. However, you can easily convert constant handle to non-constant one by simple assignment: , so const-correctness here mainly provides additional documentation.void foo(const pugi::xml_node& n) { pugi::xml_node nc = n; }
更改节点/属性数据或结构的所有成员函数都是非常量,因此不能在常量句柄上调用。但是,你可以很容易地通过简单的赋值将常量句柄转换为非常量句柄,所以这里的 const-correctness 主要提供额外的文档。 void foo(const pugi::xml_node& n) { pugi::xml_node nc = n; }
Setting node data 设置节点数据
As discussed before, nodes can have name and value, both of which are strings. Depending on node type, name or value may be absent. node_document nodes do not have a name or value, node_element and node_declaration nodes always have a name but never have a value, node_pcdata, node_cdata, node_comment and node_doctype nodes never have a name but always have a value (it may be empty though), node_pi nodes always have a name and a value (again, value may be empty). In order to set node’s name or value, you can use the following functions:
如前所述,节点可以具有 name 和 value,两者都是字符串。根据节点类型,名称或值可能不存在。node_document节点没有名称或值,node_element 和 node_declaration 节点总是有名称但从来没有值,node_pcdata、node_cdata、node_comment 和 node_doctype 节点从来没有名称但总是有值(虽然它可能为空),node_pi节点总是有名称和值(同样, 值可能为空)。为了设置节点的名称或值,您可以使用以下函数:
bool xml_node::set_name(const char_t* rhs);
bool xml_node::set_name(const char_t* rhs, size_t sz)
bool xml_node::set_value(const char_t* rhs);
bool xml_node::set_value(const char_t* rhs, size_t size);
Both functions try to set the name/value to the specified string, and return the operation result. The operation fails if the node can not have name or value (for instance, when trying to call on a node_pcdata node), if the node handle is null, or if there is insufficient memory to handle the request. The provided string is copied into document managed memory and can be destroyed after the function returns (for example, you can safely pass stack-allocated buffers to these functions). The name/value content is not verified, so take care to use only valid XML names, or the document may become malformed.set_name
这两个函数都尝试将 name/value 设置为指定的字符串,并返回操作结果。如果节点不能有 name 或 value(例如,尝试在 node_pcdata 节点上调用时)、节点句柄为 null,或者内存不足来处理请求,则操作将失败。提供的字符串将复制到文档托管内存中,并且可以在函数返回后销毁(例如,您可以安全地将堆栈分配的缓冲区传递给这些函数)。名称/值内容未经验证,因此请注意仅使用有效的 XML 名称,否则文档可能会格式错误。set_name
This is an example of setting node name and value (samples/modify_base.cpp):
以下是设置节点名称和值 (samples/modify_base.cpp) 的示例:
pugi::xml_node node = doc.child("node");
// change node name
std::cout << node.set_name("notnode");
std::cout << ", new node name: " << node.name() << std::endl;
// change comment text
std::cout << doc.last_child().set_value("useless comment");
std::cout << ", new comment text: " << doc.last_child().value() << std::endl;
// we can't change value of the element or name of the comment
std::cout << node.set_value("1") << ", " << doc.last_child().set_name("2") << std::endl;
6.1. Setting attribute data
6.1. 设置属性数据
All attributes have name and value, both of which are strings (value may be empty). You can set them with the following functions:
所有属性都有 name 和 value,两者都是字符串(value 可能为空)。您可以使用以下功能进行设置:
bool xml_attribute::set_name(const char_t* rhs);
bool xml_attribute::set_name(const char_t* rhs, size_t sz)
bool xml_attribute::set_value(const char_t* rhs);
bool xml_attribute::set_value(const char_t* rhs, size_t size);
Both functions try to set the name/value to the specified string, and return the operation result. The operation fails if the attribute handle is null, or if there is insufficient memory to handle the request. The provided string is copied into document managed memory and can be destroyed after the function returns (for example, you can safely pass stack-allocated buffers to these functions). The name/value content is not verified, so take care to use only valid XML names, or the document may become malformed.
这两个函数都尝试将 name/value 设置为指定的字符串,并返回操作结果。如果属性 handle 为 null,或者没有足够的内存来处理请求,则操作将失败。提供的字符串将复制到文档托管内存中,并且可以在函数返回后销毁(例如,您可以安全地将堆栈分配的缓冲区传递给这些函数)。名称/值内容未经验证,因此请注意仅使用有效的 XML 名称,否则文档可能会格式错误。
In addition to string functions, several functions are provided for handling attributes with numbers and booleans as values:
除了字符串函数之外,还提供了几个函数来处理以数字和布尔值作为值的属性:
bool xml_attribute::set_value(int rhs);
bool xml_attribute::set_value(unsigned int rhs);
bool xml_attribute::set_value(long rhs);
bool xml_attribute::set_value(unsigned long rhs);
bool xml_attribute::set_value(double rhs);
bool xml_attribute::set_value(double rhs, int precision);
bool xml_attribute::set_value(float rhs);
bool xml_attribute::set_value(float rhs, int precision);
bool xml_attribute::set_value(bool rhs);
bool xml_attribute::set_value(long long rhs);
bool xml_attribute::set_value(unsigned long long rhs);
The above functions convert the argument to string and then call the base function. Integers are converted to a decimal form, floating-point numbers are converted to either decimal or scientific form, depending on the number magnitude, boolean values are converted to either or .set_value
"true"
"false"
上述函数将参数转换为 string,然后调用 base 函数。整数转换为十进制形式,浮点数转换为十进制或科学形式,具体取决于数字大小,布尔值转换为 或 。set_value
“true”
“false”
Caution 谨慎
|
Number conversion functions depend on current C locale as set with , so may generate unexpected results if the locale is different from .
setlocale "C" 数字转换函数取决于当前 C 语言环境,如使用 设置的那样,因此如果语言环境与 不同,则可能会产生意外结果。 setlocale “C” |
Note 注意
|
set_value overloads with type are only available if your platform has reliable support for the type, including string conversions.
long long 仅当您的平台对类型(包括字符串转换)具有可靠的支持时,set_value type 重载才可用。 长 长 |
For convenience, all functions have the corresponding assignment operators:set_value
为方便起见,所有函数都有相应的赋值运算符:set_value
xml_attribute& xml_attribute::operator=(const char_t* rhs);
xml_attribute& xml_attribute::operator=(int rhs);
xml_attribute& xml_attribute::operator=(unsigned int rhs);
xml_attribute& xml_attribute::operator=(long rhs);
xml_attribute& xml_attribute::operator=(unsigned long rhs);
xml_attribute& xml_attribute::operator=(double rhs);
xml_attribute& xml_attribute::operator=(float rhs);
xml_attribute& xml_attribute::operator=(bool rhs);
xml_attribute& xml_attribute::operator=(long long rhs);
xml_attribute& xml_attribute::operator=(unsigned long long rhs);
These operators simply call the right function and return the attribute they’re called on; the return value of is ignored, so errors are ignored.set_value
set_value
这些运算符只调用正确的函数并返回调用它们所依据的属性;返回值 of 被忽略,因此错误被忽略。set_value
set_value
This is an example of setting attribute name and value (samples/modify_base.cpp):
以下是设置属性名称和值 (samples/modify_base.cpp) 的示例:
pugi::xml_attribute attr = node.attribute("id");
// change attribute name/value
std::cout << attr.set_name("key") << ", " << attr.set_value("345");
std::cout << ", new attribute: " << attr.name() << "=" << attr.value() << std::endl;
// we can use numbers or booleans
attr.set_value(1.234);
std::cout << "new attribute value: " << attr.value() << std::endl;
// we can also use assignment operators for more concise code
attr = true;
std::cout << "final attribute value: " << attr.value() << std::endl;
6.2. Adding nodes/attributes
6.2. 添加节点/属性
Nodes and attributes do not exist without a document tree, so you can’t create them without adding them to some document. A node or attribute can be created at the end of node/attribute list or before/after some other node:
没有文档树就不存在节点和属性,因此如果不将它们添加到某个文档中,则无法创建它们。节点或属性可以在节点/属性列表的末尾创建,也可以在其他节点之前/之后创建:
xml_attribute xml_node::append_attribute(const char_t* name);
xml_attribute xml_node::prepend_attribute(const char_t* name);
xml_attribute xml_node::insert_attribute_after(const char_t* name, const xml_attribute& attr);
xml_attribute xml_node::insert_attribute_before(const char_t* name, const xml_attribute& attr);
xml_node xml_node::append_child(xml_node_type type = node_element);
xml_node xml_node::prepend_child(xml_node_type type = node_element);
xml_node xml_node::insert_child_after(xml_node_type type, const xml_node& node);
xml_node xml_node::insert_child_before(xml_node_type type, const xml_node& node);
xml_node xml_node::append_child(const char_t* name);
xml_node xml_node::prepend_child(const char_t* name);
xml_node xml_node::insert_child_after(const char_t* name, const xml_node& node);
xml_node xml_node::insert_child_before(const char_t* name, const xml_node& node);
append_attribute
and create a new node/attribute at the end of the corresponding list of the node the method is called on; and create a new node/attribute at the beginning of the list; , , and add the node/attribute before or after the specified node/attribute.append_child
prepend_attribute
prepend_child
insert_attribute_after
insert_attribute_before
insert_child_after
insert_attribute_before
append_attribute
在调用该方法的节点的相应列表的末尾创建一个新的节点/属性;并在列表的开头创建一个新的节点/属性;、 ,并在指定的节点/属性之前或之后添加节点/属性。append_child
prepend_attribute
prepend_child
insert_attribute_after
insert_attribute_before
insert_child_after
insert_attribute_before
Attribute functions create an attribute with the specified name; you can specify the empty name and change the name later if you want to. Node functions with the argument create the node with the specified type; since node type can’t be changed, you have to know the desired type beforehand. Also note that not all types can be added as children; see below for clarification. Node functions with the argument create the element node (node_element) with the specified name.type
name
属性函数创建具有指定名称的属性;您可以指定空名称,并在以后根据需要更改该名称。带有参数的 Node 函数创建具有指定类型的节点;由于节点类型无法更改,因此您必须事先知道所需的类型。另请注意,并非所有类型都可以添加为子类型;请参阅下面的说明。带有参数的 Node 函数创建具有指定名称的元素节点 (node_element)。类型
名称
All functions return the handle to the created object on success, and null handle on failure. There are several reasons for failure:
所有函数在成功时返回已创建对象的句柄,在失败时返回 null 句柄。失败的原因有以下几种:
-
Adding fails if the target node is null;
如果目标节点为 null,则添加失败; -
Only node_element nodes can contain attributes, so attribute adding fails if node is not an element;
只有 node_element 节点可以包含属性,因此如果 node 不是元素,则属性添加失败; -
Only node_document and node_element nodes can contain children, so child node adding fails if the target node is not an element or a document;
只有 node_document 和 node_element 节点可以包含子节点,因此如果目标节点不是元素或文档,则子节点添加将失败; -
node_document and node_null nodes can not be inserted as children, so passing node_document or node_null value as results in operation failure;
type
node_document 和 node_null 节点不能作为子节点插入,因此将 node_document 或 node_null 值作为传递会导致操作失败;类型
-
node_declaration nodes can only be added as children of the document node; attempt to insert declaration node as a child of an element node fails;
node_declaration节点只能添加为文档节点的子节点;尝试将 declaration 节点作为 element 节点的子节点插入失败; -
Adding node/attribute results in memory allocation, which may fail;
添加 node/attribute 会导致内存分配,这可能会失败; -
Insertion functions fail if the specified node or attribute is null or is not in the target node’s children/attribute list.
如果指定的节点或属性为 null 或不在目标节点的 children/attribute 列表中,则插入函数将失败。
Even if the operation fails, the document remains in consistent state, but the requested node/attribute is not added.
即使操作失败,文档也会保持一致状态,但不会添加请求的节点/属性。
Caution 谨慎
|
attribute() and functions do not add attributes or nodes to the tree, so code like will not do anything if does not have an attribute with name . Make sure you’re operating with existing attributes/nodes by adding them if necessary.
child() node.attribute("id") = 123; node "id" attribute() 和函数不会向树中添加属性或节点,因此如果没有 name 的属性,则 like 代码不会执行任何操作。确保你正在使用现有的属性/节点进行操作,并在必要时添加它们。子() node.attribute(“id”) = 123; 节点 “id” |
This is an example of adding new attributes/nodes to the document (samples/modify_add.cpp):
以下是向文档添加新属性/节点的示例 (samples/modify_add.cpp):
// add node with some name
pugi::xml_node node = doc.append_child("node");
// add description node with text child
pugi::xml_node descr = node.append_child("description");
descr.append_child(pugi::node_pcdata).set_value("Simple node");
// add param node before the description
pugi::xml_node param = node.insert_child_before("param", descr);
// add attributes to param node
param.append_attribute("name") = "version";
param.append_attribute("value") = 1.1;
param.insert_attribute_after("type", param.attribute("name")) = "float";
6.3. Removing nodes/attributes
6.3. 删除节点/属性
If you do not want your document to contain some node or attribute, you can remove it with one of the following functions:
如果您不希望文档包含某个节点或属性,可以使用以下函数之一将其删除:
bool xml_node::remove_attribute(const xml_attribute& a);
bool xml_node::remove_attributes();
bool xml_node::remove_child(const xml_node& n);
bool xml_node::remove_children();
remove_attribute
removes the attribute from the attribute list of the node, and returns the operation result. removes the child node with the entire subtree (including all descendant nodes and attributes) from the document, and returns the operation result. removes all the attributes of the node, and returns the operation result. removes all the child nodes of the node, and returns the operation result. Removing fails if one of the following is true:remove_child
remove_attributes
remove_children
remove_attribute
从节点的属性列表中删除该属性,并返回操作结果。从文档中删除具有整个子树(包括所有子节点和属性)的子节点,并返回操作结果。删除该节点的所有属性,并返回操作结果。删除该节点的所有子节点。 ,并返回操作结果。如果满足以下任一条件,则删除失败:remove_child
remove_attributes
remove_children
-
The node the function is called on is null;
调用函数的节点为 null; -
The attribute/node to be removed is null;
要删除的属性/节点为 null; -
The attribute/node to be removed is not in the node’s attribute/child list.
需要移除的属性/节点不在节点的 attribute/child 列表中。
Removing the attribute or node invalidates all handles to the same underlying object, and also invalidates all iterators pointing to the same object. Removing node also invalidates all past-the-end iterators to its attribute or child node list. Be careful to ensure that all such handles and iterators either do not exist or are not used after the attribute/node is removed.
删除属性或节点会使同一基础对象的所有句柄失效,并且还会使指向同一对象的所有迭代器失效。删除 node 还会使其属性或子节点列表的所有过去结束迭代器失效。请小心确保所有此类句柄和迭代器不存在,或者在删除属性/节点后未使用。
If you want to remove the attribute or child node by its name, two additional helper functions are available:
如果要按名称删除属性或子节点,可以使用两个额外的帮助程序函数:
bool xml_node::remove_attribute(const char_t* name);
bool xml_node::remove_child(const char_t* name);
These functions look for the first attribute or child with the specified name, and then remove it, returning the result. If there is no attribute or child with such name, the function returns ; if there are two nodes with the given name, only the first node is deleted. If you want to delete all nodes with the specified name, you can use code like this: .false
while (node.remove_child("tool")) ;
这些函数查找具有指定名称的第一个属性或子项,然后将其删除,返回结果。如果没有具有此类名称的属性或子项,则函数返回 ;如果有两个具有给定名称的节点,则仅删除第一个节点。如果要删除具有指定名称的所有节点,可以使用如下代码:。假
while (node.remove_child("tool")) ;
This is an example of removing attributes/nodes from the document (samples/modify_remove.cpp):
以下是从文档中删除属性/节点的示例 (samples/modify_remove.cpp):
// remove description node with the whole subtree
pugi::xml_node node = doc.child("node");
node.remove_child("description");
// remove id attribute
pugi::xml_node param = node.child("param");
param.remove_attribute("value");
// we can also remove nodes/attributes by handles
pugi::xml_attribute id = param.attribute("name");
param.remove_attribute(id);
6.4. Working with text contents
6.4. 使用文本内容
pugixml provides a special class, , to work with text contents stored as a value of some node, i.e. . Working with text objects to retrieve data is described in the documentation for accessing document data; this section describes the modification interface of .xml_text
<node><description>This is a node</description></node>
xml_text
pugixml 提供了一个特殊的类 ,用于处理存储为某个节点的值的文本内容,即 .使用文本对象检索数据的文档介绍了访问文档数据;本节介绍的 Modification 接口。
<node><description>This is a node</description></node>
xml_text xml_text
Once you have an object, you can set the text contents using the following function:xml_text
拥有对象后,您可以使用以下函数设置文本内容xml_text
bool xml_text::set(const char_t* rhs);
bool xml_text::set(const char_t* rhs, size_t size);
This function tries to set the contents to the specified string, and returns the operation result. The operation fails if the text object was retrieved from a node that can not have a value and is not an element node (i.e. it is a node_declaration node), if the text object is empty, or if there is insufficient memory to handle the request. The provided string is copied into document managed memory and can be destroyed after the function returns (for example, you can safely pass stack-allocated buffers to this function). Note that if the text object was retrieved from an element node, this function creates the PCDATA child node if necessary (i.e. if the element node does not have a PCDATA/CDATA child already).
此函数尝试将内容设置为指定的字符串,并返回操作结果。如果文本对象是从不能具有值且不是元素节点(即它是node_declaration节点)的节点中检索的,如果文本对象为空,或者没有足够的内存来处理请求,则操作将失败。提供的字符串将复制到文档托管内存中,并且可以在函数返回后销毁(例如,您可以安全地将堆栈分配的缓冲区传递给此函数)。请注意,如果从元素节点检索到文本对象,则此函数会在必要时创建 PCDATA 子节点(即,如果元素节点还没有 PCDATA/CDATA 子节点)。
In addition to a string function, several functions are provided for handling text with numbers and booleans as contents:
除了字符串函数之外,还提供了几个函数来处理以数字和布尔值作为内容的文本:
bool xml_text::set(int rhs);
bool xml_text::set(unsigned int rhs);
bool xml_text::set(long rhs);
bool xml_text::set(unsigned long rhs);
bool xml_text::set(double rhs);
bool xml_text::set(double rhs, int precision);
bool xml_text::set(float rhs);
bool xml_text::set(float rhs, int precision);
bool xml_text::set(bool rhs);
bool xml_text::set(long long rhs);
bool xml_text::set(unsigned long long rhs);
The above functions convert the argument to string and then call the base function. These functions have the same semantics as similar functions. You can refer to documentation for the attribute functions for details.set
xml_attribute
上述函数将参数转换为 string,然后调用 base 函数。这些函数具有与类似函数相同的语义。有关详细信息,您可以参考属性函数的文档。设置
xml_attribute
For convenience, all functions have the corresponding assignment operators:set
为方便起见,所有函数都有相应的赋值运算符:set
xml_text& xml_text::operator=(const char_t* rhs);
xml_text& xml_text::operator=(int rhs);
xml_text& xml_text::operator=(unsigned int rhs);
xml_text& xml_text::operator=(long rhs);
xml_text& xml_text::operator=(unsigned long rhs);
xml_text& xml_text::operator=(double rhs);
xml_text& xml_text::operator=(float rhs);
xml_text& xml_text::operator=(bool rhs);
xml_text& xml_text::operator=(long long rhs);
xml_text& xml_text::operator=(unsigned long long rhs);
These operators simply call the right function and return the attribute they’re called on; the return value of is ignored, so errors are ignored.set
set
这些运算符只调用正确的函数并返回调用它们所依据的属性;返回值 of 被忽略,因此错误被忽略。set
set (设置)
This is an example of using object to modify text contents (samples/text.cpp):xml_text
这是使用 object 修改文本内容的示例 (samples/text.cpp):xml_text
// change project version
project.child("version").text() = 1.2;
// add description element and set the contents
// note that we do not have to explicitly add the node_pcdata child
project.append_child("description").text().set("a test project");
6.5. Cloning nodes/attributes
6.5. 克隆节点/属性
With the help of previously described functions, it is possible to create trees with any contents and structure, including cloning the existing data. However since this is an often needed operation, pugixml provides built-in node/attribute cloning facilities. Since nodes and attributes do not exist without a document tree, you can’t create a standalone copy - you have to immediately insert it somewhere in the tree. For this, you can use one of the following functions:
在前面描述的函数的帮助下,可以创建具有任何内容和结构的树,包括克隆现有数据。但是,由于这是一个经常需要的操作,pugixml 提供了内置的节点 / 属性克隆工具。由于没有文档树就不存在节点和属性,因此您无法创建独立副本 - 您必须立即将其插入到树中的某个位置。为此,您可以使用以下函数之一:
xml_attribute xml_node::append_copy(const xml_attribute& proto);
xml_attribute xml_node::prepend_copy(const xml_attribute& proto);
xml_attribute xml_node::insert_copy_after(const xml_attribute& proto, const xml_attribute& attr);
xml_attribute xml_node::insert_copy_before(const xml_attribute& proto, const xml_attribute& attr);
xml_node xml_node::append_copy(const xml_node& proto);
xml_node xml_node::prepend_copy(const xml_node& proto);
xml_node xml_node::insert_copy_after(const xml_node& proto, const xml_node& node);
xml_node xml_node::insert_copy_before(const xml_node& proto, const xml_node& node);
These functions mirror the structure of , , and related functions - they take the handle to the prototype object, which is to be cloned, insert a new attribute/node at the appropriate place, and then copy the attribute data or the whole node subtree to the new object. The functions return the handle to the resulting duplicate object, or null handle on failure.append_child
prepend_child
insert_child_before
这些函数反映了 、 和相关函数的结构 - 它们获取要克隆的原型对象的句柄,在适当的位置插入新的属性/节点,然后将属性数据或整个节点子树复制到新对象。这些函数将句柄返回给生成的重复对象,或在失败时返回 null 句柄。append_child
prepend_child
insert_child_before
The attribute is copied along with the name and value; the node is copied along with its type, name and value; additionally attribute list and all children are recursively cloned, resulting in the deep subtree clone. The prototype object can be a part of the same document, or a part of any other document.
该属性将与 name 和 value 一起复制;节点及其类型、名称和值将被复制;此外,属性 list 和所有子项都被递归克隆,从而产生深层子树克隆。prototype 对象可以是同一文档的一部分,也可以是任何其他文档的一部分。
The failure conditions resemble those of , and related functions, consult their documentation for more information. There are additional caveats specific to cloning functions:append_child
insert_child_before
失败条件类似于 和相关函数的失败条件,有关更多信息,请参阅其文档。克隆功能还有其他注意事项:append_child
insert_child_before
-
Cloning null handles results in operation failure;
克隆 null 句柄会导致操作失败; -
Node cloning starts with insertion of the node of the same type as that of the prototype; for this reason, cloning functions can not be directly used to clone entire documents, since node_document is not a valid insertion type. The example below provides a workaround.
节点克隆从插入与原型类型相同的节点开始;因此,克隆函数不能直接用于克隆整个文档,因为node_document不是有效的插入类型。以下示例提供了一种解决方法。 -
It is possible to copy a subtree as a child of some node inside this subtree, i.e. . This is a valid operation, and it results in a clone of the subtree in the state before cloning started, i.e. no infinite recursion takes place.
node.append_copy(node.parent().parent());
可以将子树复制为该子树中某个节点的子节点,即 。这是一个有效的操作,它会导致 clone 子树处于克隆开始之前的状态,即不会发生无限递归。node.append_copy(node.parent().parent());
This is an example with one possible implementation of include tags in XML (samples/include.cpp). It illustrates node cloning and usage of other document modification functions:
这是一个示例,其中包含 XML 中 include 标记的一种可能实现 (samples/include.cpp)。它说明了节点克隆和其他文档修改功能的用法:
bool load_preprocess(pugi::xml_document& doc, const char* path);
bool preprocess(pugi::xml_node node)
{
for (pugi::xml_node child = node.first_child(); child; )
{
if (child.type() == pugi::node_pi && strcmp(child.name(), "include") == 0)
{
pugi::xml_node include = child;
// load new preprocessed document (note: ideally this should handle relative paths)
const char* path = include.value();
pugi::xml_document doc;
if (!load_preprocess(doc, path)) return false;
// insert the comment marker above include directive
node.insert_child_before(pugi::node_comment, include).set_value(path);
// copy the document above the include directive (this retains the original order!)
for (pugi::xml_node ic = doc.first_child(); ic; ic = ic.next_sibling())
{
node.insert_copy_before(ic, include);
}
// remove the include node and move to the next child
child = child.next_sibling();
node.remove_child(include);
}
else
{
if (!preprocess(child)) return false;
child = child.next_sibling();
}
}
return true;
}
bool load_preprocess(pugi::xml_document& doc, const char* path)
{
pugi::xml_parse_result result = doc.load_file(path, pugi::parse_default | pugi::parse_pi); // for <?include?>
return result ? preprocess(doc) : false;
}
6.6. Moving nodes 6.6. 移动节点
Sometimes instead of cloning a node you need to move an existing node to a different position in a tree. This can be accomplished by copying the node and removing the original; however, this is expensive since it results in a lot of extra operations. For moving nodes within the same document tree, you can use of the following functions instead:
有时,您需要将现有节点移动到树中的其他位置,而不是克隆节点。这可以通过复制节点并删除原始节点来实现;但是,这很昂贵,因为它会导致大量额外的操作。对于同一文档树中的移动节点,您可以改用以下函数:
xml_node xml_node::append_move(const xml_node& moved);
xml_node xml_node::prepend_move(const xml_node& moved);
xml_node xml_node::insert_move_after(const xml_node& moved, const xml_node& node);
xml_node xml_node::insert_move_before(const xml_node& moved, const xml_node& node);
These functions mirror the structure of , , and - they take the handle to the moved object and move it to the appropriate place with all attributes and/or child nodes. The functions return the handle to the resulting object (which is the same as the moved object), or null handle on failure.append_copy
prepend_copy
insert_copy_before
insert_copy_after
这些函数反映了 、 和 的结构 - 它们获取移动对象的手柄,并将其移动到具有所有属性和/或子节点的适当位置。这些函数将返回结果对象的句柄(与移动的对象相同),或在失败时返回 null 句柄。append_copy
prepend_copy
insert_copy_before
insert_copy_after
The failure conditions resemble those of , and related functions, consult their documentation for more information. There are additional caveats specific to moving functions:append_child
insert_child_before
失败条件类似于 和相关函数的失败条件,有关更多信息,请参阅其文档。还有特定于移动功能的其他注意事项:append_child
insert_child_before
-
Moving null handles results in operation failure;
移动 null 句柄会导致操作失败; -
Moving is only possible for nodes that belong to the same document; attempting to move nodes between documents will fail.
只能对属于同一文档的节点进行移动;尝试在文档之间移动节点将失败。 -
insert_move_after
and functions fail if the moved node is the same as the argument (this operation would be a no-op otherwise).insert_move_before
node
如果
移动的节点与参数相同,则 insert_move_after 和 functions 将失败(否则此操作将是空操作)。insert_move_before
节点
-
It is impossible to move a subtree to a child of some node inside this subtree, i.e. will fail.
node.append_move(node.parent().parent());
不可能将子树移动到此子树中某个节点的子节点,即将失败。node.append_move(node.parent().parent());
6.7. Assembling document from fragments
6.7. 从 fragment 组合文档
pugixml provides several ways to assemble an XML document from other XML documents. Assuming there is a set of document fragments, represented as in-memory buffers, the implementation choices are as follows:
pugixml 提供了几种从其他 XML 文档组合 XML 文档的方法。假设有一组文档片段,表示为内存中缓冲区,则实施选项如下:
-
Use a temporary document to parse the data from a string, then clone the nodes to a destination node. For example:
使用临时文档解析字符串中的数据,然后将节点克隆到目标节点。例如:bool append_fragment(pugi::xml_node target, const char* buffer, size_t size) { pugi::xml_document doc; if (!doc.load_buffer(buffer, size)) return false; for (pugi::xml_node child = doc.first_child(); child; child = child.next_sibling()) target.append_copy(child); }
-
Cache the parsing step - instead of keeping in-memory buffers, keep document objects that already contain the parsed fragment:
缓存解析步骤 - 保留已包含已解析片段的文档对象,而不是保留内存中的缓冲区:bool append_fragment(pugi::xml_node target, const pugi::xml_document& cached_fragment) { for (pugi::xml_node child = cached_fragment.first_child(); child; child = child.next_sibling()) target.append_copy(child); }
-
Use directly:
xml_node::append_buffer
直接使用:xml_node::append_buffer
xml_parse_result xml_node::append_buffer(const void* contents, size_t size, unsigned int options = parse_default, xml_encoding encoding = encoding_auto);
The first method is more convenient, but slower than the other two. The relative performance of and depends on the buffer format - usually is faster if the buffer is in native encoding (UTF-8 or wchar_t, depending on ). At the same time it might be less efficient in terms of memory usage - the implementation makes a copy of the provided buffer, and the copy has the same lifetime as the document - the memory used by that copy will be reclaimed after the document is destroyed, but no sooner. Even deleting all nodes in the document, including the appended ones, won’t reclaim the memory.append_copy
append_buffer
append_buffer
PUGIXML_WCHAR_MODE
第一种方法更方便,但比其他两种方法慢。的相对性能 和 取决于缓冲区格式 - 如果缓冲区采用本机编码(UTF-8 或 wchar_t,取决于 ),则通常更快。同时,它在内存使用方面可能效率较低 - 实现会复制提供的缓冲区,并且该副本与文档具有相同的生命周期 - 该副本使用的内存将在文档销毁后回收,但不会更早。即使删除文档中的所有节点(包括追加的节点),也不会回收内存。append_copy
append_buffer
append_buffer
PUGIXML_WCHAR_MODE
append_buffer
behaves in the same way as xml_document::load_buffer - the input buffer is a byte buffer, with size in bytes; the buffer is not modified and can be freed after the function returns.append_buffer
的行为方式与 xml_document::load_buffer 相同 - 输入缓冲区是一个字节缓冲区,大小以字节为单位;缓冲区不会被修改,并且可以在函数返回后释放。
Since needs to append child nodes to the current node, it only works if the current node is either document or element node. Calling on a node with any other type results in an error with status.append_buffer
append_buffer
status_append_invalid_root
由于需要将子节点附加到当前节点,因此仅当当前节点是 document 或 element 节点时,它才有效。调用具有任何其他类型的节点都会导致 status 错误。append_buffer
append_buffer
status_append_invalid_root
7. Saving document 7. 保存文档
Often after creating a new document or loading the existing one and processing it, it is necessary to save the result back to file. Also it is occasionally useful to output the whole document or a subtree to some stream; use cases include debug printing, serialization via network or other text-oriented medium, etc. pugixml provides several functions to output any subtree of the document to a file, stream or another generic transport interface; these functions allow to customize the output format (see Output options), and also perform necessary encoding conversions (see Encodings). This section documents the relevant functionality.
通常在创建新文档或加载现有文档并对其进行处理后,有必要将结果保存回文件。有时,将整个文档或子树输出到某个流也很有用;用例包括调试打印、通过网络或其他面向文本的媒体进行序列化等。pugixml 提供了多种函数,可以将文档的任何子树输出到文件、流或其他通用传输接口;这些函数允许自定义输出格式(请参阅输出选项),并执行必要的编码转换(请参阅编码)。本节介绍了相关功能。
Before writing to the destination the node/attribute data is properly formatted according to the node type; all special XML symbols, such as and , are properly escaped (unless format_no_escapes flag is set). In order to guard against forgotten node/attribute names, empty node/attribute names are printed as . For well-formed output, make sure all node and attribute names are set to meaningful values.<
&
":anonymous"
在写入目标之前,将根据节点类型正确设置节点/属性数据的格式;所有特殊 XML 符号(如 和 )都已正确转义(除非设置了 format_no_escapes 标志)。为了防止忘记节点/属性名称,空节点/属性名称将打印为 .对于格式正确的输出,请确保所有节点和属性名称都设置为有意义的值。<
&
“:匿名”
CDATA sections with values that contain are split into several sections as follows: section with value is written as . While this alters the structure of the document (if you load the document after saving it, there will be two CDATA sections instead of one), this is the only way to escape CDATA contents."]]>"
"pre]]>post"
<![CDATA[pre]]]]><![CDATA[>post]]>
具有 值的 CDATA 节将拆分为几个节,如下所示:具有 value 的节将写入 。虽然这会改变文档的结构(如果在保存后加载文档,则将有两个 CDATA 部分而不是一个),但这是转义 CDATA 内容的唯一方法。“]]>“
”pre]]>post”
<![CDATA[pre]]]]><![CDATA[>post]]>
7.1. Saving document to a file
7.1. 将文档保存到文件
If you want to save the whole document to a file, you can use one of the following functions:
如果要将整个文档保存到文件中,可以使用以下功能之一:
bool xml_document::save_file(const char* path, const char_t* indent = "\t", unsigned int flags = format_default, xml_encoding encoding = encoding_auto) const;
bool xml_document::save_file(const wchar_t* path, const char_t* indent = "\t", unsigned int flags = format_default, xml_encoding encoding = encoding_auto) const;
These functions accept file path as its first argument, and also three optional arguments, which specify indentation and other output options (see Output options) and output data encoding (see Encodings). The path has the target operating system format, so it can be a relative or absolute one, it should have the delimiters of the target system, it should have the exact case if the target file system is case-sensitive, etc. The functions return on success and if the file could not be opened or written to.true
false
这些函数接受 file path 作为其第一个参数,以及三个可选参数,这些参数指定缩进和其他输出选项(请参阅 输出选项)和输出数据编码(请参阅 编码)。该路径具有目标操作系统格式,因此它可以是相对或绝对格式,它应该具有目标系统的分隔符,如果目标文件系统区分大小写,则它应该具有确切的大小写,等等。这些函数在成功时以及无法打开或写入文件时返回。真
假
File path is passed to the system file opening function as is in case of the first function (which accepts ); the second function either uses a special file opening function if it is provided by the runtime library or converts the path to UTF-8 and uses the system file opening function.const char* path
文件路径与第一个函数(接受)一样传递给系统文件打开函数;第二个函数使用特殊的文件打开函数(如果它由运行时库提供),或者将路径转换为 UTF-8 并使用系统文件打开函数。const char* 路径
save_file
opens the target file for writing, outputs the requested header (by default a document declaration is output, unless the document already has one), and then saves the document contents. Calling is equivalent to creating an object with handle as the only constructor argument and then calling ; see Saving document via writer interface for writer interface details.save_file
xml_writer_file
FILE*
save
save_file
打开目标文件进行写入,输出请求的标头(默认情况下,除非文档已有文档声明),然后保存文档内容。调用等效于创建一个对象,其中 handle 作为唯一的构造函数参数,然后调用 ;请参阅 通过 Writer 接口保存文档 以了解 Writer 界面的详细信息。save_file
xml_writer_file
FILE*
save
This is a simple example of saving XML document to file (samples/save_file.cpp):
以下是将 XML 文档保存到文件 (samples/save_file.cpp) 的简单示例:
// save document to file
std::cout << "Saving result: " << doc.save_file("save_file_output.xml") << std::endl;
7.2. Saving document to C++ IOstreams
7.2. 将文档保存到 C++ IOstreams
To enhance interoperability pugixml provides functions for saving document to any object which implements C++ interface. This allows you to save documents to any standard C++ stream (i.e. file stream) or any third-party compliant implementation (i.e. Boost Iostreams). Most notably, this allows for easy debug output, since you can use stream as saving target. There are two functions, one works with narrow character streams, another handles wide character ones:std::ostream
std::cout
为了增强互操作性,pugixml 提供了将文档保存到任何实现 C++ 接口的对象的功能。这允许您将文档保存到任何标准 C++ 流(即文件流)或任何符合第三方的实现(即 Boost Iostreams)。最值得注意的是,这允许轻松进行调试输出,因为您可以使用 stream 作为保存目标。有两个函数,一个处理窄字符流,另一个处理宽字符流:std::ostream
std::cout
void xml_document::save(std::ostream& stream, const char_t* indent = "\t", unsigned int flags = format_default, xml_encoding encoding = encoding_auto) const;
void xml_document::save(std::wostream& stream, const char_t* indent = "\t", unsigned int flags = format_default) const;
save
with argument saves the document to the stream in the same way as (i.e. with requested header and with encoding conversions). On the other hand, with argument saves the document to the wide stream with encoding_wchar encoding. Because of this, using with wide character streams requires careful (usually platform-specific) stream setup (i.e. using the function). Generally use of wide streams is discouraged, however it provides you with the ability to save documents to non-Unicode encodings, i.e. you can save Shift-JIS encoded data if you set the correct locale.std::ostream
save_file
save
std::wstream
save
imbue
Save
with argument 将文档保存到流中,其方式与 (i.e. with requested header 和 with encoding conversions) 相同。另一方面,with argument 使用 encoding_wchar 编码将文档保存到宽流中。因此,与宽字符流一起使用需要仔细的(通常是特定于平台的)流设置(即使用函数)。通常不建议使用宽流,但是它为您提供了将文档保存为非 Unicode 编码的能力,即,如果您设置了正确的区域设置,则可以保存 Shift-JIS 编码的数据。std::ostream
save_file
save
std::wstream
save
imbue
Calling with stream target is equivalent to creating an object with stream as the only constructor argument and then calling ; see Saving document via writer interface for writer interface details.save
xml_writer_stream
save
使用 stream target 进行调用等效于创建一个以 stream 作为唯一构造函数参数的对象,然后调用 ;有关编写器接口的详细信息,请参阅通过 writer 接口保存文档。保存
xml_writer_stream
保存
This is a simple example of saving XML document to standard output (samples/save_stream.cpp):
以下是将 XML 文档保存到标准输出 (samples/save_stream.cpp) 的简单示例:
// save document to standard output
std::cout << "Document:\n";
doc.save(std::cout);
7.3. Saving document via writer interface
7.3. 通过 writer 接口保存文档
All of the above saving functions are implemented in terms of writer interface. This is a simple interface with a single function, which is called several times during output process with chunks of document data as input:
以上所有保存功能都是根据 writer 接口实现的。这是一个带有单个函数的简单接口,在输出过程中以文档数据块作为输入,该函数被调用多次:
class xml_writer
{
public:
virtual void write(const void* data, size_t size) = 0;
};
void xml_document::save(xml_writer& writer, const char_t* indent = "\t", unsigned int flags = format_default, xml_encoding encoding = encoding_auto) const;
In order to output the document via some custom transport, for example sockets, you should create an object which implements interface and pass it to function. function is called with a buffer as an input, where points to buffer start, and is equal to the buffer size in bytes. implementation must write the buffer to the transport; it can not save the passed buffer pointer, as the buffer contents will change after returns. The buffer contains the chunk of document data in the desired encoding.xml_writer
save
xml_writer::write
data
size
write
write
为了通过一些自定义传输方式(例如套接字)输出文档,您应该创建一个实现 interface 的对象并将其传递给 function。function 以缓冲区作为输入调用,其中指向缓冲区 start,并且等于缓冲区大小(以字节为单位)。implementation 必须将缓冲区写入 transport;它不能保存传递的缓冲区指针, 因为 buffer 内容在返回后会发生变化。缓冲区包含所需编码的文档数据块。xml_writer
save
xml_writer::write
数据
大小
write
write
write
function is called with relatively large blocks (size is usually several kilobytes, except for the last block that may be small), so there is often no need for additional buffering in the implementation.write
函数使用相对较大的块(大小通常为几 KB,除了最后一个块可能很小)调用,因此在实现中通常不需要额外的缓冲。
This is a simple example of custom writer for saving document data to STL string (samples/save_custom_writer.cpp); read the sample code for more complex examples:
这是一个用于将文档数据保存到 STL 字符串 (samples/save_custom_writer.cpp) 的自定义写入器的简单示例;有关更复杂的示例,请阅读示例代码:
struct xml_string_writer: pugi::xml_writer
{
std::string result;
virtual void write(const void* data, size_t size)
{
result.append(static_cast<const char*>(data), size);
}
};
7.4. Saving a single subtree
7.4. 保存单个子树
While the previously described functions save the whole document to the destination, it is easy to save a single subtree. The following functions are provided:
虽然前面描述的函数将整个文档保存到目标,但保存单个子树很容易。提供以下功能:
void xml_node::print(std::ostream& os, const char_t* indent = "\t", unsigned int flags = format_default, xml_encoding encoding = encoding_auto, unsigned int depth = 0) const;
void xml_node::print(std::wostream& os, const char_t* indent = "\t", unsigned int flags = format_default, unsigned int depth = 0) const;
void xml_node::print(xml_writer& writer, const char_t* indent = "\t", unsigned int flags = format_default, xml_encoding encoding = encoding_auto, unsigned int depth = 0) const;
These functions have the same arguments with the same meaning as the corresponding functions, and allow you to save the subtree to either a C++ IOstream or to any object that implements interface.xml_document::save
xml_writer
这些函数具有与相应函数相同的参数和相同的含义,并允许您将子树保存到 C++ IOstream 或任何实现 interface 的对象。xml_document::保存
xml_writer
Saving a subtree differs from saving the whole document: the process behaves as if format_write_bom is off, and format_no_declaration is on, even if actual values of the flags are different. This means that BOM is not written to the destination, and document declaration is only written if it is the node itself or is one of node’s children. Note that this also holds if you’re saving a document; this example (samples/save_subtree.cpp) illustrates the difference:
保存子树与保存整个文档不同:该过程的行为就像 format_write_bom 处于关闭状态,而 format_no_declaration 处于打开状态,即使标志的实际值不同。这意味着 BOM 不会写入目标,并且仅当文档声明是节点本身或节点的子节点之一时,才会写入文档声明。请注意,如果要保存文档,这也适用;此示例 (samples/save_subtree.cpp) 说明了差异:
// get a test document
pugi::xml_document doc;
doc.load_string("<foo bar='baz'><call>hey</call></foo>");
// print document to standard output (prints <?xml version="1.0"?><foo bar="baz"><call>hey</call></foo>)
doc.save(std::cout, "", pugi::format_raw);
std::cout << std::endl;
// print document to standard output as a regular node (prints <foo bar="baz"><call>hey</call></foo>)
doc.print(std::cout, "", pugi::format_raw);
std::cout << std::endl;
// print a subtree to standard output (prints <call>hey</call>)
doc.child("foo").child("call").print(std::cout, "", pugi::format_raw);
std::cout << std::endl;
7.5. Output options 7.5. 输出选项
All saving functions accept the optional parameter . This is a bitmask that customizes the output format; you can select the way the document nodes are printed and select the needed additional information that is output before the document contents.flags
所有保存函数都接受可选参数 。这是一个自定义输出格式的位掩码;您可以选择文档节点的打印方式,并选择在文档内容之前输出的所需附加信息。标志
Note 注意
|
You should use the usual bitwise arithmetics to manipulate the bitmask: to enable a flag, use ; to disable a flag, use .
mask | flag mask & ~flag 您应该使用通常的按位算术来操作位掩码:要启用标志,请使用 ;要禁用标志,请使用 . 面具 | 旗帜 面具 & ~旗帜 |
These flags control the resulting tree contents:
这些标志控制生成的树内容:
-
format_indent
determines if all nodes should be indented with the indentation string (this is an additional parameter for all saving functions, and is by default). If this flag is on, the indentation string is printed several times before every node, where the amount of indentation depends on the node’s depth relative to the output subtree. This flag has no effect if format_raw is enabled. This flag is on by default."\t"
format_indent
确定是否所有节点都应使用缩进字符串缩进(这是所有保存函数的附加参数,默认情况下是)。如果此标志处于打开状态,则缩进字符串将在每个节点之前打印多次,其中缩进量取决于节点相对于输出子树的深度。如果启用了 format_raw,则此标志无效。默认情况下,此标志处于打开状态。“\t”
-
format_indent_attributes
determines if all attributes should be printed on a new line, indented with the indentation string according to the attribute’s depth. This flag implies format_indent. This flag has no effect if format_raw is enabled. This flag is off by default.format_indent_attributes
确定是否应将所有属性打印在新行上,并根据属性的深度使用缩进字符串缩进。此标志表示format_indent。如果启用了 format_raw,则此标志无效。默认情况下,此标志处于关闭状态。 -
format_raw
switches between formatted and raw output. If this flag is on, the nodes are not indented in any way, and also no newlines that are not part of document text are printed. Raw mode can be used for serialization where the result is not intended to be read by humans; also it can be useful if the document was parsed with parse_ws_pcdata flag, to preserve the original document formatting as much as possible. This flag is off by default.format_raw
在格式化输出和 Raw 输出之间切换。如果此标志打开,则不会以任何方式缩进节点,也不会打印不属于文档文本的换行符。Raw 模式可用于结果不打算由人类读取的序列化;如果使用 parse_ws_pcdata 标志解析文档,以尽可能保留原始文档格式,则此功能也很有用。默认情况下,此标志处于关闭状态。 -
format_no_escapes
disables output escaping for attribute values and PCDATA contents. If this flag is off, special symbols (, , , ) and all non-printable characters (those with codepoint values less than 32) are converted to XML escape sequences (i.e. ) during output. If this flag is on, no text processing is performed; therefore, output XML can be malformed if output contents contains invalid symbols (i.e. having a stray in the PCDATA will make the output malformed). This flag is off by default."
&
<
>
&
<
format_no_escapes
将禁用属性值和 PCDATA 内容的输出转义。如果此标志关闭,则特殊符号 (, , , ) 和所有不可打印的字符(码位值小于 32 的字符)在输出期间转换为 XML 转义序列(即 )。如果此标志处于打开状态,则不执行文本处理;因此,如果输出内容包含无效符号,则输出 XML 的格式可能不正确(即 PCDATA 中的杂散将使输出格式错误)。默认情况下,此标志处于关闭状态。“
&
<
>
&
<
-
format_no_empty_element_tags
determines if start/end tags should be output instead of empty element tags for empty elements (that is, elements with no children). This flag is off by default.format_no_empty_element_tags
确定是否应输出 start/end 标签,而不是为空元素(即没有子元素的元素)输出空元素标签。默认情况下,此标志处于关闭状态。 -
format_skip_control_chars
enables skipping characters belonging to range [0; 32) instead of "&#xNN;" encoding. This flag is off by default.format_skip_control_chars
允许跳过属于范围 [0; 32) 而不是“&#xNN;”编码的字符。默认情况下,此标志处于关闭状态。 -
format_attribute_single_quote
enables using single quotes instead of double quotes for enclosing attribute values. This flag is off by default.'
"
format_attribute_single_quote
允许使用单引号而不是双引号来括起属性值。默认情况下,此标志处于关闭状态。'
”
These flags control the additional output information:
这些标志控制其他输出信息:
-
format_no_declaration
disables default node declaration output. By default, if the document is saved via or function, and it does not have any document declaration, a default declaration is output before the document contents. Enabling this flag disables this declaration. This flag has no effect in functions: they never output the default declaration. This flag is off by default.save
save_file
xml_node::print
format_no_declaration
禁用默认节点声明输出。默认情况下,如果文档是通过 or 函数保存的,并且它没有任何文档声明,则会在文档内容之前输出默认声明。启用此标志将禁用此声明。这个标志在函数中没有影响:它们从不输出默认声明。默认情况下,此标志处于关闭状态。保存
save_file
xml_node::p rint
-
format_write_bom
enables Byte Order Mark (BOM) output. By default, no BOM is output, so in case of non UTF-8 encodings the resulting document’s encoding may not be recognized by some parsers and text editors, if they do not implement sophisticated encoding detection. Enabling this flag adds an encoding-specific BOM to the output. This flag has no effect in functions: they never output the BOM. This flag is off by default.xml_node::print
format_write_bom
启用字节顺序标记 (BOM) 输出。默认情况下,不输出 BOM,因此在非 UTF-8 编码的情况下,如果某些解析器和文本编辑器未实施复杂的编码检测,则可能无法识别生成的文档的编码。启用此标志会将特定于编码的 BOM 添加到输出中。这个标志在函数中没有影响:它们从不输出 BOM。默认情况下,此标志处于关闭状态。xml_node::p rint
-
format_save_file_text
changes the file mode when using function. By default, file is opened in binary mode, which means that the output file will contain platform-independent newline (ASCII 10). If this flag is on, file is opened in text mode, which on some systems changes the newline format (i.e. on Windows you can use this flag to output XML documents with (ASCII 13 10) newlines. This flag is off by default.save_file
\n
\r\n
format_save_file_text
使用函数时更改文件模式。默认情况下,文件以二进制模式打开,这意味着输出文件将包含与平台无关的换行符 (ASCII 10)。如果此标志打开,则文件以文本模式打开,在某些系统上会更改换行格式(例如,在 Windows 上,您可以使用此标志输出带有 (ASCII 13 10) 换行符的 XML 文档。默认情况下,此标志处于关闭状态。save_file
\n
\r\n
Additionally, there is one predefined option mask:
此外,还有一个预定义的选项掩码:
This is an example that shows the outputs of different output options (samples/save_options.cpp):
以下示例显示了不同输出选项 (samples/save_options.cpp) 的输出:
// get a test document
pugi::xml_document doc;
doc.load_string("<foo bar='baz'><call>hey</call></foo>");
// default options; prints
// <?xml version="1.0"?>
// <foo bar="baz">
// <call>hey</call>
// </foo>
doc.save(std::cout);
std::cout << std::endl;
// default options with custom indentation string; prints
// <?xml version="1.0"?>
// <foo bar="baz">
// --<call>hey</call>
// </foo>
doc.save(std::cout, "--");
std::cout << std::endl;
// default options without indentation; prints
// <?xml version="1.0"?>
// <foo bar="baz">
// <call>hey</call>
// </foo>
doc.save(std::cout, "\t", pugi::format_default & ~pugi::format_indent); // can also pass "" instead of indentation string for the same effect
std::cout << std::endl;
// raw output; prints
// <?xml version="1.0"?><foo bar="baz"><call>hey</call></foo>
doc.save(std::cout, "\t", pugi::format_raw);
std::cout << std::endl << std::endl;
// raw output without declaration; prints
// <foo bar="baz"><call>hey</call></foo>
doc.save(std::cout, "\t", pugi::format_raw | pugi::format_no_declaration);
std::cout << std::endl;
7.6. Encodings 7.6. 编码
pugixml supports all popular Unicode encodings (UTF-8, UTF-16 (big and little endian), UTF-32 (big and little endian); UCS-2 is naturally supported since it’s a strict subset of UTF-16) and handles all encoding conversions during output. The output encoding is set via the parameter of saving functions, which is of type . The possible values for the encoding are documented in Encodings; the only flag that has a different meaning is .encoding
xml_encoding
encoding_auto
pugixml 支持所有流行的 Unicode 编码(UTF-8、UTF-16(大端和小端)、UTF-32(大端和小端);UCS-2 自然受到支持,因为它是 UTF-16 的严格子集),并在输出期间处理所有编码转换。输出编码是通过 saving functions 的参数设置的,该参数的类型为 。编码的可能值记录在 Encodings 中;唯一具有不同含义的标志是 。编码
xml_encoding
encoding_auto
While all other flags set the exact encoding, is meant for automatic encoding detection. The automatic detection does not make sense for output encoding, since there is usually nothing to infer the actual encoding from, so here means UTF-8 encoding, which is the most popular encoding for XML data storage. This is also the default value of output encoding; specify another value if you do not want UTF-8 encoded output.encoding_auto
encoding_auto
虽然所有其他标志都设置确切的编码,但用于自动编码检测。自动检测对于输出编码没有意义,因为通常没有什么可以推断实际编码,所以这里指的是 UTF-8 编码,这是 XML 数据存储最流行的编码。这也是 output encoding 的默认值;如果您不需要 UTF-8 编码的输出,请指定另一个值。encoding_auto
encoding_auto
Also note that wide stream saving functions do not have argument and always assume encoding_wchar encoding.encoding
另请注意,宽流保存函数没有参数,并且始终采用encoding_wchar编码。编码
Note 注意
|
The current behavior for Unicode conversion is to skip all invalid UTF sequences during conversion. This behavior should not be relied upon; if your node/attribute names do not contain any valid UTF sequences, they may be output as if they are empty, which will result in malformed XML document.
Unicode 转换的当前行为是在转换过程中跳过所有无效的 UTF 序列。不应依赖这种行为;如果您的节点/属性名称不包含任何有效的 UTF 序列,则它们可能会像空一样输出,这将导致 XML 文档格式错误。 |
7.7. Customizing document declaration
7.7. 自定义文档声明
When you are saving the document using or , a default XML document declaration is output, if is not specified and if the document does not have a declaration node. However, the default declaration is not customizable. If you want to customize the declaration output, you need to create the declaration node yourself.xml_document::save()
xml_document::save_file()
format_no_declaration
使用 或 保存文档时,如果未指定且文档没有声明节点,则会输出默认的 XML 文档声明。但是,默认声明是不可自定义的。如果要自定义 declaration 输出,则需要自己创建 declaration 节点。xml_document::save()
xml_document::save_file()
format_no_declaration
Note 注意
|
By default the declaration node is not added to the document during parsing. If you just need to preserve the original declaration node, you have to add the flag parse_declaration to the parsing flags; the resulting document will contain the original declaration node, which will be output during saving.
默认情况下,在解析期间不会将声明节点添加到文档中。如果你只需要保留原始声明节点,你必须将 flag parse_declaration 添加到解析 flags 中;生成的文档将包含原始声明节点,该节点将在保存期间输出。 |
Declaration node is a node with type node_declaration; it behaves like an element node in that it has attributes with values (but it does not have child nodes). Therefore setting custom version, encoding or standalone declaration involves adding attributes and setting attribute values.
Declaration 节点是类型为 node_declaration 的节点;它的行为类似于 Element 节点,因为它具有带值的属性(但没有子节点)。因此,设置自定义版本、编码或独立声明涉及添加属性和设置属性值。
This is an example that shows how to create a custom declaration node (samples/save_declaration.cpp):
以下示例显示了如何创建自定义声明节点 (samples/save_declaration.cpp):
// get a test document
pugi::xml_document doc;
doc.load_string("<foo bar='baz'><call>hey</call></foo>");
// add a custom declaration node
pugi::xml_node decl = doc.prepend_child(pugi::node_declaration);
decl.append_attribute("version") = "1.0";
decl.append_attribute("encoding") = "UTF-8";
decl.append_attribute("standalone") = "no";
// <?xml version="1.0" encoding="UTF-8" standalone="no"?>
// <foo bar="baz">
// <call>hey</call>
// </foo>
doc.save(std::cout);
std::cout << std::endl;
8. XPath
If the task at hand is to select a subset of document nodes that match some criteria, it is possible to code a function using the existing traversal functionality for any practical criteria. However, often either a data-driven approach is desirable, in case the criteria are not predefined and come from a file, or it is inconvenient to use traversal interfaces and a higher-level DSL is required. There is a standard language for XML processing, XPath, that can be useful for these cases. pugixml implements an almost complete subset of XPath 1.0. Because of differences in document object model and some performance implications, there are minor violations of the official specifications, which can be found in Conformance to W3C specification. The rest of this section describes the interface for XPath functionality. Please note that if you wish to learn to use XPath language, you have to look for other tutorials or manuals; for example, you can read W3Schools XPath tutorial or the XPath 1.0 specification.
如果手头的任务是选择与某些条件匹配的文档节点的子集,则可以使用现有的遍历功能对任何实际条件编写函数。但是,通常需要数据驱动的方法,以防条件不是预定义的并且来自文件,或者使用遍历接口不方便并且需要更高级别的 DSL。有一种用于 XML 处理的标准语言 XPath 可用于这些情况。pugixml 实现了 XPath 1.0 的几乎完整的子集。由于文档对象模型的差异和一些性能影响,存在对官方规范的轻微冲突,可以在 符合 W3C 规范中找到。本节的其余部分介绍了 XPath 功能的接口。请注意,如果您想学习使用 XPath 语言,则必须寻找其他教程或手册;例如,您可以阅读 W3Schools XPath 教程或 XPath 1.0 规范。
8.1. XPath types 8.1. XPath 类型
Each XPath expression can have one of the following types: boolean, number, string or node set. Boolean type corresponds to type, number type corresponds to type, string type corresponds to either or , depending on whether wide character interface is enabled, and node set corresponds to xpath_node_set type. There is an enumeration, , which can take the values , , or , accordingly.bool
double
std::string
std::wstring
xpath_value_type
xpath_type_boolean
xpath_type_number
xpath_type_string
xpath_type_node_set
每个 XPath 表达式可以具有以下类型之一:布尔值、数字、字符串或节点集。布尔类型对应于类型,数字类型对应于类型,字符串类型对应于 或 ,具体取决于是否启用了宽字符接口,节点集对应于xpath_node_set类型。有一个枚举 ,它可以相应地采用值 、 或 。bool
double
std::string
std::wstring
xpath_value_type
xpath_type_boolean
xpath_type_number
xpath_type_string
xpath_type_node_set
Because an XPath node can be either a node or an attribute, there is a special type, , which is a discriminated union of these types. A value of this type contains two node handles, one of type, and another one of type; at most one of them can be non-null. The accessors to get these handles are available:xpath_node
xml_node
xml_attribute
由于 XPath 节点可以是节点或属性,因此存在一种特殊类型 ,它是这些类型的可区分联合。此类型的值包含两个节点手柄,一个是 type,另一个是 type;它们中最多只能有一个为非 null。获取这些句柄的访问器可用:xpath_node
xml_node
xml_attribute
xml_node xpath_node::node() const;
xml_attribute xpath_node::attribute() const;
XPath nodes can be null, in which case both accessors return null handles.
XPath 节点可以为 null,在这种情况下,两个访问器都返回 null 句柄。
Note that as per XPath specification, each XPath node has a parent, which can be retrieved via this function:
请注意,根据 XPath 规范,每个 XPath 节点都有一个父节点,可以通过以下函数检索:
xml_node xpath_node::parent() const;
parent
function returns the node’s parent if the XPath node corresponds to handle (equivalent to ), or the node to which the attribute belongs to, if the XPath node corresponds to handle. For null nodes, returns null handle.xml_node
node().parent()
xml_attribute
parent
如果
XPath 节点对应于 handle(相当于 ),则 parent 函数返回节点的父节点,如果 XPath 节点对应于 handle,则返回属性所属的节点。对于 null 节点,返回 null 句柄。xml_node
node().parent()
xml_attribute
父级
Like node and attribute handles, XPath node handles can be implicitly cast to boolean-like object to check if it is a null node, and also can be compared for equality with each other.
与节点和属性句柄一样,XPath 节点句柄可以隐式转换为类似布尔值的对象,以检查它是否为 null 节点,也可以相互比较是否相等。
You can also create XPath nodes with one of the three constructors: the default constructor, the constructor that takes node argument, and the constructor that takes attribute and node arguments (in which case the attribute must belong to the attribute list of the node). The constructor from is implicit, so you can usually pass to functions that expect . Apart from that you usually don’t need to create your own XPath node objects, since they are returned to you via selection functions.xml_node
xml_node
xpath_node
您还可以使用以下三个构造函数之一创建 XPath 节点:默认构造函数、采用 node 参数的构造函数以及采用 attribute 和 node 参数的构造函数(在这种情况下,该属性必须属于节点的属性列表)。构造函数 from 是隐式的,因此您通常可以传递给期望 .除此之外,您通常不需要创建自己的 XPath 节点对象,因为它们是通过选择函数返回给您的。xml_node
xml_node
xpath_node
XPath expressions operate not on single nodes, but instead on node sets. A node set is a collection of nodes, which can be optionally ordered in either a forward document order or a reverse one. Document order is defined in XPath specification; an XPath node is before another node in document order if it appears before it in XML representation of the corresponding document.
XPath 表达式不是在单个节点上运行,而是在节点集上运行。节点集是节点的集合,可以选择按正向文档顺序或反向顺序排序。文档顺序在 XPath 规范中定义;如果 XPath 节点在相应文档的 XML 表示形式中出现在另一个节点之前,则该节点按文档顺序位于另一个节点之前。
Node sets are represented by object, which has an interface that resembles one of sequential random-access containers. It has an iterator type along with usual begin/past-the-end iterator accessors:xpath_node_set
节点集由 object 表示,该对象具有类似于顺序随机访问容器之一的接口。它有一个迭代器类型以及通常的 begin/past-the-end 迭代器访问器xpath_node_set
typedef const xpath_node* xpath_node_set::const_iterator;
const_iterator xpath_node_set::begin() const;
const_iterator xpath_node_set::end() const;
const xpath_node& xpath_node_set::operator[](size_t index) const;
size_t xpath_node_set::size() const;
bool xpath_node_set::empty() const;
All of the above operations have the same semantics as that of : the iterators are random-access, all of the above operations are constant time, and accessing the element at index that is greater or equal than the set size results in undefined behavior. You can use both iterator-based and index-based access for iteration, however the iterator-based one can be faster.std::vector
上述所有操作都具有与 : 迭代器是随机访问的相同的语义,上述所有操作都是恒定时间,并且访问大于或等于设置大小的索引处的元素会导致未定义的行为。您可以同时使用基于迭代器和基于索引的访问进行迭代,但是基于迭代器的访问速度可能更快。std::vector
The order of iteration depends on the order of nodes inside the set; the order can be queried via the following function:
迭代的顺序取决于集合内节点的顺序;可以通过以下函数查询订单:
enum xpath_node_set::type_t {type_unsorted, type_sorted, type_sorted_reverse};
type_t xpath_node_set::type() const;
type
function returns the current order of nodes; means that the nodes are in forward document order, means that the nodes are in reverse document order, and means that neither order is guaranteed (nodes can accidentally be in a sorted order even if returns ). If you require a specific order of iteration, you can change it via function:type_sorted
type_sorted_reverse
type_unsorted
type()
type_unsorted
sort
type
函数返回节点的当前顺序;表示节点按正向文档顺序排列,表示节点按相反的文档顺序排列,并且表示不保证任何顺序(即使返回,节点也可能意外地按排序顺序排列)。如果你需要特定的迭代顺序,你可以通过 function:type_sorted
type_sorted_reverse
type_unsorted
type()
来更改它type_unsorted
sort
void xpath_node_set::sort(bool reverse = false);
Calling sorts the nodes in either forward or reverse document order, depending on the argument; after this call will return or .sort
type()
type_sorted
type_sorted_reverse
调用 将按正向或反向文档顺序对节点进行排序,具体取决于参数;在此调用之后,将返回 或 。sort
type()
type_sorted
type_sorted_reverse
Often the actual iteration is not needed; instead, only the first element in document order is required. For this, a special accessor is provided:
通常不需要实际的迭代;相反,只需要 Document Order 中的第一个元素。为此,提供了一个特殊的访问器:
xpath_node xpath_node_set::first() const;
This function returns the first node in forward document order from the set, or null node if the set is empty. Note that while the result of the node does not depend on the order of nodes in the set (i.e. on the result of ), the complexity does - if the set is sorted, the complexity is constant, otherwise it is linear in the number of elements or worse.type()
此函数按集合中的正向文档顺序返回第一个节点,如果集合为空,则返回 null 节点。请注意,虽然节点的结果不取决于集合中节点的顺序(即 的结果),但复杂性是 - 如果集合是排序的,则复杂性是恒定的,否则它在元素数量上是线性的,或者更糟。类型 ()
While in the majority of cases the node set is returned by XPath functions, sometimes there is a need to manually construct a node set. For such cases, a constructor is provided which takes an iterator range ( is a typedef for ), and an optional type:const_iterator
const xpath_node*
虽然在大多数情况下,节点集由 XPath 函数返回,但有时需要手动构造节点集。对于这种情况,提供了一个构造函数,它接受一个迭代器范围( 是 typedef for )和可选的 type:const_iterator
const xpath_node*
xpath_node_set::xpath_node_set(const_iterator begin, const_iterator end, type_t type = type_unsorted);
The constructor copies the specified range and sets the specified type. The objects in the range are not checked in any way; you’ll have to ensure that the range contains no duplicates, and that the objects are sorted according to the parameter. Otherwise XPath operations with this set may produce unexpected results.type
构造函数复制指定的范围并设置指定的类型。不以任何方式检查范围内的对象;您必须确保范围不包含重复项,并且对象根据参数进行排序。否则,使用此集的 XPath 操作可能会产生意外结果。类型
8.2. Selecting nodes via XPath expression
8.2. 通过 XPath 表达式选择节点
If you want to select nodes that match some XPath expression, you can do it with the following functions:
如果要选择与某个 XPath 表达式匹配的节点,可以使用以下函数来实现:
xpath_node xml_node::select_node(const char_t* query, xpath_variable_set* variables = 0) const;
xpath_node_set xml_node::select_nodes(const char_t* query, xpath_variable_set* variables = 0) const;
select_nodes
function compiles the expression and then executes it with the node as a context node, and returns the resulting node set. returns only the first node in document order from the result, and is equivalent to calling . If the XPath expression does not match anything, or the node handle is null, returns an empty set, and returns null XPath node.select_node
select_nodes(query).first()
select_nodes
select_node
select_nodes
函数编译表达式,然后将该节点作为上下文节点执行该表达式,并返回生成的节点集。仅返回结果中按文档顺序排列的第一个节点,等效于调用 .如果 XPath 表达式与任何内容都不匹配,或者节点句柄为 null,则返回空集,并返回 null XPath 节点。select_node
select_nodes(query).first()
select_nodes
select_node
If exception handling is not disabled, both functions throw xpath_exception if the query can not be compiled or if it returns a value with type other than node set; see Error handling for details.
如果未禁用异常处理,则如果查询无法编译或返回类型不是 node set 的值,则这两个函数都会引发xpath_exception;有关详细信息,请参阅错误处理。
While compiling expressions is fast, the compilation time can introduce a significant overhead if the same expression is used many times on small subtrees. If you’re doing many similar queries, consider compiling them into query objects (see Using query objects for further reference). Once you get a compiled query object, you can pass it to select functions instead of an expression string:
虽然编译表达式的速度很快,但如果在小子树上多次使用相同的表达式,则编译时间可能会带来很大的开销。如果您正在执行许多类似的查询,请考虑将它们编译为查询对象(请参阅使用查询对象以获取进一步参考)。获得已编译的查询对象后,您可以将其传递给 select 函数,而不是表达式字符串:
xpath_node xml_node::select_node(const xpath_query& query) const;
xpath_node_set xml_node::select_nodes(const xpath_query& query) const;
If exception handling is not disabled, both functions throw xpath_exception if the query returns a value with type other than node set.
如果未禁用异常处理,则当查询返回 type 不是 node set 的值时,这两个函数都会引发xpath_exception。
This is an example of selecting nodes using XPath expressions (samples/xpath_select.cpp):
以下是使用 XPath 表达式选择节点的示例 (samples/xpath_select.cpp):
pugi::xpath_node_set tools = doc.select_nodes("/Profile/Tools/Tool[@AllowRemote='true' and @DeriveCaptionFrom='lastparam']");
std::cout << "Tools:\n";
for (pugi::xpath_node_set::const_iterator it = tools.begin(); it != tools.end(); ++it)
{
pugi::xpath_node node = *it;
std::cout << node.node().attribute("Filename").value() << "\n";
}
pugi::xpath_node build_tool = doc.select_node("//Tool[contains(Description, 'build system')]");
if (build_tool)
std::cout << "Build tool: " << build_tool.node().attribute("Filename").value() << "\n";
8.3. Using query objects 8.3. 使用查询对象
When you call with an expression string as an argument, a query object is created behind the scenes. A query object represents a compiled XPath expression. Query objects can be needed in the following circumstances:select_nodes
当您使用表达式字符串作为参数进行调用时,将在后台创建一个查询对象。查询对象表示已编译的 XPath 表达式。在以下情况下,可能需要查询对象select_nodes
-
You can precompile expressions to query objects to save compilation time if it becomes an issue;
您可以预编译表达式以查询对象,以便在出现问题时节省编译时间; -
You can use query objects to evaluate XPath expressions which result in booleans, numbers or strings;
您可以使用查询对象来计算 XPath 表达式,从而生成布尔值、数字或字符串; -
You can get the type of expression value via query object.
您可以通过 query 对象获取 expression 值的 type。
Query objects correspond to type. They are immutable and non-copyable: they are bound to the expression at creation time and can not be cloned. If you want to put query objects in a container, either allocate them on heap via operator and store pointers to in the container, or use a C11 compiler (query objects are movable in C11).xpath_query
new
xpath_query
查询对象对应于类型。它们是不可变且不可复制的:它们在创建时绑定到表达式,并且无法克隆。如果要将查询对象放在容器中,可以通过运算符将它们分配到堆上,并将指针存储在容器中,或者使用 C11 编译器(查询对象在 C11 中是可移动的)。xpath_query
新
xpath_query
You can create a query object with the constructor that takes XPath expression as an argument:
您可以使用将 XPath 表达式作为参数的构造函数创建查询对象:
explicit xpath_query::xpath_query(const char_t* query, xpath_variable_set* variables = 0);
The expression is compiled and the compiled representation is stored in the new query object. If compilation fails, xpath_exception is thrown if exception handling is not disabled (see Error handling for details). After the query is created, you can query the type of the evaluation result using the following function:
表达式被编译,编译后的表示形式存储在新的 query 对象中。如果编译失败,如果未禁用异常处理,则会引发xpath_exception(有关详细信息,请参阅错误处理)。创建查询后,您可以通过以下函数查询评估结果的类型:
xpath_value_type xpath_query::return_type() const;
bool xpath_query::evaluate_boolean(const xpath_node& n) const;
double xpath_query::evaluate_number(const xpath_node& n) const;
string_t xpath_query::evaluate_string(const xpath_node& n) const;
xpath_node_set xpath_query::evaluate_node_set(const xpath_node& n) const;
xpath_node xpath_query::evaluate_node(const xpath_node& n) const;
All functions take the context node as an argument, compute the expression and return the result, converted to the requested type. According to XPath specification, value of any type can be converted to boolean, number or string value, but no type other than node set can be converted to node set. Because of this, , and always return a result, but and result in an error if the return type is not node set (see Error handling).evaluate_boolean
evaluate_number
evaluate_string
evaluate_node_set
evaluate_node
所有函数都将上下文节点作为参数,计算表达式并返回结果,并转换为请求的类型。根据 XPath 规范,任何类型的值都可以转换为布尔值、数字或字符串值,但除节点集以外的任何类型都不能转换为节点集。因此, , 和 始终返回结果,但如果返回类型未设置节点,则 and 会导致错误(请参阅 错误处理)。evaluate_boolean
evaluate_number
evaluate_string
evaluate_node_set
evaluate_node
Note 注意
|
Calling is equivalent to calling . Calling is equivalent to calling .
node.select_nodes("query") xpath_query("query").evaluate_node_set(node) node.select_node("query") xpath_query("query").evaluate_node(node) 调用 等同于调用 。调用 等同于调用 。 node.select_nodes(“查询”) xpath_query("query").evaluate_node_set(node) node.select_node(“查询”) xpath_query("query").evaluate_node(node) |
Note that function returns the STL string; as such, it’s not available in PUGIXML_NO_STL mode and also usually allocates memory. There is another string evaluation function:evaluate_string
请注意,function 返回 STL 字符串;因此,它在 PUGIXML_NO_STL 模式下不可用,并且通常还会分配内存。还有另一个字符串求值函数:evaluate_string
size_t xpath_query::evaluate_string(char_t* buffer, size_t capacity, const xpath_node& n) const;
This function evaluates the string, and then writes the result to (but at most characters); then it returns the full size of the result in characters, including the terminating zero. If is not 0, the resulting buffer is always zero-terminated. You can use this function as follows:buffer
capacity
capacity
此函数计算字符串,然后将结果写入 (但最多是字符) ;然后返回结果的完整大小(以字符为单位),包括终止零。如果不为 0,则生成的缓冲区始终以零结尾。您可以按如下方式使用此功能:缓冲
容量
容量
-
First call the function with and ; then allocate the returned amount of characters, and call the function again, passing the allocated storage and the amount of characters;
buffer = 0
capacity = 0
首先用 和 调用 function with and ;然后分配返回的字符数,然后再次调用函数,传递分配的存储空间和字符数;缓冲区 = 0
容量 = 0
-
First call the function with small buffer and buffer capacity; then, if the result is larger than the capacity, the output has been trimmed, so allocate a larger buffer and call the function again.
首先调用缓冲区和缓冲区容量较小的函数;然后,如果结果大于容量,则输出已被修剪,因此分配更大的缓冲区并再次调用该函数。
This is an example of using query objects (samples/xpath_query.cpp):
以下是使用查询对象 (samples/xpath_query.cpp) 的示例:
// Select nodes via compiled query
pugi::xpath_query query_remote_tools("/Profile/Tools/Tool[@AllowRemote='true']");
pugi::xpath_node_set tools = query_remote_tools.evaluate_node_set(doc);
std::cout << "Remote tool: ";
tools[2].node().print(std::cout);
// Evaluate numbers via compiled query
pugi::xpath_query query_timeouts("sum(//Tool/@Timeout)");
std::cout << query_timeouts.evaluate_number(doc) << std::endl;
// Evaluate strings via compiled query for different context nodes
pugi::xpath_query query_name_valid("string-length(substring-before(@Filename, '_')) > 0 and @OutputFileMasks");
pugi::xpath_query query_name("concat(substring-before(@Filename, '_'), ' produces ', @OutputFileMasks)");
for (pugi::xml_node tool = doc.first_element_by_path("Profile/Tools/Tool"); tool; tool = tool.next_sibling())
{
std::string s = query_name.evaluate_string(tool);
if (query_name_valid.evaluate_boolean(tool)) std::cout << s << std::endl;
}
8.4. Using variables 8.4. 使用变量
XPath queries may contain references to variables; this is useful if you want to use queries that depend on some dynamic parameter without manually preparing the complete query string, or if you want to reuse the same query object for similar queries.
XPath 查询可能包含对变量的引用;如果要使用依赖于某些动态参数的查询而不手动准备完整的查询字符串,或者如果要对类似的查询重复使用相同的查询对象,这将非常有用。
Variable references have the form ; in order to use them, you have to provide a variable set, which includes all variables present in the query with correct types. This set is passed to constructor or to / functions:$name
xpath_query
select_nodes
select_node
变量引用的格式为 ;为了使用它们,您必须提供一个变量集,其中包括查询中存在的所有具有正确类型的变量。此集合将传递给 constructor 或 / functions:$name
xpath_query
select_nodes
select_node
explicit xpath_query::xpath_query(const char_t* query, xpath_variable_set* variables = 0);
xpath_node xml_node::select_node(const char_t* query, xpath_variable_set* variables = 0) const;
xpath_node_set xml_node::select_nodes(const char_t* query, xpath_variable_set* variables = 0) const;
If you’re using query objects, you can change the variable values before / calls to change the query behavior.evaluate
select
如果您使用的是查询对象,则可以更改 / calls 之前的变量值以更改查询行为。评估
选择
Note 注意
|
The variable set pointer is stored in the query object; you have to ensure that the lifetime of the set exceeds that of query object.
变量集指针存储在 query 对象中;您必须确保 Set 的生命周期超过 Query Object 的生命周期。 |
Variable sets correspond to type, which is essentially a variable container.xpath_variable_set
变量集对应于类型,它本质上是一个变量容器。xpath_variable_set
You can add new variables with the following function:
您可以使用以下函数添加新变量:
xpath_variable* xpath_variable_set::add(const char_t* name, xpath_value_type type);
The function tries to add a new variable with the specified name and type; if the variable with such name does not exist in the set, the function adds a new variable and returns the variable handle; if there is already a variable with the specified name, the function returns the variable handle if variable has the specified type. Otherwise the function returns null pointer; it also returns null pointer on allocation failure.
该函数尝试添加具有指定名称和类型的新变量;如果 Set 中不存在具有此类名称的变量,则该函数将添加一个新变量并返回变量 handle;如果已存在具有指定名称的变量,则如果 variable 具有指定类型,则函数将返回 variable handle。否则,该函数返回 null 指针;它还会在分配失败时返回 null 指针。
New variables are assigned the default value which depends on the type: for numbers, for booleans, empty string for strings and empty set for node sets.0
false
新变量将分配默认值,该值取决于类型:对于数字、对于布尔值、字符串为空字符串和节点集为空集。0
false
You can get the existing variables with the following functions:
您可以使用以下函数获取现有变量:
xpath_variable* xpath_variable_set::get(const char_t* name);
const xpath_variable* xpath_variable_set::get(const char_t* name) const;
The functions return the variable handle, or null pointer if the variable with the specified name is not found.
这些函数返回变量句柄,如果未找到具有指定名称的变量,则返回 null 指针。
Additionally, there are the helper functions for setting the variable value by name; they try to add the variable with the corresponding type, if it does not exist, and to set the value. If the variable with the same name but with different type is already present, they return ; they also return on allocation failure. Note that these functions do not perform any type conversions.false
false
此外,还有用于按名称设置变量值的帮助程序函数;它们尝试添加具有相应类型的变量(如果不存在)并设置值。如果已存在名称相同但类型不同的变量,则它们返回 ;它们还会在分配失败时返回。请注意,这些函数不执行任何类型转换。false
false
bool xpath_variable_set::set(const char_t* name, bool value);
bool xpath_variable_set::set(const char_t* name, double value);
bool xpath_variable_set::set(const char_t* name, const char_t* value);
bool xpath_variable_set::set(const char_t* name, const xpath_node_set& value);
The variable values are copied to the internal variable storage, so you can modify or destroy them after the functions return.
变量值将复制到内部变量存储中,因此您可以在函数返回后修改或销毁它们。
If setting variables by name is not efficient enough, or if you have to inspect variable information or get variable values, you can use variable handles. A variable corresponds to the type, and a variable handle is simply a pointer to .xpath_variable
xpath_variable
如果按名称设置变量的效率不够,或者必须检查变量信息或获取变量值,则可以使用变量句柄。变量对应于类型,变量句柄只是指向 的指针。xpath_variable
xpath_variable
In order to get variable information, you can use one of the following functions:
要获取变量信息,您可以使用以下函数之一:
const char_t* xpath_variable::name() const;
xpath_value_type xpath_variable::type() const;
Note that each variable has a distinct type which is specified upon variable creation and can not be changed later.
请注意,每个变量都有一个不同的类型,该类型在创建变量时指定,以后无法更改。
In order to get variable value, you should use one of the following functions, depending on the variable type:
为了获取变量值,您应该使用以下函数之一,具体取决于变量类型:
bool xpath_variable::get_boolean() const;
double xpath_variable::get_number() const;
const char_t* xpath_variable::get_string() const;
const xpath_node_set& xpath_variable::get_node_set() const;
These functions return the value of the variable. Note that no type conversions are performed; if the type mismatch occurs, a dummy value is returned ( for booleans, for numbers, empty string for strings and empty set for node sets).false
NaN
这些函数返回变量的值。请注意,不执行类型转换;如果发生类型不匹配,则返回一个虚拟值( booleans 为 Numbers 为 Numbers 值,Empty String 表示 Strings 为 Empty String,节点集为 Empty Set)。false
NaN
In order to set variable value, you should use one of the following functions, depending on the variable type:
为了设置变量值,您应该使用以下函数之一,具体取决于变量类型:
bool xpath_variable::set(bool value);
bool xpath_variable::set(double value);
bool xpath_variable::set(const char_t* value);
bool xpath_variable::set(const xpath_node_set& value);
These functions modify the variable value. Note that no type conversions are performed; if the type mismatch occurs, the functions return ; they also return on allocation failure. The variable values are copied to the internal variable storage, so you can modify or destroy them after the functions return.false
false
这些函数修改变量值。请注意,不执行类型转换;如果发生类型不匹配,则函数返回 ;它们还会在分配失败时返回。变量值将复制到内部变量存储中,因此您可以在函数返回后修改或销毁它们。false
false
This is an example of using variables in XPath queries (samples/xpath_variables.cpp):
以下是在 XPath 查询中使用变量的示例 (samples/xpath_variables.cpp):
// Select nodes via compiled query
pugi::xpath_variable_set vars;
vars.add("remote", pugi::xpath_type_boolean);
pugi::xpath_query query_remote_tools("/Profile/Tools/Tool[@AllowRemote = string($remote)]", &vars);
vars.set("remote", true);
pugi::xpath_node_set tools_remote = query_remote_tools.evaluate_node_set(doc);
vars.set("remote", false);
pugi::xpath_node_set tools_local = query_remote_tools.evaluate_node_set(doc);
std::cout << "Remote tool: ";
tools_remote[2].node().print(std::cout);
std::cout << "Local tool: ";
tools_local[0].node().print(std::cout);
// You can pass the context directly to select_nodes/select_node
pugi::xpath_node_set tools_local_imm = doc.select_nodes("/Profile/Tools/Tool[@AllowRemote = string($remote)]", &vars);
std::cout << "Local tool imm: ";
tools_local_imm[0].node().print(std::cout);
8.5. Error handling 8.5. 错误处理
There are two different mechanisms for error handling in XPath implementation; the mechanism used depends on whether exception support is disabled (this is controlled with PUGIXML_NO_EXCEPTIONS define).
在 XPath 实现中,有两种不同的错误处理机制;使用的机制取决于是否禁用了异常支持(这由 PUGIXML_NO_EXCEPTIONS Define 控制)。
By default, XPath functions throw object in case of errors; additionally, in the event any memory allocation fails, an exception is thrown. Also is thrown if the query is evaluated to a node set, but the return type is not node set. If the query constructor succeeds (i.e. no exception is thrown), the query object is valid. Otherwise you can get the error details via one of the following functions:xpath_exception
std::bad_alloc
xpath_exception
默认情况下,XPath 函数在出现错误时引发对象;此外,如果任何内存分配失败,则会引发异常。如果查询的计算结果为节点集,但返回类型不是节点集,则也会引发查询。如果查询构造函数成功(即没有引发异常),则查询对象有效。否则,您可以通过以下函数之一获取错误详细信息:xpath_exception
std::bad_alloc
xpath_exception
virtual const char* xpath_exception::what() const throw();
const xpath_parse_result& xpath_exception::result() const;
If exceptions are disabled, then in the event of parsing failure the query is initialized to invalid state; you can test if the query object is valid by using it in a boolean expression: . Additionally, you can get parsing result via the result() accessor:if (query) { … }
如果禁用了异常,则在解析失败时,查询将初始化为无效状态;您可以通过在布尔表达式中使用查询对象来测试该对象是否有效:。此外,您可以通过 result() 访问器获取解析结果:if (query) { ... }
const xpath_parse_result& xpath_query::result() const;
Without exceptions, evaluating invalid query results in , empty string, or an empty node set, depending on the type; evaluating a query as a node set results in an empty node set if the return type is not node set.false
NaN
无一例外,评估无效查询会导致 、 空字符串 或空节点集,具体取决于类型;如果返回类型不是节点集,则将查询评估为节点集会导致空节点集。false
NaN
The information about parsing result is returned via object. It contains parsing status and the offset of last successfully parsed character from the beginning of the source stream:xpath_parse_result
解析结果的信息是通过 object 返回的。它包含解析状态和上次成功解析的字符相对于源 stream:xpath_parse_result
struct xpath_parse_result
{
const char* error;
ptrdiff_t offset;
operator bool() const;
const char* description() const;
};
Parsing result is represented as the error message; it is either a null pointer, in case there is no error, or the error message in the form of ASCII zero-terminated string.
解析结果表示为错误消息;如果没有错误,它可以是 null 指针,也可以是 ASCII 以零结尾的字符串形式的错误消息。
description()
member function can be used to get the error message; it never returns the null pointer, so you can safely use even if query parsing succeeded. Note that returns a string even in ; you’ll have to call as_wide to get the string.description()
description()
char
PUGIXML_WCHAR_MODE
wchar_t
description()
成员函数可用于获取错误消息;它从不返回 null 指针,因此即使查询解析成功,也可以安全地使用。请注意,即使在 ;您必须调用 as_wide 才能获取字符串。description()
description()
char
PUGIXML_WCHAR_MODE
wchar_t
In addition to the error message, parsing result has an member, which contains the offset of last successfully parsed character. This offset is in units of pugi::char_t (bytes for character mode, wide characters for wide character mode).offset
解析结果除了报错信息外,还有一个成员,包含上次解析成功的字符的偏移量。此偏移量以 pugi::char_t 为单位(字符模式为字节,宽字符模式为宽字符)。抵消
Parsing result object can be implicitly converted to like this: .bool
if (result) { … } else { … }
解析结果对象可以隐式转换为如下: .
if (result) { … } else { … }
布尔
This is an example of XPath error handling (samples/xpath_error.cpp):
以下是 XPath 错误处理的示例 (samples/xpath_error.cpp):
// Exception is thrown for incorrect query syntax
try
{
doc.select_nodes("//nodes[#true()]");
}
catch (const pugi::xpath_exception& e)
{
std::cout << "Select failed: " << e.what() << std::endl;
}
// Exception is thrown for incorrect query semantics
try
{
doc.select_nodes("(123)/next");
}
catch (const pugi::xpath_exception& e)
{
std::cout << "Select failed: " << e.what() << std::endl;
}
// Exception is thrown for query with incorrect return type
try
{
doc.select_nodes("123");
}
catch (const pugi::xpath_exception& e)
{
std::cout << "Select failed: " << e.what() << std::endl;
}
8.6. Conformance to W3C specification
8.6. 符合 W3C 规范
Because of the differences in document object models, performance considerations and implementation complexity, pugixml does not provide a fully conformant XPath 1.0 implementation. This is the current list of incompatibilities:
由于文档对象模型、性能考虑和实现复杂性的差异,pugixml 不提供完全一致的 XPath 1.0 实现。以下是当前的不兼容列表:
-
Consecutive text nodes sharing the same parent are not merged, i.e. in node should have one text node child, but instead has three.
<node>text1 <![CDATA[data]]> text2</node>
共享同一父节点的连续文本节点不会合并,即 in 节点应该有一个文本节点 child,而是有三个。<node>text1 <![CDATA[data]]> text2</node>
-
Since the document type declaration is not used for parsing, function always returns an empty node set.
id()
由于文档类型声明不用于解析,因此 function 始终返回空节点集。id()
-
Namespace nodes are not supported (affects axis).
namespace::
不支持 Namespace 节点(影响 axis)。命名空间::
-
Name tests are performed on QNames in XML document instead of expanded names; for , query will return only the first child, not both of them. Compliant XPath implementations can return both nodes if the user provides appropriate namespace declarations.
<foo xmlns:ns1='uri' xmlns:ns2='uri'><ns1:child/><ns2:child/></foo>
foo/ns1:*
名称测试是在 XML 文档中的 QNames 上执行的,而不是对扩展的名称执行;对于 ,查询将仅返回第一个子项,而不是两个子项。如果用户提供适当的名称空间声明,则兼容的 XPath 实现可以返回两个节点。<foo xmlns:ns1='uri' xmlns:ns2='uri'><ns1:child/><ns2:child/></foo>
foo/ns1:*
-
String functions consider a character to be either a single value or a single value, depending on the library configuration; this means that some string functions are not fully Unicode-aware. This affects , and functions.
char
wchar_t
substring()
string-length()
translate()
字符串函数将字符视为单个值或单个值,具体取决于库配置;这意味着某些字符串函数不能完全识别 Unicode。这会影响 和 函数。char
wchar_t
substring()
string-length()
translate()
9. Changelog 9. 更改日志
v1.14 2023-10-01
1.14 版 2023-10-01
Maintenance release. Changes:
维护版本。变化:
-
Improvements: 改进:
-
xml_attribute::set_name
and now have overloads that accept pointer to non-null-terminated string and sizexml_node::set_name
xml_attribute::set_name
的,现在具有接受指向非 null 终止字符串的指针和大小为 xml_node::set_name
的重载 -
Implement parsing mode in which PCDATA contents is merged into a single node when original document had comments that were skipped during parsing
parse_merge_pcdata
实施解析模式,当原始文档在解析parse_merge_pcdata过程中跳过了注释时,PCDATA 内容将合并到单个节点中
-
xml_document::load_file
now returns a more consistent error status when given a path to a folder现在,xml_document::load_file
在给定文件夹路径时返回更一致的错误状态
-
-
Bug fixes: 错误修复:
-
Fix assertion in XPath number→string conversion when using non-English locales
修复使用非英语区域设置时 XPath number→string 转换中的断言 -
Fix PUGIXML_STATIC_CRT CMake option to correctly select static CRT when using MSVC and recent CMake
修复了PUGIXML_STATIC_CRT CMake 选项,以便在使用 MSVC 和最近的 CMake 时正确选择静态 CRT
-
-
Compatibility improvements:
兼容性改进:-
Fix GCC 2.95/3.3 builds 修复 GCC 2.95/3.3 版本
-
Fix CMake 3.27 deprecation warnings
修复 CMake 3.27 弃用警告 -
Fix XCode 14 sprintf deprecation warning when compiling in C++03 mode
修复在 C++03 模式下编译时 XCode 14 sprintf 弃用警告 -
Fix clang/gcc warnings ,
-Wweak-vtables
-Wreserved-macro-identifier
修复 clang/gcc 警告、-Wweak-vtables-Wreserved
-宏标识符
-
v1.13 2022-11-01
1.13 版 2022-11-01
Maintenance release. Changes:
维护版本。变化:
-
Improvements: 改进:
-
xml_attribute::set_value
, and now have overloads that accept pointer to non-null-terminated string and sizexml_node::set_value
xml_text::set
xml_attribute::set_value
,现在具有接受指向非 null 结尾的字符串和大小为xml_node::set_value
xml_text::set
的重载 -
Improve performance of tree traversal when using compact mode (
PUGIXML_COMPACT
)
提高使用紧凑模式时的树遍历性能 (PUGIXML_COMPACT
)
-
-
Bug fixes: 错误修复:
-
Fix error handling in that could result in the function succeeding while running out of disk space
xml_document::save_file
修复了可能导致函数在磁盘空间不足时成功的错误处理xml_document::save_file
-
Fix memory leak during error handling of some out-of-memory conditions during
xml_document::load
修复在xml_document::load
期间对某些内存不足情况进行错误处理期间的内存泄漏
-
-
Compatibility improvements:
兼容性改进:-
Fix exported symbols in CMake DLL builds when using CMake
使用 CMake 时修复 CMake DLL 构建中导出的符号 -
Fix exported symbols in CMake shared object builds when using -fvisibility=hidden
修复在使用 -fvisibility=hidden 时在 CMake 共享目标文件构建中导出的元件
-
v1.12 2022-02-09
1.12 版 2022-02-09
Maintenance release. Changes:
维护版本。变化:
-
Bug fixes: 错误修复:
-
Fix a bug in xml_document move construction when the source of the move is empty
修复了 xml_document 移动源为空时移动构造中的错误 -
Fix const-correctness issues with iterator objects to support C++20 ranges
修复迭代器对象的 const 正确性问题以支持 C++20 范围
-
-
XPath improvements: XPath 改进:
-
Improved detection of overly complex queries that may result in stack overflow during parsing
改进了对在解析期间可能导致堆栈溢出的过于复杂的查询的检测
-
-
Compatibility improvements:
兼容性改进:-
Fix Cygwin support for DLL builds
修复 Cygwin 对 DLL 构建的支持 -
Fix Windows CE support 修复 Windows CE 支持
-
Add NuGet builds and project files for VS2022
为 VS2022 添加 NuGet 构建和项目文件
-
-
Build system changes 构建系统更改
-
All CMake options now have the prefix . This may require changing dependent build configurations.
PUGIXML_
所有 CMake 选项现在都带有前缀 。这可能需要更改依赖的构建配置。PUGIXML_
-
Many build settings are now exposed via CMake settings, most notably and can be set without changing
PUGIXML_COMPACT
PUGIXML_WCHAR_MODE
pugiconfig.hpp
许多构建设置现在通过 CMake 设置公开,最值得注意的是,可以在不更改PUGIXML_COMPACT
PUGIXML_WCHAR_MODE
pugiconfig.hpp
的情况下进行设置
-
v1.11 2020-11-26
1.11 版 2020-11-26
Maintenance release. Changes:
维护版本。变化:
-
New features: 新功能:
-
Add xml_node::remove_attributes and xml_node::remove_children
添加 xml_node::remove_attributes 和 xml_node::remove_children -
Add a way to customize floating point precision via xml_attribute::set and xml_text::set overloads
添加一种通过 xml_attribute::set 和 xml_text::set 重载自定义浮点精度的方法
-
-
XPath improvements: XPath 改进:
-
XPath parser now limits recursion depth which prevents stack overflow on malicious queries
XPath 解析器现在限制递归深度,从而防止恶意查询的堆栈溢出
-
-
Compatibility improvements:
兼容性改进:-
Fix Visual Studio warnings when built using clang-cl compiler
修复使用 clang-cl 编译器构建时的 Visual Studio 警告 -
Fix Wconversion warnings in gcc
修复 gcc 中的 Wconversion 警告 -
Fix Wzero-as-null-pointer-constant warnings in pugixml.hpp
修复 pugixml.hpp 中的 Wzero-as-null-pointer-constant 警告 -
Work around several static analysis false positives
解决几个静态分析误报
-
-
Build system changes 构建系统更改
-
The CMake package for pugixml now provides a target rather than a target. A compatibility target is provided if at least version 1.11 is not requested.
pugixml::pugixml
pugixml
pugixml
pugixml 的 CMake 包现在提供目标而不是目标。如果未请求至少版本 1.11,则会提供兼容性目标。pugixml::p ugixml
pugixml pugixml
-
v1.10 2019-09-15
1.10 版 2019-09-15
Maintenance release. Changes:
维护版本。变化:
-
Behavior changes: 行为更改:
-
Tab characters (ASCII 9) in attribute values are now encoded as '	' to survive roundtripping
属性值中的制表符 (ASCII 9) 现在编码为 ' ' 以承受往返传输 -
>
characters are no longer escaped in attribute values>
字符不再在属性值中转义
-
-
New features: 新功能:
-
Add Visual Studio .natvis files to improve debugging experience
添加 Visual Studio .natvis 文件以改善调试体验 -
CMake improvements (USE_POSTFIX and BUILD_SHARED_AND_STATIC_LIBS options for building multiple versions and pkg-config tweaks)
CMake 改进(用于构建多个版本和 pkg-config 调整的 USE_POSTFIX 和 BUILD_SHARED_AND_STATIC_LIBS 选项) -
Add format_skip_control_chars formatting flag to skip non-printable ASCII characters that are invalid to use in well-formed XML files
添加了format_skip_control_chars格式标志,以跳过在格式正确的 XML 文件中无效的不可打印的 ASCII 字符 -
Add format_attribute_single_quote formatting flag to use single quotes for attribute values instead of default double quotes.
添加format_attribute_single_quote格式设置标志,以对属性值使用单引号,而不是默认双引号。
-
-
XPath improvements: XPath 改进:
-
XPath union now results in a stable order that doesn’t depend on memory allocations; crucially, this may require sorting the output of XPath query operation if you rely on the document-ordered traversal
XPath 联合现在产生不依赖于内存分配的稳定顺序;至关重要的是,如果您依赖文档排序遍历,则可能需要对 XPath 查询操作的输出进行排序 -
Improve performance of XPath union operation, making it ~2x faster
提高 XPath 联合操作的性能,使其速度提高 ~2 倍
-
-
Compatibility improvements:
兼容性改进:-
Fix Visual Studio warnings when built in a DLL configuration
修复在 DLL 配置中构建时的 Visual Studio 警告 -
Fix static analysis false positives in Coverity and clang
修复 Coverity 和 clang 中的静态分析误报 -
Fix Wdouble-promotion warnings in gcc
修复 gcc 中的 Wdouble-promotion 警告 -
Add Visual Studio 2019 support for NuGet packages
添加对 NuGet 包的 Visual Studio 2019 支持
-
v1.9 2018-04-04
1.9 版 2018-04-04
Maintenance release. Changes:
维护版本。变化:
-
Specification changes: 规格变更:
-
xml_document::load(const char*)
(deprecated in 1.5) now has attribute; use insteaddeprecated
xml_document::load_string
xml_document::load(const char*)
(在 1.5 中已弃用)现在具有 attribute;请改用deprecated
xml_document::load_string
-
xml_node::select_single_node
(deprecated in 1.5) now has attribute; use insteaddeprecated
xml_node::select_node
xml_node::select_single_node
(在 1.5 中已弃用)现在具有 attribute;请改用deprecated
xml_node::select_node
-
-
New features: 新功能:
-
Add move semantics support for xml_document and improve move semantics support for other objects
添加对 xml_document 的移动语义支持,并改进对其他对象的移动语义支持 -
CMake build now exports include directories
CMake build 现在导出包含目录 -
CMake build with BUILD_SHARED_LIBS=ON now uses dllexport attribute for MSVC
BUILD_SHARED_LIBS=ON 的 CMake 版本现在使用 MSVC 的 dllexport 属性
-
-
XPath improvements: XPath 改进:
-
Rework parser/evaluator to not rely on exceptional control flow; longjmp is no longer used when exceptions are disabled
重新设计解析器/评估器,使其不依赖于异常的控制流;禁用异常时,不再使用 longjmp -
Improve error messages for certain invalid expressions such as or
.[1]
(1
改进了某些无效表达式(如 or.[1]
(1
-
Minor performance improvements
次要性能改进
-
-
Compatibility improvements:
兼容性改进:-
Fix Texas Instruments compiler warnings
修复 Texas Instruments 编译器警告 -
Fix compilation issues with limits.h for some versions of gcc
修复某些 gcc 版本的 limits.h 的编译问题 -
Fix compilation issues with Clang/C2
修复 Clang/C2 的编译问题 -
Fix implicit fallthrough warnings in gcc 7
修复 gcc 7 中的隐式 fallthrough 警告 -
Fix unknown attribute directive warnings in gcc 8
修复 gcc 8 中的未知属性指令警告 -
Fix cray++ compiler errors
修复 cray++ 编译器错误 -
Fix unsigned integer overflow errors with -fsanitize=integer
使用 -fsanitize=integer 修复无符号整数溢出错误 -
Fix undefined behavior sanitizer issues in compact mode
修复紧凑模式下的未定义行为排错程序问题
-
v1.8 2016-11-24
1.8 版 2016-11-24
Maintenance release. Changes:
维护版本。变化:
-
Specification changes: 规格变更:
-
When printing empty elements, a space is no longer added before / in format_raw mode
打印空元素时,在 format_raw 模式下不再在 / 之前添加空格
-
-
New features: 新功能:
-
Added parse_embed_pcdata parsing mode in which PCDATA value is stored in the element node if possible (significantly reducing memory consumption for some documents)
添加了parse_embed_pcdata解析模式,其中 PCDATA 值尽可能存储在元素节点中(显著减少某些文档的内存消耗) -
Added auto-detection support for Latin-1 (ISO-8859-1) encoding during parsing
添加了在解析期间对 Latin-1 (ISO-8859-1) 编码的自动检测支持 -
Added format_no_empty_element_tags formatting flag that outputs start/end tags instead of empty element tags for empty elements
添加了format_no_empty_element_tags格式标志,用于输出开始/结束标签,而不是为空元素的空元素标签
-
-
Performance improvements:
性能改进:-
Minor memory allocation improvements (yielding up to 1% memory savings in some cases)
对内存分配进行了细微的改进(在某些情况下最多可节省 1% 的内存)
-
-
Compatibility improvements:
兼容性改进:-
Fixed compilation issues for Borland C++ 5.4
修复了 Borland C++ 5.4 的编译问题 -
Fixed compilation issues for some distributions of MinGW 3.8
修复了 MinGW 3.8 某些发行版的编译问题 -
Fixed various Clang/GCC warnings
修复了各种 Clang/GCC 警告 -
Enabled move semantics support for XPath objects for MSVC 2010 and above
为 MSVC 2010 及更高版本启用了对 XPath 对象的移动语义支持
-
v1.7 2015-10-19
1.7 版 2015-10-19
Major release, featuring performance and memory improvements along with some new features. Changes:
主要版本,具有性能和内存改进以及一些新功能。变化:
-
Compact mode: Compact 模式:
-
Introduced a new tree storage mode that takes significantly less memory (2-5x smaller DOM) at some performance cost.
引入了一种新的树形存储模式,该模式以一定的性能成本显著减少占用内存(DOM 小 2-5 倍)。 -
The mode can be enabled using define.
PUGIXML_COMPACT
可以使用 define 启用该模式。PUGIXML_COMPACT
-
-
New integer parsing/formatting implementation:
新的整数解析/格式化实现:-
Functions that convert from and to integers (e.g. /) do not rely on CRT any more.
as_int
set_value
从整数转换为整数的函数(例如 /)不再依赖于 CRT。as_int
set_value
-
New implementation is 3-5x faster and is always correct wrt overflow or underflow. This is a behavior change - where previously would return UINT_MAX on a value "-1", it now returns 0.
as_uint()
新实现的速度提高了 3-5 倍,并且始终是正确的 wrt 溢出或下溢。这是一个行为更改 - 以前在值 “-1” 上返回 UINT_MAX,现在返回 0。as_uint()
-
-
New features: 新功能:
-
XPath objects (, , ) are now movable if your compiler supports C++11. Additionally, is copyable.
xpath_query
xpath_node_set
xpath_variable_set
xpath_variable_set
如果您的编译器支持 C++11,则 XPath 对象 (, , ) 现在是可移动的。此外,是可复制的。xpath_query
xpath_node_set
xpath_variable_set
xpath_variable_set
-
Added that makes the resulting XML friendlier to line diff/merge tools.
format_indent_attributes
添加了使生成的 XML 对 line diff/merge 工具更友好的功能。format_indent_attributes
-
Added a variant of function with a hint that can improve lookup performance.
xml_node::attribute
添加了一个带有提示的函数变体,可以提高查找性能。xml_node::attribute
-
Custom allocation functions are now allowed (but not required) to throw instead of returning a null pointer.
现在允许(但不是必需的)自定义分配函数引发而不是返回 null 指针。
-
-
Bug fixes: 错误修复:
-
Fix Clang 3.7 crashes in out-of-memory cases (C++ DR 1748)
修复 Clang 3.7 在内存不足情况下崩溃的问题 (C++ DR 1748) -
Fix XPath crashes on SPARC64 (and other 32-bit architectures where doubles have to be aligned to 8 bytes)
修复 XPath 在 SPARC64 上崩溃的问题(以及其他双精度必须与 8 字节对齐的 32 位架构) -
Fix xpath_node_set assignment to provide strong exception guarantee
修复xpath_node_set赋值,提供强异常保证 -
Fix saving for custom xml_writer implementations that can throw from write()
修复了可以从 write() 引发的自定义 xml_writer 实现的保存
-
v1.6 2015-04-10
1.6 版 2015-04-10
Maintenance release. Changes:
维护版本。变化:
-
Specification changes: 规格变更:
-
Attribute/text values now use more digits when printing floating point numbers to guarantee round-tripping.
属性/文本值现在在打印浮点数时使用更多数字,以保证往返。 -
Text nodes no longer get extra surrounding whitespace when pretty-printing nodes with mixed contents
当漂亮地打印具有混合内容的节点时,文本节点不再获得额外的周围空格
-
-
Bug fixes: