NDT-6D for color registration in agri-robotic applications 植保机器人应用中的色彩校准 NDT-6D
Himanshu Gupta | Achim J. Lilienthal | Henrik Andreasson | 古普塔·希曼舒 | 利利昂塔尔·阿希姆·J. | 安德烈亚松·亨利克 |Polina Kurtser 波利娜·库尔策
Centre for Applied Autonomous Sensor Systems, Institutionen för naturvetenskap & teknik, Örebro University, Örebro, Sweden 应用自主传感系统中心,自然科学与技术学院,厄勒布鲁大学,厄勒布鲁,瑞典
Perception for Intelligent Systems, Technical University of Munich, Munich, Germany 智能系统感知,慕尼黑工业大学,慕尼黑,德国
Department of Radiation Science, Radiation Physics, Umeå University, Umeå, Sweden 瑞典乌米奥大学辐射科学、辐射物理学系
Registration of point cloud data containing both depth and color information is critical for a variety of applications, including in-field robotic plant manipulation, crop growth modeling, and autonomous navigation. However, current state-ofthe-art registration methods often fail in challenging agricultural field conditions due to factors such as occlusions, plant density, and variable illumination. To address these issues, we propose the NDT-6D registration method, which is a color-based variation of the Normal Distribution Transform (NDT) registration approach for point clouds. Our method computes correspondences between pointclouds using both geometric and color information and minimizes the distance between these correspondences using only the three-dimensional (3D) geometric dimensions. We evaluate the method using the GRAPES3D data set collected with a commercial-grade RGB-D sensor mounted on a mobile platform in a vineyard. Results show that registration methods that only rely on depth information fail to provide quality registration for the tested data set. The proposed color-based variation outperforms state-of-the-art methods with a root mean square error (RMSE) of for NDT-6D compared with 1.1-2.3 cm for other color-information-based methods and 1.2-13.7 for noncolor-information-based methods. The proposed method is shown to be robust against noises using the TUM RGBD data set by artificially adding noise present in an outdoor scenario. The relative pose error (RPE) increased for our method compared to an increase of for the best-performing registration method. The obtained average accuracy suggests that the NDT-6D registration methods can be used for in-field precision agriculture applications, for example, crop detection, size-based maturity estimation, and growth modeling. 包含深度和颜色信息的点云数据的注册对于各种应用至关重要,包括现场机器人植物操作、作物生长建模和自主导航。然而,当前最先进的注册方法在具有挑战性的农业环境条件下常常失败,这是由于遮挡、植株密度和照明变化等因素造成的。为了解决这些问题,我们提出了 NDT-6D 注册方法,这是基于色彩信息的 Normal Distribution Transform(NDT)注册方法的变体。我们的方法使用几何和颜色信息计算点云之间的对应关系,并仅使用三维(3D)几何尺度来最小化这些对应关系的距离。我们使用安装在移动平台上的商用 RGB-D 传感器在葡萄园收集的 GRAPES3D 数据集来评估该方法。结果表明,仅依赖于深度信息的注册方法无法为测试数据集提供高质量的注册。所提出的基于色彩的变体在 RMSE 方面优于最先进的方法,NDT-6D 为
KEYWORDS 关键词
agricultural robotics, color pointcloud, in-field sensing, machine perception, RGB-D registration, stereo IR, vineyard 农业机器人技术、彩色点云、田间感知、机器感知、RGB-D 配准、立体红外、葡萄园
1 | INTRODUCTION 1 | 简介
Automation in the agricultural domain is a fast-growing application of outdoor robotics mostly due to the lack of human labor and, as a result, increasing the cost of manual field operations such as harvesting, planting, pruning, and trimming (Oliveira et al., 2021). With recent advancements in the field of visual detection, threedimensional (3D) reconstruction, and positioning using analytical and artificial intelligence-based methods, these labor-intensive tasks are being automated using robots (Bac et al., 2014; Bakker et al., 2006; Bawden et al., 2017). These systems have the potential to reduce costs and increase field productivity. They employ machine vision algorithms (Kamilaris & Prenafeta-Boldú, 2018; Tian et al., 2020) for the detection and positioning of target crops, mainly relying on color images. 农业领域的自动化是户外机器人快速发展的应用,主要是由于人工劳动力的缺乏,导致如收割、种植、修剪等人工田间作业成本不断上升(Oliveira 等人,2021)。随着视觉检测、三维重建及基于分析和人工智能的定位技术的进步,这些劳动密集型任务正在使用机器人进行自动化(Bac 等人,2014;Bakker 等人,2006;Bawden 等人,2017)。这些系统有望降低成本,提高田间生产效率。它们采用机器视觉算法(Kamilaris & Prenafeta-Boldú,2018;Tian 等人,2020)来检测和定位目标作物,主要依赖彩色图像。
For the calculation of the target crops' spatial position or morphological aspects, depth information is often required (Arad et al., 2020; Kurtser, Ringdahl, Rotstein, Berenstein, et al., 2020; Vit & Shani, 2018). For this purpose, the RGB-D camera is well suited as these sensors provide colored 2D images and 3D point clouds (Kurtser, Ringdahl, Rotstein, Berenstein, et al., 2020; Ringdahl et al., 2019) enclosed in a single rigid packaging capable of sustaining the harsh environmental conditions often encountered in these applications. The colored images can be used for color-based detection, plant growth monitoring, and ripeness estimation. The 3D point clouds can be used to estimate the physical crop size, shape, and target localization. Commercialgrade RGB-D sensors operating in field conditions have only recently become available in the market (Ringdahl et al., 2019; Vit & Shani, 2018). Till recently, point clouds were employed mostly exclusively for the navigation of the robot in the field using and 3D LiDARs, an application often not requiring color data (Biber et al., 2012; Malavazi et al., 2018) or high pointcloud density. 对于目标作物的空间位置或形态学方面的计算,通常需要深度信息(Arad et al., 2020; Kurtser, Ringdahl, Rotstein, Berenstein, 等, 2020; Vit & Shani, 2018)。为此,RGB-D 相机非常适合,因为这些传感器提供彩色 2D 图像和 3D 点云(Kurtser, Ringdahl, Rotstein, Berenstein, 等, 2020; Ringdahl et al., 2019),封装在单个刚性包装内,能够承受这些应用中常见的恶劣环境条件。彩色图像可用于基于颜色的检测、植物生长监测和成熟度估计。3D 点云可用于估算实际作物的大小、形状和目标定位。最近市场上才出现了在田间条件下运行的商业级 RGB-D 传感器(Ringdahl et al., 2019; Vit & Shani, 2018)。直到最近,点云主要被用于使用 和 3D 激光雷达对机器人在田间的导航,这种应用通常不需要颜色数据(Biber et al., 2012; Malavazi et al., 2018)或高密度点云。
With the penetration of the RGB-D sensors into the agricultural robotics domain, algorithms for in-field extraction of crop size, shape, ripeness, and position were developed. These algorithms rely mainly on previous work in indoor conditions where detailed 3D plant models can be extracted using hand-held 3D scanners (Schunck et al., 2021), acquiring data from multiple viewpoints. Despite previous work showing that employing multiple viewpoints can significantly improve precision (i.e., Harel et al., 2016; Kurtser & Edan, 2018b), most outdoor algorithms rely on pointclouds acquired from a single location. This can be attributed to a working assumption often voiced in the field that state-of-the-art registration algorithms generally fail to provide accurate registration results for the noisy outdoor sensory data acquired from RGB-D cameras and the dense and repetitive soft dynamic foliage present in the agricultural domain. While it was claimed in our previous work (Kurtser, Ringdahl, Rotstein, & Andreasson, 2020; Kurtser, Ringdahl, Rotstein, Berenstein, et al., 2020) that single frame detection can be sufficient for some applications, it is apparent that higher precision can be obtained from the registration of data from several viewpoints before analysis. For example, algorithms relying on a single frame RGB-D are often more prone to additional error due to a significant number of overflowing points near the boundary of the objects. This problem can be solved by combining consecutive point clouds using registration algorithms and 3D reconstruction of the registered point cloud. Similarly, single frame RGB-D only provides one object surface, the one facing the camera, making correct estimation of volume and location biased. 随着 RGB-D 传感器渗透到农业机器人领域,开发了用于田间作物尺寸、形状、成熟度和位置提取的算法。这些算法主要依赖于之前在室内条件下的工作,在那里可以使用手持 3D 扫描仪从多个视角获取详细的 3D 植物模型(Schunck 等人,2021 年)。尽管之前的工作表明,采用多个视点可以显著提高精度(即 Harel 等人,2016 年;Kurtser & Edan,2018b),但大多数室外算法依赖于从单个位置获取的点云。这可以归因于该领域普遍存在的一个工作假设,即最先进的配准算法通常无法为 RGB-D 摄像机获取的噪声室外传感数据以及农业领域存在的密集和重复的软动态树叶提供准确的配准结果。虽然我们之前的工作(Kurtser、Ringdahl、Rotstein 和 Andreasson,2020 年;Kurtser、Ringdahl、Rotstein、Berenstein 等人,2020 年)声称单帧检测可能足够满足某些应用,但很明显,在分析之前对来自多个视点的数据进行配准可以获得更高的精度。例如,依赖单个帧 RGB-D 的算法通常更容易受到对象边界附近大量溢出点导致的额外误差。通过使用配准算法和 3D 重建注册点云可以解决这个问题。同样,单个帧 RGB-D 只提供面向相机的一个物体表面,这会导致体积和位置估计存在偏差。
Beyond very close range applications such as plant morphological modeling and localization, registration of multiple pointclouds originating from consecutive frames acquired in field conditions from commercial grade RGB-D cameras can also potentially replace or supplement LiDARs in close range navigation. Enriched maps generated from the aggregation of consecutive pointclouds acquired by LiDARs for navigation purposes with dense close range information can lead to a variety of possible applications such as field monitoring and acquisition of measures such as yield. 除了像植物形态建模和定位等极近距离应用之外,来自商用级 RGB-D 相机连续帧采集的多个点云配准也可能取代或补充近距导航中的激光雷达。将激光雷达采集的连续点云聚集生成的丰富地图,与近距离信息相结合,可能带来诸如田间监测和产量测量等各种应用。
Since pointcloud registration in field conditions is a well researched field in many domains, in this paper, we aim to investigate the reasons for the failure of state-of-the-art registration algorithms given the field conditions in which agricultural robots are to operate. We perform this by comparing a range of commonly used registration algorithms on a data set acquired in commercial vineyard conditions (GRAPES3D data set [Kurtser, Ringdahl, Rotstein, Berenstein, et al., 2020]). Once the weaknesses are identified, we propose our registration method, which is shown to be more robust in these conditions. We show our algorithm's robustness using a benchmark data set for RGB-D registration and SLAM methods, TUM RGBD data set (Sturm et al., 2012)) using the evaluation metrics of the data set. 在实地条件下点云注册是许多领域中广泛研究的课题,在本文中,我们旨在探讨在农业机器人应运工作的实地条件下,最先进的注册算法会失败的原因。我们通过比较一系列常用的注册算法在商业葡萄园条件下获取的数据集(GRAPES3D 数据集[Kurtser, Ringdahl, Rotstein, Berenstein 等人,2020])来实现这一目标。一旦识别出弱点,我们提出了我们的注册方法,在这些条件下表现更为稳健。我们使用 TUM RGBD 数据集(Sturm 等人,2012)的基准数据集和评估指标,展示了我们算法在 RGB-D 注册和 SLAM 方法方面的稳健性。
1.1 | Contribution 1.1 | 贡献
Given the outlined need for RGB-D data registration in this specific setting, our contribution is as follows: 基于本特定环境中对 RGB-D 数据配准的需求,我们的贡献如下:
We introduce a novel registration method (NDT-6D) that successfully registers the collected data and shows to be more robust to sensory noise than the state-of-the-art registration methods including a supplementary code release. 我们引入了一种新颖的配准方法(NDT-6D),该方法成功地配准了收集的数据,并且比现有最先进的配准方法更能抵抗传感噪声,并附有补充代码发布。
We present evaluation results for the current state-of-the-art registration methods on prototypical agri-robotics RGB-D data collected from a mobile robot in a vineyard setting. We compare these results to the evaluation of the same algorithms on a typical indoor benchmark data set. 我们提供了当前最先进的配准方法在典型农业机器人 RGB-D 数据(从在葡萄园环境中的移动机器人收集)上的评估结果。我们将这些结果与同样算法在典型室内基准数据集上的评估进行比较。
We evaluate in detail the contribution of color cues for scan registration in the agricultural setting. 我们详细评估农业环境中扫描注册的颜色线索贡献。
We provide an evaluation methodology that focuses on measures specifically relevant to agri-robotics applications. 我们提供了一种专注于农业机器人应用相关措施的评估方法。
Code available here (last accessed Oct 2021): https://github.com/hgupta01/ndt-6d.git <代码 0> 代码可在此处获取(最后访问于 2021 年 10 月):https://github.com/hgupta01/ndt-6d.git
The rest of the paper is structured as follows. First, we provide an overview of the use of RGB-D data in the agri-robotics domain and the challenges in data registration and multi-view analysis. Next, we provide an overview of the current methods of point cloud registration to which we compare our work, as well as some standard notations. In Section 4, we first introduce the empirical data used to evaluate the various registration algorithms, followed by a description of the suggested NDT-6D method. Finally, we present and discuss the detailed results obtained from applying the registration algorithms to the mentioned data sets in the results section, followed by a short conclusion. 本文的其余部分结构如下。首先,我们概述了 RGB-D 数据在农业机器人领域的使用及数据注册和多视图分析的挑战。接下来,我们概述了当前点云注册方法,并将其与我们的工作进行比较,同时介绍了一些标准符号。在第 4 节中,我们首先介绍用于评估各种注册算法的经验数据,然后描述了建议的 NDT-6D 方法。最后,我们在结果部分呈现并讨论了将注册算法应用于上述数据集所获得的详细结果,并作出简短结论。
2 | LITERATURE OVERVIEW 文学概述
2.1 | Visual sensors in agricultural robotics 2.1 | 农业机器人中的视觉传感器
The most common sensors employed in agricultural robotics operating in field conditions are imaging cameras (Bac et al., 2014; Kamilaris & Prenafeta-Boldú, 2018). Factors such as robustness, low cost, low weight and size, and the fact that humans rely greatly on vision to perform manual crop monitoring and manipulations, are all contributors to the widespread use of the RGB camera in the applications of crop monitoring. Detection of the crop, as well as diseases and pests in field conditions, are most often achieved using a color camera placed facing the foliage (Al-Hiary et al., 2011; Bac et al., 2014; Kamilaris & Prenafeta-Boldú, 2018; Singh & Misra, 2017). The algorithms developed are aimed to detect abnormalities and foreign objects from imagery data. The main obstacles affecting performance in detection directly on foliage are most often high occlusion rates and variable lighting conditions. Some of the solutions proposed are often multi or hyperspectral cameras (Dale et al., 2013) thermal imaging(Vadivambal & Jayas, 2011) and light resilient adaptive algorithms (Arad et al., 2019; Zemmour et al., 2017). 在田间条件下,农业机器人常用的传感器是成像相机(Bac et al., 2014; Kamilaris & Prenafeta-Boldú, 2018)。坚固性、低成本、轻量和小型以及人类依赖视觉执行手动作物监测和操作等因素,都促进了 RGB 相机在作物监测应用中的广泛使用。通常使用对着叶片的彩色相机检测作物、病虫害(Al-Hiary et al., 2011; Bac et al., 2014; Kamilaris & Prenafeta-Boldú, 2018; Singh & Misra, 2017)。开发的算法旨在从图像数据中检测异常和外来物。直接在叶片上检测的主要障碍通常是遮挡率高和光照条件变化。一些提出的解决方案包括多光谱或高光谱相机(Dale et al., 2013)、热成像(Vadivambal & Jayas, 2011)和抗光照的自适应算法(Arad et al., 2019; Zemmour et al., 2017)。
Despite the clear advantages of relying on imagery data for most operations of crop monitoring, not all field operations can rely solely on color data or spectral data. Specifically, in agricultural robotics, the physical dimension and location of the detected crop can be crucial for continuous operation. The somewhat popular examples in agrirobotics include operations requiring physical manipulation of the plant and therefore requiring localization of the target in world coordinates for actions such as harvesting (Arad et al., 2020; Bac et al., 2014), weeding (Bakker et al., 2006; Bawden et al., 2017) and pruning (Botterill et al., 2017). Up till recently, most commercially available range measuring sensors, combined with RGB cameras failed to provide the necessary sensory data quality to be implemented in outdoor conditions, and therefore technical solutions such as visual servoing (e.g., for harvesting - [Arad et al., 2020; Barth et al., 2016]) or assumption of constant distance to target (e.g., for top-down weeding [Tillett et al., 2008]) are often employed. With the recent developments in commercial-grade RGB-D sensors the acquisition of acceptable quality colored point clouds is now possible in outdoor conditions as well (Ringdahl et al., 2019; Vit & Shani, 2018). 尽管依靠影像数据进行作物监测的大多数操作具有明显优势,但并非所有农场作业都能完全依赖颜色数据或光谱数据。具体而言,在农业机器人中,检测作物的实际尺寸和位置可能对持续运行至关重要。农业机器人领域中一些广为人知的示例包括需要物理操纵植物的作业,因此需要确定目标在世界坐标系中的定位,如收获(Arad et al., 2020; Bac et al., 2014)、除草(Bakker et al., 2006; Bawden et al., 2017)和修剪(Botterill et al., 2017)等。直到最近,大多数商用测距传感器与 RGB 相机无法提供足够的传感数据质量以应用于户外条件,因此经常采用视觉伺服(例如,用于收获 - [Arad et al., 2020; Barth et al., 2016])或假定到目标的距离恒定(例如,用于自上而下的除草[Tillett et al., 2008])等技术解决方案。随着商用级 RGB-D 传感器的最新发展,在户外条件下也可以获取可接受质量的彩色点云数据(Ringdahl et al., 2019; Vit & Shani, 2018)。
These capabilities open the door for close sensing applications for monitoring the crop's physical size and location. Applications such as growth monitoring, maturity estimation based on physical size, and phenotypic features extraction were so far tested mostly in the laboratory and postharvest controlled photo chamber conditions (Hacking et al., 2019; Kirk et al., 2020; Nandi et al., 2016). The availability of such sensors is now enabling in-field size- based phenotypes acquisitions (Kurtser, Ringdahl, Rotstein, Berenstein, et al., 2020; Milella et al., 2019; Vit & Shani, 2018). All of these operations require depth sensors. 这些功能为作物物理尺寸和位置监测的近距离感应应用程序开辟了道路。诸如生长监测、基于物理尺寸的成熟度估计以及表型特征提取等应用程序迄今主要在实验室和收获后受控的光室条件下进行测试(Hacking et al., 2019; Kirk et al., 2020; Nandi et al., 2016)。这种传感器的可用性现在正在使现场尺寸驱动的表型获取成为可能(Kurtser, Ringdahl, Rotstein, Berenstein, et al., 2020; Milella et al., 2019; Vit & Shani, 2018)。所有这些操作都需要深度传感器。
2.2 | Mapping and data fusion in orchard and vineyard settings 2.2 | 在果园和葡萄园环境中进行制图和数据融合
Several recent projects in autonomous monitoring of vineyards and orchards have focused on the need to fuse and aggregate information collected from ground mobile robots in a form of a semantically enriched map. VineScout (Fernández-Novales et al., 2021) autonomous ground vehicle equipped with an IR sensor was used to monitor grapevine water status. The information is aggregated into maps of the entire vineyard. More classic simultaneous mapping and localization (SLAM) algorithms were tested in vineyard conditions by the Bacchus project (Papadimitriou et al., 2022) with the aim of generating navigation maps. Wang, Tang, and Whitty (Wang et al., 2020) generated maps of flower density in apple orchards using a ground robot equipped with RGB and RGBD sensors. Despite the aggregation of the semantic data in a form of a map using geolocation extracted from the GPS unit, the authors do not register the pointclouds from the RGBD camera but perform single-frame detection. 一些最近在葡萄园和果园自主监测方面的项目,重点关注将从地面移动机器人收集的信息融合和聚合成语义丰富的地图的需求。VineScout(Fernández-Novales 等人,2021 年)自主地面车辆配备了红外传感器,用于监测葡萄藤水分状况。信息被聚合成整个葡萄园的地图。Bacchus 项目(Papadimitriou 等人,2022 年)在葡萄园条件下测试了更经典的同步建图和定位(SLAM)算法,目的是生成导航地图。Wang、Tang 和 Whitty(Wang 等人,2020 年)使用配备有 RGB 和 RGBD 传感器的地面机器人,生成了苹果园的花密度地图。尽管使用从 GPS 单元提取的地理位置将语义数据聚合成地图形式,但作者没有注册 RGBD 相机的点云,而是进行单帧检测。
2.3 | Multi-view and point cloud registration in the agricultural automation domain 2.3 | 农业自动化领域中的多视图和点云配准
The use of multiple viewpoints planning for an eye-in-hand robotic configuration or drone field monitoring is a widely discussed issue in agri-robotic vision applications (Barth et al., 2016; Bulanon et al., 2009; Hemming et al., 2014; Kurtser & Edan, 2018a, 2018b; Zaenker et al., 2021, 2020). The discussion often focuses on target visibility due to the high occlusion levels requiring multiple viewpoints to overcome the problem. The sensor viewpoint planning methods often focus on the need to plan the sensing strategy under time constraints and expected information content. These methods often do not register the point clouds but rather plan the next optimal viewpoint. Attempts to register RGB-D point clouds in agricultural settings, acquired from on-ground robots often focus on grasping pose calculation for fruit harvesting (Guo et al., 2020; Lehnert et al., 2016) or growth modeling (Alenya et al., 2011; Chebrolu et al., 2020). 使用多个视角规划用于眼手式机器人配置或无人机田间监测是农业机器人视觉应用中广泛讨论的问题(Barth et al., 2016; Bulanon et al., 2009; Hemming et al., 2014; Kurtser & Edan, 2018a, 2018b; Zaenker et al., 2021, 2020)。讨论通常集中在目标可见性上,由于需要克服高度遮挡情况而需要多个视角。传感器视角规划方法通常集中在在时间约束和预期信息内容下规划感知策略的需求。这些方法通常不注册点云,而是计划下一个最佳视角。尝试在农业环境中注册从地面机器人获取的 RGB-D 点云,通常集中在果实收获(Guo et al., 2020; Lehnert et al., 2016)或生长建模(Alenya et al., 2011; Chebrolu et al., 2020)的抓取姿态计算。
To the best of the authors' knowledge, these applications have been tested so far only in indoor laboratory conditions and do not 据作者所知,这些应用迄今仅在室内实验室条件下进行了测试,尚未
deal with issues of data registration under challenging illumination, occlusions, and plant density. Point cloud registration in field conditions has been so far implemented exclusively in navigation and mapping applications of the mobile agri-robot, acquiring 3-D point clouds using laser scanners and LiDARs, which are more resilient to outdoor illumination conditions (Gao et al., 2018; Shalal et al., 2013). Therefore, registration methods applied so far in field conditions have mostly overlooked the possible added value of color information for registration purposes. 处理挑战性照明、遮挡和植物密度下的数据注册问题。迄今为止,在农业机器人导航和制图应用中,主要采用激光扫描仪和 LiDAR 获取 3D 点云进行野外环境下的点云注册,这种方法对户外照明条件较为抗性(高等,2018 年;沙拉尔等,2013 年)。因此,目前应用于野外环境的注册方法大多忽视了颜色信息对注册目的的可能增值作用。
An exception to this is the work of Dong, Roy, and Isler (Dong et al., 2020), who performed tree row mapping using registration of pointclouds acquired from an RGB-D camera. To register the pointcloud, the authors proposed a tailor-made algorithm that relies on domain knowledge in the form of semantic constraints, such as the presence of tree trunks and their expected orientation. 董、罗伊和伊斯勒(Dong et al., 2020)的工作是这一规则的例外。他们利用从 RGB-D 相机采集的点云进行树行制图。为了注册点云,作者提出了一种定制的算法,该算法依赖于语义约束等领域知识,例如树干的存在及其预期方向。
In our previous work, we have shown how the acquisition of colored-point clouds can be used for both detections (Kurtser, Ringdahl, Rotstein, & Andreasson, 2020a) and volume estimation (Kurtser, Ringdahl, Rotstein, Berenstein, et al., 2020) of grapes in vineyard conditions with the goal of yield prediction. In both previous applications, we have employed single-frame non-registered point clouds under the assumption that state-of-the-art registration algorithms generally fail to provide accurate registration results for both the noisy outdoor sensory data acquired from RGB-D cameras and the dense and repetitive feature lacking soft and dynamic foliage present in the agricultural domain. 在我们之前的工作中,我们已经展示了如何使用有色点云的获取来进行葡萄检测(Kurtser, Ringdahl, Rotstein, & Andreasson, 2020a)和体积估算(Kurtser, Ringdahl, Rotstein, Berenstein, et al., 2020),以实现产量预测。在这两种之前的应用中,我们都使用了单帧非注册点云,因为最先进的注册算法通常无法为从 RGB-D 相机获取的嘈杂户外传感数据以及农业领域中稠密且重复特征缺失的软质和动态枝叶提供准确的注册结果。
In this paper, we aim to challenge this assumption through the evaluation of several state-of-the-art registration algorithms and propose our own registration method. 在本文中,我们旨在通过评估多种最先进的配准算法来对这一假设提出质疑,并提出我们自己的配准方法。
3 | POINT SET REGISTRATION 点集配准
In this section, we review the state-of-the-art registration methods that are used in this work. We start by defining the registration problem mathematically and discuss the registration methods briefly. 在本节中,我们回顾了本工作中使用的最先进的配准方法。我们首先从数学上定义了配准问题,并简要讨论了配准方法。
Registration of two point clouds and means finding the transformation matrix that aligns the point clouds. It is an iterative optimization problem in which registration loss is minimized. The registration problem can be mathematically expressed as in the following equation: 注册两个点云 和 意味着找到变换矩阵 来对齐这两个点云。这是一个迭代的优化问题,其中注册损失被最小化。注册问题可以用以下方程式来数学表示:
where is the rigid transformation matrix , represented using rotation matrix and translation vector . is the registration loss function and depends on the registration algorithm used. 其中 是刚性变换矩阵 ,使用旋转矩阵 和平移向量 表示。 是注册损失函数,取决于所使用的注册算法。
In this paper, we build upon the well-known and most widely used registration algorithms-iterative closest point (ICP) registration (Korn et al., 2014) and Normal Distribution Transform (NDT) registration (Stoyanov et al., 2012). 在本论文中,我们建立在著名的且最广泛使用的配准算法-迭代最近点(ICP)配准(Korn et al., 2014)和 Normal Distribution Transform(NDT)配准(Stoyanov et al., 2012)的基础之上。
Since Besl and McKay (1992) first used the term ICP, several variations were proposed. However, according to the review performed by Pomerleau et al., (Pomerleau et al., 2015), the main variation in ICP algorithms include variations in: (1) transformation functions; (2) data filters; (3) distance functions. 自从 Besl 和 McKay(1992 年)首次使用 ICP 一词以来,已经提出了多种变体。但是,根据 Pomerleau 等人的评论(Pomerleau 等人,2015 年),ICP 算法的主要变化包括以下几个方面:(1)变换函数;(2)数据滤波器;(3)距离函数。
In the described above application in agri-robotics, the scans are not expected to scale significantly. As a result, the ICP variations presented focus only on rigid transformation functions that include translation and rotation changes only. Data filters in the case of point clouds are used to reduce noise by doing feature enhancement (e.g., calculating point normal, extracting corner or surface points) and feature reduction (e.g., point density reduction, ground removal). Besides the basic point cloud data filters, image-based data filtering methods are also used in this work, described in Section 4. In this work, we used the Euclidean distance function with point-to-point and point-to-plane distance for ICP point-2-point and ICP point-2plane registration respectively. 在上述农用机器人应用中,扫描不需要显著扩展。因此,所呈现的 ICP 变体仅关注包括平移和旋转变化的刚性变换函数。在点云的情况下,使用数据过滤器来减少噪声,通过执行特征增强(如计算点法线、提取角点或表面点)和特征减少(如点密度减少、地面去除)。除了基本的点云数据过滤器外,本文还使用了基于图像的数据过滤方法,在第 4 节有所描述。在本工作中,我们分别使用点对点和点对平面距离的欧几里得距离函数进行 ICP 点对点和 ICP 点对平面配准。
Published variations of NDT algorithms are more scarce and often conceptually do not vary significantly from the below method. Nevertheless, some variations are available (Das & Waslander, 2014; Magnusson et al., 2009; Stoyanov et al., 2012; Valencia et al., 2014). Therefore, we chose the most common ones that rely on point-todistribution and distribution-to-distribution distances. 发表的 NDT 算法变体较为稀缺,通常概念上与下述方法差异不大。尽管如此,仍有一些变体可供选择(Das & Waslander, 2014; Magnusson et al., 2009; Stoyanov et al., 2012; Valencia et al., 2014)。因此,我们选择了最常见的依赖于点到分布和分布到分布距离的变体。
Finally, some recent efforts in Deep Learning based registration methods have gained popularity by training networks for feature extraction and registration (Villena-Martinez et al., 2020). These methods appear promising in data-abundant applications but can be expected to require large amounts of data, a common bottleneck in the agricultural robotics domain (Kamilaris & Prenafeta-Boldú, 2018; Kurtser, Ringdahl, Rotstein, & Andreasson, 2020). The designed networks often rely on the same conceptual approach as ICP and NDT of searching key points using images and could be viewed as an extension to the SIFT-ICP method we evaluate. 最后,一些基于深度学习的配准方法获得了广泛的应用,这些方法通过训练网络进行特征提取和配准(Villena-Martinez 等人,2020)。这些方法在数据丰富的应用中显得很有前景,但在农业机器人领域,由于数据瓶颈的限制,可能需要大量的数据(Kamilaris & Prenafeta-Boldú, 2018; Kurtser, Ringdahl, Rotstein, & Andreasson, 2020)。所设计的网络通常采用与 ICP 和 NDT 相同的概念方法,使用图像搜索关键点,可视为我们评估的 SIFT-ICP 方法的扩展。
3.1 | ICP registration 备案注册
The ICP registration loss function can be defined as the sum of the squared distance between the entities in the source cloud to the corresponding entity in the target point cloud . Here, the entity could be a point, a plane, or a shape, and the corresponding entity is usually the nearest neighbor of the transformed entity or point in the target point cloud. In work by Tavares et al. (2020), a detailed description of the ICP registration loss functions is presented. The generalized ICP loss function as per Tavares et al. (2020) can be defined as the sum of the distance between the matching features in point clouds and can be written as in Equation 2. 点到点的 ICP 配准损失函数可以定义为源点云中实体到对应目标点云中实体的平方距离之和。这里实体可以是点、平面或形状,对应实体通常是转换后的实体或点在目标点云中的最近邻点。Tavares 等人(2020)在他们的工作中详细描述了 ICP 配准损失函数。Tavares 等人(2020)提出的广义 ICP 损失函数可以定义为两个点云中匹配特征之间距离的和,如等式 2 所示。
where is the distance function, is the optional weight for the entity pair and defines the corresponding entity of source point 其中 是距离函数, 是实体对的可选权重, 定义了源点的相应实体
cloud in target point cloud . When the entities are points, the loss function is the sum of Euclidean distance between the corresponding points (Besl & McKay, 1992), and correspondence is the nearest neighbor based on Euclidean distance. In Chen and Medioni (1992), point-to-plane correspondence is established, where the loss function is defined such that the distance between the point in the source point cloud is decreased along the normal of the corresponding plane of points in the target point cloud. 目标点云中的云 。当实体是点时,损失函数是相应点之间欧几里德距离的总和(Besl & McKay, 1992),对应关系是基于欧几里德距离的最近邻。在 Chen 和 Medioni (1992)中,建立了点到平面的对应关系,损失函数的定义使得源点云中的点沿目标点云中相应平面的法线方向减小距离。
3.2 | NDT registration 注册非破坏性检测
NDT registration is a point set registration algorithm which uses NDT maps. The NDT maps are constructed by dividing the point cloud into grids called NDT cells. For each NDT cell, normal distribution is calculated using the points that fall in the grid. There are two types of NDT registration algorithms, point-2distribution (P2D) and distribution-to-distribution (D2D). In NDT registration, the cost function is minimized iteratively with respect to the rigid transformation matrix . The NDT P2D registration cost function between a point cloud and NDT map (where is the NDT map of point cloud ) is defined as the negative likelihood of point in belonging to the NDT cells in map . The cost function for NDT P2D registration is given in Equation 3 正态分布变换(NDT)注册是一种点集注册算法,使用 NDT 地图。NDT 地图是通过将点云划分为称为 NDT 单元格的网格构建的。对于每个 NDT 单元格,使用落入网格中的点计算正态分布 。NDT 注册算法有两种类型,点-分布(P2D)和分布-分布(D2D)。在 NDT 注册中,目标函数被迭代地最小化,以获得刚体变换矩阵 。NDT P2D 注册的目标函数定义为点云 中的点属于地图 中 NDT 单元格(其 NDT 地图为 )的负对数似然。NDT P2D 注册的目标函数如等式 3 所示。
where, and are positive regularization factor mentioned by Magnusson et al. (2007), iterate over the points in point cloud and iterate over the NDT cells of map . To make the cost calculation computationally less expensive, NDT cell parameter ( and ) closest to the transformed point is used. 其中, 和 是 Magnusson 等人(2007)提到的正则化因子, 遍历点云 中的点, 遍历地图 的 NDT 单元。为了降低计算成本,使用转换点最接近的 NDT 单元参数( 和 )。
NDT D2D registration cost function is defined between two NDT maps (source map and target map ) and represents the dissimilarity between the maps. There are two types of NDT D2D registration cost function, the first cost function Equation (4) is defined as the sum of the distances between NDT cells of source and target map (Andreasson & Stoyanov, 2012). The second cost function Equation 5 is based on fuzzy logic (Liao et al., 2022). NDT D2D 注册成本函数在两个 NDT 地图(源地图和目标地图)之间定义,代表地图之间的不相似性。NDT D2D 注册成本函数有两种类型,第一个成本函数公式(4)被定义为源地图和目标地图的 NDT 单元之间的距离之和(Andreasson & Stoyanov, 2012)。第二个成本函数公式 5 基于模糊逻辑(Liao et al., 2022)。
Both ICP and NDT-based registration algorithms are powerful registration tools that have been employed actively in the current literature, but in recent times more robust and fast registration algorithms have been introduced. A representative example of this is the TEASER++ algorithm (Yang et al., 2020). This algorithm is specifically designed to provide robust pointcloud registration in the presence of large amounts of outlier correspondences, a condition expected in the outdoor agricultural data set. The registration algorithm utilizes the correspondences (Fast Point Feature Histograms (FPFH) point features used for the color point cloud in the paper and also used in this work) between points and uses a graph-based method of finding the maximum clique to reject most of the outliers. In addition, the registration cost function is decoupled for translation, rotation, and scale estimation and based on Truncated Least Squares (TLS) cost, which is robust to a large fraction of outlier correspondences. 同时, ICP 和基于 NDT 的配准算法都是功能强大的配准工具,广泛应用于现有文献中,但最近也引入了更强大和更快速的配准算法。这其中一个代表性例子就是 TEASER++算法(Yang et al., 2020)。这种算法特别设计用于在存在大量异常对应关系的情况下提供稳健的点云配准,这种情况在户外农业数据集中很常见。该配准算法利用点与点之间的对应关系(用于论文中彩色点云的快速点特征直方图(FPFH)点特征,也用在本工作中),并采用基于图的方法寻找最大团来拒绝大部分异常值。此外,配准的目标函数针对平移、旋转和缩放分别进行估计,基于截断最小二乘(TLS)代价,对大比例的异常对应关系也具有鲁棒性。
The algorithm is also supplemented with easily implementable code, which makes it a great candidate for comparison. Given the complexity of the algorithm and space considerations in this article, we refer the interested reader to the original paper ( ( 该算法也补充了易于实现的代码,这使它成为比较的良好候选者。鉴于该算法的复杂性和本文中的空间考虑,我们建议有兴趣的读者参考原始论文( (
3.4 | Introduction of color information 3.4 | 颜色信息介绍
The ICP and NDT methods described above do not use color information, and the cost function is solely based on the geometrical information of the points. With the introduction of RGB-D data, the cost functions can be adjusted either by using the color information of every single point or by using image features to find the correspondences. Korn et al. (2014) find the correspondences between the point clouds using color points and register them using the ICP registration. In Huhle et al. (2008), colored NDT cells are used for registrations which are defined as Gaussian mixture models (GMM) in color space and corresponding weighted spatial means and covariance. The color NDT registration is derived from the NDT P2D registration cost function, and the cost is calculated as the weighted negative likelihood of spatial point in the GMM of NDT cells, where the weights are likelihood in the color space. Andreasson and Stoyanov (2012) used SURF image features to find the correspondence between points in two RGB-D frames, and registration is done using the NDT D2D registration method. Our method is derived from the work of Korn et al. (2014) and Stoyanov et al. (2012), by introducing a novel approach to utilizes the color and geometric info for finding the correspondences between the colored NDT cells. The NDT D2D registration cost function is used for minimizing the distance between correspondences. 上述的 ICP 和 NDT 方法不使用颜色信息,成本函数仅基于点的几何信息。通过引入 RGB-D 数据,成本函数可以通过使用每个点的颜色信息或使用图像特征来找到对应关系进行调整。Korn 等人 (2014) 使用颜色点找到点云的对应关系,并使用 ICP 配准进行注册。在 Huhle 等人 (2008) 中,使用颜色 NDT 单元进行注册,这些单元在颜色空间中被定义为高斯混合模型 (GMM),并具有相应的加权空间均值和协方差。颜色 NDT 注册是从 NDT P2D 注册成本函数派生而来的,成本计算为 NDT 单元 GMM 中空间点的加权负对数似然,其权重为颜色空间中的似然。Andreasson 和 Stoyanov (2012) 使用 SURF 图像特征找到两个 RGB-D 帧中点之间的对应关系,并使用 NDT D2D 注册方法进行注册。我们的方法是从 Korn 等人 (2014) 和 Stoyanov 等人 (2012) 的工作中派生而来的,引入了一种新的方法来利用颜色和几何信息来找到有色 NDT 单元之间的对应关系。使用 NDT D2D 注册成本函数来最小化对应关系之间的距离。
A schematic representation of the inclusion of color information in point cloud registration is presented in Figure 1. The pipeline from RGB-D images to the registered point cloud includes getting the 图 1 示出了在点云注册中包含颜色信息的示意图。从 RGB-D 图像到注册点云的管道包括获取相机外参数、图像到点云的投影、颜色信息的转移以及最终的点云注册。
FIGURE 1 Flowchart for the scene registration using RGB-D images. Registration using 3D points and/or color information and Registration using image features and 3D points. 图 1 使用 RGB-D 图像进行场景注册的流程图。使用 3D 点和/或颜色信息进行注册,以及使用图像特征和 3D 点进行注册。
point cloud from RGB-D images, filtering the point cloud, getting the correspondences based on either distance or image features, and then performing the registration. 从 RGB-D 图像获取点云、对点云进行过滤、基于距离或图像特征获取对应关系、然后执行配准。
4 | METHODS AND MATERIALS 四 | 方法和材料
4.1 | Evaluation data sets 4.1 | 评估数据集
To evaluate the registration methods outlined above we use 2 data sets-GRAPES3D data set (Kurtser & Edan, 2018b), and TUM RGBD data set (Sturm et al., 2012). The data sets were chosen given the following criteria. (1) The data set must be acquired by a commercial grade RGB-D camera; (2) It must include a view towards the target with significant variability in distance to the targets and other objects in the scene; (3) Consecutive frames must be acquired with significant overlap (i.e., acquisition frequency must be reasonable to the speed of motion of the camera); (4) The pointcloud and color images should be acquired in a feature-rich environment. The criteria were chosen to adhere to the possible application of the proposed algorithms - pointcloud registration for better crop monitoring in orchards, vineyards, and greenhouse conditions. 评估上述配准方法,我们使用了两个数据集-GRAPES3D 数据集(Kurtser & Edan, 2018b)和 TUM RGBD 数据集(Sturm et al., 2012)。选择这些数据集是基于以下标准:(1)数据集必须由商用级别的 RGB-D 相机采集;(2)必须包括目标的视角,且目标与场景中其他物体之间的距离有显著变化;(3)连续帧之间必须有较大的重叠区域(即相机运动速度与采集频率相适应);(4)点云和彩色图像应在特征丰富的环境中采集。这些标准的选择,是为了符合所提出算法的可能应用-用于果园、葡萄园和温室条件下作物监测的点云配准。
The proposed data sets were both collected using RGB-D cameras to acquire colored pointcloud. GRAPES3D data sets represent the environmental vineyards and orchard conditions in which the algorithms are intended to be used. The TUM data set provides valuable benchmark data for deeper insights into algorithm performance in noise-free indoor conditions. The highquality data of the TUM data set provides the additional ability to compare registration algorithms subject to artificial noise to gain insights into the stability and robustness of the registration algorithms. 提出的数据集均使用 RGB-D 相机采集有色点云。GRAPES3D 数据集代表了算法使用的环境葡萄园和果园条件。TUM 数据集提供了宝贵的基准数据,深入了解算法在无噪声室内条件下的性能。TUM 数据集的高质量数据还可以比较受人工噪声影响的配准算法,从而洞悉配准算法的稳定性和鲁棒性。
Unfortunately, the number of RGB-D data sets publicly available for benchmarking is limited at this point and does not adhere to the criteria mentioned above. Well-established benchmarks such as SugarBeats 2016 (Chebrolu et al., 2017) and Rosario (Pire et al., 2019) are acquired for the use cases of aerial crop monitoring in open fields. While these data sets include a detailed ground truth using GPS-RTK, the acquisition protocol differs significantly. The pointclouds are typically acquired in a top-down viewpoint, which occludes illumination disturbances and provides a rather constant distance to target measure in field conditions. Additionally, the acquisition is often performed at a low frequency with limited overlap between frames. While the TUM data set was not acquired in agricultural conditions, the acquisition protocol used is adhering to the criteria mentioned above and provides additional insights. In Appendix B we provide additional results of applying the algorithms to the SugraBeats2016 data set (Chebrolu et al., 2017) and further explain the limitations in translating the suggested methods to aerial crop monitoring applications. 可惜的是,目前公开用于基准测试的 RGB-D 数据集有限,且不符合上述标准。像 SugarBeats 2016(Chebrolu 等人,2017 年)和 Rosario(Pire 等人,2019 年)这样的知名基准测试,是为了航空作物监测在开放场地的应用采集的。虽然这些数据集使用 GPS-RTK 提供了详细的地面实况,但采集协议存在明显差异。点云通常从俯视角度采集,遮挡了照明干扰,在田间条件下提供了较为恒定的目标测量距离。此外,采集通常以较低频率进行,帧之间重叠有限。尽管 TUM 数据集不是在农业条件下采集的,但采集协议符合上述标准,并提供了额外的见解。在附录 B 中,我们提供了在 SugraBeats2016 数据集(Chebrolu 等人,2017 年)上应用算法的其他结果,并进一步解释了将建议的方法应用于航空作物监测应用的局限性。
4.1.1 | GRAPES3D 葡萄 3D
The RGB-D point clouds were collected using an Intel Realsense D435 camera mounted on Greenhouse Spraying Robot (GSR) platform in two different conditions, a controlled outdoor setup with potted grape plants and a commercial vineyard setup. The Realsense D435 has field of view of and has active stereo depth resolution of . The data is collected by teleoperating the GSR robot in a straight line with the camera mounted in two different configurations: (1) facing the growing row at ; and (2) facing the growing row at horizontal angle with respect to the moving direction. The data set contains bag files with color images, depth images, and camera info. The color and depth RGB-D 点云数据使用安装在温室喷雾机器人(GSR)平台上的 Intel Realsense D435 相机在两种不同条件下收集:有控制的室外设置和商业葡萄园设置。 Realsense D435 的视野为 ,有 的主动立体深度分辨率。 数据是通过远程控制 GSR 机器人沿直线行驶,相机安装在两种不同配置下收集的:(1)正对生长行,水平角度为 ;(2)相对于移动方向,水平角度为 。 数据集包含颜色图像、深度图像和相机信息的 bag 文件。 颜色和深度
(a)
(b)
(c)
FIGURE 2 Outlier points example from the GRAPES3D data set and pointcloud example for TUM RGBD. (a) Overflowing points at the edge of an object, (b) wavy surface points for a flat surface, and (c) noise-free pointcloud example from the TUM data set. 图 2 GRAPES3D 数据集的异常值点示例和 TUM RGBD 的点云示例。(a)物体边缘的溢出点,(b)平面表面的波状点, (c)TUM 数据集中噪声较少的点云示例。
images are extracted from bag files and aligned using -covert tool of the librealsense library. These aligned RGB-D images are converted to 3D colored point cloud using the following equation: 图像从 bag 文件中提取,并使用 librealsense 库的 -covert 工具进行对齐。 这些对齐的 RGB-D 图像使用以下公式转换为 3D 彩色点云:
where, is the 3D point in camera frame corresponding to the pixel coordinates is the depth value at the pixel coordinates in depth image is the scale factor, and are the pixel center of the camera sensor, and and are the focal length of camera. 其中, 是相机坐标系中对应于像素坐标 的 3D 点, 中深度图像上该像素坐标处的深度值, 是尺度因子, 和 是相机传感器的像素中心, 和 是相机的焦距.
4.1.2 | TUM data sets 4.1.2 | TUM 数据集
The sequences of the TUM RGBD data set used in this work were collected using a Microsoft Kinect sensor in indoor scenarios like offices and rooms. A high-accuracy motion-capture system with eight high-speed tracking cameras was used to collect the ground-truth trajectory. The data set includes scans with significantly less noise than expected in scans collected in outdoor agricultural conditions, as seen in Figure 2. As a result, the evaluation using the TUM data set aims to provide insights into our method's robustness to noise. Specifically, we evaluate the following scenarios-(1) noisy blurred images and (2) sparse pointclouds. We used "freiburg1_desk," "freiburg1_room," and "freiburg1_xyz" data sequences as these are long sequences with features present in each scan for registration methods and SIFT feature matching. 这项工作使用的 TUM RGBD 数据集是在办公室和房间等室内场景中使用微软 Kinect 传感器收集的。用于收集真实轨迹的是一个高精度的运动捕捉系统,配备有 8 台高速跟踪摄像机。该数据集的扫描噪声明显低于预期的户外农业条件下的扫描噪声,如图 2 所示。因此,使用 TUM 数据集进行评估旨在提供我们方法抗噪能力的见解。具体来说,我们评估了以下两种情况-(1)噪音模糊的图像和(2)稀疏的点云。我们使用了"freiburg1_desk"、"freiburg1_room"和"freiburg1_xyz"数据序列,因为这些是包含注册方法和 SIFT 特征匹配所需特征的长序列。
Additionally, due to the availability of ground-truth information in the TUM data sets, we are able to provide registration error measures compared to ground truth as described in Section 4.4. 此外,由于在 TUM 数据集中存在真相信息,我们能够提供与第 4.4 节中描述的真相相比的配准误差度量。
4.2 | Preprocessing 4.2 | 预处理
In Grapes3D data sets, the point cloud obtained from the RGB-D image contains many outlier points which need to be removed. The outlier points in a point cloud are points that do not belong to the surface of any object and occur due to noise in the sensor or ambient noise/parameters, as seen in Figure 2. Some common reasons for noise/outlier points in the point cloud generated using an RGB-D camera are different viewing angles, light intensities, different reflection properties of the objects, vibration, or jerk in the camera position. Since the RGB-D camera is calibrated for near objects, we observed a reduction in depth accuracy with increased object distance. Therefore the point cloud must first be filtered based on the distance from the camera origin. Other outlier points, like overflowing points near the edges, wavy surfaces, or points due to sensor noise, can be removed (not completely) using analytical methods like radius outlier removal or statistical outlier rejection. In this work, the radius outlier removal method is used to filter the outliers. According to this method, the points which satisfy the condition in Equation (7) are filtered from the point cloud, where is the neighborhood function that returns the number of points in the radius of point