全球土地覆盖为农业生产提供了关键的基准信息,应用于联合国千年生态系统评估(MA)、《生物多样性报告》和《全球环境展望》(GEO)(Meng et al., 2023; Zabel et al., 2019)。特别是,农业系统对气候变化的独特影响表明,耕地管理可以显著影响温室气体排放和生物地球化学循环(Akpoti, Kabo-bah, & Zwart, 2019; Hatfield et al., 2018; Wang et al., 2022)。耕地作为一种不可替代的农业资源和生产要素,对于社会经济的可持续性至关重要(Duan et al., 2021)。根据《2022 年世界人口展望》,预计到 2050 年全球人口将超过 90 亿,日益增长的食品需求意味着维持人均饮食所需的可耕地面积将继续增加(Fritz et al., 2013; Liang et al., 2023),这给已经超负荷的农田带来了巨大的压力(Duro et al., 2020)。
遥感已成为收集全球或区域土地利用/覆盖变化(LUCC)信息的强大工具,其衍生的数据集广泛用于识别农田分布和确定适合种植谷物和非谷物作物的区域(Fritz et al., 2013; Weiss et al., 2020)。在过去几十年中,已经从卫星和航空影像生成了多个大陆农田产品,并免费与公众分享。然而,早期的土地覆盖制图通常依赖于低分辨率图像,限制了其在国家或区域层面的应用(Benhammou et al., 2022)。自 2000 年代初以来,更加精细的全球土地覆盖数据已成为国际议程的一部分。这些制图产品随着地球物理仪器和技术的进步而改善,尽管这增加了收集和更新高分辨率实时地面训练样本所需的时间。特别是,随着 Landsat 档案的开放,高分辨率农田制图受到了越来越多的关注(Gumma et al., 2020; Teluguntla et al., 2018),使得创建 30-m30-\mathrm{m} 分辨率的农田数据集成为可能(Hu et al., 2020)。 每个产品都表现出可接受的整体准确性,可以用于在不同尺度上检测和记录农田变化(Waldner et al., 2015)。然而,这些产品之间的显著差异已被广泛报道(Fritz et al., 2013;Lu et al., 2017;Nabil et al., 2020),尤其是在过渡区和地理景观碎片化复杂的地区(Lu et al., 2020)。这些差异主要归因于传感器、分类器、获取方法、更新频率、主观解释、验证技术和地面真实参考的变化(Hua et al., 2018),这妨碍了可靠的比较,并使其在特定地区的应用变得复杂。这使得产品间的兼容性变得至关重要(Zhang et al., 2022)。已经采取了一些有益的努力来标准化这些不一致性,促进了对不断增加的现有土地覆盖图的性能比较和验证(Fritz et al., 2011)。 尽管如此,用户在选择合适的数据集时仍然面临一些困惑,因为他们常常难以找到与所需的 LUCC 水平或感兴趣的地理区域相匹配的产品(Chen et al., 2019)。此外,这些产品通常提供有限的细节和有限的类别数量(Weiss et al., 2020)。
与此同时,这些产品估算的农田面积与官方统计数据不一致,因为遥感影像中不可避免的像素混合(Claverie et al., 2013)。这种差异阻碍了它们在食品政策和农业经济中的推广。学术研究揭示了卫星反演估算与特定行业统计之间的差异,原因在于农田定义的不同(Potapov et al., 2021)。大多数制图产品往往会高估或低估农田面积,这取决于如何计算包括马赛克在内的混合类型(Lu et al., 2017)。此外,现有数据集通常强调土地覆盖而非土地利用,因为航空和卫星观测具有直接性,导致基于卫星的地图可能无法充分捕捉与土地利用变化相关的农田特征(Liu et al., 2023)。作为一种共存的集合,农田不仅由覆盖地表的作物定义,还受到与人类活动相关的粮食生产的影响(Paz et al., 2020)。 农业统计数据通常通过抽样调查和访谈收集,然后通过将这些数据与行政记录结合进行计算(Fritz et al., 2013)。值得注意的是,这些部门统计数据提供了通过 Landsat 数据无法获得的有价值的实用信息。然而,由于它们是在行政区划层面收集的,因此通常缺乏空间细节(Nabil et al., 2020)。
协同方法通过结合现有的土地覆盖图和权威统计数据,为解决上述差距提供了一种替代方案。不同的实地制图产品可以用于创建农田混合百分比图,这通常更为准确(Chen et al., 2019; Zhang et al., 2022)。当前的地图协同算法一般可分为回归和协议评分方法(Lu et al., 2017)。其中,地理加权回归(GWR)是一种具有空间变化回归参数的解决方法,已广泛应用于利用众包数据库在全球范围内构建混合地图(See et al., 2015)。Schepaschenko 等(2015)整合了来自广泛来源的森林产品,使用 GWR 模型生成了分辨率为 1 公里的全球混合森林覆盖图。然而,该过程需要大量的训练样本,并且在低采样密度下可能容易出现不稳定(Lu et al., 2020)。协议评分方法根据输入数据中的一致性水平分配分数,由于其优越的可操作性和简单性,使用更为普遍(Chen et al., 2019)。 实施这种方法,Jung、Henkel、Herold 和 Churkina(2006)以及 Ramankutty、Evan、Monfreda 和 Foley(2008)分别创建了用于碳循环建模的全球土地覆盖图和全球牧场面积图。然而,这类研究往往忽视了输入数据集之间的质量差异。为了解决这个问题,Fritz 等人(2011)对输入产品进行了排序,并根据专家判断分配了权重。传统上,评分分配是该过程的一个关键方面(See 等人,2015),但在处理大量输入数据集时,创建评分表可能会很繁琐(Gumma 等人,2020)。
of cropland changes. The CLCD datasets have been rigorously compared to state-of-the-art 30-m30-\mathrm{m} resolution thematic products including forest, surface water, and impervious surface area (ISA) to comprehensively assess its property, but few have been compared to cropland (Zhang et al., 2022). Another noteworthy dataset is the Global Land Cover-FCS30 (GLC_FCS30), a global land cover dynamic monitoring product. It adopts an exemplary classification system derived from surface reflectance (SR) images and local adaptive modeling, integrating the Food and Agriculture Organization of the United Nations (FAO) classification system to categorize land cover into 29 classes (Zhang et al., 2021). Globeland30, the world’s first 30-m resolution global land cover dataset, amalgamates multispectral imagery from the US Landsat and the China Disaster Monitoring Constellation (DMC). It employs the innovative Pixel-Object-Knowledge (POK) hierarchical classification method to classify land cover into 10 categories (Brovelli et al., 2015). GlobalCrop, in contrast, leverages normalized SR data from Landsat Analysis Ready Data (ARD) as input for mapping, generating a probability layer for each pixel via a bagged decision tree ensemble to creat a global cropland map (Potapov et al., 2021). While GlobalCrop is one year apart from the other datasets, careful studies have demonstrated that the land use and land cover changes during this period, on a large scale, are almost negligible when compared to classification error. The European Space Agency (ESA) Climate Change Initiative (CCI) project has contributed a suite of satellite-based products that incorporate medium-resolution imaging spectrometer (MERIS) SR time-series configuration parameters as input to generate land cover maps, enabling lengthy observations and analysis of global land cover dynamics since the 1990s (Zhong et al., 2022).
A unified classification system is a prerequisite for contrasting data products from various sources (Hua et al., 2018). Opting for a simplified classification helps mitigate uncertainty that can arise from the wide variety of detailed land cover types (Nabil et al., 2020). To achieve this, we reclassified the datasets into cropland and non-cropland while retaining the cropland category’s information. We excluded the effects of other classes and selected the relevant types in each dataset for merging, adhering to the FAO definition of cropland. Thereinto, cropland includes arable lands, which encompass areas under temporary agricultural crops, land used for market and kitchen gardens, temporary meadows for mowing or grazing, and temporarily fallow land. Permanent crops encompass long-term crops that are sown above ground and do not require replanting for many years, flower crops that can grow under trees and shrubs, and nurseries. Additionally, cropland-related classes in each dataset were extracted and assigned percentage weights in compliance with their definitions (i.e. mosaic cropland classes received lower weights, while pure cropland classes were given higher percentage weights). This process generated percentage maps of cropland at a 300-m300-\mathrm{m} resolution, all in the same coordinate system, for each satellite-based dataset.
In Fig. 2, we present the geographical distribution of China’s cropland around 2020, which has been extracted from five existing datasets, along with the training samples obtained from the Second National Land Survey (SNLS) database. SNLS implemented a unified organizational model that combined government coordination, local field surveys, and national quality control. It was established on the foundation of a survey base map that encompassed the entire range of remote sensing images, ensuring consistency between maps, figures, and real-world observations (Zhong et al., 2022). To maintain the accuracy and relevance of geospatial information, the Ministry of Natural Resources has overseen an annual survey of territorial changes and updates to the SNLS results since 2010. These map delineation data were meticulously verified by local experts who relied on both remote sensing images and field surveys (Chen et al., 2023; Liu et al., 2015). The existence of independent operating systems and a substantial financial assistance budget have further bolstered the credibility of accurate, comparable, systematic,
Additionally, the geographic distribution of cropland is regularly constrained by natural factors, such as topography, which poses certain intractability for remote sensing mapping (Zhang et al., 2022), hence further exploring their consistency in terms of altitude and slope within each landform interval. Of which, altitudes and slopes were calculated according to digital elevation models (DEM) that sprang from the integration of SRTM (Shuttle Radar Topography Mission) processing upgrades, vertical control, void filling, and merging with data unavailable at the time of the original SRTM formation (https://www.earthdata. nasa.gov/). Besides, the cropland statistics were obtained from the China Statistical Yearbook, published by the National Bureau of Statistics (NBS) (http://www.stats.gov.cn/), which counted the cropland acreage in each of provinces, municipalities, and autonomous regions and their proportion in the entire country. The cropland training samples were sourced from the foundational data of the SNLS, which was collected and compiled by the Ministry of Natural Resources of the People’s
Consistency assessment is a method used to evaluate the reliability of multi-source products in the absence of objective reference criteria. It is typically employed for mutual validation between datasets (Hua et al., 2018). This approach operates under the assumption that the data and methods employed in each land cover dataset are reasonable. Each data product is expected to approximate the true value of land cover to varying degrees. In other words, a significant portion of the data aligns with, or closely represents, the actual situation, while there is also a fraction of misjudged data (Lu et al., 2017). In this approach, each set of products is treated as an expert judgment of the actual status. The agreement level between the judgments of different datasets concerning the land cover type or quantitative characteristics of a given spatial unit or period is used as an indicator to assess the likelihood of the data’s reliability (Chen et al., 2019). The consistency ratio (CR) is defined as follows: CR_(j)=(M_(j))/((1)/(k)sum_(j=1)^(k)N_(j))xx100%C R_{j}=\frac{M_{j}}{\frac{1}{k} \sum_{j=1}^{k} N_{j}} \times 100 \%
where: M_(j)M_{j} denotes the pixel amount of product jj whose cover type is cropland at the same position; N_(j)N_{j} denotes the pixel amount of cropland in product j;kj ; k denotes the number of cropland products. The higher the consistency, the more likely the data product is to be robust (Fritz et al., 2011).
To evaluate the agreement and discrepancies between the cropland products and the survey data, we computed the Root Mean Square Error (RMSE) for each dataset compared to the official statistics based on cropland proportion. Simultaneously, we conducted a correlation analysis between the datasets and the survey data. The RMSE and correlation coefficient (R)(R) were calculated as follows (Pérez-Hoyos et al., 2017):
{:[RMSE=sqrt((sum_(i=1)^(n)(x_(i)-y_(i))^(2))/(n))],[R=(sum_(i=1)^(n)(x_(i)-( bar(x)))(y_(i)-( bar(y))))/(sqrt(sum_(i=1)^(n)(x_(i)-( bar(x)))^(2)*sum_(i=1)^(n)(y_(i)-( bar(y)))^(2)))]:}\begin{aligned}
& R M S E=\sqrt{\frac{\sum_{i=1}^{n}\left(x_{i}-y_{i}\right)^{2}}{n}} \\
& R=\frac{\sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)}{\sqrt{\sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{2} \cdot \sum_{i=1}^{n}\left(y_{i}-\bar{y}\right)^{2}}}
\end{aligned}
where: x_(i)x_{i} and bar(x)\bar{x} are the area ratio of unit ii and the average area ratio of all units computed in the cropland datasets; y_(i)y_{i} and bar(y)\bar{y} are the statistical area ratio of unit ii and the average of the acreage ratio of the statistics, respectively; nn denotes the number of units. The larger the RMSE, the higher the dispersion, while a larger RR refers to a higher goodness of fit (Claverie et al., 2013).
2.3.2. 混淆矩阵
The confusion matrix, one of the most commonly employed evaluation methods in satellite mapping, serves as a fundamental metric for assessing the accuracy of different products (Brovelli et al., 2015). The evaluation of classification accuracy in China is based on the confusion matrix of five sets of global land cover data and test samples, resulting in the computation of the Kappa coefficient, overall accuracy ( OAO A ), omission error, and commission error. OA=(sum_(i=1)^(n)x_(ii))/(N)O A=\frac{\sum_{i=1}^{n} x_{i i}}{N} 卡拉 =(Nsum_(i=1)^(n)x_(ii)-sum_(i=1)^(n)x_(i)x_(j))/(N^(2)-sum_(i=1)^(n)x_(i)x_(j))=\frac{N \sum_{i=1}^{n} x_{i i}-\sum_{i=1}^{n} x_{i} x_{j}}{N^{2}-\sum_{i=1}^{n} x_{i} x_{j}}
where: NN denotes the total of participating samples; nn is the confusion matrix dimension; x_(ii)x_{i i} is the number of samples in the diagonal; x_(j)x_{j} and x_(i)x_{i} denote the total of samples in column jj and row ii.
The protocol level plays a critical role in the integration of various remote sensing maps to develop an enriched dataset (See et al., 2015). Initially, subnational statistics were employed to assess accuracy and establish weights for the input cropland maps. It’s worth noting that the quality of the input products being evaluated can significantly impact the confidence of the synergy. Subsequently, agreement ranking scores were determined based on this accuracy and protocol. For each input product, the cropland acreage in each unit is calculated as follows: a_(i,j)=sum_(m=1)^(N)(S_(m)xxP_(m))a_{i, j}=\sum_{m=1}^{N}\left(S_{m} \times P_{m}\right)
where a_(i,j)a_{i, j} denotes the cropland acreage of unit jj estimated by input product ii; P_(m)P_{m} denotes the cropland percentage in pixel mm after data processing; S_(m)S_{m} denotes the pixel area computed by equal-area projection; mm denotes the pixel labeled as cropland. Besides, absolute difference AD_(i,j)A D_{i, j} between the statistics and cropland acreage estimated from input product ii is computed to evaluate the accuracy of input maps. AD_(i,j)=abs((a_(NBS_(j)j)-a_(i,j))/(a_(NBS_(,j))))A D_{i, j}=\operatorname{abs}\left(\frac{a_{\mathrm{NBS}_{j} j}-a_{i, j}}{a_{\mathrm{NBS}_{, j}}}\right)
where a_(NBS,j)a_{\mathrm{NBS}, j} is the cropland acreage statistics of unit jj derived from NBS. A lower value of AD_(i,j)A D_{i, j} signifies finer agreement with the official statistics and a superior ranking for the input map.
Protocol ranking scores are generated using tables that depict the agreement and ranking of the input product. When working with five input products, they are typically labeled as A, B, C, D, and E, ranked from highest to lowest. Agreement levels, ranging from 0 to 5 , represent the number of input products that identify a pixel as cropland. Since there are 32 permutations (2^(5)=32)\left(2^{5}=32\right) for five input products, the scores range from 0 to 31 (Table 3). A higher score indicates a higher likelihood of a pixel being cropland. An agreement level of 5 signifies that all input products classify the pixel as cropland, giving the pixel the highest score of 31. Conversely, an agreement level of 0 indicates that all the products categorize the pixel as non-cropland, resulting in the lowest score of 0. For other agreement levels, there are various permutations for the products. For instance, when the agreement level is 3, there are ten possible combinations with score values ranging from 16 to 25 . In scenarios where A,B\mathrm{A}, \mathrm{B}, and C have higher rankings, the score value is set as 25 if all three indicate cropland, which is higher than other combinations. By following these guidelines, values for a complete scoring table with five input products were determined, and these values were subsequently used to transform the input cropland layers into an agreement-ranking map.
图 3 显示了五个输入产品的统计分配流程图。最初,选择了得分最高的 31 个像素,并计算了它们的总面积。 A_(31)=sum(S_(31,m)xxP_(31,m))A_{31}=\sum\left(S_{31, m} \times P_{31, m}\right)
where P_(31,m)P_{31, m} and S_(31,m)S_{31, m} are the average percentage and pixel area of pixel mm labeled as the score 31 . If the acreage was far less than that the statistics, cropland pixels with the second-highest agreement rank, i.e., a score of 30 , were selected, and the total acreage was thenceforth computed. The cumulative cropland acres with scores of 30 and above were compared with the statistics. Pixels labeled with scores of 31 and 30 were designated as cropland pixels if the cumulative acreage was pretty close to the statistics, or else, pixels with lower scores were included until the accumulated acreage equaled the statistics. As sketched in Fig. 3, pixels with score values ranging from 29 to 31 were considered as cropland when the cumulative acreage with score of 29 was the closest to the statistics. By and large, the scoring values signaled the agreements among input products, reflecting the confidence level of cropland pixels. To standardize the scores to the same scale, a normalization process was adopted, resulting in confidence levels with values ranging from 0%0 \% to 100%100 \%.
Based on the physical geographic zoning of the CAS, the study area was divided into seven parts, as shown in Fig. 4, to investigate regional disparities in the consistency of cropland datasets. Of these, the Northeast exhibited the highest percentage of complete agreement (65%), followed by East and Central China, both with rates exceeding 50%, whereas South and Southwest China had a lower agreement percentage, at around 20%20 \% (Fig. 4a_(1),b_(1)4 a_{1}, b_{1} and c_(1)c_{1} ). Horizontally, the Northeast accounted for approximately 30%30 \% of the total, followed by East China, which represented more than 20%20 \%, and South China, which contributed less than 5%5 \%. Furthermore, the spatially consistent fractions at the provincial level posed variations across the years (Fig. 4a_(2),b_(2)4 a_{2}, b_{2}, and c_(2)c_{2} ). In 2000, Heilongjiang, Henan, and Shandong provinces exhibited the highest agreement, constituting about 70%70 \% of the total, while Tibet, Guizhou, and Fujian provinces had the worst agreement, comprising less than 10%10 \%. By 2020, Heilongjiang and Jilin had the highest agreement, though their share had decreased. The spectral and textural features of cropland in satellite images were challenging to differentiate in regions
图 4. 多区域中国输入映射数据集的一致性分布。
具有不规则地形、破碎的景观以及与其他陆地特征交错的农田,主要位于南部地区(Lu et al., 2017)。因此,这些地区的遥感影像分类被证明是复杂的,导致这些农田产品之间的符合度较低。
Fig. 4d illustrates the variation in spatial consistency concerning elevation and slope for the five sets of cropland data. The ratio of high and full agreement was more pronounced in plains with elevations below 20 m and hilly areas between 20 and 200 m , indicating strong consistency among the datasets in these region. Consistency decreased with increasing altitude, with elevations ranging from 500 to 1500 m primarily found on the Mongolian Plateau, Tarim Basin, Loess Plateau, and Yunnan-Guizhou Plateau. These areas were characterized by mountainous terrain and fragmented topography, making cropland extraction challenging and resulting in a high level of inconsistency (37.5%). Higher altitudes above 1500 m were mainly situated in the northwestern Tibetan Plateau, exhibiting an inconsistency rate of up to 50.6%50.6 \%. Similarly, slopes less than 2^(@)2^{\circ} were primarily scattered across flat plains and basins with relatively simple geographical landscapes, suitable for agricultural cultivation. This led to better agreement in cropland extraction, with 20.1%20.1 \% and 44.2%44.2 \% showing high and complete agreement. In the slope range of 2-6^(@)2-6^{\circ}, the proportion of inconsistency increased to 22.9%22.9 \%, while complete consistency decreased to 19.4%19.4 \%. Slopes of 15-25^(@)15-25^{\circ} and above were chiefly distributed in the Qinghai-Tibet Plateau and its periphery, with inconsistency accounting for 58.5%58.5 \% and 74.0%74.0 \%, respectively. Overall, the accuracy of land cover classification in
the Southwest, Northwest, and South China, where terrain slope was significant, was heavily influenced by relief and roughness. However, when relief and roughness reached a certain level, the land cover type became relatively simple and mainly non-cropland due to its unsuitability for human exploitation (Zhang et al., 2015). This stability in land cover classification agreement resulted from the land’s limited suitability for agricultural use.
3.2. 准确性评估
Fig. 5a shows the deviation between the cropland acreage retrieved from the five datasets and the statistics. The estimated cropland ratio for each data product was distinct and overestimated or underestimated to varying degrees for the most part. Specifically, the CLCD underestimated cropland acreage in the Northeast, Northwest, and North China by 4.0%4.0 \%, 2.1%2.1 \%, and 2.0%2.0 \%, respectively, while the estimates for Central and South China were relatively consistent. GLC_FCS30 also underestimated cropland in Northeast and North China by 4.4%4.4 \% and 2.8%2.8 \% but overestimated it by 3.3%,2.1%3.3 \%, 2.1 \%, and 2.6%2.6 \% in East, Central, and South China, which broadly aligned with the statistics for the Northwest. Globeland30 exhibited a bias of approximately 5.9%5.9 \% and 3.1%3.1 \% in its estimates of cropland in Northeast and North China, in line with the statistics for Central China. GlobalCrop overestimated cropland in Northeast and East China by 2.1%2.1 \% and 2.6%2.6 \% while undervaluing it in Southwest and South China. In East, Central, and South China, ESA_CCI’s estimated cropland area was 3.8%,2.0%3.8 \%, 2.0 \%, and 2.6%2.6 \% higher than the statistical data, while there was a significant underestimation in Northeast and North China, with biases of 3.5%3.5 \% and 4.5%4.5 \%, respectively. Fig. 5b presents the overall accuracy estimates, using an error matrix based on training samples in each district. In North and Northeast China, GlobalCrop achieved the highest overall accuracy ( 84.0%84.0 \% and 85.1%85.1 \% ). In Central and South China, CLCD achieved the highest accuracy ( 89.5%89.5 \% and 87.7%87.7 \% ), and Globeland30 performed exceptionally well in the northwest with an overall accuracy of 86.1%86.1 \%. These input mapping datasets were ranked in each district to create the corresponding scoring sheet based on overall accuracy.
Fig. 7 presents the relationship between the percentage of cropland acreage derived from agricultural statistics and the estimates obtained from the province-by-province cropland datasets. A comparison of these individual cropland datasets reveals that the synergy maps exhibit the highest level of agreement with the statistics, with the lowest RMSE (Root Mean Square Error) of 0.21%0.21 \% and a high correlation coefficient (R^(2))\left(R^{2}\right) of 0.97 . It is worth noting that CLCD achieved a superior R^(2)(0.93)R^{2}(0.93) and a lower RMSE ( 0.68%0.68 \% ) compared to the other four datasets. This could be attributed to the fact that CLCD’s classification results are more closely related to China compared to other global land cover mappings. In contrast, the data points for GlobalCrop noticeably deviate from the 1:1 line, with a higher RMSE ( 0.81%0.81 \% ) and a lower R^(2)R^{2} ( 0.86 ). This divergence may be attributed to GlobalCrop’s definition of cropland, which includes various mosaic types. For instance, GlobalCrop encompasses land used for planting year-round and perennial herbaceous crops for feed, biofuels, and human consumption (Potapov et al., 2021). The presence of mixed pixels resulting from fragmented landscape patches leads to variations in classifications and discrepancies in cropland area estimates across different products and in comparison to the statistics. In summary, the synergy maps exhibit better agreement with the statistics, demonstrating the effectiveness of this approach in tackling inconsistencies between satellite-based cropland maps and departmental statistics.
Table 4 lists the accuracy evaluation results for the five cropland datasets and their synergy maps within the training sample zone. GlobalCrop exhibited the highest spatial location reliability among these input datasets, achieving a Kappa coefficient of 0.64 and an overall accuracy of 87.7%87.7 \%. CLCD and GlobeLand30 followed closely with overall accuracies of 86.1%86.1 \% and 85.6%85.6 \%, and Kappa coefficients of 0.64 and 0.63 , respectively. In contrast, ESA_CCI and GLC_FCS30 demonstrated lower positional reliability, with overall accuracies of 84.1%84.1 \% and 84.0%84.0 \%, and Kappa coefficients of 0.59 and 0.60 , respectively. Specifically, CLCD exhibited a misclassification rate of 34.6%34.6 \% and an omission
图 8. 样本区域内漏标耕地和错误标记耕地的空间分布。
predominantly in southwest China (Fig. 8c). GlobalCrop exhibited the lowest rate of cropland misclassification but a higher rate of omission, mostly in western China (Fig. 8d). Additionally, ESA_CCI had a cropland commission rate of 38.2%38.2 \% and a cropland omission rate of 21.6%21.6 \%
图 9. 2000-2020 年中国耕地的时空动态分布。
validation samples (Fig. 8f), and the validation results are presented in Table 4. The overall accuracy of the synergistic maps reached 93.3%93.3 \%, with a Kappa coefficient of 0.82 . The commission rate and omission rate for cropland were 12.9%12.9 \% and 7.8%7.8 \%, both lower than those observed for the individual datasets mentioned earlier. This indicates that the synergistic approach can harness multiple datasets to create more accurate hybrid maps. The enhanced overall accuracy of SOSA reflects the feasibility and traceability of the fusion method employed in these experiments and signifies the successful integration of multi-source cropland data into a newly refined dataset with scientifically improved features.
3.5. 中国耕地分布的时空动态
Spatial analysis of input products and their synergy maps was employed to discern the evolving features and patterns of China’s cropland at the raster level (Fig. 9). The overall cropland distribution has remained relatively stable over the past two decades, with dynamics characterized by regional variations and subtle shifts. On a macroscopic scale, the center of cropland gravity has shifted towards the northwest and northeast, accelerated by the decline of high-quality arable land in the south, where water and temperature conditions are better suited for other land uses. On a micro scale, this shift is reflected in the reallocation of cropland resources between urban and rural areas, as well as the reclamation of sloped wasteland to expand cropland area. The total cropland area continued to decline rapidly until 2005, with an annual decrease of 278,000 ha. Factors contributing to this decline included ecological reclamation, land occupation for construction, damage from natural disasters, and agricultural restructuring (Duan et al., 2021). Subsequently, the central government implemented a series of stringent measures and protection systems. High-quality cropland was established, and local governments were held responsible for maintaining the cropland area and the protected basic farmland within their administrative districts, as outlined in the current territorial spatial planning. Specific policies included the demarcation of redlines for cropland, the abolition of agricultural taxes, and the establishment of a permanent mechanism for the protection of basic farmland (Tian et al., 2021). These measures have to some extent effectively curbed the trend of cropland reduction resulting from land misuse or unlawful appropriation. However, the decrease in cropland area between 2010 and 2020, despite stringent countermeasures against non-agricultural land encroachment, was due to land greening efforts and agricultural restructuring (Hu et al., 2020). With the deepening of conservation policies, continuous improvements and enhancements in land management models, methods, and technical measures, and the rigorous implementation of protection systems such as cropland balance (Zhong et al., 2022), China’s total cropland area has generally maintained a dynamic equilibrium.
Regarding the shift of cropland within each province (Fig. 10), it was observed that 70.9%70.9 \% of provinces experienced a decrease in cropland area during the study period. Notably, Shandong, Henan, Jiangsu, Sichuan, and Hebei provinces exhibited a significant decline. These provinces are characterized by extensive agriculture and high population density, but they face challenges such as low per capita cropland availability and limited reserve resources. Recent urban expansion has resulted in increased agricultural land expropriation, intensifying the risk of cropland loss. Furthermore, anthropogenic cropland abandonment, ecological reclamation efforts, and disaster-related damage have compounded the difficulties of preserving local cropland (Zeng et al., 2023). In contrast, Xinjiang, Heilongjiang, Inner Mongolia, Shanxi, and Ningxia provinces witnessed a substantial increase in cropland area. Xinjiang, in particular, added 3.19 million hectares of cropland, which accounted for approximately 40%40 \% of the country’s total increase. Inner Mongolia and Heilongjiang also contributed significantly, with more than 25%25 \% and 15%15 \% of the total increase, respectively. These provinces, with significant influence over China’s cropland changes, have gradually achieved a balance between reducing and expanding cropland. This has been made possible through improved irrigation and water conservation infrastructure, increased investment in agricultural technology, large-scale land revitalization efforts, and pilot programs for cropland rotation. These initiatives have played a pivotal role in maintaining overall cropland stability across the country.
Significantly, scoring assignment played a pivotal role in determining combinations of remotely sensed map products. SOSA streamlined the scoring method, eliminating the need for training samples, which sets it apart from other harmonized methods. Previous approaches involved the creation of extensive static score sheets to distinguish cropland from non-cropland, with 2^(n)2^{n} possible score combinations for nn input products, making the process time-consuming and
劳动密集型(Fritz et al., 2013)。相比之下,SOSA 确定了最佳协议级别,并为每个单位建立了动态评分表。评分表是离散的,其分配规则可以适应各种耕地制图应用(Lu et al., 2017)。大多数省级单位的最佳协议级别范围从 2 到 4,形成了构建评分表的基础,显著减少了其体积并提高了效率。虽然一些基于网络的解决方案和在线平台众包样本,如 LACO-Wiki (https://laco-wiki.net/), Collect Earth (https://www.collect.earth/), 和 Geo-Wiki (https://www.geo-wiki.org/), 积累了大量的地面真实样本,但评估其可靠性仍然是一项具有挑战性的任务(Schepaschenko et al., 2015; See et al., 2015)。更重要的是,样本质量和不确定性相关的问题不能被忽视,因为这些样本主要是由志愿者收集的(Chen et al., 2019),即使验证方法有良好的文档记录,其应用也需要相当的专业知识。
Our study combines existing global land cover products synergistically to enhance the accuracy of Chinese cropland estimates. This approach is well-suited to reduce local discrepancies between remote sensing data and national cropland statistics. We introduce a new calibration method based on remote sensing images and statistics and validate the resulting cropland map. This method has significant potential to streamline the process of achieving dataset synergies compared to conventional scoring assignment methods. It is also adaptable, allowing for the seamless integration of new products as they become available. Initial evidence and comparisons demonstrate that the synergistic maps exhibit higher accuracy and better statistical consistency than the original individual mapping products. Furthermore, this approach can be extended beyond its current focus on cropland mapping to cover entire continents and even global scales. It is applicable to a wide variety of land cover types, including forests, grasslands, and water bodies. However, our evaluation also suggests that there is still room for improvement, as the method described here heavily relies on the accuracy and coverage of the input layer, as well as the reliability of sub-national statistics. This mapping is expected to be further refined as more high-quality data becomes available in the foreseeable future.
Akpoti, K., Kabo-bah, A., & Zwart, S. (2019). 农业用地适宜性分析:最新进展及气候变化分析整合的展望。《农业系统》,173,172-208。
贝克-雷谢夫, I., 贾斯蒂斯, C., 巴克, B., 汉伯, M., 伦博尔德, F., 博尼法西奥, R., 扎帕科斯塔, M., 布德, M., 马加兹迪雷, T., & 希托特, C. (2020). 加强面临粮食不安全风险国家的农业决策:GEOGLAM 作物监测早期预警系统. 环境遥感, 237, 文章 111553.
Benhammou, Y., Alcaraz-Segura, D., Guirado, E., Khaldi, R., Achchab, B., Herrera, F., & Tabik, S. (2022). Sentinel2GlobalLULC:一个用于全球土地利用/覆盖制图的哨兵-2 RGB 图像切片数据集,结合深度学习。科学数据,9,681。
Brovelli, M., Molinari, M., Hussein, E., Chen, J., & Li, R. (2015). 全球土地 30 的首次国家级全面精度评估:方法论与结果。遥感, 7, 4197-4212。
陈, D., 陆, M., 周, Q., 肖, J., 汝, Y., 魏, Y., & 吴, W. (2019). 两种协同方法在混合农田制图中的比较. 遥感, 11(3), 213.
Duro, J., Lauk, C., Kastner, T., Erb, K., & Haberl, H. (2020). 全球食品消费、耕地需求和土地利用效率的不平等:一项分解分析。全球环境变化, 64, 文章 102124。
弗里茨, S., 西, L., 你, L., 贾斯蒂斯, C., 贝克-雷谢夫, I., 比德克尔克, L., 库马尼, R., 德福尔尼, P., 埃尔布, K., 福利, J., & 吉利亚姆斯, S. (2013). 需要改进全球农田地图. Eos, 美国地球物理联盟会刊, 94, 31-32.
Fritz, S., You, L., Bun, A., See, L., McCallum, I., Schill, C., Perger, C., Liu, J., Hansen, M., & Obersteiner, M. (2011). 撒哈拉以南非洲的农田:使用五个土地覆盖数据集的协同方法。地球物理研究快报, 38, 文章 L04404。
Gumma, M., Thenkabail, P., Teluguntla, P., Oliphant, A., Xiong, J., Giri, C., Pyla, V., Dixit, S., & Whitbread, A. (2020). Agricultural cropland extent and areas of South Asia derived using Landsatsatellite 30-m30-\mathrm{m} time-series big-data using random forest machine learningalgorithms on the Google Earth Engine cloud. GIScience and Remote Sensing, 57, 302-322.