这是用户在 2024-4-7 21:20 为 https://app.immersivetranslate.com/pdf-pro/8396c46e-7e11-44c7-8405-d932aa33d321 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
2024_04_07_7a0cb9cc5d7637d06b62g

Meta-learning strategy based on user preferences and a machine recommendation system for real-time cooling load and COP forecasting
基于用户偏好的元学习策略和用于实时冷却负荷和 COP 预测的机器推荐系统

Wenqiang , Guangcai Gong , Houhua Fan , Pei Peng , Liang Chun
Wenqiang , Guangcai Gong , Houhua Fan , Pei Peng , Liang Chun
a School of Civil Engineering, Hunan Univ., Changsha 410082, China
a 湖南大学土木工程学院,中国长沙 410082
Research and Design Center, Hubei Huayu Hi-Tech Architectural Design Consulting Co., Ltd., Yichang Three Gorges Branch, Yichang 443000, China
湖北华宇高科建筑设计咨询有限公司宜昌三峡分公司研究设计中心,中国宜昌 443000
Hunan Tianyu Energy Technology Co., Ltd., Changsha 410082, China
湖南天宇能源科技有限公司,中国长沙 410082

H I G H L I G H T S

  • A new meta-learning recommendation system is proposed.
    提出了一种新的元学习推荐系统。
  • The new system concerned two-stage (subjective and objective) user preferences.
    新系统涉及两个阶段(主观和客观)的用户偏好。
  • Multi-objective decision making algorithms (MODMA) are used in option optimization.
    多目标决策算法(MODMA)用于期权优化。
  • A new "walking slide method" aimed at some extremely special cases is proposed.
    针对一些极为特殊的情况,提出了一种新的 "行走滑动法"。
  • The new system is validated on real buildings, and its generalizability is proved.
    新系统在真实建筑上得到了验证,其通用性也得到了证明。

A R T I C L E I N F O

Keywords: 关键词:

Cooling load prediction 冷却负荷预测
Meta-learning 元学习
Artificial neural network (ANN)
人工神经网络(ANN)
Recommendation system 推荐系统
User preferences 用户偏好

Abstract 摘要

A B S T R A C T Building data forecasting plays an increasingly important role in building energy savings. However, the one-fitsall model cannot satisfy all the requirements of multiple application scenarios and user preferences. Motivated by the need to bridge the research gap between different user preferences (application scenarios) and energy prediction model recommendation systems, this paper proposes a novel meta-learning strategy based on an artificial neural network recommendation system. This strategy is employed for real-time cooling loads, coefficients of performance prediction and optimal prediction model recommendations. The data set is composed of 40 cases from five factory buildings. After the predictions and recommendations are obtained for all cases, the two-stage user preferences are considered based on multi-objective decision-making algorithms. Then, a new model termed the "walking slide method", is proposed to predict some special cases. This study shows that the seasonal autoregressive integrated moving average model and random forest model achieve the best prediction accuracy and the minimum computation cost separately for most cases, while the long short-term memory is the best model when considering the two criteria. The variances between the different cases lead to a lower crossvalidation score (approximately 65%), but a higher success rate (over 99%) for the recommendation performance. In addition, in the more complex application scenarios, a lower prediction accuracy and recommendation success rate will be obtained. In most cases, the use of a prediction combined with a monitoring system is the best choice. Last, the reliability of the results is verified by application studies. This work provides a scientific basis for energy prediction applications based on user preferences.
建筑数据预测在建筑节能中发挥着越来越重要的作用。然而,千篇一律的模型无法满足多种应用场景和用户偏好的所有要求。为了弥补不同用户偏好(应用场景)和能源预测模型推荐系统之间的研究差距,本文提出了一种基于人工神经网络推荐系统的新型元学习策略。该策略用于实时冷却负荷、性能系数预测和最优预测模型推荐。数据集由来自五座工厂大楼的 40 个案例组成。在获得所有案例的预测和建议后,基于多目标决策算法考虑了两个阶段的用户偏好。然后,提出了一种名为 "行走滑动法 "的新模型,用于预测一些特殊情况。研究表明,在大多数情况下,季节自回归综合移动平均模型和随机森林模型分别达到了最好的预测精度和最小的计算成本,而在考虑这两个标准时,长短期记忆是最好的模型。不同案例之间的差异导致交叉验证得分较低(约 65%),但推荐性能的成功率较高(超过 99%)。此外,在较为复杂的应用场景中,预测准确率和推荐成功率都会较低。在大多数情况下,将预测与监控系统结合使用是最佳选择。最后,应用研究验证了结果的可靠性。这项工作为基于用户偏好的能源预测应用提供了科学依据。

1. Introduction 1.导言

The opposing demands of increasing energy consumption and decreasing total energy storage require more detailed management and use of energy [1]. Energy usage predictions are of great significance for energy savings in buildings which have become one of the largest energy consumers in the world [2], especially large energy consumption buildings (e.g., factories). In heating, ventilation, and air conditioning
日益增长的能源消耗和日益减少的能源储存总量这对矛盾的需求,要求对能源进行更细致的管理和使用[1]。建筑已成为世界上最大的能源消耗之一[2],尤其是高能耗建筑(如工厂),能源使用预测对于建筑节能意义重大。在供暖、通风和空调领域

(HVAC) systems, energy consumption prediction can address the time delay between air-conditioning cooling (heating) load and heat extraction by allowing HVAC equipment to respond in advance to reduce the maximum air-conditioning load demand. Additionally, energy consumption prediction is an important tool for fault detection, diagnosis, and energy control optimization. As a result, many prediction models have been proposed, including meta-learning strategies that automatically learn to select optimal prediction methods. In these
(暖通空调(HVAC)系统中,能耗预测可以解决空调制冷(制热)负荷与热量提取之间的时间延迟问题,让暖通空调设备提前做出响应,降低最大空调负荷需求。此外,能耗预测还是故障检测、诊断和能源控制优化的重要工具。因此,人们提出了许多预测模型,包括自动学习选择最佳预测方法的元学习策略。在这些
applications, user preferences are expected to be very important, because a good optimization strategy is based on the actual situations of the users. This section introduces some load prediction models and user preferences as follows.
由于一个好的优化策略要以用户的实际情况为基础,因此用户的偏好就显得非常重要。本节将介绍一些负载预测模型和用户偏好,具体如下。
In general, the modeling methods for short-term building energy prediction can be broadly categorized into three types: physical, statistical, and data-driven models. Physical models (e.g., EnergyPlus [3], TRNSYS [4], ESP-r [5]) can also be thought of as "white-box" models, because they require clear physical principles and a large amount of detailed building information (such as the thermophysical parameters of envelopes and building dimensions). Physical models have the advantage of having a clear calculation process but often require too much relevant information; hence, it is very difficult to use them in building energy management (BEM) systems. In addition, there is another type of physically based "variant" models [6]. The parameters utilized in the "variant" models are available adjusted measured (monitored) data. Statistical models have been mainly based on statistical principles and include multiple linear regression [7], Kalman filtering [8], Box-Jenkins [9], autoregressive integrated moving average (ARIMA) [10], wavelet and other relevant models. Those similar models include the seasonal autoregressive integrated moving average (SARIMA) [11] and wave recurrent neural network (WaveRNN) [12] models. In [13], an ARIMA and a support vector machine (SVM) model were combined.
一般来说,短期建筑能耗预测的建模方法可大致分为三类:物理模型、统计模型和数据驱动模型。物理模型(如 EnergyPlus [3]、TRNSYS [4]、ESP-r [5])也可视为 "白盒 "模型,因为它们需要明确的物理原理和大量详细的建筑信息(如围护结构的热物理参数和建筑尺寸)。物理模型的优点是计算过程清晰,但往往需要过多的相关信息,因此很难在建筑能源管理系统(BEM)中使用。此外,还有另一种基于物理的 "变体 "模型[6]。变体 "模型中使用的参数是经过调整的测量(监测)数据。统计模型主要基于统计原理,包括多元线性回归模型[7]、卡尔曼滤波模型[8]、盒状-詹金斯模型[9]、自回归综合移动平均模型(ARIMA)[10]、小波模型和其他相关模型。类似的模型还包括季节自回归积分移动平均(SARIMA)[11] 和波浪循环神经网络(WaveRNN)[12] 模型。在文献[13]中,ARIMA 模型与支持向量机(SVM)模型相结合。

This hybrid model is also an important part of the statistical models. Commonly, HVAC system data can be regarded as a type of time series [14]. Box and Jenkins discussed this concept in detail in [15] as a basic idea for statistical models. However, statistical models are very sensitive to the intrinsic linkages and laws in data.
这种混合模型也是统计模型的重要组成部分。通常,暖通空调系统数据可被视为一种时间序列[14]。Box 和 Jenkins 在文献[15]中详细讨论了这一概念,并将其作为统计模型的基本思想。然而,统计模型对数据的内在联系和规律非常敏感。
The final (and very important) type of prediction model is the datadriven model, which mainly relies on the operation data of the building HVAC system. The correlations between the inputs (e.g., outdoor meteorological parameters) and outputs (e.g., cooling load and coefficient of performance (COP)) can be determined without studying the internal physical laws affecting the various elements. Thus, data-driven models are also known as "black-box" models. Furthermore, the models can achieve higher accuracy, are more flexible to use, and have higher generalization ability for complex nonlinear correlations than the physical and statistical models. Through literature reading, there are three main aspects of data-driven prediction models for HVAC systems. The first aspect regards the application of some specific data-driven models. The extreme gradient boosting (XGB) method showed superiority for prediction, as described in [16]. Xu [17] proposed a novel long short-term memory (LSTM) model, and the results showed that this model had slight advantages in terms of the level of indoor temperature prediction performance as compared with the SVM model, decision tree model, back-propagation neural network (BPNN) model,
最后一种(也是非常重要的一种)预测模型是数据驱动模型,它主要依赖于建筑暖通空调系统的运行数据。输入(如室外气象参数)和输出(如制冷负荷和性能系数 (COP))之间的相关性可以在不研究影响各要素的内部物理规律的情况下确定。因此,数据驱动模型也被称为 "黑箱 "模型。此外,与物理和统计模型相比,数据驱动模型可以获得更高的精度,使用更灵活,对复杂的非线性关联具有更高的泛化能力。通过文献阅读,暖通空调系统的数据驱动预测模型主要有三个方面。第一个方面是一些特定数据驱动模型的应用。如文献[16]所述,极端梯度提升(XGB)方法在预测方面表现出优越性。Xu[17]提出了一种新型的长短期记忆(LSTM)模型,结果表明,与 SVM 模型、决策树模型、反向传播神经网络(BPNN)模型相比,该模型在室内温度预测性能水平方面略有优势、

and original LSTM model. Fan et al. [18] presented some one-stepahead prediction models (i.e., a direct approach based on recurrent models and RNN, LSTM, gated recurrent unit (GRU), and multi-input and multi-output (MIMO) approaches). This research indicated that the direct approach was the most accurate model, while the GRU was the most cost-effective. Naji et al. [19] developed a method for predicting building energy consumption based on an extreme learning machine
和原始 LSTM 模型。Fan 等人[18]提出了一些领先一步的预测模型(即基于递归模型的直接方法和 RNN、LSTM、门控递归单元(GRU)以及多输入多输出(MIMO)方法)。这项研究表明,直接方法是最精确的模型,而 GRU 则是最具成本效益的方法。Naji 等人[19] 开发了一种基于极端学习机的建筑能耗预测方法。

(ELM) and found that ELM predictions were superior to those of genetic programming (GP) and artificial neural networks (ANNs) by evaluating the root mean square error (RMSE), and values. Some new and improved support vector regression (SVR) methods have also performed well . As discussed above, all of the specific models achieved ideal performances in specific energy prediction studies. The second aspect of data-driven models regards ANN models [22], some
(ELM) 的预测结果,通过评估均方根误差 (RMSE)、 值,发现 ELM 的预测结果优于遗传编程 (GP) 和人工神经网络 (ANN) 的预测结果。一些新的改进型支持向量回归(SVR)方法也表现出色, 。如上所述,所有具体模型在具体的能源预测研究中都取得了理想的成绩。数据驱动模型的第二个方面涉及到 ANN 模型[22],其中一些模型可以用于能源预测。
Fig. 1. Framework of the proposed strategy.
图 1.拟议战略的框架。

revised ANN models, and ANN hybrid models. Fu [23] investigated a deep neural network for real-time cooling load forecasting (CLF). This model not only exhibited high stability and robustness but also performed better than the BPNN and SVM models. An ANN model and a case-based reasoning (CBR) model were compared in [24], and the results indicated that the ANN model obtained high levels of accuracy. For more accurate load forecasting, various ANN hybrid models were proposed. Among them, the model-based method was one of the most commonly used methods. Similar to the physically-based "variant" models discussed above, this method has two parts: a mathematical part and an optimization algorithm. Thus, various optimization algorithms were combined with ANNs, including the clustering-based ANN [25] and BPNN [26]. These optimization algorithms also included the Bayesian optimization algorithm [27,28], uncertainty algorithms [29,30,31], probability density optimization algorithms [32], and supervisory optimization control algorithms [33], all of which obtained good performance in various HVAC system predictions. Many researchers have concentrated on studying the inputs extraction [34] of the ANN model, which is used to obtain more satisfactory evaluation scores. Principal component analysis (PCA) [35], t-distributed stochastic neighbor embedding (t-SNE) [36], and some other tools can provide substantial help for feature extraction.
修正后的 ANN 模型和 ANN 混合模型。Fu [23] 研究了一种用于实时冷却负荷预测(CLF)的深度神经网络。该模型不仅表现出很高的稳定性和鲁棒性,而且性能优于 BPNN 和 SVM 模型。文献[24]对 ANN 模型和基于案例的推理(CBR)模型进行了比较,结果表明 ANN 模型的准确度较高。为了更准确地进行负荷预测,人们提出了各种 ANN 混合模型。其中,基于模型的方法是最常用的方法之一。与上文讨论的基于物理的 "变体 "模型类似,这种方法包括两个部分:数学部分和优化算法。因此,各种优化算法与 ANN 相结合,包括基于聚类的 ANN [25] 和 BPNN [26]。这些优化算法还包括贝叶斯优化算法[27,28]、不确定性算法[29,30,31]、概率密度优化算法[32]和监督优化控制算法[33],它们都在各种暖通空调系统预测中取得了良好的性能。许多研究人员集中研究了 ANN 模型的输入提取[34],用于获得更令人满意的评估分数。主成分分析(PCA)[35]、t 分布随机邻域嵌入(t-SNE)[36]和其他一些工具可为特征提取提供实质性帮助。
The third aspect of data-driven models concerns meta-learning. The phrase "meta-learning" was first used in [37] in the context of a time series, and it was used to express the process of automatic learning to select prediction models for different tasks and purposes [38,39]. Lemke and Gabrys [40] studied meta-learning for time series prediction, and found that the decision tree was an important tool for metalearning. Cui [41] built a meta-learning model in terms of a building energy model recommendation (BEMR) system, and the results showed that the BEMR could identify the best performance model for of the cases.
数据驱动模型的第三个方面涉及元学习。元学习"(meta-learning)一词最早出现在时间序列的语境中[37],用来表达为不同任务和目的选择预测模型的自动学习过程[38,39]。Lemke 和 Gabrys [40] 研究了时间序列预测的元学习,发现决策树是金属学习的重要工具。Cui[41]在建筑节能模型推荐(BEMR)系统方面建立了元学习模型,结果表明 BEMR 可以为 的案例确定最佳性能模型。
Note that the current investigations largely focus on real case predictions by various models to find the most appropriate model. This finding indicated that in the machine learning field, the one-fits-all modeling approach was limited [42] in fulfilling all of the user requirements, and each model (even traditional SARIMAX [43]) could play an irreplaceable role for a certain case. On the one hand, with the development of machine learning techniques, more new models will be invented, and the rapid developments in big data can offer opportunities for the effective use of these prediction models in HVAC systems. On the other hand, in building energy management (BEM) systems, building data is usually generated once every five minutes to one hour. The trial-and-error process of inputting data into various models is costly, especially when the amount of data is large. Thus, it is better to predict the "best performance model" in advance using the "metalearning" strategy. The perspective of users, the user requirements, the maximum accuracy that meter equipment can provide and the relationships between these factors have a high impact on the function of load prediction and, thus, the user preferences [44,45] regarding load predictions should be considered. Two-stage user preferences are investigated in this study, which includes two aspects and one restriction; the two aspects are the subjective requirements of the approach for the prediction or monitoring systems and the accuracy requirements for the prediction system; the restriction is an investment. To obtain the subjective requirements under the investment restriction, a questionnaire survey will be made for users before the monitoring system installation. Some users could pay more attention to the whole cooling load provided by the HVAC system, and in this case, a prediction system with suitable accuracy can meet the demand. If energy-saving is also expected by the users, the COP value and a high-precision prediction system should be considered. Moreover, the budgets of the users will impose some limits on these demands, even if most of the users wish to pay attention to the cooling load, COP and monitoring systems. Therefore, five options that combine the forecasting and monitoring systems can be proposed based on practical experience: "cooling load prediction"; "COP prediction"; "cooling load/COP prediction"; "cooling load prediction and monitoring"; and "cooling load/COP prediction and monitoring". Based on the above, the main objective of this study can be summarized as the proposal and resolution of two questions: (1) what type of model can achieve the best prediction performance, and does that model meet the requirements of the users?; (2) which option mentioned above is the comprehensive optimal for users under an investment restriction? Naturally, the meta-learning approach and multiobjective decision-making algorithms (MODMA) [46,47] were chosen to resolve these questions in this paper.
需要注意的是,目前的研究主要集中于各种模型对实际案例的预测,以找到最合适的模型。这一发现表明,在机器学习领域,"一刀切 "的建模方法[42]在满足用户所有要求方面是有限的,每个模型(即使是传统的 SARIMAX [43])都可能在特定情况下发挥不可替代的作用。一方面,随着机器学习技术的发展,将会有更多新模型被发明出来,而大数据的快速发展也为在暖通空调系统中有效使用这些预测模型提供了机会。另一方面,在楼宇能源管理系统(BEM)中,楼宇数据通常每五分钟到一小时产生一次。将数据输入各种模型的试错过程成本高昂,尤其是当数据量较大时。因此,使用 "金属学习 "策略提前预测 "最佳性能模型 "是更好的选择。用户的视角、用户的要求、仪表设备所能提供的最大精度以及这些因素之间的关系对负荷预测的功能有很大影响,因此应考虑用户对负荷预测的偏好[44,45]。本研究调查了两个阶段的用户偏好,其中包括两个方面和一个限制条件;两个方面是对预测或监测系统方法的主观要求和对预测系统精度的要求;限制条件是投资。为了获得投资限制下的主观要求,将在监测系统安装前对用户进行问卷调查。有些用户可能更关注暖通空调系统提供的整个冷负荷,在这种情况下,精度合适的预测系统就能满足需求。如果用户还希望节能,则应考虑 COP 值和高精度的预测系统。此外,即使大多数用户希望关注制冷负荷、COP 值和监控系统,用户的预算也会对这些需求造成一定的限制。因此,可以根据实践经验提出五种将预测和监控系统相结合的方案:"冷负荷预测"、"COP 预测"、"冷负荷/COP 预测"、"冷负荷预测与监测 "和 "冷负荷/COP 预测与监测"。基于以上所述,本研究的主要目标可概括为提出并解决两个问题:(1) 什么样的模型能达到最佳预测性能,以及该模型是否满足用户的要求?(2) 在投资限制条件下,上述哪种方案对用户而言是综合最优的?当然,本文选择元学习方法和多目标决策算法(MODMA)[46,47] 来解决这些问题。
The contributions of this article include the following: (1) the research object (i.e., factories) and one of the prediction objects (i.e., COP) have rarely been analyzed in previous studies; most of the investigated building types are office buildings [16,17,20,24,27,36], and most of the prediction objects are cooling loads and electric loads. At the same time, the electricity price is included in the feature pool for the data-driven models. (2) This study proposes a new meta-learning strategy based on a machine recommendation system, which accounts for the two-stage user preferences. (3) A new "walking slide method" aimed at predicting some special cases is proposed based on the recommendation system. (4) Five options that combine the forecasting and monitoring systems are proposed and optimized here by key attributes (i.e., energy-saving potential, prediction accuracy, initial investment, and prediction of computational cost and measurement cost) which has high practicability and provides valuable suggestions for the users.
本文的贡献包括以下几点:(1)研究对象(即工厂)和预测对象之一(即 COP)在以往的研究中鲜有分析;所研究的建筑类型多为办公建筑[16,17,20,24,27,36],预测对象多为冷负荷和电负荷。同时,电价也被纳入数据驱动模型的特征库中。(2)本研究在机器推荐系统的基础上提出了一种新的元学习策略,该策略考虑了用户的两阶段偏好。(3) 基于推荐系统提出了一种新的 "行走滑动法",旨在预测一些特殊情况。(4) 提出了预测与监测系统相结合的五种方案,并根据关键属性(即节能潜力、预测精度、初始投资、预测计算成本和测量成本)进行了优化,具有很高的实用性,可为用户提供有价值的建议。
This paper is organized as follows. Section 2 presents the detailed meta-learning strategy with two-stage user preferences. Section 3 includes five parts: (1) a brief introduction of 40 cases; (2) the performances of the forecasting; (3) the performance of the meta-level learning; (4) the performance when considering the user preferences; and (5) a discussion. Section 4 describes the conclusions. The Appendix A gives a brief introduction to the data-driven forecasting models.
本文的组织结构如下。第 2 节详细介绍了具有两阶段用户偏好的元学习策略。第 3 节包括五个部分:(1) 40 个案例的简要介绍;(2) 预测的性能;(3) 元学习的性能;(4) 考虑用户偏好时的性能;(5) 讨论。第 4 节阐述了结论。附录 A 简要介绍了数据驱动预测模型。

2. Methodology 2.方法论

Following the existing research procedure on meta-learning [41], the meta-learning system proposed in this paper was based on an ANN recommendation system. Analysis of the user preferences, which were investigated rarely before, was based on MODMA algorithms. There were three steps in this system, as shown in Fig. 1. The first step was inputting the data (the COP and cooling load) from cases one to 40 into the six selected common forecasting models; each case was brought into the six models, and a total of 240 regression predictions were made. This step contained data pre-processing, data division, training, prediction, and most importantly, the "best" performance model determination among the six prediction models. The second step was recommendation system modeling, which included the derivation of the meta-features (inputs), data division, training, and prediction. The outputs of the recommendation system were the "best" prediction performance models in step one. The third step was the comprehensive optional process, and the two-stage user preferences was considered, while a new "walking slide method" was proposed to make predictions for those special cases that could not meet the user's accuracy requirements. Last, by both taking objective and subjective properties into account when using MODMA, the optimal options of the comprehensive attributes could be selected, and the whole strategy could be built. This section presents two parts: 1) a description of a metalearning strategy combined with user preferences, 2) the "walking slide method".
根据现有的元学习研究程序[41],本文提出的元学习系统基于 ANN 推荐系统。对用户偏好的分析是基于 MODMA 算法的,这在以前的研究中很少见。如图 1 所示,该系统分为三个步骤。第一步是将案例 1 至 40 的数据(COP 和冷负荷)输入到所选的六个常见预测模型中;每个案例都被输入到六个模型中,总共进行了 240 次回归预测。这一步包括数据预处理、数据分割、训练、预测,最重要的是在六个预测模型中确定性能 "最佳 "的模型。第二步是推荐系统建模,包括元特征(输入)的推导、数据分割、训练和预测。推荐系统的输出是第一步中的 "最佳 "预测性能模型。第三步是综合可选过程,考虑了两阶段的用户偏好,同时提出了一种新的 "行走滑动法",对无法满足用户准确性要求的特殊情况进行预测。最后,在使用 MODMA 时兼顾客观和主观属性,选择综合属性的最优选项,构建整体策略。本节将介绍两个部分:1)结合用户偏好的金属学习策略描述;2)"行走滑动法"。

2.1. Meta-learning strategy with user preferences
2.1.带用户偏好的元学习策略

2.1.1. Forecasting models
2.1.1.预测模型

(1) Statistical models (1) 统计模型
In some less-complicated conditions, statistical forecasting models can play a suitable role in some simple cooling load forecasting tasks. Usually, HVAC data have strong regularity, and, possible simple application scenarios should be considered in step one. Thus, two traditional methods are introduced in this paper: The time series method [10] and the wavelet analysis method [48]. These two methods are introduced in the Appendix A.
在一些不太复杂的条件下,统计预测模型可以在一些简单的冷负荷预测任务中发挥适当的作用。通常情况下,暖通空调数据具有很强的规律性,而且在第一步中就应考虑到可能的简单应用场景。因此,本文介绍了两种传统方法:时间序列法 [10] 和小波分析法 [48]。附录 A 介绍了这两种方法。
(2) Machine learning models
(2) 机器学习模型
Machine learning could be very useful under conditions in which the connections between the inputs and outputs are highly nonlinear. Considering the possible complex application scenarios in terms of the user preferences, six common models were explored: WNN, SARIMAX, Elman recurrent neural network (Elman RNN) [49], random forest (RF), long short-term memory recurrent neural network (LSTM-RNN), and SVR, some of which are introduced in Appendix A.
在输入和输出之间的联系高度非线性的情况下,机器学习可能非常有用。考虑到用户偏好方面可能存在的复杂应用场景,我们探索了六种常见模型:WNN、SARIMAX、Elman 循环神经网络(Elman RNN)[49]、随机森林(RF)、长短期记忆循环神经网络(LSTM-RNN)和 SVR,附录 A 介绍了其中一些模型。
Machine learning can be summarized as using the right features to build the right model to complete the given tasks. The first and important part of a machine learning process is the feature selection. In this work, PCA was commonly used to extract the main inputs. By calculating the Pearson correlation coefficient ( ) (as shown in Eq. (1) below and [50]) between the principal component and the original inputs, a coefficient color map was generated. Then, the corresponding principal components could be found. For example, if the top seven components explained at least of the entire original data set, the top seven components were deemed to be principal components. A detailed presentation of PCA could be found in [51].
机器学习可以概括为使用正确的特征来建立正确的模型,从而完成给定的任务。机器学习过程的第一个重要部分是特征选择。在这项工作中,通常使用 PCA 来提取主要输入。通过计算主成分与原始输入之间的皮尔逊相关系数( )(如下式(1)和文献[50]所示),生成系数色图。然后,就可以找到相应的主成分。例如,如果前七个分量至少解释了整个原始数据集的 ,那么前七个分量就被认为是主分量。关于 PCA 的详细介绍,请参阅 [51]。
In the above equation, is the true value, is the predicted value, and is the number of data points.
在上式中, 是真实值, 是预测值, 是数据点数。
The second part of machine learning is data training. In this study, a cross-validation method [52] was utilized to randomly split the training data and testing data to examine the performance of the data using different prediction models.
机器学习的第二部分是数据训练。本研究采用交叉验证法[52],随机分割训练数据和测试数据,以检验数据在不同预测模型下的表现。
The final part of machine learning is testing (predicting). To evaluate the performance, the normalized root mean square error (NRMSE) and computation time (e) of each case are used as the evaluation criteria. NRMSE is defined in Eq. (2), and the computation time (e) includes both the training and prediction processes.
机器学习的最后一部分是测试(预测)。为了评估性能,我们使用每个案例的归一化均方根误差(NRMSE)和计算时间(e)作为评估标准。NRMSE 在公式 (2) 中定义,计算时间 (e) 包括训练和预测过程。
where is the true value, is the predicted value, and is the number of data points or instances. Here, and are the maximum and minimum value of .
其中, 是真实值, 是预测值, 是数据点或实例数。这里, 的最大值和最小值。

2.1.2. Recommendation system
2.1.2.建议系统

A recommendation system is the core component of a meta-learning strategy. The decision tree [40], XGB [16] and ANN [41] (called "metalearners") could be utilized for meta-learning systems. The ANN-based meta-learning system is more common in HVAC systems.
推荐系统是元学习策略的核心组成部分。决策树 [40]、XGB [16] 和 ANN [41](称为 "金属学习器")可用于元学习系统。基于 ANN 的元学习系统在暖通空调系统中更为常见。
(1) Modeling of the artificial neural network (ANN)
(1) 人工神经网络(ANN)建模
The parameter settings of the ANN were as follows: the number of hidden neurons: inputs + outputs) [53]; inputs was the number of input layer units, outputs was the number of output layer units, and the number of training patterns was the number of training samples. In this study, the number of hidden layers was 8 . The hidden layer activation function was "Relu", and the output layer activation function was "Softmax". There were six output neurons. The loss function was "categorical cross-entropy", and the optimizer was stochastic gradient descent (SGD) [54]. In that regard, the SGD trained with a single sample as a training unit can calculate faster. The ANN model was built in a Python environment using the TensorFlow library. The inputs of the proposed ANN were the meta-features selected by the PCA method, and the output was the best prediction performance model of the six regression models in this work. The "Label Encoder" tool was used to code the prediction models with the best performance.
ANN 的参数设置如下:隐神经元数: inputs + outputs) [53];输入为输入层单元数,输出为输出层单元数,训练模式数为训练样本数。在本研究中,隐层数为 8。隐层激活函数为 "Relu",输出层激活函数为 "Softmax"。共有 6 个输出神经元。损失函数为 "分类交叉熵",优化器为随机梯度下降(SGD)[54]。在这方面,以单个样本为训练单元进行训练的 SGD 计算速度更快。ANN 模型是在 Python 环境中使用 TensorFlow 库构建的。拟建 ANN 的输入为 PCA 方法选出的元特征,输出为本研究中六个回归模型中预测性能最好的模型。使用 "标签编码器 "工具对性能最佳的预测模型进行编码。
(2) The first stage of the user preferences
(2) 用户偏好的第一阶段
As discussed in Section 1, the user preferences in different application scenarios cannot be ignored before a single recommendation system is introduced to the users. For example, when the total cooling capacity is not too large in some small office buildings, the users might only be concerned with whether the cooling load of the HVAC system is stable. However, in some larger factories with large cooling load demands, the users must not only be concerned with the total cooling load but also with the COP of the HVAC system. In some rigorous environments (such as hospitals), users must also ensure the normal operation of the HVAC system; hence, a monitoring system is necessary. At the same time, the budget of the users will impose some limits on the demands. Based on these observations, five different options were provided for the first stage of the user preferences: Option 1, cooling load prediction; Option 2, COP prediction; Option 3, Cooling load and COP prediction; Option 4, cooling load prediction and monitoring; and Option 5, cooling load/COP prediction and monitoring. These five options are similar to those of the five packages for the users, where the benefit and cost (calculated in Section 3.4.1) of each package are different. The most appropriate option for users cannot be determined in a simple manner.
如第 1 节所述,在向用户推出单一推荐系统之前,不能忽视不同应用场景下的用户偏好。例如,在一些小型办公楼中,当总制冷量不太大时,用户可能只关心暖通空调系统的冷负荷是否稳定。但在一些冷负荷需求较大的大型工厂,用户不仅要关注总冷负荷,还要关注暖通空调系统的 COP。在一些严格的环境中(如医院),用户还必须确保暖通空调系统的正常运行,因此,监控系统是必要的。同时,用户的预算也会对需求造成一定的限制。根据上述观察结果,在用户偏好的第一阶段提供了五个不同的方案:方案 1,冷负荷预测;方案 2,COP 预测;方案 3,冷负荷和 COP 预测;方案 4,冷负荷预测和监控;方案 5,冷负荷/COP 预测和监控。这五个方案与用户的五个套餐相似,但每个套餐的效益和成本(在第 3.4.1 节中计算)不同。无法简单地确定最适合用户的方案。
MODMA was utilized to connect the objective performances (NRMSE and e) of the models with the subjective user preferences, and was also combined with the recommendation system. A theoretical algorithm for MODMA with unknown weights is expressed briefly below:
利用 MODMA 将模型的客观性能(NRMSE 和 e)与用户的主观偏好联系起来,并与推荐系统相结合。下文简要介绍了权重未知的 MODMA 理论算法:
Some assumptions were made, as presented in Table 1. After the scheme (option) set, properties set, attribute weights vector, and subjective utility preference information vector were built, a primitive decision matrix could be made. With different attribute evaluation types for different properties, a normalized decision matrix based on the original decision information could be established; the normalizing process is expressed in Eqs. (3) and (4). Last, the fuzzy comprehensive attribute value for every scheme (option) could also be obtained by the additive weighting method.
表 1 列出了一些假设。在建立了方案(选项)集、属性集、属性权重向量和主观效用偏好信息向量之后,就可以建立原始决策矩阵 。针对不同属性的不同属性评价类型,可以建立基于原始决策信息的归一化决策矩阵 ;归一化过程用公式(3)和(4)表示。最后,还可以通过加权法得到每个方案(选项)的模糊综合属性值。
Table 1 表 1
Introduction of parameters in MODMA.
在 MODMA 中引入参数。
Parameters Expression Remarks
set
set -
scheme (option) set
properties set
attribute weights vector
subjective utility preference
information vector in terms of and
evaluation value -
primitive decision matrix -
normalized decision matrix , in terms of
fuzzy comprehensive attribute value
attribute evaluation type benefit type
attribute evaluation type cost type
The establishment and solution of the decision model accounted for one principle: the preference values based on subjective and objective decision-making should conform to the consistency principle, and the selection of the weights should minimize the sum of the distances squared of the subjective and objective preference values. The following optimization models can then be given:
决策模型的建立和求解遵循一个原则:基于主客观决策的偏好值应符合一致性原则,权重的选择应使主客观偏好值的距离平方和最小。这样就可以给出以下优化模型:
where is the subjective preference value for scheme . Eq. (5) can be solved by the Lagrange multiplier method.
其中 是方案 的主观偏好值。式 (5) 可用拉格朗日乘数法求解。
Some key properties (i.e., NRMSE, e, N, initial investment, and energy-saving) were investigated to evaluate the advantages and disadvantages of the five options, and the attribute evaluation types for the key properties are presented in Table 2.
对一些关键属性(即 NRMSE、e、N、初始投资和节能)进行了调查,以评价五种方案的优劣,关键属性的属性评价类型见表 2。
The energy savings for Options 1 to 5 are given by Eqs. (6)-(10), respectively:
方案 1 至 5 的节能效果分别由公式 (6)-(10) 得出:
In the above, and were the real (tested) cooling load and COP for the HVAC system, while and were the predicted cooling load and COP for the HVAC system. and were the revised COP values based on the monitoring system defined in Eq. (11). In Eqs. (6)-(10), the summation lower limit of 0 indicated the moment of the starting measurement, and the summation upper limit indicated the moment of the end measurement.
在上述公式中, 是暖通空调系统的实际(测试)冷却负荷和 COP,而 是暖通空调系统的预测冷却负荷和 COP。 是根据公式 (11) 中定义的监控系统修正的 COP 值。在公式 (6)-(10) 中,求和下限 0 表示起始测量时刻,求和上限 表示结束测量时刻。
where is the lower-limit value for the COP, which means that the two types of revised COP will be larger than at any time owing to the monitoring system.
其中 是 COP 的下限值,这意味着由于监控系统的存在,两种类型的修正 COP 在任何时候都会大于

It should be noted that the energy savings calculated by Eqs. (6)-(10) is only concerned with saving the energy loss caused by the fluctuation in the cooling/heating sources rather than the terminal energy loss and thermal comfort of the room. The thermal comfort of the room is considered to be stable after and before the prediction.
需要注意的是,根据公式(6)-(10)计算出的节能效果只涉及冷/热源波动造成的能源损耗,而不是房间的终端能源损耗和热舒适度。房间的热舒适度被认为在预测后和预测前是稳定的。
(3) The second stage of the user preferences
(3) 用户偏好的第二阶段
For different application scenarios, not all situations require highprecision prediction. The accuracy of the prediction can have a large impact on the initial investment and measurement. By investigating some classical literature , it was found that most studies obtained good prediction accuracy but showed little consideration for whether the real accuracy matches their own needs, especially for the cost to obtain the accuracy.
对于不同的应用场景,并非所有情况都需要高精度预测。预测精度会对初始投资和测量产生很大影响。通过调查一些经典文献 ,我们发现大多数研究都获得了较高的预测精度,但很少考虑实际精度是否符合自身需求,尤其是获得精度所需的成本。
Thus, the second stage of the user preferences should be considered. When the minimum accuracy requirements were set, if the accuracy of the "best performance model" (i.e., the best NRMSE score) recommended by the meta-learning system could not meet the minimum requirement, the signal regression model and recommendation system discussed above were considered to have failed. For example, if the minimum requirement was 0.8 and the accuracy of the "recommended best model" was 0.7 , the recommendation system failed to meet the user's requirements. Therefore, a special regression model, namely, the "walking slide method", was proposed to address these special cases. A detailed description of this combination method will be discussed in Section 2.2.
因此,应考虑第二阶段的用户偏好。在设定最低精度要求时,如果元学习系统推荐的 "最佳性能模型"(即最佳 NRMSE 分数)的精度不能满足最低要求,则认为上述信号回归模型和推荐系统失败。例如,如果最低要求为 0.8,而 "推荐的最佳模型 "的准确度为 0.7,则推荐系统不能满足用户的要求。因此,针对这些特殊情况,我们提出了一种特殊的回归模型,即 "行走滑动法"。关于这种组合方法的详细介绍将在第 2.2 节中讨论。
This section proposes the new meta-learning strategy system, and two-stage user preferences are incorporated into the recommendation system by the MODMA method and the new "walking slide method".
本节提出了新的元学习策略系统,并通过 MODMA 方法和新的 "行走滑动法 "将两阶段用户偏好纳入推荐系统。

2.2. Walking slide method for the special case
2.2.特殊情况下的行走滑动法

Based on the situation described in Section 2.1.2, the "walking slide method" is proposed in this section. This idea arose from the data splitting process [57] of the testing and training parts. As expressed in Table 3, a data slider is created first, and the length of the slider defaults to the period . Then, the slider is incorporated into the welltrained recommendation system. While increasing the length of the slider (from to ), the recommended results (the best prediction performance model on the slider) before and after the increasing are observed. Until , the endpoint of the slider can be confirmed. Next, the original time series (e.g. in Fig. 2) will be separated into several fragments. Every fragment of the corresponding data can perform best on a certain model, and the model obtained the best score was determined by the well-trained recommendation system mentioned in Section 2.1. For example, as shown in Fig. 2, if the time series was separated into four fragments (this step might be more complicated in reality), , , the best scores could be achieved with the SVR, LSTM-RNN, Elman RNN and RF models, respectively. Finally, the four fragments are combined to calculate the accuracy of the entire time series . The detailed algorithm is presented in Table 3.
根据第 2.1.2 节所述情况,本节提出了 "行走滑动法"。这一想法源于测试和训练部分的数据分割过程[57]。如表 3 所示,首先创建一个数据滑块 ,滑块的长度默认为周期 。然后,将滑块纳入训练有素的推荐系统。在增加滑块长度(从 )的同时,观察 之前和 之后的推荐结果(滑块上的最佳预测性能模型)。直到 ,可以确认滑块的终点。接下来,原始时间序列(如图 2 中的 )将被分成几个片段。相应数据的每个片段都可以在某个模型上表现最佳,而获得最佳分数的模型则由第 2.1 节中提到的训练有素的推荐系统决定。例如,如图 2 所示,如果将时间序列分成四个片段(这个步骤在现实中可能更复杂), ,SVR、LSTM-RNN、Elman RNN 和 RF 模型分别可以获得最佳分数。最后,将四个片段合并计算整个时间序列的准确度 。详细算法见表 3。
This section proposed the "walking slide method" for cases in which the recommendation system fails. In this regard, the step size, walking rate, limit value of the start of step three in Table 3, and other
本节针对推荐系统失效的情况提出了 "行走滑动法"。在这方面,表 3 中的步长、行走速度、第三步开始的极限值以及其他
Table 2 表 2
The attribute evaluation type for the five properties.
五个属性的属性评估类型。
Attribute evaluation type Evaluation properties Reasons
cost type NRMSE smaller NRMSE could provide a larger benefit
smaller e values could provide a larger benefit
a larger N results in a larger acquisition cost for users
benefit type initial investment larger initial investment results in a larger cost for users
energy saving larger energy savings result in larger benefits for users
Table 3 表 3
Algorithm of the walking slide method.
行走滑动法的算法。
Algorithm: Walking slide method
算法:行走滑动法
Requirements: Series , recommendation system, data slider , period , step size , walking rate
要求:系列 ,推荐系统,数据滑块 ,周期 ,步幅 ,步行速率
1: randomly initialize
1: 随机初始化
2: while (incorporate the series into the recommendation system)
2:同时 (将该系列纳入推荐系统)
3: Update: (walking from the left to the right of the time series)
3: 更新: (从时间序列的左边走到右边)
4: print (in step )
4:打印 (在步骤 中 )
5: return (labeling the best performance period from to )
5:返回 (标注从 的最佳表现期)
6: while  6: 当
7: Update: (walking from the left to the right of the time series)
7: 更新: (从时间序列的左边走到右边)
8: print: (while in step
8: 打印: (同时在步骤
9: return: (labeling the best performance period from to )
9:返回: (标注从 的最佳表现期)
10: end while 10: 同时结束
11: end while 11: 同时结束
Calculate the overall accuracy of the series
计算系列的总体精度
is the result of the recommendation system (best prediction method),
是推荐系统的结果(最佳预测方法)、
is the prediction accuracy for the series to , and
是 至 系列的预测精度,以及
is the average value of
的平均值。
parameters used for this method are all set by the users. In Section 2, the proposed meta-learning strategy, Eqs. (6)-(10), and the "walking slide method" shown in Fig. 2 are first proposed in the study.
该方法所使用的参数均由用户设定。在第 2 节中,本研究首先提出了所建议的元学习策略、公式 (6)-(10) 以及图 2 所示的 "行走滑动法"。

3. Results and discussion
3.结果和讨论

All the case studies were performed on a personal computer, which was configured as follows, CPU: , Intel(R)Core(TM)i5-7400; RAM: 8.00 GB;operating system: Win7_64.
所有案例研究都是在个人电脑上进行的,个人电脑的配置如下:CPU: 英特尔(R)酷睿(TM)i5-7400;内存:8.00 GB;操作系统:Win7_64:操作系统:Win7_64。

3.1. Case description 3.1.案例描述

The operation data from five factory buildings (see in Table 4) were used in this study. These factories were equipped with comfort HVAC systems. The operation data from the entire year of 2018 were collected by an online energy consumption monitoring system, and the data sampling interval was .
本研究使用了五座工厂建筑(见表 4)的运行数据。这些工厂都配备了舒适的暖通空调系统。能耗在线监测系统收集了 2018 年全年的运行数据,数据采样间隔为
The data sets are listed in Table 5, and they include four categories:
表 5 列出了这些数据集,它们包括四个类别:
(1) time variables; (2) 17 types of outdoor meteorological parameters;
(1) 时间变量;(2) 17 种室外气象参数;
(3) operating parameters of the HVAC systems; and (4) price.
(3) 暖通空调系统的运行参数;以及 (4) 价格。
In summary, the entire dataset was composed of 40 cases (five building types, two data types, four time types), which included 20 cases for COP and 20 cases for cooling load. The training and testing data accounted for and of the total data respectively.
总之,整个数据集由 40 个案例组成(5 种建筑类型、2 种数据类型、4 种时间类型),其中包括 20 个 COP 案例和 20 个冷负荷案例。训练数据和测试数据分别占总数据的

3.1.1. Outlier handling 3.1.1.异常值处理

In an actual prediction process, the raw data obtained from the measurement equipment can have many null or outlier values, and such values may be due to the shutdown time or other unexpected reasons that caused the system to work under abnormal conditions. These null and outlier values will have a large impact on the regression performance of the statistical models, especially for some time series regression methods that rely heavily on historical data rather than on the correspondence between inputs and outputs, such as SARIMAX. The imputation of data in this paper consisted of two parts: outlier data elimination and missing value regression. Outlier data were selected as the data with values from 1.5 to 3 times the quartile spacing and were considered to be abnormal values [58]. The missing (null) values were regressed by the expectation-maximization (EM) algorithm and an RF algorithm. RF missing data algorithms are a useful approach for imputing missing data. This approach has not only the capacity for handling interactions, nonlinearity and mixed types of missing data but also the advantages of fast big data processing [59]. At the same time, the two most significant advantages of the EM algorithm are simplicity and stability [60] which are suitable for missing data regression with strong regularity.
在实际预测过程中,从测量设备获取的原始数据可能会有很多空值或离群值,这些值可能是由于关机时间或其他意外原因导致系统在异常情况下工作。这些空值和离群值会对统计模型的回归性能产生很大影响,特别是对于一些主要依赖历史数据而非输入和输出之间对应关系的时间序列回归方法,如 SARIMAX。本文的数据估算包括两部分:离群数据消除和缺失值回归。离群数据是指数值在四分位距 1.5 到 3 倍之间的数据,被认为是异常值[58]。采用期望最大化(EM)算法和 RF 算法对缺失(空)值进行回归。射频缺失数据算法是一种有用的缺失数据归因方法。这种方法不仅能够处理交互作用、非线性和混合类型的缺失数据,还具有快速大数据处理的优势[59]。同时,EM 算法最显著的两个优势是简单性和稳定性[60],适用于具有较强规律性的缺失数据回归。

3.1.2. Missing value regression
3.1.2.缺失值回归

As shown in Figs. 3 and 4, the null values in the raw data (hourly data of building 1) were regressed using an EM algorithm and an RF algorithm separately. Through observation, the variances of the COP and cooling load data regressed by the RF algorithm were lower than those regressed by the EM algorithm, and it was determined that the null and missing value could be regressed more smoothly by the RF algorithm, without introducing new data. The variance of the processed cooling load data was reduced from 416.407 (raw data) to 139.967 (RF
如图 3 和图 4 所示,分别使用 EM 算法和 RF 算法对原始数据(1 号楼的小时数据)中的空值进行了回归。通过观察,射频算法回归的 COP 和冷负荷数据的方差比 EM 算法回归的方差要小,可以确定射频算法在不引入新数据的情况下可以更顺利地回归空值和缺失值。处理后的冷却负荷数据方差从 416.407(原始数据)降至 139.967(RF
Fig. 2. Concept map for the walking slide method.
图 2.行走滑动法的概念图。
Table 4 表 4
Buildings and data investigated in this study.
本研究调查的建筑物和数据。
Factory buildings Area Types of compressors System capacity (kW) Chilled/Cooling water pump operation strategy Location
B1 6880 Screw chiller 1204 Fixed frequency, the flow matches with the compressors Guangzhou
B2 32,000 Centrifugal water cooling unit 5767.88 Shenzhen
B3 10,122 Screw chiller 1728.8 Shenzhen
B4 16,000 Centrifugal water cooling unit 2813.6 Shenzhen
B5 40,000 Centrifugal water cooling unit 7034.0 Shenzhen
Note: All of the buildings are located in hot summer/warm winter climate regions.
注:所有建筑都位于夏热冬暖地区。
Table 5 表 5
Data sets. 数据集。
Types Parameters Abbreviations of parameters
Time variables Week W
Weekend K
Day D
Hour
Outdoor meteorological parameters Air pressure PRS
Sea-level atmospheric pressure PRS_Sea
Maximum pressure PRS_Max
Minimum pressure PRS_Min
Maximum wind velocity WIN_S_Max
Extreme wind speed WIN_S_Inst_Max
Maximum wind speed wind direction WIN_D_INST_Max
2-minute average wind direction WIN_D_Avg_2mi
2-minute average wind speed WIN_S_Avg_2mi
Maximum wind speed wind direction WIN_D_S_Max
Outdoor dry-bulb temperature TEM
Maximum air temperature TEM_Max
Minimum air temperature TEM_Min
Relative humidity RHU
Vapor pressure VAP
Minimum relative humidity RHU_Min
One hour precipitation PRE_1h
Operating parameters of heating, ventilation, and air conditioning (HVAC) systems Cooling load /
Coefficient of performance (COP) of the HVAC system COP
The supply and return of chilled water temperature 1
Price Electricity price Price
Note: The 40 cases are all in 24-hour working factories; thus, the working schedule is treated as constant.
注:40 个案例均为 24 小时工作制工厂,因此工作时间安排被视为不变。
Fig. 3. Performance of the imputation of missing data (cooling load); (a) processed data set; (b) missing data region 1 ; (c) missing data region 2 .
图 3.缺失数据(冷却负荷)的估算结果;(a)处理过的数据集;(b)缺失数据区域 1;(c)缺失数据区域 2。
algorithm), and that of the processed COP data was reduced from 1.881 (raw data) to 1.248 (RF algorithm). Thus, the raw data from the other 39 cases were regressed by the algorithm.
算法),而经过处理的 COP 数据则从 1.881(原始数据)降至 1.248(RF 算法)。因此,其他 39 个病例的原始数据均采用 算法进行回归。
Fig. 4. Performance of the imputation of missing data (COP); (a) processed data set; (b) missing data region 1 ; (c) missing data region 2 .
图 4.缺失数据估算(COP)的性能;(a)处理过的数据集;(b)缺失数据区域 1;(c)缺失数据区域 2。

3.2. Performance of the forecasting in step one
3.2.第一步的预测结果

3.2.1. Basic features extraction
3.2.1.基本特征提取
As mentioned before, the PCA method was used to extract the
如前所述,我们使用 PCA 方法提取了

features from the raw data. In particular, the electricity price was considered in the regression for buildings , and 5 , because laddertype pricing was introduced and implemented in those cases. A coefficient color map for basic feature extraction shows that the outdoor dry-bulb temperature (TEM), relative humidity (RHU), extreme wind speed (Win_S_Inst_Max), price, and vapor pressure (VAP) are the principal components, as shown in Fig. 5. Those components explained at least of the entire original data set.
从原始数据中提取的特征。特别是,在对 和 5 号楼进行回归时考虑了电价,因为在这些情况下引入并实施了阶梯式电价。基本特征提取的系数颜色图显示,室外干球温度 (TEM)、相对湿度 (RHU)、极端风速 (Win_S_Inst_Max)、价格和蒸汽压力 (VAP) 是主成分,如图 5 所示。这些成分至少解释了整个原始数据集的

3.2.2. Performance evaluation of forecasting for different cases
3.2.2.不同情况下的预测性能评估

Figs. 6 and 7 present the 1 -hour ahead prediction accuracies (NRMSE) and computation times (e) from six different regression models for the 40 cases. It is clear that SARIMAX has the best accuracy of the six strategies, followed by SVR, LSTM-RNN, and RF. RF has the lowest computation cost, followed by LSTM-RNN. Besides, the Elman RNN approach results in the worst performance for accuracy, and the SARIMAX approach results in the worst performance for the computation time. LSTM-RNN is an appropriate approach and could achieve a balance between accuracy and computation time. Nevertheless, if the user does not require high accuracy, RF is the most efficient model.
图 6 和图 7 显示了针对 40 个案例的六种不同回归模型的提前 1 小时预测精度(NRMSE)和计算时间(e)。很明显,在六种策略中,SARIMAX 的准确率最高,其次是 SVR、LSTM-RNN 和 RF。RF 的计算成本最低,其次是 LSTM-RNN。此外,Elman RNN 方法的准确度表现最差,而 SARIMAX 方法的计算时间表现最差。LSTM-RNN 是一种合适的方法,可以在准确性和计算时间之间取得平衡。不过,如果用户对准确度要求不高,RF 是最有效的模型。
To compare the performance of the six regression models between the COP and cooling load data, the COP data and cooling load data were analyzed separately, as shown in Figs. 8 and 9. The SARIMAX approach also performs best on cases for COP and cases for cooling load. However, the performances for the cases for cooling loads are more stable than those for COP because the gap between the upper and lower quartiles is smaller for the cooling data. For the e values of the 20 cases for COP and the 20 cases for cooling load, the overall trends are the same for both groups. However, the models will consume less time for the COP data than for the cooling data. In addition, the same types of data show better performance than the mixed data set, as compared in Fig. 7. The performance for the accuracy and computation cost of the six prediction models are presented clearly in Figs. 6-9.
为了比较六个回归模型在 COP 和冷负荷数据之间的性能,分别对 COP 数据和冷负荷数据进行了分析,如图 8 和图 9 所示。SARIMAX 方法在 COP 和冷负荷情况下也表现最佳。不过,冷却负荷案例的性能比 COP 案例更稳定,因为冷却数据的上四分位数和下四分位数之间的差距较小。对于 20 个 COP 案例和 20 个冷负荷案例的 e 值,两组的总体趋势相同。不过,对于 COP 数据,模型消耗的时间要少于冷却数据。此外,如图 7 所示,相同类型的数据比混合数据集显示出更好的性能。图 6-9 清晰地展示了六个预测模型的准确性和计算成本表现。

3.2.3. Performance evaluation of forecasting with different time bases
3.2.3.使用不同时间基准进行预测的性能评估

In this set of forecasting data, the performances with different time bases are tested, as shown in Figs. 10 and 11. Data from the same source at four different time intervals were inputted into the six forecasting models (WNN, SARIMAX, Elman RNN, RF, LSTM-RNN, and SVR). The letter " " represents the day, " " represents the hour, " " represents the week (from Monday to Friday), and "K" represents the weekend (Saturday and Sunday). B1-B5 represents buildings 1 to 5. For example, B1-1 indicates the case of COP in building 1, and B1-2 indicates the case of the cooling load in building 1 . It is clear that the hourly data achieves the best performance, and the weekly data achieves the worst
如图 10 和图 11 所示,在这组预测数据中,测试了不同时间基准的性能。来自同一来源的四个不同时间间隔的数据被输入到六个预测模型(WNN、SARIMAX、Elman RNN、RF、LSTM-RNN 和 SVR)中。字母 " "代表日," "代表时," "代表周(从周一到周五),"K "代表周末(周六和周日)。B1-B5 表示 1 至 5 号楼。例如,B1-1 表示 1 号楼的 COP 情况,B1-2 表示 1 号楼的冷负荷情况。很明显,每小时数据的性能最好,而每周数据的性能最差。
Fig. 6. Box plot of the mean of the NRMSE for 40 cases.
图 6.40 个案例的 NRMSE 平均值箱形图。
Fig. 7. Box plot of the computation time (e) for 40 cases.
图 7.40 个案例的计算时间(e)箱形图。
Fig. 5. Coefficient color map for basic features extraction.
图 5.用于提取基本特征的系数颜色图。

Fig. 8. Box plot of the mean of the NRMSE for 20 cases for COP and 20 cases for cooling load.
图 8.20 个 COP 案例和 20 个冷却负荷案例的 NRMSE 平均值箱形图。
Fig. 9. Box plot of the mean of the computation time (e) for 20 cases for COP and 20 cases for cooling load.
图 9.20 种 COP 和 20 种冷却负荷情况下计算时间(e)平均值的方框图。
performance, as seen in Figs. 10 and 11. This phenomenon can be explained according to Shannon's theorem, the higher the sampling rate is, the closer the recovered waveform is to the original signal. The mean NRMSEs of the hourly, daily, weekly, and weekend data are 0.084, , and 0.240 , respectively. It is noteworthy that, the lengths of the data from B2-1, B3-1, B3-2, B4-2, B5-1 and B5-2 are too short to meet the requirement for the setting training accuracy, and thus, SARIMAX fails to predict these cases. To avoid misunderstandings, these values are set to 1.0 in Fig. 11(a). Thus, in this section, a summary of the performances of the forecasting for 40 cases with four time bases is provided.
如图 10 和图 11 所示。这一现象可以用香农定理来解释:采样率越高,恢复的波形越接近原始信号。每小时、每天、每周和周末数据的平均 NRMSE 分别为 0.084、 和 0.240。值得注意的是,B2-1、B3-1、B3-2、B4-2、B5-1 和 B5-2 的数据长度太短,不符合设置训练精度的要求,因此 SARIMAX 无法预测这些情况。为避免误解,图 11(a)中将这些值设为 1.0。因此,在本节中,我们将对 40 个案例的预测性能进行总结,并提供四个时间基准。

3.3. Performance of meta-level learning in step two
3.3.元级学习在第二步中的表现

3.3.1. Meta features 3.3.1.元特征

The same method (PCA) used in Section 3.2.1 was adopted to select principal meta-features from a meta-features pool (summarized from
采用第 3.2.1 节中使用的相同方法(PCA),从元特征库中筛选出主元特征(汇总自

[40,41], see in Table 6). Remarkably, the data of each case can be treated as a time series; as such, the power spectrum and average period of each series can be calculated by a fast Fourier transform.
[40,41],见表 6)。值得注意的是,每种情况的数据都可以作为时间序列处理;因此,每个序列的功率谱和平均周期都可以通过快速傅立叶变换计算出来。
Fig. 12 presents a coefficient color map for meta-features extraction. The meta-features "N_of_Max", "Min", "Mean", "f", "N_Loss", "ta", "N", and "Max_P_S" are selected as inputs. Those inputs explain at least of the entire data set in the meta-features pool.
图 12 展示了元特征提取的系数颜色图。元特征 "N_of_Max"、"Min"、"Mean"、"f"、"N_Loss"、"ta"、"N "和 "Max_P_S "被选为输入特征。这些输入至少可以解释元特征库中整个数据集的

3.3.2. Meta-learning performance
3.3.2.元学习绩效

In this section, the meta-learning performance will be evaluated by the cross-validation score and the success rate. The basic idea of crossvalidation is to create some restrictions between the training data and validation data [41]. Additionally, the cross-validation scores can characterize the variance between the cases. The success rate is the ratio of correctly recommended cases to the entire set of cases in the testing data. Table 7 presents the meta-learning performance in terms of the success rate, cross-validation score, and some of the parameters under the scenario of the best performance. The 40 cases are processed into three types, which are represented in different application scenarios. The first type concerns the cases for COP data, the second type concerns the cases for both COP and cooling load data, and the third type concerns the cases for cooling load data alone. Furthermore, the "GridSearchCV" [61] tool is used for the best performance parameter search in the training process.
本节将通过交叉验证得分和成功率来评估元学习性能。交叉验证的基本思想是在训练数据和验证数据之间建立一些限制[41]。此外,交叉验证得分还可以描述案例之间的差异。成功率是正确推荐的案例与测试数据中整个案例集的比率。表 7 从成功率、交叉验证得分和最佳性能情况下的一些参数方面展示了元学习的性能。这 40 个案例被分为三类,分别代表不同的应用场景。第一类涉及 COP 数据,第二类涉及 COP 和冷负荷数据,第三类仅涉及冷负荷数据。此外,"GridSearchCV"[61] 工具用于在训练过程中搜索最佳性能参数。
Table 7 shows that the cases for COP and cooling load alone achieve good results because the success rate remains above . However, the success rate for cases that combine COP and cooling load is only . Moreover, the cross-validation scores in the three conditions are very low and are approximately . This phenomenon reflects the inherent differences between the COP and cooling load data. Moreover, the cooling load data can result in a higher cross-validation score than the COP data. Thus, the cases for COP have a higher variance.
表 7 显示,仅 COP 和冷却负荷的案例取得了良好的结果,因为成功率保持在 以上。然而,结合 COP 和冷却负荷的情况下,成功率仅为 。此外,三种情况下的交叉验证得分都很低,约为 。这一现象反映了 COP 和冷负荷数据之间的内在差异。此外,冷却负荷数据可导致比 COP 数据更高的交叉验证得分。因此,COP 的案例方差较大。

3.3.3. Experiment on extreme gradient boosting (XGB) for recommendation exploration
3.3.3.极端梯度提升(XGB)推荐探索实验

As described in Section 3.3.2, the cross-validation scores in the three types of cases are very low. For this reason, an experiment using XGB for recommendation exploration is discussed in this section. Table 8 presents the success rate for the three types of cases. Compared with the proposed ANN, the XGB model did not achieve good scores, while the success rate of the cases with mixed COP and cooling load was lower than that of the cases with the other two types of data, which is the same result as that reached in Table 7.
如第 3.3.2 节所述,三种情况下的交叉验证得分都很低。因此,本节将讨论使用 XGB 进行推荐探索的实验。表 8 列出了三种情况下的成功率。与提出的 ANN 相比,XGB 模型的得分并不理想,而 COP 和冷负荷混合情况下的成功率低于其他两类数据情况下的成功率,这与表 7 得出的结果相同。

3.4. Performance of the user preferences in step three
3.4.用户偏好在第三步中的表现

For the convenience of expression, the performances of the first and the second stage of the user preferences (in step three, as expressed in Fig. 1) are presented together in this section.
为了表达方便,本节将合并介绍用户偏好的第一阶段和第二阶段(如图 1 所示的第三步)的表现。

3.4.1. Performance of the first stage of the user preferences
3.4.1.用户偏好第一阶段的表现

According to the data set and performance of the forecasting, a normalized decision matrix can be built. For example, using the hourly and daily data in "the best prediction score model" of B1and B2, normalized decision matrices and fuzzy comprehensive attribute values can be generated, and are presented in Tables 9-12. At the same time, some reasonable assumptions are made in the calculation process as follows:
根据数据集和预测结果,可以建立归一化决策矩阵 。例如,利用 B1 和 B2 的 "最佳预测得分模型 "中的小时数据和日数据,可生成归一化决策矩阵和模糊综合属性值 ,如表 9-12 所示。同时,在计算过程中还需做出如下合理假设:
  • The preference information is based on a survey of the owners of B1 .
    偏好信息是根据对 B1 车主的调查得出的
  • The lower limit value ( ) for COP is different in various HVAC systems, according to [62], and in various types of HVAC systems. The value of is 3.52 , and the value of is 2.3 .
    根据 [62],各种暖通空调系统和各种类型的暖通空调系统的 COP 下限值 ( ) 是不同的。 值为 3.52 , 值为 2.3 。
  • For the options that include both COP and cooling load prediction (options 3 and 5), the NRMSE is the mean value of the COP case and
    对于同时包含 COP 和制冷负荷预测的方案(方案 3 和 5),NRMSE 是 COP 情况下的平均值,而 COP 情况下的 NRMSE 是制冷负荷情况下的平均值。
Fig. 10. Bar graph of NRMSE; (a) daily basis forecasting; (b) hourly basis forecasting.
图 10.NRMSE 柱状图;(a)按日预报;(b)按小时预报。
cooling load case, and the e value is the sum of the COP case and cooling load case.
而 e 值是 COP 情况和冷却负荷情况的总和。
Tables 9-12 display the normalized decision matrices, and the sorting between the different options. Option 5 achieves the best fuzzy comprehensive attribute value not only for an hourly case in B1 but also for the hourly and daily cases in B2. It is noteworthy the four results are based on the same preference vector . This phenomenon shows that a specific building (case) has a specific comprehensive attribute for the COP or cooling load forecasting, even for the same user preference. Additionally, the use of a prediction system combined with a monitoring system is most often the best choice for users in terms of comprehensive properties. To further verify the phenomenon discussed above, the cases of the daily and hourly load for B3-B5 are utilized to calculate the comprehensive attribute values, with the results presented in Fig. 13 (B3-H represents the hourly case for B3). This figure shows that Option 5 (prediction system combined with a monitoring system) remains the best choice for most cases and that option 3 is the secondbest choice, while option 3 also achieves the best attribute value for cases B5-H.
表 9-12 显示了归一化决策矩阵以及不同方案之间的排序。方案 5 不仅在 B1 的每小时情况下,而且在 B2 的每小时和每天情况下,都获得了最佳模糊综合属性值。值得注意的是,这四种结果都基于相同的偏好向量 。这一现象表明,即使用户的偏好相同,特定建筑物(案例)的 COP 或冷负荷预测也具有特定的综合属性。此外,就综合属性而言,使用预测系统与监测系统相结合往往是用户的最佳选择。为了进一步验证上述现象,我们利用 B3-B5 的日负荷和小时负荷案例来计算综合属性值,结果如图 13 所示(B3-H 代表 B3 的小时案例)。从图中可以看出,在大多数情况下,方案 5(预测系统与监测系统相结合)仍然是最佳选择,而方案 3 则是次佳选择,同时方案 3 在 B5-H 的情况下也获得了最佳属性值。
Fig. 11. Bar graph of NRMSE; (a) weekend basis forecasting; (b) weekly basis forecasting.
图 11.NRMSE 柱状图;(a)周末预报;(b)每周预报。
Table 6 表 6
Meta-features pool. 元特征池。
Abbreviation Description
Length of series
Standard deviation of series
Kurtosis of series
Skewness of series
loss Missing data ratio of series
Min Minimum value of series
Max Maximum value of series
N_of_Min Number of minimal value of series
N_of_Max Number of maximal value of series
Mean Average value of series
Max_P_S Power spectrum: maximal value (by Fast Fourier Transform)
Average_P Average period of series (by Fast Fourier Transform)
ta Differential number to stationary series
Order of the moving average part of the model
Autoregressive model order
Sta Seasonal differential number to stationary series
sq Seasonal order of the moving average part of the model
Seasonal autoregressive model order
3.4.2. Performance of the second stage of the user preferences and walking slide method
3.4.2.用户偏好和行走滑动法第二阶段的表现
The second stage of the user preferences corresponds to step three, as shown in Fig. 1. Fig. 14 presents the performance of the best prediction model for the 40 cases. If the minimum accuracy requirements of the NRMSE are set to 0.2 (e.g., a limit value determined by the users), cases 25 and 27 cannot meet the accuracy requirement. Based on some attempts for cases 25 and 27 using the "walking slide method", the best NRMSE value can be reduced to less than 0.2 .
如图 1 所示,用户偏好的第二阶段与第三步相对应。图 14 显示了 40 个案例中最佳预测模型的性能。如果将 NRMSE 的最低精度要求设为 0.2(例如,由用户确定的极限值),则案例 25 和 27 无法满足精度要求。根据使用 "行走滑动法 "对案例 25 和 27 进行的一些尝试,最佳 NRMSE 值可降低到 0.2 以下。

3.5. Application case studies
3.5.应用案例研究

To validate the performance of other types of buildings and show the physical significance of the proposed strategy, a new case was applied. The information for the building is presented in Fig. 15. The same settings used in Sections 3.1-3.4 were adopted in the new case, and the validation data were processed at four different frequencies: hourly, daily, weekly, and weekend. The forecasting performance in steps one and two are presented in Table 13. It was obvious that the NRMSE of the best performance prediction model for the validation case was lower than that shown in Figs. 10 and 11. The reason could be that the data recorded from an office building is characterized by greater uncertainty than that from factory buildings. However, this aspect did not affect the credibility of the recommendation system, because the average success rate is the same as the value in Table 7 . This phenomenon indicated that the proposed strategy provided reliable
为了验证其他类型建筑的性能,并展示所建议策略的物理意义,我们应用了一个新的案例。该建筑的信息如图 15 所示。新案例采用了第 3.1-3.4 节中的相同设置,并以四种不同频率处理验证数据:每小时、每天、每周和周末。表 13 列出了第一步和第二步的预测性能。很明显,验证案例中性能最佳的预测模型的净信噪比低于图 10 和图 11。原因可能是办公楼记录的数据比工厂楼记录的数据具有更大的不确定性。不过,这并不影响推荐系统的可信度,因为平均成功率与表 7 中的值相同。这一现象表明,建议的策略提供了可靠的

Table 7 表 7
Meta-learning performance.
元学习性能。
Test Data
Cases for COP
prediction
Cases for COP and
cooling load
prediction
Cases for cooling
load prediction
Success rate
0.990 0.625 0.990
Cross validation 0.650 0.575 0.650
score 0.400
Best learning rate 0.100 0.500 0.800
Best momentum 0.900 0.200 800
Best batch size 10 300 50
Best epochs 100 60
Note: The optimizer of the ANN in this paper is SGD [54].
注:本文中 ANN 的优化器为 SGD [54]。
Table 8 表 8
Recommendation performance for the extreme gradient boosting (XGB) model.
极端梯度提升(XGB)模型的推荐性能。
Test Data
Cases for COP
prediction
Cases for COP and
cooling load prediction
Cases for cooling
load prediction
Success rate 0.75 0.50 0.75
Note: parameters in the XGB boost model were optimized by GridSearchCV [61].
注:XGB 提升模型中的参数由 GridSearchCV [61] 优化。
Table 9 表 9
Decision matrix S for hourly data of B1 in the "best prediction score model".
最佳预测得分模型 "中 B1 每小时数据的决策矩阵 S。
Properties
Option 1
Option 2
Option 3
Option 4
Option 5
NRMSE 0.000 1.000 0.500 0.000 0.500
0.802 0.000 1.000 0.802 1.000
0.427 0.427 1.000 0.429 0.000
Energy Saving 1.000 0.996 0.000 0.892 0.793
Initial investment 0.000 0.232 0.512 0.720 1.000
0.6810 .590
Table 10 表 10
Decision matrix S for daily data of B1 in the "best prediction score model".
最佳预测得分模型 "中 B1 每日数据的决策矩阵 S。
Properties
Option 1
Option 2
Option 3
Option 4
Option 5
NRMSE 1.000 0.000 0.500 1.000 0.500
e 0.000 0.031 0.609 0.000 1.000
0.000 0.000 0.941 0.059 1.000
Energy Saving 0.084 1.000 0.708 0.000 0.704
Initial investment 0.000 0.232 0.512 0.720 1.000
Fig. 12. Coefficients color map for meta-features extraction.
图 12.用于元特征提取的系数颜色图。
Table 11 表 11
Decision matrix S for hourly data of B2 in the "best prediction score model".
最佳预测得分模型 "中 B2 每小时数据的决策矩阵 S。
Properties
Option 1
Option 2
Option 3
Option 4
Option 5
NRMSE 0.000 1.000 0.500 0.000 0.500
0.466 0.000 1.000 0.466 1.000
0.000 0.000 0.990 0.010 1.000
Energy Saving 1.000 0.960 0.677 0.000 0.231
Initial investment 0.000 0.232 0.512 0.720 1.000
0.8190 .719
Table 12 表 12
Decision matrix S for daily data of B2 in the "best prediction score model".
最佳预测得分模型 "中 B2 每日数据的决策矩阵 S。
Properties
Option 1
Option 2
Option 3
Option 4
Option 5
NRMSE 1.000 0.000 0.500 1.000 0.500
0.000 0.716 0.849 0.151 1.000
0.000 0.000 0.868 0.132 1.000
Energy Saving 0.686 1.000 0.435 0.255 0.000
Initial investment 0.000 0.232 0.512 0.720 1.000
Fig. 13. Fuzzy comprehensive attribute values for typical cases.
图 13.典型案例的模糊综合属性值。
recommendations. 建议。
In real applications, the shorter test time and use of medium precision sensor equipment would reduce the investment, and users should obtain some recommendations before the installation of large-scale sensor equipment. As a result, the sensitivities of the length of the data (N) to the attribute values in step three should be analyzed. As shown in Fig. 16, by taking the hourly data for case B1 as an example, the attribute values for the five options under the conditions of , , and were calculated.
在实际应用中,缩短测试时间和使用中等精度的传感器设备将减少投资,用户应在安装大型传感器设备之前获得一些建议。因此,在第三步中应分析数据长度(N)对属性值的敏感性。如图 16 所示,以案例 B1 的每小时数据为例,计算了 条件下五个选项的属性值。
Fig. 16(a) shows that the attribute value difference between the original data set and various data segments was almost stable, especially for options 2 and 5. Additionally, Fig. 16(b) and (c) shows that the (approximately ) data segment obtained the best performance for RMSE.
图 16(a)显示,原始数据集 与不同数据段之间的属性值差异基本稳定,尤其是选项 2 和 5。此外,图 16(b)和(c)显示, (约为 )数据段的均方根误差表现最好。

3.6. Discussions 3.6.讨论情况

The strategy proposed in this paper is motivated by the need to fill the research gap between different user preferences (application
本文提出这一战略的动机是,需要填补不同用户偏好(应用软件、软件和服务)之间的研究空白。
Fig. 14. Performance of the best-predicted model for 40 cases.
图 14.40 个案例中最佳预测模型的性能。
scenarios) and recommendation systems. In this section, the forecasting results (Figs. 3-11), recommendation results (Tables 7 and 8), user preference results (Tables 9-12, Figs. 13 and 14), and validation results (Figs. 14-16) are discussed. In the forecasting, the electricity price will be an important input if a higher cumulative contribution rate (95%) is needed. The SARIMAX model achieves the best performance for most cases in the comparison models (including LSTM). This finding differs from previous research (Ref. [17] shows that LSTM is better than the SVR and DF models and does not consider the computation cost). This finding indicates that appropriate data processing can help traditional models to achieve better performance and neural network models do not have advantages in all cases. However, when considering the computation cost and accuracy together, LSTM was also considered as a good prediction model in this study. The performance of the COP and cooling load cases shows consistency (Figs. 8, 10 and 11) under the same grouping conditions, but the same type (simple application scenario) of data achieves better performance than mixed data (complex application scenario), which indicates that the COP and cooling load data not only have a clear internal mathematical connection, but also have differences, and the more complex the application scenario is, the lower the prediction accuracy is. For the recommendations aspect, good recommendation accuracy (over 99%; the cooling load-related cases performed better than the COP-related cases) can be obtained when the data is of the same type (just COP or cooling load, e.g., the simple application of options 1, 2, and 4); however, mixed data (e.g., the complex applications scenario of option 3) led to relatively low accuracy. This interesting result was discussed above, which proved that the application scenarios have an important impact on prediction accuracy. The results of the cross-validation part were not ideal for all cases, and this phenomenon also appeared in [40], which reported that metamodels in a cross-validation methodology did not lead to convincing results when neural networks, decision trees and support vector machines were implemented. However, when compared with that in [41], which achieved an accuracy of more than , the cross-validation score in this study is much lower. The data sets investigated in [41] and this study are all extracted from real cases, while the difference is that the data set in [41] was simulated by EnergyPlus. As a result, the reason for the lower cross-validation score may be that the data set in [41] is smoother. This guess can be partially validated in two aspects. On the one hand, the rankings of the five options in the performance for the first stage of the user preferences have evident differences even for the same preference, which shows the difference in the data. On the other hand, when comparing the validation success rate (50%) in [41] (using the data from a real commercial building at the Iowa Energy Center)
场景)和推荐系统。本节将讨论预测结果(图 3-11)、推荐结果(表 7 和表 8)、用户偏好结果(表 9-12、图 13 和图 14)以及验证结果(图 14-16)。在预测中,如果需要较高的累计贡献率(95%),电价将是一个重要的输入。在对比模型(包括 LSTM)中,SARIMAX 模型在大多数情况下都取得了最佳性能。这一发现不同于以往的研究(参考文献 [17] 显示 LSTM 优于 SVR 和 DF 模型,且未考虑计算成本)。这一发现表明,适当的数据处理可以帮助传统模型获得更好的性能,而神经网络模型并非在所有情况下都具有优势。不过,在综合考虑计算成本和准确性的情况下,LSTM 在本研究中也被认为是一种很好的预测模型。在相同分组条件下,COP 和冷负荷情况的性能表现出一致性(图 8、图 10 和图 11),但同类型(简单应用场景)数据的性能要好于混合数据(复杂应用场景),这说明 COP 和冷负荷数据不仅有明确的内在数学联系,也存在差异,应用场景越复杂,预测精度越低。在推荐方面,如果数据类型相同(只有 COP 或冷负荷,如方案 1、2 和 4 的简单应用),则可获得较高的推荐准确率(超过 99%;与冷负荷相关的案例比与 COP 相关的案例表现更好);但如果数据混合(如方案 3 的复杂应用场景),则准确率相对较低。上文讨论了这一有趣的结果,证明应用场景对预测准确率有重要影响。交叉验证部分的结果并不是在所有情况下都很理想,这种现象在文献[40]中也出现过,该文献报告说,当采用神经网络、决策树和支持向量机时,交叉验证方法中的元模型并没有带来令人信服的结果。不过,[41]的准确率超过了 ,与之相比,本研究的交叉验证得分要低得多。文献[41]和本研究调查的数据集都是从真实案例中提取的,不同之处在于文献[41]中的数据集是由 EnergyPlus 模拟的。因此,交叉验证得分较低的原因可能是[41]中的数据集更平滑。这一猜测可以从两个方面得到部分验证。 一方面,即使是相同的偏好,五个选项在用户偏好第一阶段的表现中的排名也有明显差异,这说明了数据的不同。另一方面,在比较 [41] 中的验证成功率(50%)(使用爱荷华能源中心一栋真实商业建筑的数据)时
building type: office 建筑类型:办公楼
building height:
建筑高度:
area:  地区:
conditioned area:
空调区域:
HVAC system: air source heat pump
暖通空调系统:空气源热泵
HVAC operation schedule: 8:00-18:30
暖通空调运行时间表:8:00-18:30
monitering period: 2018.01.01-2018.8.29
监测时间:2018.01.01-2018.8.29
chilled water temperature set point:
冷冻水温度设定点:
hot water temperature set point:
热水温度设定点:
zone temperature cooling set point:
区段温度冷却设定点:
zone temperature heating set point:
区段温度加热设定点:
monitoring frequency: minutes
监测频率: 分钟
data type: hourly, daily, weekly, weekend
数据类型:每小时、每天、每周、周末
Fig. 15. Information on the validation case.
图 15.验证案例信息。
with the cross-validation score in Table 7 in this study (cases for COP prediction: 0.65 , cases for COP and cooling load prediction: 0.575 , cases for cooling load prediction: 0.65 ), the results of this study in terms of recommendations are more reliable and slightly better than those in [41]. At the same time, the use of a prediction system combined with a monitoring system is the best choice for users at most of the time (threequarters of the cases). In the validation part, the XGB model showed the same rule as the ANNs; user preferences or application scenarios should be considered for HVAC data forecasting. Therefore, the meta-learning strategy and the proposed "walking slide method" are important tools to guarantee the accuracy of the recommended model and to meet the user requirements.
与本研究表 7 中的交叉验证得分(预测 COP 的案例:0.65,预测 COP 和冷负荷的案例:0.65,预测 COP 和冷负荷的案例:0.650.65 , COP 和冷负荷预测的案例:0.575 , 冷负荷预测的案例:0.5750.575 ,冷负荷预测的案例:0.65 ),本研究在建议方面的结果要好一些:0.65 ),本研究的建议结果更加可靠,略优于 [41] 的结果。同时,在大多数情况下(四分之三的案例),使用预测系统与监测系统相结合是用户的最佳选择。在验证部分,XGB 模型显示出与 ANNs 相同的规则;暖通空调数据预测应考虑用户偏好或应用场景。因此,元学习策略和提出的 "行走滑动法 "是保证推荐模型准确性和满足用户需求的重要工具。
In the study of the applications, another real office case was applied to the proposed strategy, and the average success rate proved that the recommendations were reliable. Different data segments were applied in the proposed strategy to test the sensitivity to the data length, and the results showed that the strategy was stable in the applications, while the (approximately ) data segment had the smallest length that ensures reasonable accuracy.
在应用研究中,另一个真实的办公室案例也应用了建议的策略,平均成功率证明建议是可靠的。为了测试对数据长度的敏感性,建议的策略应用了不同的数据段,结果表明该策略在应用中是稳定的,而 (约 )数据段的长度最小,可以确保合理的准确性。

4. Conclusions 4.结论

The meta-learning strategy proposed in this paper provides an effective method for making recommendations for cooling load and COP prediction in an HVAC system based on user preference. Forty cases are predicted by six different regression models. Users can select the comprehensive optimal option by using the first stage of the user preferences, and if the score of "the best model" in step two cannot meet the user requirements (second stage of the user preferences), the "walking slide method" can be utilized. The main conclusions are as follows:
本文提出的元学习策略提供了一种有效的方法,可根据用户偏好为暖通空调系统的冷却负荷和 COP 预测提供建议。通过六个不同的回归模型对 40 个案例进行了预测。用户可以通过第一阶段的用户偏好选择综合最优的方案,如果第二步中 "最佳模型 "的得分不能满足用户的要求(第二阶段的用户偏好),则可以使用 "行走滑动法"。主要结论如下:
(a) In general, the comprehensive optimal option can be obtained through the meta-learning strategy by using the first preference of the users. Moreover, the preference information for a certain case is determined by the users themselves.
(a) 一般来说,通过元学习策略可以利用用户的第一偏好获得综合最优方案。此外,某个案例的偏好信息是由用户自己决定的。
(b) If the meta-learning strategy fails, then the "walking slide method" has proven to be effective when facing some extremely special cases, and the accuracy of the prediction can be increased.
(b) 如果元学习策略失败,那么 "行走滑动法 "在面对一些极其特殊的情况时被证明是有效的,可以提高预测的准确性。
(c) The three-step (as shown in Fig. 1) meta-learning strategy improves the traditional ANN-based recommendation systems and enhances the generalizability of the system.
(c) 三步元学习策略(如图 1 所示)改进了传统的基于 ANN 的推荐系统,增强了系统的普适性。
(d) The prediction accuracy and recommendation success rate will become lower along with increasing the complexity of the application scenario.
(d) 预测准确率和推荐成功率会随着应用场景复杂程度的增加而降低。
This study improves ANN-based meta-learning recommendation systems and enhances the generalizability of the systems. Accordingly, this study provides a scientific basis for energy prediction applications based on user preferences. This study has the limitation of seldom considering the combination of energy-saving with thermal comfort in options optimization. However, the proposed strategy and case studies adequately demonstrate the applicability of the recommendation system, and thermal comfort can easily be incorporated into the decision matrix. This limitation will be improved in future work, and a more flexible energy utilization strategy will be provided which considers the user demand side.
这项研究改进了基于 ANN 的元学习推荐系统,提高了系统的通用性。因此,本研究为基于用户偏好的能源预测应用提供了科学依据。本研究存在局限性,即在选项优化中很少考虑节能与热舒适的结合。不过,所提出的策略和案例研究充分证明了推荐系统的适用性,而且热舒适性可以很容易地纳入决策矩阵。这一局限性将在今后的工作中加以改进,并提供一种考虑用户需求方面的更灵活的能源利用策略。

CRediT authorship contribution statement
CRediT 作者贡献声明

Wenqiang Li: Conceptualization, Methodology, Software, Data curation, Investigation, Writing - original draft. Guangcai Gong: Conceptualization, Methodology, Supervision. Houhua Fan: Data curation, Investigation. Pei Peng: Writing - review & editing. Liang Chun: Writing - review & editing.
李文强:构思、方法论、软件、数据整理、调查、写作--原稿。龚广才构思、方法、指导。范厚华:数据整理、调查。彭培:写作--审阅和编辑。梁春写作--审阅和编辑。

Declaration of Competing Interest
竞争利益声明

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to
作者声明,他们没有任何已知的竞争性经济利益或个人关系,这些利益或个人关系可能会导致
Table 13 表 13
Performances for the validation case in steps one and two.
步骤一和步骤二中验证案例的性能。
Validation case The best performance prediction model NRMSE for the best model Recommended best prediction model Average success rate
Hourly SARIMA 0.422 SVR
Daily SVR 1.376 SVR
Weekly SVR 1.102 SVR
Weekend SARIMA 0.871 SARIMA
Fig. 16. The sensitivities tested for the scale of the data set; (a) attribute value differences between the original data and other data segments; (b) best performing RMSE for the cooling load with different data segments; (c) best performing RMSE for the COP with the different data segment.
图 16.对数据集规模进行的敏感性测试;(a)原始数据 与其他数据段之间的属性值差异;(b)不同数据段冷却负荷的最佳 RMSE;(c)不同数据段 COP 的最佳 RMSE。
influence the work reported in this paper.
影响本文所报告的工作。

Acknowledgments 致谢

The National Natural Science Foundation of China (Grant no.
国家自然科学基金资助项目(批准号:No.

Appendix A. Data-driven models
附录 A.数据驱动模型

A.1. ARIMA and SARIMA
A.1.ARIMA 和 SARIMA

Eqs. (A.1)-(A.3) express the algorithm for ARIMA.
公式 (A.1)-(A.3) 表达了 ARIMA 算法。
: denotes the order of the autoregressive part of the model.
表示模型自回归部分的阶次。
: defines the degree of differencing.
:定义差分的程度。
  1. and National Key Technology Support Program (Grant no. 2015BAJ03B00) provided financial assistance for this study.
    国家重点科技支撑计划(批准号:2015BAJ03B00)为本研究提供了资助。
: indicates the order of the moving average part of the model.
表示模型移动平均部分的阶次。
In the above, is the true value in the time series, is the forecasting error in the moment. and are the autocorrelation coefficients at the time , and is a constant term. is autoregressive part order, is the differencing times, is the moving average part order. is the new time
其中, 是时间序列中的真实值, 时刻的预测误差。 时刻的自相关系数, 是常数项。 是自回归部分阶次, 是差分次数, 是移动平均部分阶次。 是新时间。
Fig. A1. The structural chart of the Elman recurrent neural network (RNN).
图 A1.Elman 循环神经网络(RNN)结构图。

series after differencing. The SARIMA model will have another seasonal part as compared with the ARIMA model.
差分后的序列。与 ARIMA 模型相比,SARIMA 模型会有另一个季节性部分。

A.2. Support vector regression
A.2.支持向量回归

Support vector machine (SVR) is a type of supervised learning method. The more information about SVR can be found in [63].
支持向量机(SVR)是一种监督学习方法。有关 SVR 的更多信息,请参阅 [63]。

A.3. Wavelet analysis and WNN
A.3.小波分析和 WNN

A detailed introduction about the wavelet analysis method can be found in [64]. Furthermore, some researchers combine a wavelet analysis method with neural network into a wavelet neural network (WNN). The detailed information about WNN can be found in [65].
有关小波分析方法的详细介绍,请参见文献 [64]。此外,一些研究人员将小波分析方法与神经网络相结合,形成了小波神经网络(WNN)。有关 WNN 的详细信息,请参见文献 [65]。

A.4. Elman  A.4.埃尔曼

The structural chart of the Elman RNN is shown in Fig. A.1. The Elman RNN is composed of the input, hidden, adherence, and output layers. The adherence layer has a function of remembering the previous output. The whole workflow can be expressed by Eqs. (A.4)-(A.6).
Elman RNN 的结构图如图 A.1 所示。Elman RNN 由输入层、隐藏层、依附层和输出层组成。坚持层具有记忆前一次输出的功能。整个工作流程可用公式 (A.4)-(A.6) 表示。
Here, is the output node vector, is the median node element vector, is the feedback state vector, and is the input vector. Additionally, and are connection weights. is the transfer function for the output neurons, and is the transfer function for the median neurons.
是输出节点向量, 是中值节点元素向量, 是反馈状态向量, 是输入向量。此外, 是连接权重。 是输出神经元的传递函数, 是中值神经元的传递函数。

A.5.  A.5.

The essence of RNN is to connect the hidden layers and span the time point. However, there is a problem of gradient disappearance in the traditional RNN. Thus, the LSTM model is introduced to RNN (see Fig. A2), which was proposed in [66],
RNN 的本质是连接隐层和跨越时间点。然而,传统的 RNN 存在梯度消失的问题。因此,在 RNN 中引入了 [66] 提出的 LSTM 模型(见图 A2)、
The working principle of LSTM is as follows:
LSTM 的工作原理如下:
(a) Forget gate: determines how much of the cell state of the previous time is reserved for the current time .
(a) 遗忘门:决定为当前时间 保留多少上一时间的单元状态
(b) Input gate: determines how many inputs of the current time network are saved to the cell state .
(b) 输入门:决定将当前时间网络的多少个输入 保存到单元状态
(c) Update the previous cell state to the new cell state .
(c) 将先前的单元状态 更新为新的单元状态
(d) Output gate: controls how many cell states output to the current output value of LSTM.
(d) 输出门:控制多少个单元状态 输出到 LSTM 的当前输出值
where are weight matrices, are bias vectors, .) is the sigmoid function, and .) is the hyperbolic tangent function.
其中 为权重矩阵, 为偏置向量, .)为 sigmoid 函数, .)为双曲正切函数。
Fig. A2. Structure of LSTM.
图 A2.LSTM 的结构。

References 参考资料

[1] Wang H, Zhang R, Peng J, Wang G, Liu Y, Jiang H, et al. GPNBI-inspired MOSFA for Pareto operation optimization of integrated energy system. Energy Convers Manag 2017;151:524-37.
[1] Wang H, Zhang R, Peng J, Wang G, Liu Y, Jiang H, et al. GPNBI-inspired MOSFA for Pareto operation optimization of integrated energy system.Energy Convers Manag 2017;151:524-37.
[2] ürge-Vorsatz D, Cabeza LF, Serrano S, Barreneche C, Petrichenko K. Heating and cooling energy trends and drivers in buildings. Renew Sust Energy Rev 2015;41:85-98.
[2] ürge-Vorsatz D,Cabeza LF,Serrano S,Barreneche C,Petrichenko K.建筑物的供暖和制冷能源趋势及驱动因素。Renew Sust Energy Rev 2015;41:85-98.
[3] Dahanayake KWDKC, Chow CL. Studying the potential of energy saving through vertical greenery systems: Using EnergyPlus simulation program. Energy Build 2017;138:47-59.
[3] Dahanayake KWDKC,Chow CL.研究垂直绿化系统的节能潜力:使用EnergyPlus模拟程序。Energy Build 2017;138:47-59.
[4] Alibabaei N, Fung AS, Raahemifar K. Development of Matlab-TRNSYS co-simulator for applying predictive strategy planning models on residential house HVAC system. Energy Build 2016;128:81-98.
[4] Alibabaei N, Fung AS, Raahemifar K. Matlab-TRNSYS 协同模拟器的开发,用于在住宅暖通空调系统中应用预测策略规划模型。Energy Build 2016;128:81-98.
[5] Kokogiannakis G. History and development of validation with the ESP-r simulation program. Build Enviro 2008;43:601-9.
[5] Kokogiannakis G. 使用 ESP-r 模拟程序进行验证的历史和发展。Build Enviro 2008;43:601-9.
[6] Li Z, Liu T. Improved particle filter based soft sensing of room cooling load. Energy Build 2017;142:56-61.
[6] Li Z,Liu T.基于改进粒子滤波器的房间冷负荷软传感.Energy Build 2017;142:56-61.
[7] Amral N, Ozveren CS, King D. Short term load forecasting using Multiple Linear Regression. In: International universities power engineering conference; 2007.
[7] Amral N, Ozveren CS, King D. 使用多元线性回归进行短期负荷预测。In:国际大学电力工程会议;2007 年。
[8] Paliwal KK, Basu A. A speech enhancement method based on Kalman filtering. In: IEEE international conference on acoustics, speech, & signal processing; 1987.
[8] Paliwal KK,Basu A. 基于卡尔曼滤波的语音增强方法。In:IEEE 国际声学、语音及信号处理会议,1987 年。
[9] Pankratz A. Forecasting with univariate Box-Jenkins models: concepts and cases; 2008.
[9] Pankratz A. 单变量 Box-Jenkins 模型预测:概念与案例;2008 年。
[10] Box GEP, Pierce D. Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. Publ Am Stat Assoc 1970;65:1509-26.
[10] Box GEP, Pierce D. Distribution of residual autocorrelations in autoregressive-integrated moving average time series models.Publ Am Stat Assoc 1970;65:1509-26.
[11] Conejo AJ, Plazas MA, Espinola R, Molina AB. Day-ahead electricity price forecasting using the wavelet transform and ARIMA models. IEEE T Power Syst 2005;20:1035-42.
[11] Conejo AJ, Plazas MA, Espinola R, Molina AB.使用小波变换和 ARIMA 模型预测日前电价。IEEE T Power Syst 2005; 20:1035-42.
[12] Zhang BL, Dong ZY. An adaptive neural-wavelet model for short term load forecasting. Electr Pow Syst Res 2001;59:121-9.
[12] Zhang BL, Dong ZY.用于短期负荷预测的自适应神经小波模型。Electr Pow Syst Res 2001;59:121-9.
[13] Nie H, Liu G, Liu X, Yong W. Hybrid of ARIMA and SVMs for short-term load forecasting. Energy Procedia 2012;16:1455-60.
[13] Nie H, Liu G, Liu X, Yong W. Hybrid of ARIMA and SVMs for short-term load forecasting.Energy Procedia 2012;16:1455-60.
[14] Kotzur L, Markewitz P, Robinius M, Stolten D. Impact of different time series aggregation methods on optimal energy system design. Renew Energ 2018;117:474-87.
[14] Kotzur L, Markewitz P, Robinius M, Stolten D. 不同时间序列聚合方法对优化能源系统设计的影响。Renew Energ 2018;117:474-87.
[15] Box G. Box and Jenkins: Time Series Analysis, Forecasting and Control; 2013.
[15] Box G. Box and Jenkins:时间序列分析、预测与控制》,2013 年。
[16] Fan C, Xiao F, Zhao Y. A short-term building cooling load prediction method using deep learning algorithms. Appl Energy 2017;195:222-33.
[16] Fan C, Xiao F, Zhao Y. 一种利用深度学习算法的短期建筑冷负荷预测方法.Appl Energy 2017;195:222-33.
[17] Xu C, Chen H, Wang J, Guo Y, Yuan Y. Improving prediction performance for indoor temperature in public buildings based on a novel deep learning method. Build Enviro 2019;148:128-35.
[17] Xu C, Chen H, Wang J, Guo Y, Yuan Y. 基于新型深度学习方法提高公共建筑室内温度预测性能。Build Enviro 2019;148:128-35.
[18] Fan C, Wang J, Gang W, Li S. Assessment of deep recurrent neural network-based strategies for short-term building energy predictions. Appl Energy 2019;236:700-10.
[18] Fan C, Wang J, Gang W, Li S. 基于深度递归神经网络的短期建筑能耗预测策略评估。Appl Energy 2019;236:700-10.
[19] Naji S, Keivani A, Shamshirband S, Alengaram UJ, Jumaat MZ, Mansor Z, et al. Estimating building energy consumption using extreme learning machine method. Energy 2016;97:506-16.
[19] Naji S, Keivani A, Shamshirband S, Alengaram UJ, Jumaat MZ, Mansor Z, et al.Energy 2016;97:506-16.
[20] Chen Y, Peng X, Chu Y, Li W, Wu Y, Ni L, et al. Short-term electrical load forecasting using the Support Vector Regression (SVR) model to calculate the demand response baseline for office buildings. Appl Energy 2017;195:659-70.
[20] Chen Y, Peng X, Chu Y, Li W, Wu Y, Ni L, et al. 使用支持向量回归(SVR)模型计算办公楼需求响应基线的短期电力负荷预测。Appl Energy 2017;195:659-70.
[21] Jain RK, Smith KM, Culligan PJ, Taylor JE. Forecasting energy consumption of multi-family residential buildings using support vector regression: Investigating the impact of temporal and spatial monitoring granularity on performance accuracy. Appl Energy 2014;123:168-78.
[21] Jain RK, Smith KM, Culligan PJ, Taylor JE.使用支持向量回归预测多户住宅建筑能耗:调查时间和空间监测粒度对性能准确性的影响。Appl Energy 2014;123:168-78.
[22] Raza MQ, Khosravi A. A review on artificial intelligence based load demand forecasting techniques for smart grid and buildings. Renew Sust Energy Rev 2015;50:1352-72.
[22] Raza MQ, Khosravi A. A review on artificial intelligence based load demand forecasting techniques for smart grid and buildings.Renew Sust Energy Rev 2015; 50:1352-72.
[23] Fu G. Deep belief network based ensemble approach for cooling load forecasting of air-conditioning system. Energy 2018;148. S0360544218302081.
[23] Fu G. 基于深度信念网络的空调系统冷负荷预测集合方法。Energy 2018;148.S0360544218302081.
[24] Platon R, Dehkordi VR, Martel J. Hourly prediction of a building's electricity consumption using case-based reasoning, artificial neural networks and principal component analysis. Energy Build 2015;92:10-8.
[24] Platon R, Dehkordi VR, Martel J. 使用基于案例的推理、人工神经网络和主成分分析对建筑物耗电量进行每小时预测。Energy Build 2015;92:10-8.
[25] Deb C, Eang LS, Yang J, Santamouris M. Forecasting diurnal cooling energy load for institutional buildings using Artificial Neural Networks. Energy Build 2016;121:284-97.
[25] Deb C, Eang LS, Yang J, Santamouris M. 使用人工神经网络预测机构建筑的昼夜制冷能耗负荷。Energy Build 2016;121:284-97.
[26] Feng YU, Xiaozhong XU. A short-term load forecasting model of natural gas based on optimized genetic algorithm and improved BP neural network. Appl Energy 2014;134:102-13
[26] 余锋,徐晓钟。基于优化遗传算法和改进BP神经网络的天然气短期负荷预测模型。应用能源》,2014;134:102-13
[27] He FF, Zhou JZ, Feng ZK, Liu GB, Yang YQ. A hybrid short-term load forecasting model based on variational mode decomposition and long short-term memory networks considering relevant factors with Bayesian optimization algorithm. Appl Energy 2019;237:103-16.
[27] He FF, Zhou JZ, Feng ZK, Liu GB, Yang YQ.基于变模分解和长短期记忆网络的混合短期负荷预测模型(考虑相关因素)与贝叶斯优化算法.Appl Energy 2019;237:103-16.
[28] He F, Zhou J, Feng Z-k, Liu G, Yang Y. A hybrid short-term load forecasting model based on variational mode decomposition and long short-term memory networks considering relevant factors with Bayesian optimization algorithm. Appl Energy 2019.
[28] He F, Zhou J, Feng Z-k, Liu G, Yang Y.基于变模分解和长短期记忆网络的混合短期负荷预测模型(考虑相关因素)与贝叶斯优化算法.Appl Energy 2019.
[29] He F, Zhou J, Mo L, Feng K, Liu G, He Z. Day-ahead short-term load probability density forecasting method with a decomposition-based quantile regression forest.
[29] He F, Zhou J, Mo L, Feng K, Liu G, He Z. 基于分解的量子回归森林的日前短期负荷概率密度预测方法。

Appl Energy 2020;262:114396.
[30] Wate P, Iglesias M, Coors V, Robinson D. Framework for emulation and uncertainty quantification of a stochastic building performance simulator. Appl Energy 2020;258.
[30] Wate P, Iglesias M, Coors V, Robinson D. 随机建筑性能模拟器的仿真和不确定性量化框架。Appl Energy 2020;258.
[31] Zhoua Y, Zheng S. Machine-learning based hybrid demand-side controller for highrise office buildings with high energy flexibilities. Appl Energy 2020;262.
[31] Zhoua Y, Zheng S. 基于机器学习的高层办公建筑高能效混合需求侧控制器。Appl Energy 2020;262.
[32] He Y, Qin Y, Wang S, Wang X, Wang C. Electricity consumption probability density forecasting method based on LASSO-Quantile Regression Neural Network. Appl Energy 2019;233:565-75.
[32] He Y, Qin Y, Wang S, Wang X, Wang C. 基于LASSO-Quantile回归神经网络的用电概率密度预测方法.Appl Energy 2019;233:565-75.
[33] Wang L, Lee EWM, Yuen RKK, Feng W. Cooling load forecasting-based predictive optimisation for chiller plants. Energy Build 2019;198:261-74.
[33] Wang L, Lee EWM, Yuen RKK, Feng W. 基于冷却负荷预测的冷水机组预测优化。Energy Build 2019; 198:261-74.
[34] May R, Dandy G, Maier H. Review of input variable selection methods for artificial neural networks. Artificial Neural Networks-Methodological Advances and Biomedical Applications; 2011:19-44.
[34] May R, Dandy G, Maier H. 人工神经网络输入变量选择方法综述。人工神经网络-方法学进展与生物医学应用》,2011:19-44.
[35] Timmerman ME. Principal component analysis, 2nd ed. In: Jolliffe IT, editor. J Am Stat Assoc, vol. 98; 2003. p. 1082-3.
[35] Timmerman ME.In: Jolliffe IT, editor.J Am Stat Assoc, vol. 98; 2003. p. 1082-3.
[36] Markovic R, Grintal E, Woelki D, Frisch J, van Treeck C. Window opening model using deep learning methods. Build Enviro 2018;145:319-29.
[36] Markovic R, Grintal E, Woelki D, Frisch J, van Treeck C. 使用深度学习方法的开窗模型。Build Enviro 2018;145:319-29.
[37] Ludermir T. Using machine learning techniques to combine forecasting methods. In: Australian joint conference on advances in artificial intelligence; 2004.
[37] Ludermir T. Using machine learning techniques to combine forecasting methods.In:Australian joint conference on advances in artificial intelligence; 2004.
[38] Heidelberg SB. Meta-learning in computational intelligence; 2011.
[38] Heidelberg SB.计算智能中的元学习;2011 年。
[39] Vilalta R, Drissi Y. A perspective view and survey of meta-learning; 2002.
[39] Vilalta R, Drissi Y.元学习的透视与调查》,2002 年。
[40] Lemke C, Gabrys B. Meta-learning for time series forecasting and forecast combination. Neurocomputing 2010;73:2006-16.
[40] Lemke C, Gabrys B. 用于时间序列预测和预测组合的元学习。Neurocomputing 2010;73:2006-16.
[41] Cui C, Wu T, Hu M, Weir JD, Li X. Short-term building energy model recommendation system: a meta-learning approach. Appl Energy 2016;172:251-63.
[41] Cui C, Wu T, Hu M, Weir JD, Li X.短期建筑能耗模型推荐系统:一种元学习方法。Appl Energy 2016;172:251-63.
[42] Chirarattananon Surapong, Taveekun Juntakan. An OTTV-based energy estimation model for commercial buildings in Thailand. Energy Build 2004;36:680-9.
[42] Chirarattananon Surapong,Taveekun Juntakan。基于 OTTV 的泰国商业建筑能耗估算模型。能源建设,2004;36:680-9。
[43] Tarsitano A, Amerise IL. Short-term load forecasting using a two-stage sarimax model. Energy 2017;133:108-14.
[43] Tarsitano A, Amerise IL.使用两阶段 sarimax 模型的短期负荷预测。Energy 2017;133:108-14.
[44] Avci M. Demand response-enabled model predictive HVAC load control in buildings using real-time electricity pricing. Dissertations & Theses - Gradworks; 2013.
[44] Avci M. Demand Response-enabled Model predictive HVAC load control in buildings using real-time electricity pricing.Dissertations & Theses - Gradworks; 2013.
[45] Nguyen HT, Nguyen D, Le LB. Home energy management with generic thermal dynamics and user temperature preference. In: IEEE international conference on smart grid communications; 2013.
[45] Nguyen HT,Nguyen D,Le LB。具有通用热动力学和用户温度偏好的家庭能源管理。In:IEEE 智能电网通信国际会议;2013 年。
[46] Chan PML, Hu YF, Sheriff RE. Implementation of fuzzy multiple objective decision making algorithm in a heterogeneous mobile environment. In: Wireless Communications & networking conference; 2002.
[46] Chan PML,Hu YF,Sheriff RE.异构移动环境中模糊多目标决策算法的实现。In:无线通信与网络会议;2002 年。
[47] Baky IA. Interactive TOPSIS algorithms for solving multi-level non-linear multiobjective decision-making problems. Appl Math Model 2014;38:1417-33.
[47] Baky IA.解决多层次非线性多目标决策问题的交互式 TOPSIS 算法。Appl Math Model 2014;38:1417-33.
[48] Niu D. A study on wavelet neural network prediction model of time series. Syst EngTheory Practice 1999:89-92.
[48] Niu D. 小波神经网络时间序列预测模型研究.Syst EngTheory Practice 1999:89-92.
[49] Krichene E, Masmoudi Y, Alimi AM, Abraham A, Chabchoub H. Forecasting using Elman recurrent neural network; 2016.
[49] Krichene E, Masmoudi Y, Alimi AM, Abraham A, Chabchoub H..使用 Elman 循环神经网络进行预测;2016 年。
[50] Louangrath PI. Correlation coefficient according to data classification. Social Science Electronic Publishing; 2014.
[50] Louangrath PI.根据数据分类的相关系数。社会科学电子出版社;2014.
[51] Jolliffe IT. Principal component analysis. J Marketing Res 2002;87:513,
[51] Jolliffe IT.主成分分析。J Marketing Res 2002;87:513、
[52] Racine J. Consistent cross-validatory model-selection for dependent data: hv -block cross-validation. J Economet 2004;99:39-61.
[52] Racine J. Consistent cross-validatory model-selection for dependent data: hv -block crossvalidation.J Economet 2004;99:39-61.
[53] Kalogirou SA. Artificial neural networks in renewable energy systems applications: a review. Renew Sust Energy Rev 2001;5:373-401.
[53] Kalogirou SA.人工神经网络在可再生能源系统中的应用:综述。Renew Sust Energy Rev 2001;5:373-401.
[54] Neyshabur B, Salakhutdinov R, Srebro N. Path-SGD: path-normalized optimization in deep neural Networks. International conference on neural information processing systems. 2015.
[54] Neyshabur B, Salakhutdinov R, Srebro N. Path-SGD: path-normalized optimization in deep neural Networks.神经信息处理系统国际会议。2015.
[55] Cosma AC, Simha R. Machine learning method for real-time non-invasive prediction of individual thermal preference in transient conditions. Build Enviro 2019;148:372-83.
[55] Cosma AC,Simha R. 瞬态条件下个人热偏好实时无创预测的机器学习方法。Build Enviro 2019;148:372-83.
[56] Cai M, Pipattanasomporn M, Rahman S. Day-ahead building-level load forecasts using deep learning vs. traditional time-series techniques. Appl Energy 2019;236:1078-88.
[56] Cai M, Pipattanasomporn M, Rahman S. 使用深度学习与传统时间序列技术的建筑物级日前负荷预测。Appl Energy 2019;236:1078-88.
[57] Picard RR, Berk KN. Data splitting. Am Stat 1990;44:140-7.
[57] Picard RR, Berk KN.数据分割。Am Stat 1990;44:140-7.
[58] Clark CH, Hussein M, Tsang Y, Thomas R, Nisbet A. A multi-institutional dosimetry audit of rotational intensity-modulated radiotherapy. Radiother Oncol 2014;113.
[58] Clark CH、Hussein M、Tsang Y、Thomas R、Nisbet A. 旋转调强放疗的多机构剂量测量审计。Radiother Oncol 2014;113.
[59] Carranza EJM, Laborte AG. Random forest predictive modeling of mineral prospectivity with small number of prospects and data with missing values in Abra (Philippines). Comput Geosci 2015;74:60-70.
[59] Carranza EJM, Laborte AG.菲律宾阿布拉地区少量矿产远景和缺失数据的随机森林矿产远景预测模型。Comput Geosci 2015; 74:60-70.
[60] Lauritzen SL. The EM algorithm for graphical association models with missing data; 1995.
[60] Lauritzen SL.有缺失数据的图形关联模型的 EM 算法;1995 年。
[61] Lavalle SM, Branicky MS. On the relationship between classical grid search and probabilistic roadmaps. Int J Robot Res 2003;23:673-92.
[61] Lavalle SM, Branicky MS.经典网格搜索与概率路线图之间的关系.Int J Robot Res 2003;23:673-92.
[62] Chinese standard. GB/T 50785-2012 Evaluation standard for indoor thermal environment in civil building. Beijing: China Architecture and Building Press; 2012.
[62] 中国标准。GB/T 50785-2012 《民用建筑室内热环境评价标准》。北京:中国建筑工业出版社;2012。
[63] Joachims T. Text categorization with support vector machines: learning with many relevant features. In: Proc conference on machine learning; 1998.
[63] Joachims T. 使用支持向量机进行文本分类:利用许多相关特征进行学习。In:Proc conference on machine learning; 1998.
[64] Resnikoff HL, Jr ROW. Wavelet analysis; 1998.
[64] Resnikoff HL, Jr ROW.小波分析》,1998 年。
[65] Chen Y, Bo Y, Dong J. Time-series prediction using a local linear wavelet neural network. Neurocomputing 2006;69:449-65.
[65] Chen Y, Bo Y, Dong J. 使用局部线性小波神经网络的时间序列预测.神经计算,2006;69:449-65.
[66] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput 1997:1735-80.
[66] Hochreiter S, Schmidhuber J. Long short-term memory.Neural Comput 1997:1735-80.

    • Corresponding author. 通讯作者:
    E-mail address: gcgong@hnu.edu.cn (G. Gong).
    电子邮件地址:gcgong@hnu.edu.cn (G. Gong)。