2024_04_14_e248b112c1666b34996cg

Reinforcement learning framework for freight demand forecasting to support operational planning decisions
用于货运需求预测的强化学习框架，以支持运营规划决策

Lama Al Hajj Hassan, Hani S. Mahmassani*, Ying ChenTransportation Center, Northwestern University, Chambers Hall, 600 Foster Street, Evanston, IL 60208, United States
西北大学交通中心，钱伯斯大厅， 600 Foster Street， Evanston， IL 60208， United States

A R T I C L E I N F O

Keywords: 关键字：

Freight demand forecasting
货运需求预测

Time series 时间序列

Reinforcement learning 强化学习

Rolling horizon 滚动地平线

Abstract 抽象

A B S T R A C T Freight forecasting is essential for managing, planning operating and optimizing the use of resources. Multiple market factors contribute to the highly variable nature of freight flows, which calls for adaptive and responsive forecasting models. This paper presents a demand forecasting methodology that supports freight operation planning over short to long term horizons. The method combines time series models and machine learning algorithms in a Reinforcement Learning framework applied over a rolling horizon. The objective is to develop an efficient method that reduces the prediction error by taking full advantage of the traditional time series models and machine learning models. In a case study applied to container shipment data for a US intermodal company, the approach succeeded in reducing the forecast error margin. It also allowed predictions to closely follow recent trends and fluctuations in the market while minimizing the need for user intervention. The results indicate that the proposed approach is an effective method to predict freight demand. In addition to clustering and Reinforcement Learning, a method for converting monthly forecasts to long-term weekly forecasts was developed and tested. The results suggest that these monthly-to-weekly long-term forecasts outperform the direct long term forecasts generated through typical time series approaches.
A B S T R A C T 货运预测对于管理、规划、运营和优化资源使用至关重要。多种市场因素导致货运流量的高度可变性，这需要适应性和响应性预测模型。本文提出了一种需求预测方法，该方法支持短期到长期的货运运营计划。该方法将时间序列模型和机器学习算法结合在强化学习框架中，应用于滚动地平线。目标是开发一种有效的方法，通过充分利用传统的时间序列模型和机器学习模型来减少预测误差。在一家美国多式联运公司集装箱运输数据的案例研究中，该方法成功地降低了预测误差幅度。它还允许预测密切关注市场的最新趋势和波动，同时最大限度地减少对用户干预的需求。结果表明，所提方法是一种有效的货运需求预测方法。除了聚类和强化学习之外，还开发并测试了一种将月度预测转换为长期周度预测的方法。结果表明，这些每月到每周的长期预测优于通过典型时间序列方法生成的直接长期预测。

1. Introduction 1. 引言

Taking 2015 as the base year, total U.S. freight shipments are expected to increase 41 percent by 2045 (Assoc. of Am. Railroads, 2017). In the midst of this growing demand, intermodal operations, which combine truck drayage with rail long haul service, offer a viable option to serve various markets. These operations help reduce trucks on highways at a competitive price. Intermodal traffic is expected to grow as the trucking industry continues to face driver shortages and regulatory changes that raise costs and limit driver hours and miles. The projected cost increases in trucking coupled with growth in e-commerce will further push retailers to seek savings along the supply chain (Stephens, 2017).
以2015年为基准年，预计到2045年，美国货运总量将增长41%（美国铁路协会，2017年）。在这种不断增长的需求中，将卡车拖运与铁路长途服务相结合的多式联运业务为服务于各种市场提供了可行的选择。这些操作有助于以具有竞争力的价格减少高速公路上的卡车。随着卡车运输业继续面临司机短缺和监管变化，这些变化会增加成本并限制司机的工作时间和里程，预计多式联运量将增长。卡车运输的预计成本增加加上电子商务的增长将进一步推动零售商在供应链上寻求节省（Stephens，2017）。

To accommodate the projected growth, intermodal carriers need to understand and properly react to regulations, market forces, and evolving customer requirements in a competitive market. The response should be reliable through the coordination between freight and equipment flow (Dewitt and Clinger, 2000). Customers and retailers are expecting more shipping options and higher delivery speeds and reliability, lower prices, flexible destinations, and distribution terminals closer to urban agglomerations (Choe et al., 2017).
为了适应预期的增长，多式联运承运人需要了解竞争激烈的市场中的法规、市场力量和不断变化的客户需求并做出适当的反应。通过货运和设备流之间的协调，响应应该是可靠的（Dewitt和Clinger，2000年）。客户和零售商期望有更多的运输选择、更高的交付速度和可靠性、更低的价格、灵活的目的地以及更靠近城市群的配送终端（Choe et al.， 2017）。

These challenges underscore the role of data in planning and managing the supply chain process. Regardless of the mode in question, operators need forecasting models that not only analyze and project based on historical trends and market information, but also adapt to recent changes in customer purchase patterns, policy and so on. Extracting analytical insights from the network data, developing demand projections on a real-time basis, and acting on them will affect carriers' market position (Choe et al., 2017). This dynamic insight-response mechanism promises improved fleet utilization and business growth.
这些挑战凸显了数据在规划和管理供应链流程中的作用。无论采用何种模式，运营商都需要的预测模型不仅要根据历史趋势和市场信息进行分析和预测，还要适应客户购买模式、政策等的最新变化。从网络数据中提取分析见解，实时开发需求预测并采取行动将影响运营商的市场地位（Choe et al.， 2017）。这种动态的洞察-响应机制有望提高车队利用率和业务增长。

Freight demand has been modeled using regression and time series techniques, behavioral models (Regan & Garrido, 2001), commodity-based input-output models, inventory theory, truck route optimization and simulation approaches (Mahmassani et al., 2007). One critical limitation in existing freight demand models is the inability to adapt quickly to new conditions (Chow et al., 2010). In addition, some models require extensive demand and economic datasets making implementation time and resource consuming (Cambridge Systematics, 2010). This paper presents a demand forecasting methodology to support operational planning over short to long term horizons. The approach combines time series models ("forecasters") in a Reinforcement Learning (RL) framework implemented over a rolling horizon. In the proposed methodology, predictions for each market are generated after clustering the market lanes (different origin-destination pairs) on the basis of observed container demand patterns. The contribution of this paper lies in developing a structured responsive approach that starts by identifying market clusters in a large dataset, and then establishing a suitable subset (committee) of forecasters for each cluster. Constructing a committee of forecasters instead of relying on a single predictor helps improve the prediction accuracy since no one model is always the best performer (Newbold and Granger, 1974; Granger and Newbold, 1976; Granger and Jeon, 2004; Yang, 2004; Bichpuriya et al., 2016). The approach uses RL logic to overcome the responsiveness limitation of time series models by learning from the recent performance of the committee members. Reinforcement Learning is a computational approach to learning in which the agent learns and adapts through continuous experimentation. An agent's goal is to maximize a suitably defined reward function, so the agent explores several options and consequently discovers which action yields the highest return, and as a result learns what to do (Sutton and Barto, 1998).
货运需求已使用回归和时间序列技术，行为模型（Regan&Garrido，2001），基于商品的投入产出模型，库存理论，卡车路线优化和模拟方法（Mahmassani等人，2007）进行建模。现有货运需求模型的一个关键限制是无法快速适应新条件（Chow et al.， 2010）。此外，一些模型需要大量的需求和经济数据集，这使得实施耗时耗资（Cambridge Systematics，2010）。本文提出了一种需求预测方法，以支持短期到长期的运营规划。该方法将时间序列模型（“预测器”）结合在强化学习（RL）框架中，该框架在滚动范围内实施。在所提出的方法中，根据观察到的集装箱需求模式，在对市场航线（不同的始发地-目的地对）进行聚类后，生成对每个市场的预测。本文的贡献在于开发一种结构化的响应方法，该方法首先在大型数据集中识别市场集群，然后为每个集群建立一个合适的预测者子集（委员会）。建立一个预测者委员会而不是依赖单一的预测器有助于提高预测准确性，因为没有一个模型总是表现最好的（Newbold and Granger， 1974;Granger 和 Newbold，1976 年;Granger 和 Jeon，2004 年;杨， 2004;Bichpuriya 等人，2016 年）。该方法使用 RL 逻辑，通过学习委员会成员最近的表现来克服时间序列模型的响应性限制。强化学习是一种计算学习方法，其中智能体通过不断的实验来学习和适应。智能体的目标是最大化适当定义的奖励函数，因此智能体探索了几个选项，从而发现哪个行为产生了最高的回报，并因此学会了该做什么（Sutton和Barto，1998）。

In the following section, a review of the relevant literature is presented, followed by a presentation of the methodology in Section 3. The model is then tested, and the results are discussed in Section 4. Finally, Section 5 concludes the paper and suggests future extensions of this work.
在下一节中，将对相关文献进行回顾，然后在第3节中介绍方法。然后对模型进行测试，结果将在第 4 节中讨论。最后，第5节总结了本文，并提出了这项工作的未来扩展。

2. Literature review 2. 文献综述

2.1. Freight demand modelling
2.1. 货运需求建模

Several tools have been developed for freight demand modeling to support public and private sector decision making processes (Chow et al., 2010; Tavasszy and De Jong, 2013). Freight demand has been modeled using regression and time series techniques (Fite et al., 2002; Farrington and Harris, 2011), behavioral models (Regan and Garrido, 2001; Ben-Akiva et al., 2008), spatial econometrics techniques (Garrido and Mahmassani, 1998, 2000), commodity-based input-output models (Black, 1999; Holguín-Veras and Patil, 2008; Cascetta et al., 2013), inventory and supply chain theories (Winston, 1983), truck route optimization and simulation approaches (Zhang et al., 2008; Cambridge Systematics, 2010; Nuzzolo et al., 2013).
已经开发了几种用于货运需求建模的工具，以支持公共和私营部门的决策过程（Chow et al.， 2010;Tavasszy 和 De Jong，2013 年）。货运需求已使用回归和时间序列技术进行建模（Fite et al.， 2002;Farrington和Harris，2011），行为模型（Regan和Garrido，2001;Ben-Akiva et al.， 2008）、空间计量经济学技术（Garrido and Mahmassani， 1998， 2000）、基于商品的投入产出模型（Black， 1999;Holguín-Veras 和 Patil，2008 年;Cascetta et al.， 2013）、库存和供应链理论（Winston， 1983）、卡车路线优化和模拟方法（Zhang et al.， 2008;剑桥系统学，2010 年;Nuzzolo等人，2013）。

Behavioral models simulate shippers' decision processes and carrier choices. These can be implemented if survey data are collected from shippers and carriers; however the approach can be time-consuming and not readily available as shipper-carrier interaction data are typically proprietary (Cambridge Systematics, 2010). Input-output models are used to determine the amount of freight by commodity consumed and produced in a region as a result of economic mechanisms. Truck trips are then estimated based on the calculated freight flows. To develop such models, one would require historical and predicted freight flow data and economic indicators. Inventory models simulate the decision-making process between customers and suppliers while seeking to minimize the total transportation cost (shipping, inventory and ordering). Simulation and network models focus on freight mode choice and routing decisions (Mahmassani et al., 2007; Cambridge Systematics, 2010). Time series models are commonly used by public sector agencies and have fewer resource requirements compared to other approaches. However, time series models require a long history of observations and assume previous conditions remain stationary into the future. Therefore, on their own such models do not adapt to new situations and information quickly.
行为模型模拟托运人的决策过程和承运人选择。如果从托运人和承运人那里收集调查数据，则可以实施这些措施;然而，这种方法可能很耗时，而且不容易获得，因为托运人与承运人的互动数据通常是专有的（Cambridge Systematics，2010）。投入产出模型用于确定由于经济机制而在一个地区消费和生产的商品的货运量。然后根据计算出的货运流量估算卡车行程。要开发这样的模型，需要历史和预测的货运流量数据和经济指标。库存模型模拟客户和供应商之间的决策过程，同时寻求将总运输成本（运输、库存和订购）降至最低。仿真和网络模型侧重于货运模式选择和路线决策（Mahmassani 等人，2007 年;剑桥系统学，2010 年）。时间序列模型通常由公共部门机构使用，与其他方法相比，它需要的资源更少。然而，时间序列模型需要很长的观测历史，并假设以前的条件在未来保持静止。因此，这些模型本身并不能快速适应新情况和信息。

One of the critical issues facing existing tools include lack of and late responsiveness to consumer demand variations and to sudden economic conditions (Cambridge Systematics, 2010). This misalignment between the freight market and the model adjustments could result in missed opportunities and inefficient operations. As a result, this paper uses time-series and machine learning models within a reinforcement learning approach to leverage available data and fast responsiveness to new conditions. Freight demand modeling literature using time series analysis have relied on single models to generate point forecasts (Farrington and Harris, 2011) or multiple models to generate bounds (Liu et al., 2006). The contribution of this paper lies in developing and testing using actual carrier data a structured responsive approach that starts by identifying clusters in the data and their suitable committee of forecasters. The approach then uses the RL logic to overcome the responsiveness limitation of time series models and learn from the recent performance of the committee members (model components).
现有工具面临的关键问题之一是对消费者需求变化和突如其来的经济状况缺乏反应和反应迟钝（Cambridge Systematics，2010）。货运市场与模型调整之间的这种错位可能导致错失机会和运营效率低下。因此，本文在强化学习方法中使用时间序列和机器学习模型来利用可用数据和对新条件的快速响应。使用时间序列分析的货运需求建模文献依赖于单个模型来生成点预测（Farrington and Harris，2011）或多个模型来生成边界（Liu et al.，2006）。本文的贡献在于使用实际的载波数据开发和测试一种结构化的响应方法，该方法首先识别数据中的聚类及其合适的预报员委员会。然后，该方法使用 RL 逻辑来克服时间序列模型的响应性限制，并从委员会成员（模型组件）的近期表现中学习。

2.2. Reinforcement learning
2.2. 强化学习

Inspired by behavioral psychology, Reinforcement Learning has been adopted and adapted by multiple disciplines such as game theory, control theory, and operations research, as a mechanism for integrating information from various sources and/or outcomes from repeated trials. In reinforcement learning, given a set of possible actions, the agent discovers the most rewarding decision by trying an action and reaping the reward immediately or after a set of subsequent actions i.e., delayed reward. Because of the structure
受行为心理学的启发，强化学习已被博弈论、控制理论和运筹学等多个学科采用和改编，作为一种整合来自各种来源的信息和/或重复试验结果的机制。在强化学习中，给定一组可能的动作，智能体通过尝试一个动作并立即或在一组后续动作（即延迟奖励）之后获得奖励来发现最有价值的决定。因为结构
of this problem, the agent is required to make a tradeoff between exploration of new options that may have higher rewards and exploitation of past actions that led to rewards (Weigang et al., 2008).
对于这个问题，智能体需要在探索可能具有更高奖励的新选项和利用导致奖励的过去行为之间做出权衡（Weigang et al.， 2008）。

In the field of game theory, Erev and Roth (1998) studied the descriptive (best fit) and predictive performance of reinforcement learning models in games with a unique equilibrium. They showed that static equilibrium predictions were outperformed by those from reinforcement learning models. They tested three games with similar equilibrium conditions and noted that the speed at which each experimental subject converged to the equilibrium prediction varied depending on their starting point. They further tested the long-term ramifications of initial conditions on the learning process and argued that although it is possible to improve the initial estimates their effects on the model were limited.
在博弈论领域，Erev和Roth（1998）研究了具有独特均衡的博弈中强化学习模型的描述性（最佳拟合）和预测性能。他们表明，静态平衡预测的表现优于强化学习模型的预测。他们测试了三个具有相似平衡条件的游戏，并注意到每个实验对象收敛到平衡预测的速度取决于他们的起点。他们进一步测试了初始条件对学习过程的长期影响，并认为尽管可以改进初始估计，但它们对模型的影响是有限的。

In earlier work, Roth and Erev (1995) introduced two psychological features in their reinforcement learning framework: experimentation and recency effects. By allowing experimentation in the model, a subject's choice set was not limited to past successful ones, thereby preventing subjects from being rapidly locked into one sub-optimal choice. Recency effects recognize that recent events might have a more significant impact on decisions compared to past experience. These two features allow the reinforcement learning model to be more responsive to changes in the environment, especially under partial and general conditions of information availability.
在早期的工作中，Roth和Erev（1995）在他们的强化学习框架中引入了两个心理特征：实验和新近效应。通过允许在模型中进行实验，受试者的选择集不限于过去成功的选择集，从而防止受试者迅速被锁定在一个次优选择中。新近效应认识到，与过去的经验相比，最近发生的事件可能对决策产生更重大的影响。这两个特性使强化学习模型能够更好地响应环境的变化，特别是在信息可用性的部分和一般条件下。

Feltovich (2000) designed a multi-stage game and compared Nash equilibrium predictions to those of reinforcement learning and belief based models. In reinforcement learning models, a player bases his/her choices only on payoffs of past actions whereas in belief based models choices depend on the expected payoff of an action given held beliefs of the likely actions of the other players. Using a set of pre-defined criteria, the author concluded that predictions generated from both learning models substantially outperform those of Nash equilibrium. In fact, in the tested games and after multiple repetitions, the predictions of both models started and remained far from equilibrium. The experimental results were inconclusive as to which learning model (Reinforcement or Belief-Based) outperformed the other, though both models resulted in similar behavioral patterns regardless of their different specifications, suggesting that there might not be a single perfect model that would capture behavior under all circumstances.
Feltovich（2000）设计了一个多阶段博弈，并将纳什均衡预测与强化学习和基于信念的模型进行了比较。在强化学习模型中，玩家的选择仅基于过去行为的回报，而在基于信念的模型中，选择取决于对其他参与者可能行为的信念，从而确定行动的预期收益。使用一组预定义的标准，作者得出结论，从两个学习模型生成的预测都大大优于纳什均衡的预测。事实上，在测试的博弈中，经过多次重复，两个模型的预测都开始了，并且仍然远离平衡。实验结果尚无定论，哪种学习模型（强化或基于信念）优于另一种，尽管两种模型无论其规格如何，都产生了相似的行为模式，这表明可能没有一个完美的模型可以捕捉所有情况下的行为。

While there are numerous examples in the literature for the use of machine learning techniques for planning and management of airport operations, forecasting models for freight operations have been more inclined toward the use of time series and multiple regression models (Bodily and Freeland, 1988; Guerrero and Elizondo, 1997; Fite et al., 2002; Patil and Sahu, 2016). An overview of the use of reinforcement learning for forecasting demand for transportation services and the models used in forecasting freight flow are presented in Table 1.
虽然文献中有许多使用机器学习技术规划和管理机场运营的例子，但货运运营的预测模型更倾向于使用时间序列和多元回归模型（Bodily and Freeland，1988;格雷罗和埃利桑多，1997年;Fite 等人，2002 年;Patil 和 Sahu，2016 年）。表1概述了使用强化学习预测运输服务需求以及用于预测货运流量的模型。

Table 1 shows that reinforcement learning succeeded in guiding processes and improving decisions in the aviation sector. Review of freight-related literature revealed that there had been attempts to incorporate several machine learning techniques in the forecasting process although time series models are still adopted. This paper seeks to combine the already established time series models with reinforcement learning to advance freight forecasting models and their application to support operational planning decision processes.
表1显示，强化学习成功地指导了航空业的流程和改进决策。对货运相关文献的回顾表明，尽管仍然采用时间序列模型，但已经尝试在预测过程中纳入几种机器学习技术。本文试图将已经建立的时间序列模型与强化学习相结合，以推进货运预测模型及其应用，以支持运营规划决策过程。

3. Methodology 3. 方法论

This section presents the overall approach to forecasting freight movement. Assume that we have

oracles or forecasters who provide a forecast for each time period in a given market-in this case, the number of containers to be moved in a given week in that market. No forecast is perfect, and sometimes a forecast will hit the mark or get very close, whereas at other times it may miss the mark by varying degrees. Not knowing the past performance of the oracles, the operator (using the forecasts as a basis for planning decisions) may take a simple average of the forecasts as the "consensus" forecast. However, recognizing that some forecasters outperform others in certain instances or for different periods of time, the operator will place different weights on the respective forecasts in forming the "final" forecast. Reinforcement Learning (RL) provides a mechanism for (1) dynamically learning about the respective performance of the different forecasters based on the quality of their past forecasts, placing greater emphasis on more recent instances, and (2) weighing the respective forecasts accordingly at each instance.
本节介绍预测货运的总体方法。假设我们有

预言机或预测者，他们为给定市场中的每个时间段提供预测 - 在本例中，该市场在给定的一周内要移动的集装箱数量。没有一个预测是完美的，有时预测会达到目标或非常接近，而在其他时候，它可能会不同程度地偏离目标。由于不知道预言机的过去表现，操作员（使用预测作为计划决策的基础）可能会将预测的简单平均值作为“共识”预测。然而，考虑到某些预测者在某些情况下或不同时间段的表现优于其他预测者，运营商将在形成“最终”预测时对各自的预测施加不同的权重。强化学习（RL）提供了一种机制，用于（1）根据不同预测者过去预测的质量动态学习各自的表现，更加强调最近的实例，以及（2）在每个实例中相应地权衡各自的预测。

In this context, the forecasters, or agents, consist of different statistical or Machine Learning models developed using the same training (estimation) data set, and implemented in a rolling horizon framework to provide forecasts at different time scales. These are run in parallel, and compared against actual realizations once those have materialized, providing an automated basis for scoring the forecasters' respective performance for use in the RL weight updating mechanism. The challenge consists in specifying the reward function and formulating the updating mechanism. The mathematical details are provided in this section as part of the model formulation.
在这种情况下，预测者或代理由使用相同的训练（估计）数据集开发的不同统计或机器学习模型组成，并在滚动地平线框架中实现，以提供不同时间尺度的预测。这些是并行运行的，并在实现后与实际实现进行比较，为对预测者各自的表现进行评分提供自动化基础，以用于 RL 权重更新机制。挑战在于指定奖励函数和制定更新机制。本节将作为模型公式的一部分提供数学详细信息。

Before developing and applying the forecasting models, container demand originating in a given market is clustered into destination lane groups with similar spatio-temporal patterns, as described in Section 3.1. The overall approach is summarized in Fig. 1. The formulation of the continuous, self-correcting RL model is presented in Section 3.2, followed in Section 3.3 by a description of the time scales over which the forecasting models are applied. Recognizing the presence of seasonality and special events (e.g. holidays), Section 3.4 presents the approach followed to incorporate these effects in the models. Finally Section 3.5 describes the metrics used to evaluate the performance of the model in the case study application.
在开发和应用预测模型之前，将源自给定市场的集装箱需求聚类到具有相似时空模式的目的地航线组中，如第 3.1 节所述。图 1 总结了总体方法。第 3.2 节介绍了连续、自校正 RL 模型的公式，然后在第 3.3 节中描述了应用预测模型的时间尺度。认识到季节性和特殊事件（例如节假日）的存在，第3.4节介绍了将这些影响纳入模型所采用的方法。最后，第 3.5 节描述了用于评估案例研究应用程序中模型性能的指标。

3.1. Market lane clustering
3.1. 市场通道聚类

After correcting for missing data and data errors, the primary markets are identified along with their trends in time and space. The primary markets are defined as the origin terminals where all the containers are loaded. The primary market is the one for which
在纠正缺失的数据和数据错误后，确定主要市场及其时间和空间趋势。主要市场被定义为装载所有集装箱的始发码头。一级市场是

Table 1 表1

Selected freight forecasting methods and reinforcement learning applications.
选定的货运预测方法和强化学习应用。

Reference

Method

Application

Findings

(GOSAVII et al., 2002)
（GOSAVII 等人，2002 年）

Formulated the problem as a semi-Markov
将问题表述为半马尔可夫

decision problem aiming to maximize the
决策问题旨在最大化

average reward. 平均奖励。

Objective values defined for each state-action
为每个状态操作定义的目标值

pair were updated within a neural network
在神经网络中更新对

scheme

Revenue management problem for
收入管理问题

a single flight leg
单程飞行

The proposed method outperforms the
所提出的方法优于

Expected Marginal Seat Revenue (EMSR),
预期边际座位收入（EMSR），

a heuristic that is widely used in the
一种启发式方法，广泛用于

industry.

Weigang et al., 2008
Weigang 等人，2008

Simulate future airspace demand to identify
模拟未来空域需求以识别

capacity requirements in each sector for
每个部门的能力要求

different periods of time.
不同的时间段。

Decision support process is designed as a
决策支持过程被设计为

Markov Decision Chain (MDC), the state
马尔可夫决策链（MDC），状态

information from MDC is transferred to a
来自 MDC 的信息被传输到

reinforcement learning model in which an
强化学习模型，其中

action is selected, executed, and its
操作被选中、执行，其

corresponding outcomes are used to update
相应的结果用于更新

the learning process. 学习过程。

Air traffic flow management
空中交通流量管理

decisions

Using traffic flow information for Brazil,
使用巴西的交通流量信息，

the model-suggested course of action was
模型建议的行动方案是

close to reality and even led to
贴近现实，甚至导致

improvement in certain instances.
在某些情况下有所改进。

Tumer and Agogino (Tumer
Tumer 和 Agogino （Tumer

and Agogino, 2009) 和 Agogino，2009 年）

Multi-agent model that responded quickly to
快速响应的多智能体模型

weather and airport conditions to limit the
天气和机场条件限制

local delays. 本地延误。

These agents learn continuously through
这些代理通过不断学习

reinforcement learning and provide air traffic
强化学习和提供空中交通

controllers with recommendations and
带有建议的控制器和

decisions.

Air traffic management systems
空中交通管理系统

(ATM)

A simulation based on US airspace
基于美国领空的模拟

showed that the proposed model could
表明所提出的模型可以

improve ATM while retaining current
在保持电流的同时改善ATM

flow management procedures, without
流程管理程序，无

significant policy shifts. Comparison of
重大政策转变。比较

simulation results to outputs of a Monte
Monte 输出的仿真结果

Carlo estimation procedure revealed that
Carlo估计程序显示，

the adaptive agents outperformed the
自适应代理的表现优于

reference case. 参考案例。

Garrido and Mahmassani 加里多和马赫马萨尼

(Garrido and

Mahmassani, 2000) Mahmassani，2000年）

Multinomial probit (MNP) model for freight
货运的多项式概率（MNP）模型

demand analysis and flow distribution
需求分析和流量分布

prediction that captures general spatial and
捕获一般空间和

temporal correlation patterns.
时间相关性模式。

Given order patterns and information
给定的订单模式和信息

regarding socioeconomic activity, the model
关于社会经济活动，该模型

forecasts freight flow over space and time for
预测空间和时间上的货运流量

operational and tactical level planning
作战和战术层面的规划

Motor carrier company dataset of
汽车运输公司数据集

all shipments picked up in the
所有货物都提货

state of Texas between June 1994
1994年6月期间的德克萨斯州

and July 1995 .
和 1995 年 7 月。

While predicted probabilities differed
虽然预测概率不同

from those in the forecasting sample, the
从预测样本中，

modified probit model succeeded in
修改后的概率模型成功

ranking and identifying sites with a
对具有

higher probability of generating
生成概率更高

shipments at a given time.
给定时间的装运量。

Moscoso-López et al. 莫斯科-洛佩斯等人。

(Moscoso-López et al., （Moscoso-López等人，

2016)

Compared the performance of Artificial
比较人工的性能

Neural Networks (ANN) and Support Vector
神经网络（ANN）和支持向量

Machines (SVMs) models in predicting freight
预测货运的机器（SVM）模型

volume

Fresh vegetable transportation
新鲜蔬菜运输

through RO-RO operations in the
通过滚装操作

Port of Algeciras Bay
港口 Algeciras Bay

The SVMs models performed slightly
SVM 模型的表现略有

better than ANN in forecasting the volume
在预测数量方面优于 ANN，

of fresh vegetables moved on each day
每天搬来的新鲜蔬菜

predictions are generated. Primary vs secondary notation is used only to differentiate between the origin market and the destination markets respectively. It is important to examine the trends since the intensity and volatility of the fluctuations in each market hint to the type of the time series models that would best capture the different characteristics of demand such as seasonality and periodicity.
生成预测。主要符号与次要符号仅用于分别区分始发市场和目的地市场。研究趋势很重要，因为每个市场波动的强度和波动性暗示了最能捕捉需求不同特征（如季节性和周期性）的时间序列模型的类型。

For each market, the corresponding demand dataset is clustered so as to identify groups of destination lanes. Each group consists of lanes with similar patterns and for which separate cluster-customized models are developed. Cluster analysis seeks to find groups in a dataset that increase homogeneity in a group while increasing heterogeneity between groups (Kaufman and Rousseeuw, 2009). There are various algorithms for clustering; here we employ k-means clustering approach. K-means clustering is an unsupervised approach that starts by locating centroids in the dataset and then assigning elements in the data to that centroid to create groups. The assignment seeks to minimize the sum of the squared distance between the elements and the centroid in a cluster (Schoier and Gregorio, 2017). Orlin et al. (2017) used clustering to reduce errors in sales forecasting for a retail company. Compared to the nonclustered approach, clustering the different items sold by the retail company lowered the average errors by at least 10% (Orlin et al., 2017). Hence, in this study, a market is broken down into various clusters, as described in the case study application in Section 4. For each cluster, a set of forecasting models are selected to comprise the components that are later used in the RL approach to generate the final forecast. Therefore, the predicted demand for a cluster is the weighted average of the predictions of the individual component models and the weights are updated continuously using RL.
对于每个市场，对相应的需求数据集进行聚类，以识别目的地车道组。每个组都由具有相似模式的车道组成，并为其开发单独的集群定制模型。聚类分析试图在数据集中找到增加组同质性的组，同时增加组之间的异质性（Kaufman和Rousseeuw，2009）。有多种聚类算法;在这里，我们采用k-means聚类方法。K-means 聚类是一种无监督方法，它首先在数据集中定位质心，然后将数据中的元素分配给该质心以创建组。该任务旨在最小化集群中元素和质心之间的平方距离之和（Schoier和Gregorio，2017）。Orlin等人（2017）使用聚类来减少零售公司销售预测中的错误。与非聚类方法相比，对零售公司销售的不同商品进行聚类可将平均误差降低至少 10%（Orlin 等人，2017 年）。因此，在本研究中，市场被分解为不同的集群，如第 4 节的案例研究应用程序中所述。对于每个聚类，选择一组预测模型来包含稍后在 RL 方法中使用以生成最终预测的组件。因此，对集群的预测需求是各个组件模型预测的加权平均值，并且使用 RL 不断更新权重。

3.2. Continuous and self-correcting model formulation
3.2. 连续和自校正模型的制定

In developing the forecasting model, the longitudinal dataset is divided into two time periods. The first is used for training, and the later one is used for testing. The estimation set is used for calibrating the statistical time series models and training the machine learning algorithms; forecasts are then produced using these models for the testing portion, along with forecasting new time intervals, where each time interval is one week in this application.
在开发预测模型时，纵向数据集分为两个时间段。第一个用于训练，后者用于测试。估计集用于校准统计时间序列模型和训练机器学习算法;然后，使用这些模型为测试部分生成预测，并预测新的时间间隔，在此应用程序中，每个时间间隔为一周。

The demand forecasting process is designed to operate on a rolling horizon basis, whereby a forecasting horizon is updated (rolled) at the end of every week as new information becomes available. Adopting this rolling horizon approach, depicted in Fig. 2,
需求预测过程旨在以滚动为基础运行，即在每周结束时随着新信息的出现而更新（滚动）。采用这种滚动地平线方法，如图 2 所示，

Fig. 1. Stages in the presented methodology.
图 1.所介绍的方法的阶段。

allows updating the forecasts based on the most recent information. It is accomplished by adding to the estimation dataset the number of moves that occurred in the latest week, dropping the earliest week, and then forecasting. As a result, forecasting becomes a quasi-continuous process that is not only performed at the end of the forecasting horizon.
允许根据最新信息更新预测。它是通过将最近一周发生的移动次数添加到估计数据集中，删除最早的一周，然后进行预测来实现的。因此，预测成为一个准连续过程，不仅在预测范围结束时执行。

Deploying a continuous model may be directly correlated with the need to increase monitoring efforts and incurring operational inefficiencies. However, the suggested framework incorporates reinforcement learning to maximize efficiency through better deployment and informed execution. Reinforcement learning (RL) would allow the model to adapt to recent trends while eliminating the need for direct auditing. The RL notion is introduced in a manner that guides forecasts generated in this proposed approach.
部署连续模型可能与增加监控工作和导致运营效率低下的需要直接相关。然而，建议的框架结合了强化学习，通过更好的部署和明智的执行来最大限度地提高效率。强化学习（RL）将使模型能够适应最近的趋势，同时消除直接审计的需要。RL概念的引入方式指导了在该建议方法中生成的预测。

Since no time series model is capable of fully capturing the demand trend characteristics and one model may not be suitable for
由于没有时间序列模型能够完全捕捉需求趋势特征，并且一个模型可能不适合

Fig. 2. Rolling horizon implementation.
图 2.滚动地平线实现。

forecasting at all times, this proposed approach uses multiple time series models as RL agents. The time series or machine learning models (agents) are chosen based on the initial trend analysis performed at the level of clusters. The forecast of the individual component (agent) is then weighed through the reinforcement learning mechanism to obtain the final forecast. Suppose

time series models,

, are used for forecasting, where

is the

-th time step,

denote the prediction accuracy of component

, which is the fraction of time over the testing period in which component

gives a correct prediction.
在任何时候进行预测，所提出的方法使用多个时间序列模型作为RL代理。时间序列或机器学习模型（代理）是根据在聚类级别执行的初始趋势分析来选择的。然后通过强化学习机制对单个组件（智能体）的预测进行权衡，以获得最终预测结果。假设

时间序列模型

，用于预测，其中

-th 时间步长表示

组件

的预测精度，即组件

给出正确预测的测试周期内的时间分数。

The "correctness" or "accuracy" of a component j's prediction at a time

could be a [0,1] variable, equal to 1 if

's prediction is correct (within a pre-defined acceptable error range) for time t; 0 otherwise. It could also depend on the relative accuracy of the prediction, as implemented in the model presented here:
分量 j 的预测在一次

的“正确性”或“准确性”可以是一个 [0,1] 变量，如果

时间 t 的预测是正确的（在预定义的可接受误差范围内），则等于 1;否则为 0。它还可能取决于预测的相对准确性，如此处介绍的模型中所实现的那样：

Recognizing that each predictor is likely to be accurate with some probability, the "consensus" prediction will be a weighted sum of the values

, i.e.:
认识到每个预测变量在一定概率下可能是准确的，“共识”预测将是值

的加权总和，即：

where

is the weight associated with predictor

for the

-th interval. The weights account for the respective prediction accuracy fractions and the importance of performance in previous time steps. The weights are normalized and sum to

in case ties are allowed, i.e., two components could have the same accuracy at the same time.
其中

，是与

第 -th 个区间的预测变量

关联的权重。权重考虑了各自的预测准确性分数和先前时间步中性能的重要性。权重被归一化，并在允许平局的情况下求和，

即两个分量可以同时具有相同的精度。

The main feature of RL is the manner in which the weights are adjusted from one time step to next time step based on the latest information derived from actual observed values. The basic idea is that

will be set for the next time step

based on the outcome of the current time step

. One might also prefer positive deviations (Forecast - Actual

) to negative deviations (Forecast Actual

). Note that it is usually better to over-predict than under-predict, as it is possible to redeploy or give discounts to customers in the former case, while customers and goodwill are lost in the latter case. The preference for over-prediction is reflected by including a penalty coefficient

in the weights applied to a forecast as follows:
RL的主要特点是根据从实际观测值得出的最新信息，从一个时间步长调整权重到下一个时间步长。基本思想是，

将根据当前时间步

的结果为下一个时间步

设置。人们可能更喜欢正偏差（预测 - 实际

）而不是负偏差（预测实际

）。请注意，高估通常比低估要好，因为在前一种情况下可以重新部署或向客户提供折扣，而在后一种情况下会失去客户和商誉。通过在应用于预测的权重中包含惩罚系数

来反映对过度预测的偏好，如下所示：

where 哪里

: Deviation penalty for component model

：组件模型

的偏差惩罚

: Forecast for period

from model

：模型

的周期

预测

Observation at period

经期

观察

: Weight given for forecast from model

at period

：根据模型

在期间

进行预测的权重

: Weight given to past data

：对过去数据的权重

: Number of fitting (component) models used

：使用的管接头（元件）型号数量

3.3. Forecasting over different time scales
3.3. 不同时间尺度的预测

As previously mentioned the proposed forecasting scheme is used to support operational planning over the short term and the longer term. The operations are planned on a weekly time scale therefore both short-term and long-term forecasts represent the number of moves per week. Accordingly, the short-term model is designed to predict the number of moves for one week and two weeks in advance whereas the long-term model forecasts the number of moves in each week for the weeks in one month and two
如前所述，拟议的预测方案用于支持短期和长期的运营规划。这些操作是按每周时间尺度计划的，因此短期和长期预测都代表每周的移动次数。因此，短期模型旨在提前一周和两周预测移动次数，而长期模型则预测一个月和两周内每周的移动次数
months in advance. 提前几个月。

For the short-term forecast, a suitable number of time series models are chosen for each cluster based on the trend characteristics found during the pre-processing stage. However, time series models have low accuracy when it comes to forecasting over long horizons (Nguyen and Chan, 2004), so an alternative to long-term weekly forecasts is devised. Consequently, instead of directly forecasting weeks far in advance, we forecast the number of moves on a monthly basis, i.e., the number of moves that will occur in the next two months separately. Afterwards, the number of moves in each week is calculated as a weighted average of: (a) No Information/ Equal Allocation (NIA) Model, and (b) Monthly to Weekly Mapping (MWM) Model with the weights (

) update using an RL mechanism similar to the one previously described. The final long-term weekly forecast is calculated as follows:
对于短期预测，根据预处理阶段发现的趋势特征，为每个聚类选择适当数量的时间序列模型。然而，时间序列模型在长期预测方面的准确性较低（Nguyen和Chan，2004），因此设计了一种长期每周预测的替代方案。因此，我们不是提前几周直接预测，而是按月预测移动次数，即未来两个月将分别发生的移动次数。之后，每周的移动次数计算为以下加权平均值：（a）无信息/均等分配（NIA）模型，以及（b）月度到每周映射（MWM）模型，权重（

）使用类似于前面描述的 RL 机制进行更新。最终的长期每周预测计算如下：

where 哪里

: Final long-term weekly forecast for week

- regardless of the month it falls in

：每周

的最终长期每周预测 - 无论它属于哪个月份

Resultant forecast for week

from the "No Information Model"

“无信息模型”对本周

的预测结果

: Resultant forecast for week

from the "Monthly to Weekly Mapping model"

：根据“月度到周度映射模型”对周

的预测结果

The RL weight for model

and for week

模型

和周

的 RL 权重

In the NIA model, the monthly forecasts

are equally allocated over the days of the month. Then the weekly patterns are obtained by adding up the days of the weeks that belong to that month. On the other hand, the MWM model requires inputting a matrix of week-month weights

defined as the proportion of days in week

that fall in month

, the monthly forecasts

and the long-term weekly forecasts using time series models

. Although time series models are not optimal for long-term forecasting, they are incorporated in this overall model to take advantage of their ability to pick up trends and patterns in the data albeit not with the best accuracy. Afterwards, the monthly forecasts are mapped into the weeks - in each month - using weights

as follows:
在 NIA 模型中，月度预测

在当月的几天内平均分配。然后，通过将属于该月的周数相加来获得每周模式。另一方面，MWM 模型需要输入一个周-月权重矩阵，

该矩阵定义为使用时间序列模型

的月

度预测

和长期每周预测的周数百分比。尽管时间序列模型对于长期预测来说不是最佳选择，但它们被合并到这个整体模型中，以利用它们获取数据中趋势和模式的能力，尽管不是最佳精度。之后，使用如下权重

将月度预测映射到每个月的周数中：

where 哪里

: The fraction of the number of moves in month

that fall in week

：以周

为单位的月份

移动次数的百分比

: The proportion of days in week

that fall in month

：一周

中以月

为单位的天数比例

: The weekly forecast for

from the component models

：组件模型的

每周预测

: The forecasted number of moves for month

：当月的

预测移动次数

: The adjusted forecast for week

based on the monthly forecasts and the weekly pattern

：根据月度预测和每周模式调整后的周

预测

By definition,

is the proportion of days in week

that fall in month "i"; for example, the week from December 30,2019 to January 6, 2020 has 2 days out of 7 in December. Therefore, if week

does not belong to month

, then

. It is an important property that ensures

, in other words, the number of moves during one month is not over/underestimated. This, in turn, ensures that the sum of the weekly forecasts matches the sum of the monthly forecasts in one year as shown below:
根据定义，

是一周

中落在月份“i”的天数的比例;例如，从 2019 年 12 月 30 日到 2020 年 1 月 6 日的一周，12 月的 7 天中有 2 天。因此，如果 week

不属于 month

，则

。这是一个重要的属性，可以确保

换句话说，一个月内的移动次数不会被高估/低估。这反过来又确保了每周预测的总和与一年内的月度预测总和相匹配，如下所示：

where 哪里

3.4. Accounting for calendar effects
3.4. 考虑日历效应

Holiday shopping season is critical for suppliers and retailers who rely on supply chain companies to deliver their products to the right place at the right time. Therefore, it is necessary to account for holidays as covariates in the time series models whenever possible. Nevertheless, this introduces additional challenges known as the calendar effect.
假日购物季对于依赖供应链公司在正确的时间将产品运送到正确地点的供应商和零售商来说至关重要。因此，有必要尽可能在时间序列模型中将假期作为协变量进行考虑。然而，这带来了额外的挑战，称为日历效应。

Holidays such as Easter and Thanksgiving occur each year but their exact timing (such as date for Thanksgiving, day of the week for Christmas) shifts and their effect on the time series may affect more than one period (one week or one month). Therefore, the observations in the time series (in this case the number of container moves) depend on the absolute length of the period between moving holidays. It is important because the standard time series models, such as ARIMA do not capture these lags and effects. These models do capture seasonality, but by definition, the seasonal component occurs at the same time each year, and since these periods and holidays are shifting by date or day of the week, their effect is not picked up by the models.
复活节和感恩节等假期每年都会发生，但它们的确切时间（例如感恩节的日期、圣诞节的星期几）会发生变化，并且它们对时间序列的影响可能会影响多个时期（一周或一个月）。因此，时间序列中的观测值（在本例中为集装箱移动次数）取决于移动假期之间时间段的绝对长度。这很重要，因为标准时间序列模型（如 ARIMA）无法捕获这些滞后和影响。这些模型确实捕获了季节性，但根据定义，季节性成分发生在每年的同一时间，并且由于这些时期和节假日按日期或星期几变化，因此模型不会接收到它们的影响。

Cleveland and Grupe (1981) developed an approach to correct for the calendar effects. The approach used in their paper was developed for monthly forecasts however in this framework forecasts are generated on a weekly basis, so the approach is modified accordingly. Then the calendar effects of different holidays are added as external covariates (regression variables) to the models. The regression variables (effects) of the calendar events are defined as:
Cleveland和Grupe（1981）开发了一种纠正日历效应的方法。他们论文中使用的方法是为月度预测而开发的，但是在这个框架中，预测是每周生成的，因此该方法进行了相应的修改。然后，将不同假期的日历效应作为外部协变量（回归变量）添加到模型中。日历事件的回归变量（效应）定义为：

where 哪里

: The number of days before the holiday that are affected by it

：受假期影响的天数

: The current day

：当天

: Number of days in week

that belongs to the month of day

and are affected by the holiday

：一周

中属于一天

中受假日影响的月份的天数

: Total number of days in week

：一周

的总天数

The holidays included in this study are New Year's, Easter, Memorial Day, Fourth of July, Labor Day, Thanksgiving and Christmas. Their assumed

values are listed as follows:
本研究中包括的假期是新年、复活节、阵亡将士纪念日、七月四日、劳动节、感恩节和圣诞节。其假定

值如下所示：

New Year's - from Dec. 25th until Jan 2nd
新年 - 从 12 月 25 日到 1 月 2 日
Easter - 10 days
复活节 - 10 天
Memorial Day -7 days
阵亡将士纪念日 -7 天
4th of July - 7 days
7 月 4 日 - 7 天
Labor Day - 7 days
劳动节 - 7 天
Thanks giving - 10 days
感恩 - 10 天
Christmas is variable, and it is taken as the duration starting from the day after Thanksgiving up to and including Dec 24th.
圣诞节是可变的，它被视为从感恩节后的第二天开始到 12 月 24 日（包括 12 月 24 日）的持续时间。

3.5. Evaluation metrics 3.5. 评估指标

Several measures are used for assessing the prediction accuracy of the proposed forecasting method. Both statistical measures such as Mean Absolute Percentage Error (MAPE) and Mean Absolute Deviation (MAD) are utilized. Additionally, the results are interpreted graphically using a graph of error frequency distribution as well as individual weekly and cumulative forecasts. The statistical measures are presented below:
本文采用多种措施评估所提预测方法的预测精度。同时使用平均绝对百分比误差（MAPE）和平均绝对偏差（MAD）等统计度量。此外，使用误差频率分布图以及单个每周和累积预测以图形方式解释结果。统计指标如下：

where 哪里

actual number of moves during period

期间

的实际移动次数

forecasted number of moves for period

期间

的预测移动次数

While MAPE measures the extent of the error in percentage units, the MAD measures the error in units of the object forecasted.
MAPE 以百分比单位测量误差程度，而 MAD 以预测对象的单位测量误差。

Error frequency distribution graphs allow examination of the performance of the model. Ultimately, a bell-shaped distribution is preferred such that it is centered on 0% MAPE with most of the errors falling in between

. Moreover, the individual weekly
误差频率分布图允许检查模型的性能。最终，钟形分布是首选，因此它以 0% MAPE 为中心，大多数误差介于两者之间

。此外，个人每周

Fig. 3. Clustering market

at the lane level.
图 3.在车道层面聚集市场

。

graph helps examine how closely the forecasts follow the actual observation patterns as well as their proximities to the real values. As for the cumulative graphs, these are used for the long-term weekly pattern model to ensure that the forecasts are catching up with the actual number of moves and therefore to check and verify the soundness of the proposed approach.
图表有助于检查预测与实际观测模式的接近程度以及它们与实际值的接近程度。至于累积图，这些图用于长期每周模式模型，以确保预测赶上实际移动次数，从而检查和验证所建议方法的合理性。

4. Case study 4. 案例研究

The proposed approach is tested on a dataset provided by an intermodal company operating in the USA. The dataset includes information for each move by origin and destination. Moreover, for each movement, the dataset has information regarding the commodity type and order placement, pickup and delivery times. The data used in the model is the weekly number of moves from 2013 to 2017. The information presented below is for a market referred to here as X (for data confidentiality purposes).
所提出的方法在一家在美国运营的多式联运公司提供的数据集上进行了测试。该数据集包括按始发地和目的地划分的每次移动的信息。此外，对于每个移动，数据集都包含有关商品类型和订单下达、取货和交货时间的信息。模型中使用的数据是 2013 年至 2017 年的每周移动次数。下面提供的信息适用于此处称为 X 的市场（出于数据保密目的）。

After pre-processing the dataset, the moves originating from market

are clustered by lane then each lane group is further clustered by commodity type. The clustering at the lane level results in three groups with the members of each cluster- destinations presented in green, blue and black respectively and the centroids are chosen by the k-means algorithm in red, shown in Fig. 3. Figs. 3 and 4 only show the first 52 weeks in the training set - for clear presentation. However, clustering was done using the whole training set. The plot shows that the grouping is mainly driven by the frequency of moves between market

and the different destinations. Accordingly, three clusters are referred to as the "More Frequent" in green, "Frequent" in blue, and "Less Frequent" in black with cluster size of one, four and six lanes respectively. Through Fig. 3, it is clear that the "More Frequent" group is distinct whereas the other two groups have some overlapping elements.
在对数据集进行预处理后，源自市场的

移动按车道进行聚类，然后按商品类型进一步对每个巷组进行聚类。在车道层面进行聚类会产生三组，每个聚类的目标的成员分别以绿色、蓝色和黑色表示，质心由红色的 k 均值算法选择，如图 3 所示。图 3 和图 4 仅显示训练集中的前 52 周 - 以便清晰显示。但是，聚类是使用整个训练集完成的。该图显示，分组主要由市场

与不同目的地之间的移动频率驱动。因此，三个集群分别以绿色表示“较频繁”，以蓝色表示“频繁”，以黑色表示“不太频繁”，簇大小分别为一、四和六车道。通过图3，可以清楚地看出，“更频繁”的组是不同的，而其他两个组有一些重叠的元素。

Fig. 4. Clustering market

at the commodity level.
图 4.在大宗商品层面聚集市场

。

Lane Group Clustering 车道组聚类

Fig. 5. Weekly moves by cluster.
图 5.每周按集群移动。

Commodities, categorized based on a two-digit code, are clustered based on the number of weekly orders, as shown in Fig. 4. Starting with the original dataset, the number of weekly moves for 2-digit commodity ID is clustered. This results in two distinct groups presented in blue and green with cluster size two and 23 commodities respectively. Fig. 4 shows that there is no overlap between two clusters and that the primary distinction is again the number of weekly moves.
商品，根据两位数代码分类，根据每周订单数量进行聚类，如图 4 所示。从原始数据集开始，对 2 位商品 ID 的每周移动次数进行聚类。这导致两个不同的组以蓝色和绿色显示，集群大小分别为 2 和 23 种商品。图 4 显示两个集群之间没有重叠，主要区别再次是每周移动的次数。

Fig. 5 demonstrates the two main steps in the clustering process. The blue numbers on the arrows indicate the average number of weekly moves by cluster.
图 5 演示了聚类过程中的两个主要步骤。箭头上的蓝色数字表示按集群划分的每周平均移动次数。

As a result, the model is customized for each lane-commodity cluster, in other words, each lane cluster is divided into two commodity sub-sets, and the model is designed based on the specifications of each. The number of moves from January 2013 to June 2017, is extracted from the dataset and then aggregated to weekly moves for the short-term model and monthly moves for the longterm model. Afterwards, the number of container moves is divided into two sets: the "Training Dataset" is used for estimation from January 2013 to June 2015, and the "Testing Dataset" is predicted and introduced in the model gradually using the rolling horizon approach.
因此，针对每个车道商品集群定制模型，即将每个车道集群划分为两个商品子集，并根据每个子集的规格设计模型。从数据集中提取 2013 年 1 月至 2017 年 6 月的移动次数，然后聚合为短期模型的每周移动数和长期模型的每月移动数。然后，将集装箱移动次数分为两组：使用“训练数据集”进行2013年1月至2015年6月的估计，并使用滚动水平方法逐步将“测试数据集”引入模型中。

4.1. Short-term model 4.1. 短期模型

This model is used for forecasting one week

and two weeks

in advance. A "hybrid" model is used for forecasting such that the

is forecasted by combining data from lane clusters two and three into one cluster and forecasting for it. The "hybrid" approach was compared to forecasting the separate clusters and the combination, in this case, resulted in a higher prediction accuracy. Therefore, we decided to combine the two clusters for this case only. On the other hand,

forecasts are generated using three lane clusters previously discussed. The hybrid approach was introduced after the non-hybrid approach failed to produce acceptable results.
该模型用于提前一周

和两周

进行预测。“混合”模型用于预测，

通过将来自车道集群 2 和 3 的数据合并到一个集群中并对其进行预测来预测。将“混合”方法与预测单独的聚类进行了比较，在这种情况下，这种组合导致了更高的预测准确性。因此，我们决定仅针对这种情况合并两个集群。另一方面，

预测是使用前面讨论的三个车道集群生成的。混合方法是在非混合方法未能产生可接受的结果之后引入的。

The time series components used in forecasting

are: autoregressive integrated moving average with explanatory variables (ARIMAX) with the holiday covariates, Seasonal and Trend decomposition using Loess (STL), and Simple Moving Average (SMA). The components used in

forecasting are ARIMAX, TBATs, and SMA. In the ARIMAX models, the dependent variable is regressed on its own lagged values and lagged forecasting errors as well as covariates. The model may also apply an initial step of differencing to prevent non-stationarity. STL identifies the seasonal, trend and irregular components using locally weighted smoothing (local regression). SMA forecasts are calculated as the average of some previous observations (Hyndman and Athanasopoulos, 2014). Moreover, TBATs is a generalization of exponential smoothing. It allows for automatic Box-Cox transformation and errors are modeled as an ARMA process (De Livera et al., 2011).
预测中使用的时间序列分量

包括：具有解释变量的自回归综合移动平均线（ARIMAX）和假日协变量、使用黄土的季节性和趋势分解（STL）以及简单移动平均线（SMA）。预测中使用

的组件是 ARIMAX、TBAT 和 SMA。在 ARIMAX 模型中，因变量根据其自身的滞后值和滞后预测误差以及协变量进行回归。该模型还可以应用微分的初始步骤来防止非平稳性。STL 使用局部加权平滑（局部回归）识别季节性、趋势和不规则分量。SMA预测被计算为一些先前观测值的平均值（Hyndman和Athanasopoulos，2014）。此外，TBAT是指数平滑的推广。它允许自动Box-Cox转换，并将错误建模为ARMA过程（De Livera等人，2011）。

The training period is taken as 2.5 years and is kept so during the entire process of adding and dropping weeks. The initial forecasts are generated using a three-year period; however using 2.5 years improves the accuracy of results, which may be related to Erev and Roth's (1998) concept of recency discussed in Section 2. Finally, the forecasts are generated at the lane-commodity cluster
培训期为 2.5 年，并在增加和减少周数的整个过程中保持如此。初步预测是以三年为周期生成的;然而，使用 2.5 年可以提高结果的准确性，这可能与第 2 节中讨论的 Erev 和 Roth （1998）的新近概念有关。最后，在车道-商品集群上生成预测

Fig. 6. Aggregate Short Term Forecasts for Market X - Forecasts starting 2017.
图 6.市场 X 的汇总短期预测 - 从 2017 年开始的预测。

Fig. 7. MAD in forecasting for

and

for market

.
图 7.MAD在市场预测

和

市场

预测中。

level with positive deviation penalty equal to the negative deviation penalty, shown in Fig. 6 . The evaluation metrics, MAD and MAPE, in Figs. 7 and 8 are presented by one-week ahead forecasts

and two-weeks ahead forecasts

indicated on the

axis. The figures also present the metrics as an average of all testing time intervals (All) and testing time intervals in 2017 only (2017). The approach proposed result in

forecasts with a margin of error of 20 weekly moves on average (Fig. 7), and of

for the overall forecasting period and

for 2017 forecasts only (Fig. 8). The errors for

( 2 weeks ahead) forecasts increase to 50 weekly moves on average and around 12-16% MAPE (Figs. 7 and 8). The MAD for one and two weeks ahead are reasonable given that the demand to the market fluctuates around 450 moves per week, Fig. 6.
正偏差惩罚等于负偏差惩罚的水平，如图 6 所示。图 7 和图 8 中的评估指标 MAD 和 MAPE 由

轴上

指示的提前一周预测

和提前两周预测表示。这些数字还以2017年（2017年）所有测试时间间隔（全部）和测试时间间隔的平均值表示指标。该方法提出的

预测结果平均每周波动幅度为20次（图7），并且仅

针对整个预测期和

2017年的预测（图8）。

（提前 2 周）预测的误差平均增加到 50 周，MAPE 约为 12-16%（图 7 和 8）。鉴于市场需求每周波动约 450 次，未来 1 周和 2 周的 MAD 是合理的，图 6。

Analyzing the error distributions for

forecasts (Fig. 9) shows that the relative errors follow a favorable normal distribution centered on zero.

errors (Fig. 9) are more spread out, but the majority of the forecasting errors are concentrated around zero.
分析

预测的误差分布（图9）表明，相对误差遵循以零为中心的有利正态分布。

误差（图9）更加分散，但大多数预测误差集中在零附近。

Figs. 10 and 11 show the cluster level forecasts for one week ahead. The forecasted values closely follow the actual observations and trends in demand. It is further verified by the low errors presented in Figs. 12 and 13. On the other hand, forecasts for two weeks ahead catch up with the fluctuations at a slower pace (Figs. 14 and 15). This is reflected in the errors in Figs. 16 and 17.
图10和图11显示了未来一周的集群水平预测。预测值与实际观察结果和需求趋势密切相关。图 12 和图 13 中的低误差进一步验证了这一点。另一方面，对未来两周的预测以较慢的速度赶上波动（图14和图15）。这反映在图 16 和 17 中的误差中。

Two-week ahead forecasts for the remaining clusters are still performed separately as discussed before. Below, their forecasts are summed and presented as forecasts for "Lane Cluster 2" to maintain a consistent format with

forecasts.
如前所述，对其余集群的两周前期预报仍单独进行。下面，他们的预测被汇总并呈现为“Lane Cluster 2”的预测，以保持与

预测一致的格式。

4.2. Long-term model 4.2. 长期模型

As previously mentioned, the long-term model combines the forecasts of the No Information model and the Monthly to Weekly Mapping model.
如前所述，长期模型结合了无信息模型和月到周映射模型的预测。

The monthly model forecasts the total number of moves that will occur in the month

where

is the current month. The

forecasts are generated using the hybrid approach in which lane clusters 1 and 2 are combined. For that cluster, STL and structural time series (StructTS) are used with rolling history of length 2.5 years. Forecasts are generated at the lane commodity level and penalizing the negative deviations. As for the third lane cluster, all the specifications apply except for the components, which consist of SMA, STL, complex exponential smoothing (CES), and StructTS. StructTS is a linear-state space model that decomposes the time series into components (trend, seasonality, etc.) and fits the model by maximum likelihood (Petris and Petrone, 2011). CES is a nonlinear method that introduces complex variables to simple exponential smoothing method (Svetunkov et al., 2016).
月度模型预测当月

将发生的移动总数或

当月的位置

。

预测是使用混合方法生成的，其中车道集群 1 和 2 被组合在一起。对于该聚类，使用 STL 和结构时间序列（StructTS），滚动历史长度为 2.5 年。预测是在车道商品层面生成的，并对负偏差进行惩罚。对于第三通道集群，除组件外，所有规格均适用，这些组件由 SMA、STL、复指数平滑（CES）和 StructTS 组成。StructTS 是一种线性状态空间模型，它将时间序列分解为组件（趋势、季节性等），并以最大似然拟合模型（Petris 和 Petrone，2011 年）。CES是一种非线性方法，它将复杂变量引入简单的指数平滑方法（Svetunkov等人，2016）。

Separate

forecasts are generated for each lane cluster. The components for lane cluster 1 include TBATs and CES. Lane cluster 2 forecasts are generated using STL and CES whereas lane cluster 3 forecasts are generated using SMA, STL, CES, and StructTS. The previous specifications still apply.
为每个车道集群生成单独的

预测。泳道集群 1 的组件包括 TBAT 和 CES。车道集群 2 预测使用 STL 和 CES 生成，而车道集群 3 预测使用 SMA、STL、CES 和 StructTS 生成。以前的规范仍然适用。

Fig. 8. MAPE in forecasting for

and

for market

.
图 8.MAPE在市场预测

和

市场

预测中。

Fig. 9. Error Distribution in short term forecasts

in Red,

in Green) for aggregate market

. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
图 9.短期预测

中的误差分布（红色，

绿色）为总市场

。（为了解释此图例中对颜色的引用，读者可以参考本文的网络版本。

Fig. 10.

Forecasts for market

- Cluster 1 .
图 10.

市场

预测 - 集群 1 .

Fig. 11.

Forecasts market

- hybrid Cluster 2 .
图 11.

预测市场

- 混合集群 2 .

Fig. 12. MAD in forecasting for

for Lane Cluster 1 and 2- market

.
图 12.MAD 对 Lane Cluster 1 和 2- 市场的

预测

。

Fig. 13. MAPE in forecasting for

for Lane Cluster 1 and 2- market

.
图 13.MAPE 对 Lane Cluster 1 和 2 市场的

预测

。

Fig. 14.

Forecasts market

- Lane Cluster 1 .
图 14.

预测市场

- Lane Cluster 1 .

Based on the results presented above (Figs. 18-21), it is evident that the monthly forecasts are accurate enough to be incorporated into the long-term weekly model. The calculations presented next are still performed within a rolling horizon framework. First, the weekly patterns are obtained for the "No Information / Equal Allocation" component of the model. It is done by an equal allocation of the monthly forecasts over the days of the month. Then, for each week, the number of daily moves that belong to that week and month are added.
根据上述结果（图18-21），很明显，月度预测足够准确，可以纳入长期周度模型。接下来介绍的计算仍在滚动地平线框架内执行。首先，获得模型的“无信息/平等分配”组件的每周模式。它是通过在每月的几天内平均分配月度预测来完成的。然后，对于每一周，将添加属于该周和该月的每日移动数。

Afterwards, the "Monthly to Weekly Mapping" component is calculated as a function of a week-month weights matrix

, forecasts

, and the long-term weekly forecasts using time series models

. The component models used for the long-term
之后，“月度到周度映射”组件被计算为周月权重矩阵

、预测

和使用时间序列模型

的长期每周预测的函数。用于长期的组件模型

Fig. 15.

Forecasts for market

- hybrid Lane Cluster 2 .
图 15.

市场

预测 - 混合车道集群 2 .

Fig. 16. MAD in forecasting for

for Lane cluster 1 and 2- market

.
图 16.MAD 对 Lane 集群 1 和 2 市场的

预测

。

Fig. 17. MAPE in forecasting for

for Lane cluster 1 and 2- market

.
图 17.MAPE 对 Lane 集群 1 和 2- 市场的

预测

。

weekly forecasts are XGboost, SMA, and Theta forecasting models. XGboost is a machine learning algorithm. For a given dataset with "

" rows of data and "

" features a tree model uses

additive functions to predict the output. XGboost finds the optimal weights for each leaf on the tree by minimizing a loss function that sums up the difference between the actual and predicted values (Xgboost.readthedocs.io, 2016). The theta model introduces a parameter theta that represents the local curvature of the time series model. The second order derivatives of the observed time series are multiplied by the theta parameters. Forecasts are obtained through a weighted average of the modified time series for different values of theta (Hyndman and Billah, 2003).
每周预测是 XGboost、SMA 和 Theta 预测模型。XGboost 是一种机器学习算法。对于具有 “

” 行数据和 “

”特征的给定数据集，树模型使用

加法函数来预测输出。XGboost 通过最小化损失函数来找到树上每片叶子的最佳权重，该损失函数将实际值和预测值之间的差异相加（Xgboost.readthedocs.io，2016）。θ 模型引入了一个参数 theta，表示时间序列模型的局部曲率。观测到的时间序列的二阶导数乘以 theta 参数。预测是通过对不同θ值的修改时间序列的加权平均值获得的（Hyndman和Billah，2003）。

Fig. 22 presents the actual and long-term weekly forecasts generated through the proposed method. Both forecasts tend to stabilize the overall trend and present a more stable basis upon which medium range operational decisions can be made. The evaluation metrics (Figs. 23-25) reveal that the majority of forecasts generated using this method have an

prediction interval with most of the errors falling

for

weekly forecasts and a more spread out error distribution for

forecasts
图22显示了通过所提出的方法生成的实际和长期每周预报。这两种预测都倾向于稳定总体趋势，并为作出中期业务决策提供更稳定的基础。评估指标（图 23-25）显示，使用这种方法生成的大多数预测都具有

预测区间，大多数误差在

每周预测中下降

，而

预测的误差分布更分散

Fig. 18. Monthly forecasts for market

- forecasts starting 2016 .
图 18.市场

月度预测 - 从2016年开始的预测。

Fig. 19. Error distribution (

top,

bottom) monthly forecasts for market

.
图 19.误差分布（

顶部，

底部）市场

月度预测。

(Fig. 25). When testing the typical time series models for long term weekly forecasting, forecasting weeks 8 and 12 ahead, the MAPE ranged between

and

whereas Fig. 24 shows that the proposed approach results in a MAPE between

and

.
（图 25）。在测试用于长期每周预测的典型时间序列模型时，预测第 8 周和第 12 周，MAPE 的范围介于

和

之间，而图 24 显示所提出的方法导致 MAPE 介于

和

之间。

The forecasting errors for month

and

are calculated by summing the number of forecasted moves in each week that fall in each month and comparing them to the actual number of moves. These are then compared to the errors in the monthly model (directly forecasting monthly moves). Fig. 26 shows that the proposed long-term weekly model outperforms the direct monthly model in 7 out of the 18 forecasted months. The proposed model also outperforms the monthly model in half the forecasting instances for

forecasts (Fig. 26).
月份

和

的预测误差的计算方法是将每个月的每周预测移动数相加，并将其与实际移动数进行比较。然后将这些误差与月度模型中的误差进行比较（直接预测月度移动）。图26显示，在18个预测月份中，有7个月的预测月份中，所提出的长期周度模型优于直接月度模型。所提出的模型在

预测实例的一半上也优于月度模型（图26）。

4.3. Comparing individual approach to combined approach
4.3. 比较单个方法与组合方法

As previously mentioned, the intermodal company plans its operations on a weekly horizon. It requires devising models that generate long-term weekly patterns. The suggested model provides these forecasts as a weighted average of values obtained from an equal allocation of monthly forecasts over weeks and short-term time series models with a long-term forecasting horizon. The weights are initialized and then updated through reinforcement learning. Fig. 27 helps assess the performance of the overall model and its components (No Information and Short Term). In several instances, the short-term model and No Information react differently to
如前所述，多式联运公司每周计划一次运营。它需要设计能够生成长期每周模式的模型。建议的模型将这些预测作为从数周内平均分配的月度预测和具有长期预测范围的短期时间序列模型中获得的值的加权平均值。权重被初始化，然后通过强化学习进行更新。图 27 有助于评估整体模型及其组件的性能（无信息和短期）。在某些情况下，短期模型和无信息对

Fig. 20. MAD in forecasting for

and

- Market

.
图 20.MAD 在预测

和

- 市场

Fig. 21. MAPE in forecasting for

and

- Market

.
图 21.MAPE 在预测

和

- 市场

Fig. 22. Long-term weekly forecasts for market

- forecasts for 2017.
图 22.市场

的长期每周预测 - 2017 年的预测。

recent trends where the latter is more stable the short-term model fluctuates strongly. The results show that the RL weighted approach helps average out the over-prediction and under-prediction of the individual components. In fact, Fig. 28(a) and (b) show that the weekly long-term pattern model can closely follow the trend in actual observations over the long run for forecasting months

and

.
近期趋势：后者较为稳定，短期模型波动较大。结果表明，RL加权方法有助于平均各个成分的过度预测和预测不足。事实上，图28（a）和（b）显示，每周长期模式模型可以密切跟踪预测月份

和

。

5. Conclusion 5. 结论

Freight service providers have relied on well-established time series models to forecast shipments and plan their operations in advance. However, with the changing markets, policies and evolving customer preferences, these models appear to be losing effectiveness, or are at best a hit of miss proposition. Moreover, tightening error margins is becoming more critical especially in freight
货运服务提供商依靠完善的时间序列模型来预测货物并提前计划其运营。然而，随着市场、政策和客户偏好的变化，这些模式似乎正在失去效力，或者充其量只是一个失败的提议。此外，收紧误差幅度变得越来越重要，尤其是在货运领域

Fig. 23. MAD forecasting month

and

using the long-term model.
图 23.MAD预测月份

并使用

长期模型。

Fig. 24. MAPE forecasting month

and

using the long-term model.
图 24.MAPE 预测月份

并使用

长期模型。

applications with the growing need for faster and more reliable networks. This highlights the need for models that can react and adjust quickly to fluctuations. This paper presents and tests a new approach to freight forecasting that builds on typical time series models and machine learning algorithms, and enhances their performance through reinforcement learning framework applied over a rolling horizon. The advantage of the proposed approach compared to other models lies in its quick adaptability to recent events in the freight market and reasonable data requirements. The proposed approach, however, requires start time for RL calibration and training.
对更快、更可靠的网络的需求不断增长的应用。这凸显了对能够对波动做出快速反应和调整的模型的需求。本文介绍并测试了一种新的货运预测方法，该方法建立在典型的时间序列模型和机器学习算法之上，并通过在滚动地平线上应用的强化学习框架来增强其性能。与其他模型相比，该方法的优势在于其对货运市场近期事件的快速适应性和合理的数据要求。然而，所提出的方法需要RL校准和训练的开始时间。

The proposed freight forecasting framework is applied for a target market; it groups all the market's lanes and generates clusterbased forecasts. Multiple suitable time series models are chosen for each cluster, and their forecasts are weighted using reinforcement learning formulation. Reinforcement learning updates the weights assigned to the forecasts of each time series component based on its performance in the latest forecasts. To aid the learning mechanism, forecasts are generated on a rolling horizon basis in which the estimation dataset drops the earliest week and adds the most recent in each rolling time interval. In addition to clustering and reinforcement learning, we developed and tested a method for converting monthly forecasts to long-term weekly forecasts. The results suggest that these monthly to weekly long-term forecasts outperform the direct long-term forecasts generated through typical time series approaches.
所提出的运费预测框架适用于目标市场;它对市场的所有车道进行分组，并生成基于集群的预测。为每个聚类选择多个合适的时间序列模型，并使用强化学习公式对它们的预测进行加权。强化学习根据每个时间序列组件在最新预测中的表现更新分配给每个时间序列组件的预测的权重。为了帮助学习机制，预测是在滚动水平的基础上生成的，其中估计数据集删除最早的一周，并在每个滚动时间间隔内添加最近的一周。除了聚类和强化学习之外，我们还开发并测试了一种将月度预测转换为长期周度预测的方法。结果表明，这些月度到每周的长期预测优于通过典型时间序列方法生成的直接长期预测。

The overall framework is tested using market data for a US intermodal company. The margin of error is around

and

for short-term

and

forecasts respectively. The margin of error in the long-term weekly forecasts is around

in all forecasting periods in 2017 and for

and

forecasts. Furthermore, analyzing the forecasted trends, the results reveal that the predictions of the proposed framework can capture and adjust to recent fluctuation in the market. This indicates that RL coupled with rolling horizon approach has potential benefits in improving forecast quality.
使用一家美国多式联运公司的市场数据对整体框架进行了测试。误差幅度分别在短期

和

预测范围内

。长期每周预测的误差幅度在2017年的所有预测期以及预测

期内都在附近

。此外，通过分析预测趋势，结果表明，所提出的框架的预测可以捕捉并适应最近的市场波动。这表明RL与滚动水平方法相结合，在提高预测质量方面具有潜在的优势。

Improving freight forecasting accuracy can help carriers and logistics service providers improve route planning, equipment allocation, labor needs, competitive pricing, and investment decisions. It could also help public sector agencies mitigate congestion by
提高货运预测的准确性可以帮助承运人和物流服务提供商改进路线规划、设备分配、劳动力需求、有竞争力的价格和投资决策。它还可以帮助公共部门机构通过以下方式缓解拥堵

Fig. 25. Error distribution in forecasting long-term weekly forecasts for months (

in Red,

in Green) for market

. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
图 25.预测市场

几个月的长期每周预测（

红色，

绿色）的误差分布。（为了解释此图例中对颜色的引用，读者可以参考本文的网络版本。

Fig. 26. Monthly vs. sum of corresponding weekly errors

forecasts (Red),

forecasts (Green). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
图 26.月度与相应每周误差

预测（红色）、

预测（绿色）的总和。（为了解释此图例中对颜色的引用，读者可以参考本文的网络版本。

generating adaptive predictions of truck traffic as well as to inform infrastructure decisions and regulatory policies (Nuzzolo et al., 2013). This study demonstrates a successful new methodology for forecasting freight demand, validated in a representative Market X with short-term and long-term models. Additional work would extend the approach to multiple market lanes an overall network. Another extension of this work would seek to automate the clustering and component model selection approach such that the committee of forecasters can be updated if and when needed with minimal intervention.
生成卡车交通的自适应预测，并为基础设施决策和监管政策提供信息（Nuzzolo 等人，2013 年）。本研究展示了一种成功的预测货运需求的新方法，该方法在具有代表性的市场X中得到了短期和长期模型的验证。额外的工作将把这种方法扩展到整个网络的多个市场通道。这项工作的另一个扩展将寻求使聚类和组件模型选择方法自动化，以便在必要时可以以最少的干预更新预报员委员会。

Fig. 27. Comparing different approaches in forecasting long term weekly pattern - results for

forecasts starting 2017.
图 27.比较预测长期每周模式的不同方法 - 2017年开始的

预测结果。

Fig. 28. Cumulative actual weekly values vs cumulative monthly and weekly forecasts: (a)

forecasts starting 2016; (b)

forecasts starting 2016 .
图 28.累计实际每周值与累计月度和周度预测：（a）

2016年开始的预测;（b）

从2016年开始的预测。

Acknowledgement 确认

This paper is based on a study conducted by the Northwestern University Transportation Center (NUTC) in collaboration with an Intermodal Transportation company based in the USA, which prefers to remain anonymous. The analysis is based on real-world data provided by the company. The authors have benefited from helpful comments provided by analysts and managers of that company, as well as from the participation of NUTC Associate Director Breton Johnson in facilitating the project. The authors remain responsible for all content of the paper.
本文基于西北大学交通中心（NUTC）与一家不愿透露姓名的美国多式联运公司合作进行的一项研究。该分析基于公司提供的真实世界数据。作者受益于该公司的分析师和经理提供的有益意见，以及NUTC副主任Breton Johnson参与促进该项目。作者仍对论文的所有内容负责。

Appendix A. Supplementary material
附录 A. 补充材料

Supplementary data to this article can be found online at https://doi.org/10.1016/j.tre.2020.101926.
本文的补充数据可在线找到 https://doi.org/10.1016/j.tre.2020.101926.

References 引用

Assoc. of American Railroads, 2017. Rail Intermodal Keeps America Moving. Available at https://www.aar.org/wp-content/uploads/2018/07/AAR-Rail-Intermodal. pdf.
美国铁路协会，2017 年。铁路多式联运使美国不断前进。可在 https://www.aar.org/wp-content/uploads/2018/07/AAR-Rail-Intermodal 购买。PDF格式。

Ben-Akiva, M., Bolduc, D., Park, J.Q., 2008. Discrete choice analysis of shippers' preferences. In: Recent Developments in Transport Modelling: Lessons for the Freight Sector. Emerald Group Publishing Limited, pp. 135-155.
Ben-Akiva， M.， Bolduc， D.， Park， J.Q.， 2008.托运人偏好的离散选择分析。在：运输建模的最新发展：货运部门的经验教训。翡翠集团出版有限公司，第135-155页。

Black, William R. Commodity flow modeling. Report No: E-C011. National Research Council (US), 1999.
Black， William R. 商品流建模。报告编号：E-C011。美国国家研究委员会，1999年。

Bichpuriya, Y.K., Soman, S.A., Subramanyam, A., 2016. Combining forecasts in short term load forecasting: Empirical analysis and identification of robust forecaster. Sādhanā 41 (10), 1123-1133.
Bichpuriya， Y.K.， Soman， S.A.， Subramanyam， A.， 2016.在短期负荷预测中结合预测：实证分析和确定稳健的预测者。Sādhanā 41（10），1123-1133。

Bodily, S.E., Freeland, J.R., 1988. A simulation of techniques for forecasting shipments using firm orders-to-date. J. Oper. Res. Soc. 39 (9), 833-846.
Bodily， SE， Freeland， J.R.，1988 年。使用迄今为止的确定订单预测发货的技术的模拟。J.奥珀。Res. Soc. 39 （9）， 833-846.

Cambridge Systematics, GeoStats, LLP., 2010. Freight-demand Modeling to Support Public-sector Decision Making (Vol. 8). National Cooperative Freight Research Program, Transportation Research Board, Washington, DC.
剑桥系统学，GeoStats，LLP.，2010 年。支持公共部门决策的货运需求建模（第 8 卷）。国家合作货运研究计划，运输研究委员会，华盛顿特区。

Cascetta, E., Marzano, V., Papola, A., Vitillo, R., 2013. A multimodal elastic trade coefficients MRIO model for freight demand in Europe. In: Freight Transport Modelling. Emerald Group Publishing Limited, pp. 45-68.
Cascetta， E.， Marzano， V.， Papola， A.， Vitillo， R.， 2013.欧洲货运需求的多式联运弹性贸易系数MRIO模型。在：货运建模。翡翠集团出版有限公司，第45-68页。

Choe, T., Rosenberger, S., Garza, M., Woolfolk, J., 2017. The future of freight: how new technology and new thinking can transform how goods are moved. Deloitte Insights.
Choe， T.， Rosenberger， S.， Garza， M.， Woolfolk， J.， 2017.货运的未来：新技术和新思维如何改变货物的运输方式。Deloitte Insights（德勤洞察）。

Chow, J.Y., Yang, C.H., Regan, A.C., 2010. State-of-the art of freight forecast modeling: lessons learned and the road ahead. Transportation 37 (6), 1011-1030.
Chow， J.Y.， Yang， C.H.， Regan， A.C.， 2010.最先进的货运预测建模技术：经验教训和未来之路。运输37（6），1011-1030。

Cleveland, W.P., Grupe, M.R., 1981. Modeling time series when calendar effects are present. Division of Research and Statistics, Federal Reserve Board.
克利夫兰，WP，格鲁普，MR，1981 年。对存在日历效应时的时间序列进行建模。美国联邦储备委员会研究与统计司。

De Livera, A.M., Hyndman, R.J., Snyder, R.D., 2011. Forecasting time series with complex seasonal patterns using exponential smoothing. J. Am. Stat. Assoc. 106 (496), 1513-1527.
De Livera， A.M.， Hyndman， RJ， Snyder， R.D.， 2011.使用指数平滑预测具有复杂季节模式的时间序列。J. Am. Stat. Assoc. 106 （496）， 1513-1527.

Dewitt, W., Clinger, J., 2000. Intermodal freight transportation. Transportation in the New Millennium. Transportation Research Board, Washington, DC.
德威特，W.，克林格，J.，2000 年。多式联运货运。新千年的交通。交通研究委员会，华盛顿特区。

Erev, I., Roth, A.E., 1998. Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria. Am. Econ. Rev.

.
Erev， I.， Roth， AE， 1998.预测人们如何玩游戏：具有独特混合策略均衡的实验游戏中的强化学习。Am. Econ. Rev.

Farrington, P.A., Harris, G.A., 2011. Methods for Forecasting Freight in Uncertainty: Time Series Analysis of Multiple Factors (Research report No. 930-768). University of Alabama.
宾夕法尼亚州法灵顿，佐治亚州哈里斯，2011 年。不确定性中的货运预测方法：多因素的时间序列分析（研究报告编号930-768）。阿拉巴马大学。

Feltovich, N., 2000. Reinforcement-based vs. belief-based learning models in experimental asymmetric-information games. Econometrica 68 (3),

.
费尔托维奇，N.，2000 年。实验性非对称信息博弈中基于强化与基于信念的学习模型。计量经济学 68 （3），

Fite, J.T., Don Taylor, G., Usher, J.S., English, J.R., Roberts, J.N., 2002. Forecasting freight demand using economic indices. Int. J. Phys. Distribut. Logist. Manage. 32 (4), 299-308.
Fite， J.T.， Don Taylor， G.， Usher， J.S.， English， J.R.， Roberts， J.N.， 2002.使用经济指数预测货运需求。Int. J. Phys. Distribut.逻辑。管理。32 (4), 299-308.

Garrido, R.A., Mahmassani, H.S., 1998. Forecasting short-term freight transportation demand: Poisson STARMA model. Transp. Res. Rec. 1645 (1), 8-16.
Garrido， R.A.， Mahmassani， H.S.， 1998.预测短期货运需求：泊松STARMA模型。Transp. Res. Rec. 1645 （1）， 8-16.

Garrido, R.A., Mahmassani, H.S., 2000. Forecasting freight transportation demand with the space-time multinomial probit model. Transport. Res. Part B: Methodol. 34 (5), 403-418.
Garrido， R.A.， Mahmassani， H.S.， 2000.使用时空多项式概率模型预测货运需求。运输。B部分：Methodol。34 (5), 403-418.

Granger, C.W., Newbold, P., 1976. Forecasting transformed series. J. Roy. Stat. Soc. B Methodol. 189-203.
格兰杰，CW，纽博尔德，P.，1976 年。预测转换序列。J.罗伊。Stat. Soc. B Methodol.189-203.

Granger, C.W., Jeon, Y., 2004. Thick modeling. Econ. Model. 21 (2), 323-343.
格兰杰，CW，Jeon，Y.，2004 年。厚建模。经济模型。21 (2), 323-343.

GOSAVII, A., Bandla, N., Das, T.K., 2002. A reinforcement learning approach to a single leg airline revenue management problem with multiple fare classes and overbooking. IIE Trans. 34 (9), 729-742.
GOSAVII， A.， Bandla， N.， Das， T.K.， 2002.针对具有多个票价等级和超售的单条航段航空公司收入管理问题的强化学习方法。IIE Trans. 34 （9）， 729-742.

Guerrero, V.M., Elizondo, J.A., 1997. Forecasting a cumulative variable using its partially accumulated data. Manage. Sci. 43 (6), 879-889.
格雷罗，V.M.，埃利桑多，J.A.，1997年。使用累积变量的部分累积数据预测累积变量。管理。科学 43 （6）， 879-889.

Holguín-Veras, J., Patil, G.R., 2008. A multicommodity integrated freight origin-destination synthesis model. Networks Spatial Econ. 8 (2-3), 309-326.
Holguín-Veras， J.， Patil， G.R.， 2008.多商品一体化货运始发地-目的地综合模型。网络空间经济学 8 （2-3）， 309-326.

Hyndman, R.J., Athanasopoulos, G., 2014. Forecasting: principles and practice. OTexts.
海德曼，RJ，阿萨纳索普洛斯，G.，2014 年。预测：原则与实践。OTexts。

Hyndman, R.J., Billah, B., 2003. Unmasking the Theta method. Int. J. Forecast. 19 (2), 287-290.
海德曼，RJ，比拉，B.，2003 年。揭开 Theta 方法的面纱。国际预测。19 (2), 287-290.

Kaufman, L., Rousseeuw, P.J., 2009. Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, New York, NY.
考夫曼，L.，Rousseeuw，PJ，2009 年。在数据中查找组：聚类分析简介。John Wiley & Sons，纽约，纽约。

Liu, F., Kaiser, R.G., Zekkos, M., Allison, C., 2006. Growth forecasting of vehicle miles of travel at county and statewide levels. Transp. Res. Rec. 1957 (1), 56-65.
Liu， F.， Kaiser， R.G.， Zekkos， M.， Allison， C.， 2006.县级和全州级车辆行驶里程的增长预测。Transp. Res. Rec. 1957 （1）， 56-65.

Mahmassani, H.S., Zhang, K., Dong, J., Lu, C.C., Arcot, V.C., Miller-Hooks, E., 2007. Dynamic network simulation-assignment platform for multiproduct intermodal freight transportation analysis. Transp. Res. Rec. 2032 (1), 9-16.
Mahmassani， H.S.， Zhang， K.， Dong， J.， Lu， C.C.， Arcot， V.C.， Miller-Hooks， E.， 2007.用于多产品多式联运货物运输分析的动态网络仿真分配平台。Transp. Res. Rec. 2032 （1）， 9-16.

Moscoso-López, J., Turias, I.T., Come, M., Ruiz-Aguilar, J., Cerbán, M., 2016. Short-term forecasting of intermodal freight using ANNs and SVR: case of the port of Algeciras bay. Transp. Res. Proc. 18, 108-114.
莫斯科索-洛佩斯， J.，图里亚斯， I.T.，科姆， M.，鲁伊斯-阿吉拉尔， J.，塞尔班， M.， 2016.使用人工神经网络和SVR对多式联运货物进行短期预测：以阿尔赫西拉斯湾港为例。Transp. Res. Proc. 18， 108-114.

Newbold, P., Granger, C.W., 1974. Experience with forecasting univariate time series and the combination of forecasts. J. Roy. Statist. Soc. Ser. A General 131-165.
纽博尔德，P.，格兰杰，CW，1974 年。具有预测单变量时间序列和预测组合的经验。J.罗伊。中央集权。Soc. Ser.A 总则131-165。

Nguyen, H.H., Chan, C.W., 2004. Multiple neural networks for a long term time series forecast. Neural Comput. Appl. 13 (1), 90-98.
Nguyen， H.H.， Chan， C.W.， 2004.用于长期时间序列预测的多个神经网络。神经计算。应用13（1），90-98。

Nuzzolo, A., Coppola, P., Comi, A., 2013. Freight transport modeling: review and future challenges. Int. J. Transport Econ./Rivista internazionale di economia dei trasporti 151-181.
Nuzzolo， A.， Coppola， P.， Comi， A.， 2013.货运建模：回顾和未来挑战。Int. J. Transport Econ./Rivista internazionale di economia dei trasporti 151-181.

Orlin, J.B., Kumar, M., Patel, N., Woo, J., 2017. Data clustering for forecasting. Available at: http://ebusiness.mit.edu/sponsors/common/2002-June-Wksp-DataM/ orlin.pdf.
Orlin， JB， Kumar， M.， Patel， N.， Woo， J.， 2017.用于预测的数据聚类。可在以下网址获得：http://ebusiness.mit.edu/sponsors/common/2002-June-Wksp-DataM/ orlin.pdf。

Patil, G.R., Sahu, P.K., 2016. Estimation of freight demand at Mumbai Port using regression and time series models. KSCE J. Civ. Eng. 20 (5),

.
帕蒂尔，GR，萨胡，PK，2016 年。使用回归和时间序列模型估计孟买港的货运需求。KSCE J. Civ. Eng. 20 （5），

Petris, G., Petrone, S., 2011. State space models. R. J. Statist. Software 41 (4), 1-25.
Petris， G.， Petrone， S.， 2011.状态空间模型。RJ 统计主义者。软件41（4），1-25。

Regan, A.C., Garrido, R., 2001. Freight demand and shipper behavior modeling: state of the art, directions for the future. The leading edge of travel behavior research. Pergamon, New York.
里根，AC，加里多，R.，2001 年。货运需求和托运人行为建模：最新技术，未来方向。旅行行为研究的前沿。佩加蒙，纽约。

Roth, A.E., Erev, I., 1995. Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term. Games Econ. Behavior 8 (1),

.
罗斯，AE，埃雷夫，I.，1995 年。在广泛形式的游戏中学习：中期的实验数据和简单的动态模型。游戏经济行为 8 （1），

Schoier, G., Gregorio, C., 2017. Clustering algorithms for spatial big data. In: International Conference on Computational Science and Its Applications. Springer, pp. 571-583.
Schoier， G.， Gregorio， C.， 2017.空间大数据的聚类算法。在：计算科学及其应用国际会议。施普林格，第 571-583 页。

Stephens, B., 2017. Forecast: Intermodal Growth Trend to Continue Next Year. Trains Magazine.
斯蒂芬斯，B.，2017 年。预测：明年多式联运增长趋势将持续。火车杂志。

Sutton, R.S., Barto, A.G., 1998. Reinforcement Learning: An Introduction. MIT press Cambridge.
萨顿，R.S.，巴托，AG，1998 年。强化学习：简介。麻省理工学院出版社剑桥。

Svetunkov, I., Kourentzes, N., Fildes, R., 2016. Complex Exponential Smoothing. Lancaster University.
Svetunkov， I.， Kourentzes， N.， Fildes， R.， 2016.复杂的指数平滑。兰开斯特大学。

Tavasszy, L., De Jong, G., 2013. Modelling Freight Transport. Elsevier.
Tavasszy， L.， De Jong， G.， 2013.货运建模。爱思唯尔。

Tumer, K., Agogino, A., 2009. Improving air traffic management with a learning multiagent system. IEEE Intell. Syst. 24 (1), 18-21.
Tumer， K.， Agogino， A.， 2009.通过学习多智能体系统改善空中交通管理。IEEE智能。系统24（1），18-21。

Weigang, L., de Souza, B.B., Crespo, A.M.F., Alves, D.P., 2008. Decision support system in tactical air traffic flow management for air traffic flow controllers. J. Air Transport Manage. 14 (6), 329-336.
Weigang， L.， de Souza， B.B.， Crespo， A.M.F.， Alves， D.P.， 2008.空中交通管制员战术空中交通流量管理决策支持系统。J. 航空运输管理。14 (6), 329-336.

Winston, C., 1983. The demand for freight transportation: models and applications. Transport. Res. Part A: General 17 (6), 419-427.
温斯顿，C.，1983 年。货运需求：模型和应用。运输。A部分：总则第17（6）条，第419-427页。

Xgboost.readthedocs.io., 2016. Introduction to Boosted Trees — xgboost 0.90 documentation. [online] Available at: https://xgboost.readthedocs.io/en/latest/ tutorials/model.html.
Xgboost.readthedocs.io.，2016 年。Boosted Trees 简介 — xgboost 0.90 文档。[在线的]可在以下网址获得：https://xgboost.readthedocs.io/en/latest/ tutorials/model.html。

Yang, Y., 2004. Combining forecasting procedures: some theoretical results. Econometric Theory 20 (1), 176-222.
杨，Y.，2004 年。结合预测程序：一些理论结果。计量经济学理论20（1），176-222。

Zhang, K., Nair, R., Mahmassani, H.S., Miller-Hooks, E.D., Arcot, V.C., Kuo, A., Dong, J., Lu, C.C., 2008. Application and validation of dynamic freight simulation-assignment model to large-scale intermodal rail network: Pan-European case. Transp. Res. Rec. 2066 (1), 9-20.
Zhang， K.， Nair， R.， Mahmassani， H.S.， Miller-Hooks， E.D.， Arcot， V.C.， Kuo， A.， Dong， J.， Lu， C.C.， 2008.动态货运仿真-分配模型在大型多式联运铁路网络中的应用与验证——泛欧案例.Transp. Res. Rec. 2066 （1）， 9-20.

- Corresponding author. 通讯作者。
E-mail addresses: lamaalhajjhassan2021@u.northwestern.edu (L. Al Hajj Hassan), masmah@northwestern.edu (H.S. Mahmassani), y-chen@northwestern.edu (Y. Chen).
电子邮件地址：lamaalhajjhassan2021@u.northwestern.edu （L. Al Hajj Hassan）， masmah@northwestern.edu （H.S. Mahmassani）， y-chen@northwestern.edu （Y. Chen）。

Reinforcement learning framework for freight demand forecasting to support operational planning decisions 用于货运需求预测的强化学习框架，以支持运营规划决策

A R T I C L E I N F O

Keywords: 关键字：

Abstract 抽象

1. Introduction 1. 引言

2. Literature review 2. 文献综述

2.1. Freight demand modelling2.1. 货运需求建模

2.2. Reinforcement learning2.2. 强化学习

3. Methodology 3. 方法论

3.1. Market lane clustering3.1. 市场通道聚类

3.2. Continuous and self-correcting model formulation3.2. 连续和自校正模型的制定

3.3. Forecasting over different time scales3.3. 不同时间尺度的预测

3.4. Accounting for calendar effects3.4. 考虑日历效应

3.5. Evaluation metrics 3.5. 评估指标

4. Case study 4. 案例研究

Lane Group Clustering 车道组聚类

4.1. Short-term model 4.1. 短期模型

4.2. Long-term model 4.2. 长期模型

4.3. Comparing individual approach to combined approach4.3. 比较单个方法与组合方法

5. Conclusion 5. 结论

Acknowledgement 确认

Appendix A. Supplementary material附录 A. 补充材料

References 引用

Reinforcement learning framework for freight demand forecasting to support operational planning decisions
用于货运需求预测的强化学习框架，以支持运营规划决策

2.1. Freight demand modelling
2.1. 货运需求建模

2.2. Reinforcement learning
2.2. 强化学习

3.1. Market lane clustering
3.1. 市场通道聚类

3.2. Continuous and self-correcting model formulation
3.2. 连续和自校正模型的制定

3.3. Forecasting over different time scales
3.3. 不同时间尺度的预测

3.4. Accounting for calendar effects
3.4. 考虑日历效应

4.3. Comparing individual approach to combined approach
4.3. 比较单个方法与组合方法

Appendix A. Supplementary material
附录 A. 补充材料