The dataset provided to the teams represented an amazing collection of moment-by-moment information for every point after the first two rounds from all of Wimbledon’s 2023 men’s singles matches. Incredible advances in technology has enabled such a detailed data capture. Although still small in comparison, data of this nature are a portal into big data that clamor for data science and mathematical modeling approaches capable of uncovering important patterns that data volume obscures. These skills are of high interest to academia and industry alike and have a strong presence in undergraduate programs around the globe. Moreover, there is a high interest in sports among young people and growing interest in sports analytics. It is no surprise, then, that the number of teams selecting the 2024 Problem C vastly outnumbered the teams electing any other MCM or ICM problems ever. The many dimensions present in this year’s data afforded teams with a wide array of modeling options whose use directly met the contest’s expectations. 提供给团队的数据集代表了 2023 年温布尔登男子单打比赛前两轮后每个时刻信息的惊人集合。技术的巨大进步使得如此详细的数据捕获成为可能。尽管与之相比仍然较小,但这种性质的数据是通往大数据的门户,迫切需要数据科学和数学建模方法来揭示数据量所掩盖的重要模式。这些技能在学术界和工业界都受到高度关注,并在全球的本科项目中有着强大的存在。此外,年轻人对体育的兴趣很高,对体育分析的兴趣也在增长。因此,选择 2024 年问题 C 的团队数量远远超过选择任何其他 MCM 或 ICM 问题的团队数量,这并不令人惊讶。今年数据中存在的多个维度为团队提供了广泛的建模选项,其使用直接满足了比赛的期望。
Assumptions 假设
The modeling assumptions expected by MCM judges are not just trivial statements presented to meet process expectations. Teams whose list included simplistic statements such as “We assume that players compete to win” or “We assume that the data are correct” were not looked upon favorably. Rather, mathematical statements are preferred; for instance, “We assume that the time series data for [data label] is distributed normally” with justification given perhaps as “The modeling approach requires this characteristic to hold, a point that we demonstrate as sound or well-founded later during validation.” Another example would be: “We assume that the time series data chosen for inclusion in our initial model are not strongly auto-correlated. … We later demonstrate to the contrary and adjust our initial model to compensate.” MCM 评审期望的建模假设不仅仅是为了满足流程期望而提出的琐碎陈述。那些列表中包含简单陈述的团队,例如“我们假设玩家竞争是为了赢得胜利”或“我们假设数据是正确的”,并未受到青睐。相反,数学陈述更受欢迎;例如,“我们假设[data label]的时间序列数据呈正态分布”,并可能给出理由:“建模方法要求这一特征成立,我们将在后续验证中证明这一点是合理或有根据的。”另一个例子是:“我们假设选择纳入我们初始模型的时间序列数据没有强自相关性。……我们随后证明了相反,并调整我们的初始模型以进行补偿。”
Data Cleaning 数据清理
Data cleaning issues are a priority that teams needed to address. The first need was simply to evaluate the validity of the data and deal with outliers and omissions. Teams also needed to clearly document the data source of games used for comparison. If new variables are introduced, they must be clearly defined, as variable names are often insufficient descriptions. The difficulty in interpreting the data is compounded by poorly defined variables. For example, a number of teams that used linear regression did not state clearly that this method does not determine which 数据清理问题是团队需要优先解决的。首要任务是评估数据的有效性,并处理异常值和遗漏。团队还需要清楚地记录用于比较的游戏数据来源。如果引入新变量,必须明确定义,因为变量名称往往不足以描述。数据解释的困难因变量定义不清而加剧。例如,许多使用线性回归的团队没有明确说明该方法并不能确定哪个。
variables have the most significant impact on the outcome or dependent variable; it can only analyze the ones provided. 变量对结果或因变量的影响最为显著;它只能分析提供的变量。
Artificial Intelligence 人工智能
Perhaps as a landmark, this year’s contest was the first time that teams were explicitly allowed to employ artificial intelligence (AI) resources, if they chose to do so, as long as they reported both their queries and AI’s responses in a section (usually at the end of the paper). There is a good deal of controversy associated with AI use in education, mostly around the question of how to leverage its current capabilities to enhance student understanding and strengthen their cognitive skills without compromising or sacrificing their thought processes. 也许作为一个里程碑,今年的比赛是团队首次被明确允许使用人工智能(AI)资源,如果他们选择这样做,只要在论文的一个部分(通常在最后)报告他们的查询和 AI 的响应。关于在教育中使用 AI 存在相当多的争议,主要围绕如何利用其当前能力来增强学生理解和加强他们的认知技能,而不妨碍或牺牲他们的思维过程。
It is our opinion that AI can be appropriately leveraged in a manner somewhat similar to published research, to wit: “standing on the shoulders of Giants” [Newton in 1675 in letter to Robert Hooke]. Unlike research that has successfully navigated a peer-review process that facilitates quality control and imbues confidence in the results presented, AI results are unchecked and untethered in this regard. Consequently, students are best advised beforehand that they will bear the responsibility for verification in addition to proper citation, which could prove to be a time trap within the already-tight contest time limits. 我们认为,人工智能可以以一种类似于已发布研究的方式适当利用,即:“站在巨人的肩膀上” [牛顿在 1675 年给罗伯特·胡克的信中]。与成功通过同行评审过程的研究不同,该过程促进了质量控制并增强了对所呈现结果的信心,人工智能的结果在这方面是未经检查和不受约束的。因此,最好提前告知学生,他们将承担验证的责任,此外还需进行适当的引用,这在已经紧张的比赛时间限制内可能会成为一个时间陷阱。
The teams that used AI this year did so mainly to 今年使用人工智能的团队主要是为了
improve paragraph or sentence expression, 改善段落或句子的表达,
check or suggest small programming code sections, 检查或建议小的编程代码段,
assist them in identifying relevant literature sources, or 协助他们识别相关文献来源,或
suggest ways that they might begin to model the concept of momentum in tennis. 建议他们可以开始在网球中建模动量概念的方法。
It appeared evident from the query results that none of these uses abdicated student thinking or clever mathematical modeling, nor even suggested that AI was capable of supplanting human involvement in mathematical modeling as of yet. 从查询结果来看,显然这些用法都没有放弃学生的思考或巧妙的数学建模,也没有暗示人工智能目前能够取代人类在数学建模中的参与。
Elements Considered in Judging 判断时考虑的要素
We focus on a select group-not an exhaustive list-of performance elements in papers, in order to provide observations and insights that might assist future MCM teams. These elements enabled triage and final judges to stratify papers into scoring categories and ultimately identify those papers that competed for Finalist and Outstanding award designations. 我们专注于一组特定的性能元素,而不是详尽的列表,以提供可能帮助未来 MCM 团队的观察和见解。这些元素使得分流和最终评审能够将论文分层到评分类别中,并最终识别出那些竞争最终入围和杰出奖项的论文。
General 一般
Of the 10,000+ papers for this year’s Problem C, an overwhelming majority of teams demonstrated proper use of and citation for credible research sources and how they should be used to support a modeling effort. It appears that most teams are dedicating time for this activity up front rather than rushing right into the mathematical modeling and analysis. This is commendable, since it has been a consistent judges’ recommendation for many years to help teams make more efficient use of the available time. 在今年的 C 问题中,超过 10,000 篇论文中,绝大多数团队展示了对可信研究来源的正确使用和引用,以及如何将其用于支持建模工作。看来大多数团队在这项活动上花费了时间,而不是急于进入数学建模和分析。这是值得称赞的,因为多年来,这一直是评委们的一项一致建议,以帮助团队更有效地利用可用时间。
Moreover, the quality of writing, composition, and exposition this year was amazing, given the contest’s requirement for all papers to be submitted in English. Teams representing universities whose first language is not English are to be especially commended in this regard, as both triage and final judges found nearly all papers were a true pleasure to read. 此外,考虑到比赛要求所有论文以英语提交,今年的写作、构思和阐述质量令人惊叹。代表非英语国家大学的团队在这方面尤其值得表扬,因为初审和最终评审的评委们发现几乎所有论文都令人愉悦。
Teams also by and large showed a mastery for dissecting, identifying, and organizing stated and implied tasks needed to yield a comprehensive model sufficient to meet the problem’s requirements. The degree to which these tasks were accomplished formed a basis for discriminating between top papers and those that fell short. For the MCM contest, this is essentially saying that critical differences between papers appeared to be based on modeling prowess and not poor problem structuring. 团队总体上展示了对解剖、识别和组织所需的明确和隐含任务的掌握,以产生一个足够满足问题要求的综合模型。这些任务完成的程度构成了区分优秀论文和那些未能达到标准的论文的基础。对于 MCM 比赛来说,这基本上意味着论文之间的关键差异似乎是基于建模能力,而不是糟糕的问题结构。
Approach to the Data 数据处理方法
A unique dataset, such as this year’s, is both a blessing and a curse. There is a richness to its many dimensions that nearly encompasses all potential time-series modeling approaches. Yet at the same time such richness forces many decisions to be made by teams. The first decision is whether to dive directly into data exploration or to first establish a base understanding of what the data represent and their possible relationship to the questions being asked. From a judge’s perspective, either choice could yield fruitful results, depending on a team’s mathematical skills. Experience generally drives teams to choose one option or the other. 独特的数据集,比如今年的,既是福也是祸。它的多维度丰富性几乎涵盖了所有潜在的时间序列建模方法。然而,这种丰富性同时也迫使团队做出许多决策。第一个决策是直接进行数据探索,还是首先建立对数据所代表内容及其与所提问题可能关系的基本理解。从评审的角度来看,任何选择都可能产生丰硕的结果,这取决于团队的数学技能。经验通常驱使团队选择一个选项或另一个。
Data Exploration First 数据探索第一
Teams that dove directly into data exploration using all of the data in a naïve software-driven approach fared poorly in terms of impressing the judges. This “firehose” approach has been-and will continue to bediscouraged by the MCM judges, mostly because evidence of modeling creativity and critical thinking-features that are a foundation for mathematical modeling-are largely absent. Instead, the approach gives an impression that teams are hoping to discover an effective modeling approach through brute force and serendipity rather than through reflection and se- 直接使用所有数据进行数据探索的团队,在评委面前表现不佳,这种天真的软件驱动方法被称为“消防栓”方法,已经并将继续受到 MCM 评委的 discouragement,主要是因为缺乏建模创造力和批判性思维的证据,而这些特征是数学建模的基础。相反,这种方法给人一种印象,即团队希望通过蛮力和偶然发现有效的建模方法,而不是通过反思和系统性分析。
lective choice. Moreover, this approach appears to indicate a lack of technical understanding of potential software tools, which is clearly not in the best interest of student teams. 选择性选择。此外,这种方法似乎表明对潜在软件工具缺乏技术理解,这显然不符合学生团队的最佳利益。
Teams that plugged the dataset elements into multiple machine learning techniques and subsequently rationalized one of the resulting patterns as representing momentum and its swings throughout a game fell short in impressing the judges. There are subtle but important differences between machine-learning algorithms, differences that are driven by underlying assumptions and mathematical data characteristics. Whether machine learning should be used at all depends on the nuances associated with the specific intent for their development. Many machine learning approaches were used in a “black box” fashion with little discussion of why the model parameters were selected; they were also used for crossvalidation or sensitivity analysis at the end to show that the choices made sense. A far better approach would have been to have chosen a single machine-learning method that could be best applied with a small number of assumptions that could later be relaxed to improve the model’s fidelity. 将数据集元素应用于多种机器学习技术的团队,随后将其中一个结果模式合理化为代表比赛中的动量及其波动,但未能给评委留下深刻印象。机器学习算法之间存在微妙但重要的差异,这些差异源于基本假设和数学数据特征。是否应该使用机器学习完全取决于与其开发特定意图相关的细微差别。许多机器学习方法以“黑箱”方式使用,几乎没有讨论为何选择模型参数;它们也在最后用于交叉验证或敏感性分析,以表明所做的选择是合理的。一个更好的方法是选择一种机器学习方法,该方法可以在较少的假设下最佳应用,随后可以放宽这些假设以提高模型的准确性。
Basic Modeling First 基础建模第一
In contrast, teams that began by identifying informative elements that might constitute momentum, as described in credible professional sources, communicated an effort to understand and structure a needed foundation upon which strong modeling justifications could be supported. For many high-performing teams, this resulted in effective explicit mathematical expressions of momentum that enabled visualization of a tennis match’s flow and the identification and prediction of advantage swing. For example, Dynamic Time Warping (DTW) is a widely-used technique for analyzing time series, offering a powerful tool to compare sequences that may differ in speed or length. 相反,那些首先通过识别可能构成动量的信息元素的团队,如可信的专业来源所描述的,传达了理解和构建所需基础的努力,以便支持强有力的建模论证。对于许多高绩效团队来说,这导致了有效的动量显式数学表达,使得能够可视化网球比赛的流程,并识别和预测优势的变化。例如,动态时间规整(DTW)是一种广泛使用的时间序列分析技术,提供了一种强大的工具来比较可能在速度或长度上有所不同的序列。
Whether the dynamical expressions were linear or nonlinear depended on the teams’ choices, which were supported by research citations. Judges viewed this approach as more aligned with the philosophy of mathematical modeling compared to teams that relied on “black box” software programs. This preference was largely because teams employing dynamical expressions could more effectively explain the construction and significance of each term as it was incorporated into the final expression. 动态表达式是线性还是非线性取决于团队的选择,这些选择得到了研究引用的支持。评委认为这种方法与数学建模的理念更为一致,相较于依赖“黑箱”软件程序的团队。这种偏好主要是因为使用动态表达式的团队能够更有效地解释每个术语的构造和重要性,因为它们被纳入最终表达式中。
Choosing Data to Use 选择要使用的数据
Once a team identified a modeling focus and proceeded into the data, they discovered that the large dataset presented them with an abundance of potential data elements to represent momentum and quantify its changes throughout a tennis game. How to reduce this to a manageable number emerged as the next hurdle. Teams that chose a subset without strong 一旦团队确定了建模重点并进入数据,他们发现大型数据集为他们提供了大量潜在的数据元素,以表示动量并量化其在网球比赛中的变化。如何将其减少到一个可管理的数量成为下一个障碍。选择没有强烈支持的子集的团队
mathematical support did not fair well. Moreover, the traditional TOPSIS (Technique for Order Preference by Similarity to Ideal Solution) model relies heavily on subjective weighting of indicators, which, as a result, can significantly influence the results. Therefore, principal component analysis can be an objective method to analyze the dataset and reduce dimensionality. More common, however, were teams that applied widely accepted data-dimension-reducing techniques that provided insight into the extent to which each element contributed to underlying variance, such as principal component analysis, linear discriminant analysis, singular value decomposition, autoregressive integrated moving average (ARIMA), and entropy. For example, an ARIMA model can be a reasonable and effective way for predicting the trend of match momentum. The results, coupled with appropriate assumptions, enabled teams to pick a subset of data to analyze. 数学支持表现不佳。此外,传统的 TOPSIS(基于与理想解相似性的排序偏好技术)模型在很大程度上依赖于指标的主观加权,这可能会显著影响结果。因此,主成分分析可以作为一种客观的方法来分析数据集并减少维度。然而,更常见的是,团队应用广泛接受的数据降维技术,这些技术提供了对每个元素对潜在方差贡献程度的洞察,例如主成分分析、线性判别分析、奇异值分解、自回归积分滑动平均(ARIMA)和熵。例如,ARIMA 模型可以是预测比赛势头趋势的一种合理有效的方法。结果结合适当的假设,使团队能够选择一部分数据进行分析。
In addition, successful groups used a variety of methods for analysis, such as logistic regression, cubic spline interpolation, neural networks, XG Boost, Monte Carlo, sliding window, and random forest. Even the use of Bartlett’s test of sphericity can help determine if there is a high degree of internal correlation in the data. The goal with the data was to filter outliers and anomalies that would increase the variance, while preserving the momentum change trends so that nonstationary and nonlinear data values can be effectively processed. In sum, the goal of successful teams was to 此外,成功的团队使用了多种分析方法,如逻辑回归、三次样条插值、神经网络、XG Boost、蒙特卡洛、滑动窗口和随机森林。甚至使用巴特利特球形检验可以帮助确定数据中是否存在高度的内部相关性。数据的目标是过滤掉会增加方差的异常值和异常情况,同时保留动量变化趋势,以便有效处理非平稳和非线性数据值。总之,成功团队的目标是
predict momentum and the occurrence of swings without reducing the dimensionality of the original dataset, and 预测动量和波动的发生,而不减少原始数据集的维度,和
provide both coaches and players with a proper predictive interval. 为教练和球员提供适当的预测区间。
Common Shortfalls 常见不足之处
Since the MCM’s inception, teams have repeatedly been advised to identify and justify any (and all) modeling assumptions, as a necessary modeling process requirement. While still valid advice, evidence provided in team papers this year makes it apparent that more should be said regarding this advice, if for no other reason than to clarify what judges are looking for. 自 MCM 成立以来,团队们反复被建议识别并证明任何(以及所有)建模假设,这是建模过程的必要要求。虽然这一建议仍然有效,但今年团队论文中提供的证据表明,关于这一建议应该说得更多,至少是为了澄清评审员在寻找什么。
Model assumptions should not only be discussed but also rigorously assessed. Unfortunately, evaluating the appropriateness and adequacy of models is often overlooked or inadequately performed. An Outstanding paper distinguishes itself through thorough sensitivity analysis and comprehensive model testing. 模型假设不仅应该被讨论,还应该被严格评估。不幸的是,评估模型的适用性和充分性常常被忽视或不够充分地进行。一篇杰出的论文通过全面的敏感性分析和全面的模型测试而脱颖而出。
Additionally, the problem underscored the significance of aligning modeling approaches with the goals of the analysis. For instance, linear models such as regression and analysis of variance, benefit from residual analysis, 此外,这个问题强调了将建模方法与分析目标对齐的重要性。例如,线性模型如回归和方差分析,受益于残差分析,
which helps validate results. The insights gained from identifying issues can be invaluable for decision-makers. 这有助于验证结果。识别问题所获得的见解对决策者来说可能是无价的。
Ultimately, selecting and testing the right model, combined with clear explanations and well-crafted graphs, is crucial for a modeling paper to receive top marks. 最终,选择和测试合适的模型,以及清晰的解释和精心制作的图表,对于一篇建模论文获得高分至关重要。
Common Pluses 常见优点
Verifying Assumptions 验证假设
Unless a team is extraordinarily gifted, lucky, or has unlimited time and resources, they will inevitably face the challenge of bridging the gap between real-world phenomena and mathematical methods. To navigate this, teams often make assumptions that reduce their model’s ability to capture every detail, such as the complexities of momentum. These assumptions typically stem from the mathematical requirements that justify the choice of a particular analytical method familiar to the team. To make timely progress, teams should identify facts that need to be checked and verified as they arise. While temporarily accepting unverified assumptions allows the team to move forward, verifying and relaxing these assumptions ultimately strengthens the model. Conversely, leaving them unverified weakens it. 除非一个团队极其有天赋、幸运,或者拥有无限的时间和资源,否则他们不可避免地会面临将现实世界现象与数学方法之间的差距的挑战。为了应对这一挑战,团队通常会做出一些假设,这些假设降低了他们模型捕捉每一个细节的能力,例如动量的复杂性。这些假设通常源于数学要求,这些要求为团队选择特定的分析方法提供了合理性。为了及时取得进展,团队应该识别出需要在出现时进行检查和验证的事实。虽然暂时接受未经验证的假设可以让团队向前推进,但最终验证和放宽这些假设会增强模型。相反,保持这些假设未经验证会削弱模型。
Actionable Recommendations 可行的建议
Recommendations provided by teams at the end of their papers need to be actionable advice associated with what the modeling effort accomplished. In the case of this problem, the advice was to be directed to tennis coaches. Recommendations such as “Players should practice serving…” and “Players should maintain momentum as long as possible…” communicate to judges that teams do not understand why recommendations are provided to the reader and greatly detract from the paper’s quality. 团队在论文末尾提供的建议需要是与建模工作成果相关的可操作建议。在这个问题的情况下,建议是针对网球教练的。诸如“球员应该练习发球……”和“球员应该尽可能保持势头……”这样的建议向评审传达了团队不理解为什么要向读者提供建议,这大大降低了论文的质量。
Informative Graphs 信息图表
Outstanding papers present results in a way that is accessible to nontechnical readers. Such papers include graphs that go beyond simple illustrations of data, such as the oscillation of momentum or player performance records. Instead, the graphs possess explanatory power, reinforcing the findings and conclusions, thereby making the results more credible. The graphs are not only easy to interpret but also enhance the overall summary. 优秀的论文以非技术读者易于理解的方式呈现结果。这类论文包括超越简单数据插图的图表,例如动量的波动或球员表现记录。相反,这些图表具有解释力,强化了研究结果和结论,从而使结果更具可信度。这些图表不仅易于解读,还增强了整体摘要。
Conclusion 结论
Despite the accessibility of phenomenal modeling tools and computational software, human reasoning and logic continue to be irreplaceable in the modeling process. While advanced tools can handle vast amounts of data and perform complex calculations with precision, they lack the intuitive understanding and contextual awareness that humans bring to the table. Human expertise is crucial for interpreting results, identifying anomalies, and making informed decisions based on the data. This blend of computational power and human insight ensures that models are not only accurate but also meaningful and applicable to real-world scenarios. 尽管卓越的建模工具和计算软件变得易于获取,但人类的推理和逻辑在建模过程中仍然是不可替代的。虽然先进的工具可以处理大量数据并进行精确的复杂计算,但它们缺乏人类所带来的直观理解和上下文意识。人类的专业知识对于解释结果、识别异常以及根据数据做出明智的决策至关重要。这种计算能力与人类洞察力的结合确保了模型不仅准确,而且在现实世界场景中具有意义和适用性。
About the Authors 关于作者
Mark Arvidson is a professor of mathematics at Azusa Pacific University as well as an experienced teacher, educator, consultant, and presenter. He is also the Director of the Math Fellows program and trains both elementary and secondary math teachers. Dr. Arvidson did his undergraduate work at Wheaton College in mathematics; his graduate studies were completed at the Claremont Graduate University with an emphasis in mathematics education. As an applied mathematician, he has been a final judge for the MCM for five years. 马克·阿维德森是阿祖萨太平洋大学的数学教授,同时也是一位经验丰富的教师、教育工作者、顾问和演讲者。他还是数学研究员项目的主任,负责培训小学和中学的数学教师。阿维德森博士在惠顿学院完成了数学本科学习;他的研究生学习是在克莱蒙特研究生大学完成的,专注于数学教育。作为一名应用数学家,他担任了 MCM 的终审评委五年。
Patrick J. Driscoll is the former Director of the MCM and has been an MCM judge since 1991. Graduating from the US Military Academy at West Point, he served as an officer in the US Army for 22 years, during which time he obtained an M.S. in Operations Research and also an M.S. in Engineering Economic Systems from Stanford University, along with a Ph.D. in Systems Engineering from Virginia Tech. He served on the faculty of the Dept. of Mathematical Sciences and the Dept. of Systems Engineering at West Point, and was promoted to Emeritus Professor of Systems Engineering in 2022. 帕特里克·J·德里斯科尔是 MCM 的前任主任,自 1991 年以来一直担任 MCM 评委。他毕业于西点军校,在美国陆军服役 22 年期间,获得了斯坦福大学的运筹学硕士学位和工程经济系统硕士学位,以及弗吉尼亚理工大学的系统工程博士学位。他曾在西点军校的数学科学系和系统工程系任教,并于 2022 年晋升为系统工程名誉教授。