The Committee on Foundational Research Gaps and Future Directions for Digital Twins uses the following definition of a digital twin, modified from a definition published by the American Institute of Aeronautics and Astronautics (AIAA Digital Engineering Integration Committee 2020): 数字双胞胎基础研究差距与未来方向委员会使用以下数字双胞胎的定义,该定义经过修改,来源于美国航空航天学会(AIAA 数字工程集成委员会 2020 年)发布的定义:
A digital twin is a set of virtual information constructs that mimics the structure, context, and behavior of a natural, engineered, or social system (or systemof-systems), is dynamically updated with data from its physical twin, has a predictive capability, and informs decisions that realize value. The bidirectional interaction between the virtual and the physical is central to the digital twin. 数字双胞胎是一组虚拟信息构造,模拟自然、工程或社会系统(或系统的系统)的结构、上下文和行为,动态地从其物理双胞胎更新数据,具有预测能力,并为实现价值的决策提供信息。虚拟与物理之间的双向互动是数字双胞胎的核心。
The study committee's refined definition refers to "a natural, engineered, or social system (or system-of-systems)" to describe digital twins of physical systems in the broadest sense possible, including the engineered world, natural phenomena, biological entities, and social systems. This definition introduces the phrase "predictive capability" to emphasize that a digital twin must be able to issue predictions beyond the available data to drive decisions that realize value. Finally, this definition highlights the bidirectional interaction, which comprises feedback flows of information from the physical system to the virtual representa- 研究委员会的精确定义指的是“自然、工程或社会系统(或系统的系统)”,以尽可能广泛的方式描述物理系统的数字双胞胎,包括工程世界、自然现象、生物实体和社会系统。该定义引入了“预测能力”这一短语,以强调数字双胞胎必须能够发出超出可用数据的预测,以推动实现价值的决策。最后,该定义强调了双向互动,包括从物理系统到虚拟表示的信息反馈流。
tion and from the virtual back to the physical system to enable decision-making, either automatic or with humans-in-the-loop. 从虚拟系统到物理系统的转换,以便进行决策,无论是自动的还是人类参与的。
THE PROMISE OF DIGITAL TWINS 数字双胞胎的承诺
Digital twins hold immense promise in accelerating scientific discovery and revolutionizing industries. Digital twins can be a critical tool for decision-making based on a synergistic combination of models and data. The bidirectional interplay between a physical system and its virtual representation endows the digital twin with a dynamic nature that goes beyond what has been traditionally possible with modeling and simulation, creating a virtual representation that evolves with the system over time. By enabling predictive insights and effective optimizations, monitoring performance to detect anomalies and exceptional conditions, and simulating dynamic system behavior, digital twins have the capacity to revolutionize scientific research, enhance operational efficiency, optimize production strategies, reduce time-to-market, and unlock new avenues for scientific and industrial growth and innovation. The use cases for digital twins are diverse and proliferating, with applications across multiple areas of science, technology, and society, and their potential is wide-reaching. Yet key research needs remain to advance digital twins in several domains. 数字双胞胎在加速科学发现和革命性改变各个行业方面具有巨大潜力。数字双胞胎可以成为基于模型和数据的协同组合进行决策的重要工具。物理系统与其虚拟表示之间的双向互动赋予了数字双胞胎一种动态特性,超越了传统建模和仿真的可能性,创造出一个随着系统随时间演变的虚拟表示。通过提供预测性洞察和有效优化,监测性能以检测异常和特殊情况,以及模拟动态系统行为,数字双胞胎有能力彻底改变科学研究,提高运营效率,优化生产策略,缩短上市时间,并为科学和工业的增长与创新开辟新的途径。数字双胞胎的应用案例多种多样,正在迅速增加,涵盖多个科学、技术和社会领域,其潜力广泛。然而,仍然需要在多个领域推进数字双胞胎的关键研究。
This report is the result of a study that addressed the following key topics: 本报告是针对以下关键主题进行研究的结果:
Definitions of and use cases for digital twins; 数字双胞胎的定义和使用案例;
Foundational mathematical, statistical, and computational gaps for digital twins; 数字双胞胎的基础数学、统计和计算差距;
Best practices for digital twin development and use; and 数字双胞胎开发和使用的最佳实践;以及
Opportunities to advance the use and practice of digital twins. 推动数字双胞胎使用和实践的机会。
While there is significant enthusiasm around industry developments and applications of digital twins, the focus of this report is on identifying research gaps and opportunities. The report's recommendations are particularly targeted toward what agencies and researchers can do to advance mathematical, statistical, and computational foundations of digital twins. 尽管对数字双胞胎的行业发展和应用有很大的热情,但本报告的重点在于识别研究空白和机会。报告的建议特别针对机构和研究人员可以采取的措施,以推动数字双胞胎的数学、统计和计算基础。
ELEMENTS OF THE DIGITAL TWIN ECOSYSTEM 数字双胞胎生态系统的要素
The notion of a digital twin goes beyond simulation to include tighter integration between models, data, and decisions. The dynamic, bidirectional interaction tailors the digital twin to a particular physical counterpart and supports the evolution of the virtual representation as the physical counterpart evolves. This bidirectional interaction is sometimes characterized as a feedback loop, where data from the physical counterpart are used to update the virtual models, and, in turn, the virtual models are used to drive changes in the physical system. 数字双胞胎的概念超越了模拟,包含了模型、数据和决策之间更紧密的集成。动态的双向互动使数字双胞胎与特定的物理对应物相适应,并支持虚拟表示的演变,随着物理对应物的演变而变化。这种双向互动有时被描述为反馈循环,其中来自物理对应物的数据用于更新虚拟模型,而虚拟模型则用于推动物理系统的变化。
This feedback loop may occur in real time, such as for dynamic control of an autonomous vehicle or a wind farm, or it may occur on slower time scales, such as post-flight updating of a digital twin for aircraft engine predictive maintenance or post-imaging updating of a digital twin and subsequent treatment planning for a cancer patient. 该反馈循环可能实时发生,例如用于自主车辆或风电场的动态控制,或者可能在较慢的时间尺度上发生,例如在飞行后更新飞机发动机预测维护的数字双胞胎,或在成像后更新数字双胞胎以及随后为癌症患者制定治疗计划。
The digital twin provides decision support when a human plays a decisionmaking role, or decision-making may be shared jointly between the digital twin and a human as a human-agent team. Human-digital twin interactions may also involve the human playing a crucial role in designing, managing, and operating elements of the digital twin, including selecting sensors and data sources, managing the models underlying the virtual representation, and implementing algorithms and analytics tools. 数字双胞胎在人工扮演决策角色时提供决策支持,或者决策可能在数字双胞胎和人类之间共同分享,形成人机团队。人类与数字双胞胎的互动还可能涉及人类在设计、管理和操作数字双胞胎的各个元素中发挥关键作用,包括选择传感器和数据源、管理虚拟表示的基础模型,以及实施算法和分析工具。
Finding 2-1: A digital twin is more than just simulation and modeling. 发现 2-1:数字双胞胎不仅仅是模拟和建模。
Conclusion 2-1: The key elements that comprise a digital twin include (1) modeling and simulation to create a virtual representation of a physical counterpart, and (2) a bidirectional interaction between the virtual and the physical. This bidirectional interaction forms a feedback loop that comprises dynamic data-driven model updating (e.g., sensor fusion, inversion, data assimilation) and optimal decision-making (e.g., control, sensor steering). 结论 2-1:数字双胞胎的关键要素包括(1)建模和仿真,以创建物理对应物的虚拟表示,以及(2)虚拟与物理之间的双向互动。这种双向互动形成了一个反馈循环,包括动态数据驱动的模型更新(例如,传感器融合、反演、数据同化)和最优决策(例如,控制、传感器引导)。
These elements are depicted in Figure S-1. 这些元素在图 S-1 中描绘。
An important theme that runs throughout this report is the notion that the digital twin virtual representation be "fit for purpose," meaning that the virtual representation-model types, fidelity, resolution, parameterization, and quantities of interest-be chosen, and in many cases dynamically adapted, to fit the particular decision task and computational constraints at hand. 本报告中贯穿的一个重要主题是数字双胞胎虚拟表示应“适合目的”,这意味着虚拟表示模型类型、保真度、分辨率、参数化和关注的量应被选择,并在许多情况下动态调整,以适应特定的决策任务和计算约束。
Conclusion 3-1: A digital twin should be defined at a level of fidelity and resolution that makes it fit for purpose. Important considerations are the required level of fidelity for prediction of the quantities of interest, the available computational resources, and the acceptable cost. This may lead to the digital twin including high-fidelity, simplified, or surrogate models, as well as a mixture thereof. Furthermore, a digital twin may include the ability to represent and query the virtual models at variable levels of resolution and fidelity depending on the particular task at hand and the available resources (e.g., time, computing, bandwidth, data). 结论 3-1:数字双胞胎应在适合其目的的保真度和分辨率水平上进行定义。重要的考虑因素包括对感兴趣量的预测所需的保真度水平、可用的计算资源和可接受的成本。这可能导致数字双胞胎包括高保真、简化或替代模型,以及它们的混合。此外,数字双胞胎可能包括根据特定任务和可用资源(例如时间、计算、带宽、数据)在不同的分辨率和保真度水平上表示和查询虚拟模型的能力。
An additional consideration is the complementary role of models and dataa digital twin is distinguished from traditional modeling and simulation in the way that models and data work together to drive decision-making. In cases in 一个额外的考虑是模型和数据的互补作用,数字双胞胎与传统建模和仿真之间的区别在于模型和数据如何协同工作以推动决策。在某些情况下
FIGURE S-1 Elements of the digital twin ecosystem. 图 S-1 数字双胞胎生态系统的要素。
NOTES: Information flows bidirectionally between the virtual representation and physical counterpart. These information flows may be through automated processes, human-driven processes, or a combination of the two. 注意:信息在虚拟表示和物理对应物之间双向流动。这些信息流可以通过自动化过程、人为驱动的过程或两者的结合进行。
which an abundance of data exists and the decisions to be made fall largely within the realm of conditions represented by the data, a data-centric view of a digital twin is appropriate-the data form the core of the digital twin, the numerical model is likely heavily empirical, and analytics and decision-making wrap around this numerical model. In other cases that are data-poor and call on the digital twin to issue predictions in extrapolatory regimes that go well beyond the available data, a model-centric view of a digital twin is appropriate - a mathematical model and its associated numerical model form the core of the digital twin, and data are assimilated through the lens of these models. An important need is to advance hybrid modeling approaches that leverage the synergistic strengths of data-driven and model-driven digital twin formulations. 在数据丰富且决策主要基于数据所代表的条件的情况下,数字双胞胎的数据中心视角是合适的——数据构成数字双胞胎的核心,数值模型可能高度依赖经验,而分析和决策围绕这个数值模型展开。在数据稀缺且需要数字双胞胎在超出可用数据的外推范围内进行预测的情况下,数字双胞胎的模型中心视角是合适的——数学模型及其相关的数值模型构成数字双胞胎的核心,数据通过这些模型的视角进行同化。一个重要的需求是推进混合建模方法,利用数据驱动和模型驱动的数字双胞胎构建的协同优势。
ADVANCING DIGITAL TWIN STATE OF THE ART REQUIRES AN INTEGRATED RESEARCH AGENDA 推进数字双胞胎的最新技术需要一个综合研究计划
Despite the existence of examples of digital twins providing practical impact and value, the sentiment expressed across multiple committee information-gathering sessions is that the publicity around digital twins and digital twin solutions currently outweighs the evidence base of success. 尽管存在数字双胞胎提供实际影响和价值的例子,但在多个委员会的信息收集会议中表达的观点是,目前关于数字双胞胎和数字双胞胎解决方案的宣传超过了成功的证据基础。
Conclusion 2-5: Digital twins have been the subject of widespread interest and enthusiasm; it is challenging to separate what is true from what is merely aspirational, due to a lack of agreement across domains and sectors as well as misinformation. It is important to separate the aspirational from the actual to strengthen the credibility of the research in digital twins and to recognize that serious research questions remain in order to achieve the aspirational. 结论 2-5:数字双胞胎一直受到广泛的关注和热情;由于各个领域和行业之间缺乏共识以及错误信息,分辨真实与仅仅是理想的内容具有挑战性。区分理想与实际对于增强数字双胞胎研究的可信度至关重要,并且要认识到,为了实现理想,仍然存在严肃的研究问题。
Conclusion 2-6: Realizing the potential of digital twins requires an integrated research agenda that advances each one of the key digital twin elements and, importantly, a holistic perspective of their interdependencies and interactions. This integrated research agenda includes foundational needs that span multiple domains as well as domain-specific needs. 结论 2-6:实现数字双胞胎的潜力需要一个综合研究议程,推动每一个关键数字双胞胎元素的发展,并且重要的是,要有一个整体视角来理解它们的相互依赖和互动。这个综合研究议程包括跨多个领域的基础需求以及特定领域的需求。
Recommendation 1: Federal agencies should launch new crosscutting programs, such as those listed below, to advance mathematical, statistical, and computational foundations for digital twins. As these new digital twin-focused efforts are created and launched, federal agencies should identify opportunities for cross-agency interactions and facilitate crosscommunity collaborations where fruitful. An interagency working group may be helpful to ensure coordination. 建议 1:联邦机构应启动新的跨部门项目,例如以下列出的项目,以推进数字双胞胎的数学、统计和计算基础。在这些新的数字双胞胎重点项目创建和启动时,联邦机构应识别跨机构互动的机会,并在有利的情况下促进跨社区合作。一个跨机构工作组可能有助于确保协调。
National Science Foundation (NSF). NSF should launch a new program focused on mathematical, statistical, and computational foundations for digital twins that cuts across multiple application domains of science and engineering. 国家科学基金会(NSF)。NSF 应该启动一个新项目,专注于数字双胞胎的数学、统计和计算基础,涵盖多个科学和工程应用领域。
The scale and scope of this program should be in line with other multidisciplinary NSF programs (e.g., NSF Artificial Intelligence Institutes) to highlight the technical challenge being solved as well as the emphasis on theoretical foundations being grounded in practical use cases. 该计划的规模和范围应与其他多学科的国家科学基金会(NSF)项目(例如,NSF 人工智能研究所)相一致,以突出所解决的技术挑战,以及强调理论基础应扎根于实际应用案例。
Ambitious new programs launched by NSF for digital twins should ensure that sufficient resources are allocated to the solicitation so that the technical advancements are evaluated using real-world use cases and testbeds. 国家科学基金会推出的雄心勃勃的新项目针对数字双胞胎,应确保为该招标分配足够的资源,以便使用真实世界的用例和测试平台来评估技术进步。
NSF should encourage collaborations across industry and academia and develop mechanisms to ensure that small and medium-sized industrial and academic institutions can also compete and be successful leading such initiatives. NSF 应该鼓励行业和学术界之间的合作,并制定机制以确保中小型工业和学术机构也能竞争并成功主导这些倡议。
Ideally, this program should be administered and funded by multiple directorates at NSF, ensuring that from inception to sunset, real-world applications in multiple domains guide the theoretical components of the program. 理想情况下,该项目应由国家科学基金会的多个部门管理和资助,确保从开始到结束,多个领域的实际应用指导项目的理论部分。
Department of Energy (DOE). DOE should draw on its unique computational facilities and large instruments coupled with the breadth of its mission as it considers new crosscutting programs in support of digital twin research and development. It is well positioned and experienced in large, interdisciplinary, multi-institutional mathematical, statistical, and computational programs. Moreover, it has demonstrated the ability to advance common foundational capabilities while also maintaining a focus on specific use-driven requirements (e.g., predictive high-fidelity models for high-consequence decision support). This collective ability should be reflected in a digital twin grand challenge research and development vision for DOE that goes beyond the current investments in large-scale simulation to advance and integrate the other digital twin elements, including the physical/virtual bidirectional interaction and high-consequence decision support. This vision, in turn, should guide DOE's approach in establishing new crosscutting programs in mathematical, statistical, and computational foundations for digital twins. 能源部(DOE)。在考虑支持数字双胞胎研究和开发的新跨学科项目时,DOE 应利用其独特的计算设施和大型仪器,以及其使命的广度。它在大型跨学科、多机构的数学、统计和计算项目方面具有良好的定位和经验。此外,它已经证明能够在保持对特定使用驱动需求(例如,用于高后果决策支持的预测高保真模型)的关注的同时,推进共同的基础能力。这种集体能力应体现在 DOE 的数字双胞胎重大挑战研究和开发愿景中,该愿景超越当前在大规模仿真方面的投资,以推进和整合其他数字双胞胎要素,包括物理/虚拟双向交互和高后果决策支持。反过来,这一愿景应指导 DOE 在建立数字双胞胎的数学、统计和计算基础的新跨学科项目时的做法。
National Institutes of Health (NIH). NIH should invest in filling the gaps in digital twin technology in areas that are particularly critical to biomedical sciences and medical systems. These include bioethics, handling of measurement errors and temporal variations in clinical measurements, capture of adequate metadata to enable effective data harmonization, complexities of clinical decision-making with digital twin interactions, safety of closed-loop systems, privacy, and many others. This could be done via new cross-institute programs and expansion of current programs such as the Interagency Modeling and Analysis Group. 国家卫生研究院(NIH)。NIH 应该投资于填补数字双胞胎技术在生物医学科学和医疗系统中特别关键领域的空白。这些领域包括生物伦理学、临床测量中的测量误差和时间变化的处理、捕获足够的元数据以实现有效的数据协调、数字双胞胎交互中的临床决策复杂性、闭环系统的安全性、隐私等。这可以通过新的跨机构项目和扩展当前项目(如跨机构建模与分析小组)来实现。
Department of Defense (DoD). DoD's Office of the Under Secretary of Defense for Research and Engineering should advance the application of digital twins as an integral part of the digital engineering performed to support system design, performance analysis, developmental and operational testing, operator and force training, and operational maintenance prediction. DoD should also consider using mechanisms such as the Multidisciplinary University Research Initiative and Defense Acquisition University to support research efforts to develop and mature the tools and techniques for the application of digital twins as part of system digital engineering and model-based system engineering processes. 国防部(DoD)。国防部副部长办公室应推动数字双胞胎的应用,使其成为支持系统设计、性能分析、开发和操作测试、操作员和部队培训以及操作维护预测的数字工程的重要组成部分。国防部还应考虑使用多学科大学研究倡议和国防采购大学等机制,支持研究工作,以开发和成熟数字双胞胎在系统数字工程和基于模型的系统工程过程中的应用工具和技术。
Other federal agencies. Many federal agencies and organizations beyond those listed above can play important roles in the advancement of digital twin research. For example, the National Oceanic and Atmospheric Administration, the National Institute of Standards and Technology, and the National Aeronautics and Space Administration 其他联邦机构。许多联邦机构和组织超出了上述列出的范围,可以在数字双胞胎研究的推进中发挥重要作用。例如,国家海洋和大气管理局、国家标准与技术研究院以及国家航空航天局。
should be included in the discussion of digital twin research and development, drawing on their unique missions and extensive capabilities in the areas of data assimilation and real-time decision support. 应纳入数字双胞胎研究与开发的讨论,借鉴他们在数据同化和实时决策支持领域的独特使命和广泛能力。
Verification, Validation, and Uncertainty Quantification: Foundational Research Needs and Opportunities 验证、确认和不确定性量化:基础研究需求与机遇
Verification, validation, and uncertainty quantification (VVUQ) is an area of particular need that necessitates collaborative and interdisciplinary investment to advance the responsible development, implementation, monitoring, and sustainability of digital twins. Evolution of the physical counterpart in real-world use conditions, changes in data collection, noisiness of data, addition and deletion of data sources, changes in the distribution of the data shared with the virtual twin, changes in the prediction and/or decision tasks posed to the digital twin, and evolution of the digital twin virtual models all have consequences for VVUQ. 验证、验证和不确定性量化(VVUQ)是一个特别需要的领域,需要协作和跨学科的投资,以推动数字双胞胎的负责任开发、实施、监测和可持续性。在实际使用条件下物理对应物的演变、数据收集的变化、数据的噪声、数据源的增加和删除、与虚拟双胞胎共享的数据分布的变化、对数字双胞胎提出的预测和/或决策任务的变化,以及数字双胞胎虚拟模型的演变,都会对 VVUQ 产生影响。
VVUQ must play a role in all elements of the digital twin ecosystem. In the digital twin virtual representation, verification and validation play key roles in building trustworthiness, while uncertainty quantification gives measures of the quality of prediction. Many of the elements of VVUQ for digital twins are shared with VVUQ for computational models (NRC 2012), although digital twins bring some additional challenges. Common challenges arise from model discrepancies, unresolved scales, surrogate modeling, and the need to issue predictions in extrapolatory regimes. However, digital twin VVUQ must also address the uncertainties associated with the physical counterpart, including changes to sensors or data collection equipment, and the continual evolution of the physical counterpart's state. Data quality improvements may be prioritized based on the relative impacts of parameter uncertainties on the model uncertainties. VVUQ also plays a role in understanding the impact of mechanisms used to pass information between the physical and virtual. These include challenges arising from parameter uncertainty and ill-posed or indeterminate inverse problems, in addition to the uncertainty introduced by the inclusion of the human-in-the-loop. VVUQ 必须在数字双胞胎生态系统的所有元素中发挥作用。在数字双胞胎的虚拟表示中,验证和确认在建立可信度方面发挥关键作用,而不确定性量化则提供了预测质量的衡量标准。数字双胞胎的 VVUQ 许多元素与计算模型的 VVUQ 共享(NRC 2012),尽管数字双胞胎带来了一些额外的挑战。常见的挑战来自模型差异、未解决的尺度、替代建模以及在外推领域发布预测的需要。然而,数字双胞胎的 VVUQ 还必须解决与物理对应物相关的不确定性,包括传感器或数据收集设备的变化,以及物理对应物状态的持续演变。数据质量的改进可能会根据参数不确定性对模型不确定性的相对影响进行优先考虑。VVUQ 还在理解用于在物理和虚拟之间传递信息的机制的影响方面发挥作用。 这些包括由于参数不确定性以及不适定或不确定的逆问题所带来的挑战,此外还有由于引入人类参与而产生的不确定性。
Conclusion 2-2: Digital twins require VVUQ to be a continual process that must adapt to changes in the physical counterpart, digital twin virtual models, data, and the prediction/decision task at hand. A gap exists between the class of problems that has been considered in traditional modeling and simulation settings and the VVUQ problems that will arise for digital twins. 结论 2-2:数字双胞胎需要将验证、验证和不确定性量化(VVUQ)作为一个持续的过程,必须适应物理对应物、数字双胞胎虚拟模型、数据以及当前的预测/决策任务的变化。在传统建模和仿真环境中考虑的问题类别与数字双胞胎将出现的 VVUQ 问题之间存在差距。
Conclusion 2-3: Despite the growing use of artificial intelligence, machine learning, and empirical modeling in engineering and scientific applications, there is a lack of standards in reporting VVUQ as well as a lack of consideration of confidence in modeling outputs. 结论 2-3:尽管在工程和科学应用中越来越多地使用人工智能、机器学习和经验建模,但在报告 VVUQ 方面缺乏标准,同时对建模输出的信心考虑也不足。
Conclusion 2-4: Methods for ensuring continual VVUQ and monitoring of digital twins are required to establish trust. It is critical that VVUQ be deeply embedded in the design, creation, and deployment of digital twins. In future digital twin research developments, VVUQ should play a core role and tight integration should be emphasized. Particular areas of research need include continual verification, continual validation, VVUQ in extrapolatory conditions, and scalable algorithms for complex multiscale, multiphysics, and multi-code digital twin software efforts. There is a need to establish to what extent VVUQ approaches can be incorporated into automated online operations of digital twins and where new approaches to online VVUQ may be required. 结论 2-4:需要确保持续的 VVUQ(验证、验证和确认)方法和数字双胞胎的监测,以建立信任。VVUQ 必须深深嵌入数字双胞胎的设计、创建和部署中。在未来的数字双胞胎研究发展中,VVUQ 应发挥核心作用,并强调紧密集成。特定的研究领域包括持续验证、持续确认、在外推条件下的 VVUQ,以及针对复杂多尺度、多物理和多代码数字双胞胎软件工作的可扩展算法。需要确定 VVUQ 方法在数字双胞胎的自动在线操作中可以融入到何种程度,以及在何处可能需要新的在线 VVUQ 方法。
Recommendation 2: Federal agencies should ensure that verification, validation, and uncertainty quantification (VVUQ) is an integral part of new digital twin programs. In crafting programs to advance the digital twin VVUQ research agenda, federal agencies should pay attention to the importance of (1) overarching complex multiscale, multiphysics problems as catalysts to promote interdisciplinary cooperation; (2) the availability and effective use of data and computational resources; (3) collaborations between academia and mission-driven government laboratories and agencies; and (4) opportunities to include digital twin VVUQ in educational programs. Federal agencies should consider the Department of Energy Predictive Science Academic Alliance Program as a possible model to emulate. 建议 2:联邦机构应确保验证、确认和不确定性量化(VVUQ)成为新数字双胞胎项目的一个 integral 部分。在制定推进数字双胞胎 VVUQ 研究议程的项目时,联邦机构应关注以下重要性:(1)作为促进跨学科合作的催化剂的整体复杂多尺度、多物理问题;(2)数据和计算资源的可用性及有效利用;(3)学术界与以任务为驱动的政府实验室和机构之间的合作;以及(4)在教育项目中纳入数字双胞胎 VVUQ 的机会。联邦机构应考虑能源部预测科学学术联盟计划作为一个可供借鉴的模型。
Virtual Representation: Foundational Research Needs and Opportunities 虚拟代表:基础研究需求与机会
A fundamental challenge for digital twins is the vast range of spatial and temporal scales that the virtual representation may need to address. In many applications, a gap remains between the scales that can be simulated and actionable scales. An additional challenge is that as finer scales are resolved and a given model achieves greater fidelity to the physical counterpart it simulates, the computational and data storage/analysis requirements increase. This limits the applicability of the model for some purposes, such as uncertainty quantification, probabilistic prediction, scenario testing, and visualization. 数字双胞胎面临的一个基本挑战是虚拟表示可能需要处理的广泛空间和时间尺度。在许多应用中,可模拟的尺度与可操作的尺度之间仍然存在差距。另一个挑战是,随着更细尺度的解析和给定模型对其模拟的物理对应物的保真度提高,计算和数据存储/分析的需求也随之增加。这限制了模型在某些目的上的适用性,例如不确定性量化、概率预测、情景测试和可视化。
Finding 3-2: Different applications of digital twins drive different requirements for modeling fidelity, data, precision, accuracy, visualization, and time-to-solution, yet many of the potential uses of digital twins are currently intractable to realize with existing computational resources. 发现 3-2:数字双胞胎的不同应用驱动了对建模保真度、数据、精度、准确性、可视化和解决时间的不同要求,但目前许多数字双胞胎的潜在用途在现有计算资源下是无法实现的。
Abstract 摘要
Recommendation 3: In crafting research programs to advance the foundations and applications of digital twins, federal agencies should create mechanisms to provide digital twin researchers with computational resources, recognizing the large existing gap between simulated and actionable scales and the differing levels of maturity of high-performance computing across communities. 建议三:在制定研究计划以推进数字双胞胎的基础和应用时,联邦机构应建立机制,为数字双胞胎研究人员提供计算资源,认识到模拟与可操作规模之间存在的巨大差距,以及各个社区高性能计算的成熟程度差异。
Mathematical and algorithmic advances in data-driven modeling and multiscale physics-based modeling are necessary elements for closing the gap between simulated and actionable scales. Reductions in computational and data requirements achieved through algorithmic advances are an important complement to increased computing resources. Important areas to advance include hybrid modeling approaches - a synergistic combination of empirical and mechanistic modeling approaches that leverage the best of both data-driven and model-driven formulations-and surrogate modeling approaches. Key gaps, research needs, and opportunities include the following: 数据驱动建模和多尺度基于物理建模的数学和算法进展是缩小模拟与可操作尺度之间差距的必要元素。通过算法进展实现的计算和数据需求的减少是对计算资源增加的重要补充。需要推进的重要领域包括混合建模方法——一种将经验建模和机制建模方法的协同组合,充分利用数据驱动和模型驱动公式的优势——以及替代建模方法。关键的差距、研究需求和机会包括以下内容:
Combining data-driven models with mechanistic models requires effective coupling techniques to facilitate the flow of information (data, variables, etc.) between the models while understanding the inherent constraints and assumptions of each model. 将数据驱动模型与机械模型结合需要有效的耦合技术,以促进模型之间信息(数据、变量等)的流动,同时理解每个模型固有的约束和假设。
Integration of component/subsystem digital twins is a pacing item for the digital twin representation of a complex system, especially if different fidelity models are used in the representation of its components/ subsystems. There are key gaps in quantifying the uncertainty in digital twins of coupled complex systems, enhancing interoperability between digital twin models, and reconciling assumptions made between models. 组件/子系统数字双胞胎的集成是复杂系统数字双胞胎表示的一个关键项目,特别是在其组件/子系统的表示中使用不同精度模型时。在耦合复杂系统的数字双胞胎中,量化不确定性、增强数字双胞胎模型之间的互操作性以及调和模型之间的假设存在关键差距。
Methods are needed to achieve VVUQ of hybrid and surrogate models, recognizing the uncertain conditions under which digital twins will be called on to make predictions, often in extrapolatory regimes where data are limited or models are untested. An additional challenge for VVUQ is the dynamic model updating and adaptation that is key to the digital twin concept. 需要方法来实现混合模型和代理模型的 VVUQ,认识到数字双胞胎在进行预测时所面临的不确定条件,这通常是在数据有限或模型未经测试的外推领域。VVUQ 的另一个挑战是动态模型更新和适应,这对数字双胞胎概念至关重要。
Data quality, availability, and affordability are challenges. A particular challenge is the prohibitive cost of generating sufficient data for machine learning (ML) and surrogate model training. 数据质量、可用性和可负担性是挑战。一个特别的挑战是生成足够的数据以进行机器学习(ML)和替代模型训练的高昂成本。
Physical Counterpart: Foundational Research Needs and Opportunities 物理对应物:基础研究需求与机会
Digital twins rely on observation of the physical counterpart in conjunction with modeling to inform the virtual representation. In many applications, these data will be multimodal, from disparate sources, and of varying quality. While significant literature has been devoted to best practices around gathering and pre- 数字双胞胎依赖于对物理对应物的观察以及建模,以提供虚拟表示。在许多应用中,这些数据将是多模态的,来自不同来源,并且质量各异。尽管已有大量文献专注于收集和预处理的最佳实践,
paring data for use, several important gaps and opportunities are crucial for robust digital twins. Key gaps, research needs, and opportunities include the following: 为使用而准备的数据中,有几个重要的差距和机会对强大的数字双胞胎至关重要。关键差距、研究需求和机会包括以下内容:
Undersampling in complex systems with large spatiotemporal variability is a significant challenge for acquiring the data needed for digital twin development. Understanding and quantifying this uncertainty is vital for assessing the reliability and limitations of the digital twin, especially in safety-critical or high-stakes applications. 在具有大空间时间变异性的复杂系统中,欠采样是获取数字双胞胎开发所需数据的重大挑战。理解和量化这种不确定性对于评估数字双胞胎的可靠性和局限性至关重要,尤其是在安全关键或高风险应用中。
Tools are needed for data and metadata handling and management to ensure that data and metadata are gathered, recorded, stored, and processed efficiently. 需要工具来处理和管理数据及元数据,以确保数据和元数据被高效地收集、记录、存储和处理。
Mathematical tools are needed for assessing data quality, determining appropriate utilization of all available information, and understanding how data quality affects the performance of digital twin systems. 评估数据质量、确定所有可用信息的适当利用以及理解数据质量如何影响数字双胞胎系统性能都需要数学工具。
Standards and governance policies are critical for data quality, accuracy, security, and integrity, and frameworks play an important role in providing standards and guidelines for data collection, management, and sharing while maintaining data security and privacy. 标准和治理政策对数据质量、准确性、安全性和完整性至关重要,框架在提供数据收集、管理和共享的标准和指南方面发挥着重要作用,同时维护数据安全和隐私。
Physical-to-Virtual and Virtual-to-Physical Feedback Flows: Foundational Research Needs and Opportunities 物理到虚拟和虚拟到物理的反馈流:基础研究需求与机会
In the digital twin feedback flow from physical to virtual, inverse problem methodologies and data assimilation are required to combine physical observations and virtual models in a rigorous, systematic, and scalable way. Specific challenges for digital twins such as calibration and updating on actionable time scales highlight foundational gaps in inverse problem and data assimilation theory, methodology, and computational approaches. ML and artificial intelligence (AI) have potential large roles to play in addressing these challenges, such as through the use of online learning techniques for continuously updating models using streaming data. In addition, in settings where data are limited due to data acquisition resource constraints, AI approaches such as active learning and reinforcement learning can help guide the collection of additional data most salient to the digital twin's objectives. 在物理到虚拟的数字双胞胎反馈流程中,需要逆问题方法和数据同化,以严格、系统和可扩展的方式结合物理观测和虚拟模型。数字双胞胎面临的具体挑战,如在可操作时间尺度上进行校准和更新,突显了逆问题和数据同化理论、方法论及计算方法中的基础性缺口。机器学习(ML)和人工智能(AI)在解决这些挑战中可能发挥重要作用,例如通过使用在线学习技术,利用流数据持续更新模型。此外,在由于数据获取资源限制而导致数据有限的情况下,AI 方法如主动学习和强化学习可以帮助指导收集与数字双胞胎目标最相关的额外数据。
On the virtual-to-physical flowpath, the digital twin is used to drive changes in the physical counterpart itself, or in the observational systems associated with the physical counterpart through an automatic controller or a human. 在虚拟到物理的流动路径上,数字双胞胎用于驱动物理对应物本身的变化,或通过自动控制器或人类驱动与物理对应物相关的观察系统的变化。
Accordingly, the committee identified gaps associated with the use of digital twins for automated decision-making tasks, for providing decision support to a human decision-maker, and for decision tasks that are shared jointly within a human-agent team. There are additional challenges associated with the ethics and social implications of the use of digital twins in decision-making. Key gaps, research needs, and opportunities in the physical-to-virtual and virtual-to-physical feedback flows include the following: 因此,委员会识别出与使用数字双胞胎进行自动决策任务、为人类决策者提供决策支持以及在人机团队中共同进行决策任务相关的差距。使用数字双胞胎进行决策的伦理和社会影响也面临额外挑战。物理到虚拟和虚拟到物理反馈流中的关键差距、研究需求和机会包括以下内容:
Methods to incorporate state-of-the-art risk metrics and characterization of extreme events in digital twin decision-making are needed. 需要在数字双胞胎决策中纳入最先进的风险指标和极端事件特征的方法。
The assimilation of sensor data for using digital twins on actionable time scales will require advancements in data assimilation methods and tight coupling with the control or decision-support task at hand. Data assimilation techniques are needed for data from multiple sources at different scales and numerical models with different levels of uncertainty. 传感器数据的同化以在可操作时间尺度上使用数字双胞胎将需要数据同化方法的进步,并与当前的控制或决策支持任务紧密结合。需要数据同化技术来处理来自多个来源、不同尺度和具有不同不确定性水平的数值模型的数据。
Methods and tools are needed to make sensitivity information more readily available for model-centric digital twins, including automatic differentiation capabilities that will be successful for multiphysics, multiscale digital twin virtual representations, including those that couple multiple codes, each simulating different components of a complex system. Scalable and efficient optimization and uncertainty quantification methods that handle non-differentiable functions that arise with many risk metrics are also lacking. 需要方法和工具,使灵敏度信息更容易获得,以便为以模型为中心的数字双胞胎提供支持,包括适用于多物理、多尺度数字双胞胎虚拟表示的自动微分能力,这些表示包括耦合多个代码,每个代码模拟复杂系统的不同组件。还缺乏可扩展和高效的优化和不确定性量化方法,以处理许多风险指标中出现的不可微分函数。
Scalable methods are needed for goal-oriented sensor steering and optimal experimental design that encompass the full sense-assimilate-predictcontrol-steer cycle while accounting for uncertainty. 需要可扩展的方法来实现目标导向的传感器引导和最佳实验设计,这些方法涵盖了完整的感知-同化-预测-控制-引导循环,同时考虑不确定性。
Development of implementation science around digital twins, user-centered design of digital twins, and effective human-digital twin teaming is needed. 需要围绕数字双胞胎的实施科学、以用户为中心的数字双胞胎设计以及有效的人机双胞胎团队合作进行发展。
Research is needed on the impact of the content, context, and mode of human-digital twin interaction on the resulting decisions. 需要研究人类与数字双胞胎互动的内容、背景和方式对最终决策的影响。
Ethics, Privacy, Data Governance, and Security 伦理、隐私、数据治理与安全
Protecting individual privacy requires proactive ethical consideration at every phase of development and within each element of the digital twin ecosystem. Moreover, the tight integration between the physical system and its virtual representation has significant cybersecurity implications, beyond what has historically been needed, that must be considered in order to effectively safeguard and scale digital twins. While security issues with digital twins share common challenges with cybersecurity issues in other settings, the close relationship between cyber and physical in digital twins could make cybersecurity more challenging. Privacy, ownership, and responsibility for data accuracy in complex, heterogeneous digital twin environments are all areas with important open questions that require attention. While the committee noted that many data ethics and governance issues fall outside the study's charge, it is important to highlight the dangers of scaling digital twins without actionable standards for appropriate use and guidelines for identifying liability in the case of misuse. Furthermore, digital twins necessitate heightened levels of security, particularly around the transmission of data and information between the physical and virtual counterparts. Especially in sensi- 保护个人隐私需要在开发的每个阶段和数字双胞胎生态系统的每个元素中进行主动的伦理考虑。此外,物理系统与其虚拟表示之间的紧密集成具有重大的网络安全影响,这超出了历史上所需的范围,必须加以考虑,以有效保护和扩展数字双胞胎。虽然数字双胞胎的安全问题与其他环境中的网络安全问题面临共同挑战,但数字双胞胎中网络与物理的紧密关系可能使网络安全变得更加复杂。在复杂的异构数字双胞胎环境中,隐私、数据所有权和数据准确性的责任都是需要关注的重要开放问题。尽管委员会指出,许多数据伦理和治理问题超出了研究的范围,但强调在没有可操作的适当使用标准和识别误用责任的指南的情况下扩展数字双胞胎的危险是很重要的。 此外,数字双胞胎需要更高水平的安全性,特别是在物理和虚拟对应体之间的数据和信息传输方面。特别是在感官方面
tive or high-risk settings, malicious interactions could result in security risks for the physical system. Additional safeguard design is necessary for digital twins. 在积极或高风险环境中,恶意交互可能导致物理系统的安全风险。数字双胞胎需要额外的安全设计。
TOWARD SCALABLE AND SUSTAINABLE DIGITAL TWINS 迈向可扩展和可持续的数字双胞胎
Realizing the societal benefits of digital twins will require both incremental and more dramatic research advances in cross-disciplinary approaches. In addition to bridging fundamental research challenges in statistics, mathematics, and computing, bringing complex digital twins to fruition necessitates robust and reliable yet agile and adaptable integration of all these disparate pieces. 实现数字双胞胎的社会效益将需要在跨学科方法上进行渐进和更为戏剧性的研究进展。除了弥合统计学、数学和计算机科学中的基础研究挑战外,将复杂的数字双胞胎付诸实践还需要对所有这些不同部分进行强大、可靠而又灵活、适应性强的整合。
Evolution and Sustainability of a Digital Twin 数字双胞胎的演变与可持续性
Over time, the digital twin will likely need to meet new demands, incorporate new or updated models, and obtain new data from the physical system to maintain its accuracy. Model management is key for supporting the digital twin evolution. For a digital twin to faithfully reflect temporal and spatial changes where applicable in the physical counterpart, the resulting predictions must be reproducible, incorporate improvements in the virtual representation, and be reusable in scenarios not originally envisioned. This, in turn, requires a design approach to digital twin development and evolution that is holistic, robust, and enduring, yet flexible, composable, and adaptable. Digital twins require a foundational backbone that, in whole or in part, is reusable across multiple domains, supports multiple diverse activities, and serves the needs of multiple users. Digital twins must seamlessly operate in a heterogeneous and distributed infrastructure supporting a broad spectrum of operational environments, ranging from hand-held mobile devices accessing digital twins on-the-go to large-scale, centralized highperformance computing installations. Sustaining a robust, flexible, dynamic, accessible, and secure digital twin is a key consideration for creators, funders, and the diverse community of stakeholders. 随着时间的推移,数字双胞胎可能需要满足新的需求,整合新的或更新的模型,并从物理系统获取新数据以保持其准确性。模型管理是支持数字双胞胎演变的关键。为了使数字双胞胎忠实地反映物理对应物在时间和空间上的变化,所产生的预测必须是可重复的,融入虚拟表示的改进,并在最初未设想的场景中可重用。这反过来又需要一种全面、稳健和持久的数字双胞胎开发和演变设计方法,同时又要灵活、可组合和适应性强。数字双胞胎需要一个基础支撑,整体或部分可在多个领域中重用,支持多种多样的活动,并满足多个用户的需求。数字双胞胎必须在异构和分布式基础设施中无缝运行,支持广泛的操作环境,从手持移动设备随时访问数字双胞胎到大型集中式高性能计算设施。 维持一个强大、灵活、动态、可访问和安全的数字双胞胎是创作者、资助者和多元利益相关者社区的重要考虑因素。
Conclusion 7-1: The notion of a digital twin has inherent value because it gives an identity to the virtual representation. This makes the virtual representation-the mathematical, statistical, and computational models of the system and its data - an asset that should receive investment and sustainment in ways that parallel investment and sustainment in the physical counterpart. 结论 7-1:数字双胞胎的概念具有内在价值,因为它赋予了虚拟表示一个身份。这使得虚拟表示——系统及其数据的数学、统计和计算模型——成为一种资产,应以与物理对应物相似的方式进行投资和维护。
Recommendation 4: Federal agencies should each conduct an assessment for their major use cases of digital twin needs to maintain and sustain data, software, sensors, and virtual models. These assessments should drive the definition and establishment of new programs similar to the National Science Foundation's Natural Hazards Engineering Research Infrastructure and Cyberinfrastructure for Sustained Scientific 建议 4:联邦机构应对其主要数字双胞胎需求用例进行评估,以维护和支持数据、软件、传感器和虚拟模型。这些评估应推动新项目的定义和建立,类似于国家科学基金会的自然灾害工程研究基础设施和持续科学的网络基础设施。
Innovation programs. These programs should target specific communities and provide support to sustain, maintain, and manage the life cycle of digital twins beyond their initial creation, recognizing that sustainability is critical to realizing the value of upstream investments in the virtual representations that underlie digital twins. 创新项目。这些项目应针对特定社区,并提供支持,以维持、管理数字双胞胎的生命周期,超越其初始创建,认识到可持续性对于实现上游投资在数字双胞胎基础上的虚拟表现的价值至关重要。
Translation and Collaborations Between Domains 领域之间的翻译与合作
There are domain-specific and even use-specific digital twin challenges, but there are also many elements that cut across domains and use cases. For digital twin virtual representations, advancing the models themselves is necessarily domain-specific, but advancing the digital twin enablers of hybrid modeling and surrogate modeling embodies shared challenges that crosscut domains. For the physical counterpart, many of the challenges around sensor technologies and data are domain-specific, but issues around handling and fusing multimodal data, enabling access to data, and advancing data curation practices embody shared challenges that crosscut domains. When it comes to the physical-to-virtual and virtual-to-physical flows, there is an opportunity to advance data assimilation, inverse methods, control, and sensor-steering methodologies that are applicable across domains, while at the same time recognizing domain-specific needs, especially as they relate to the domain-specific nature of decision-making. Finally, there is a significant opportunity to advance digital twin VVUQ methods and practices in ways that translate across domains. 存在特定领域甚至特定用途的数字双胞胎挑战,但也有许多跨越领域和用例的共同元素。对于数字双胞胎的虚拟表示,推进模型本身必然是特定于领域的,但推进混合建模和替代建模的数字双胞胎支持技术体现了跨领域的共同挑战。对于物理对应物,许多与传感器技术和数据相关的挑战是特定于领域的,但处理和融合多模态数据、实现数据访问以及推进数据管理实践的问题体现了跨领域的共同挑战。在物理到虚拟和虚拟到物理的流动中,有机会推进适用于各个领域的数据同化、逆方法、控制和传感器引导方法,同时认识到特定领域的需求,特别是与决策的特定领域性质相关的需求。最后,有一个重要的机会可以推进数字双胞胎的验证、验证和不确定性量化(VVUQ)方法和实践,以便在不同领域之间进行转换。
As stakeholders consider architecting programs that balance these domainspecific needs with cross-domain opportunities, it is important to recognize that different domains have varying levels of maturity with respect to the different elements of the digital twin. For example, the Earth system science community is a leader in data assimilation; many fields of engineering are leaders in integrating VVUQ into simulation-based decision-making; and the medical community has a strong culture of prioritizing the role of a human decision-maker when advancing new technologies. Cross-domain interactions through the common lens of digital twins are opportunities to share, learn, and cross-fertilize. 随着利益相关者考虑架构能够平衡这些领域特定需求与跨领域机会的项目,重要的是要认识到不同领域在数字双胞胎的不同元素方面具有不同的成熟度。例如,地球系统科学界在数据同化方面处于领先地位;许多工程领域在将 VVUQ 整合到基于仿真的决策中处于领先地位;而医学界在推进新技术时,优先考虑人类决策者的角色有着强烈的文化。通过数字双胞胎的共同视角进行跨领域互动是分享、学习和交叉 fertilization 的机会。
Conclusion 7-2: As the foundations of digital twins are established, it is the ideal time to examine the architecture, interfaces, bidirectional workflows of the virtual twin with the physical counterpart, and community practices in order to make evolutionary advances that benefit all disciplinary communities. 结论 7-2:随着数字双胞胎基础的建立,现在是审视虚拟双胞胎与物理对应物的架构、接口、双向工作流程以及社区实践的理想时机,以便实现有利于所有学科社区的进化性进展。
Recommendation 5: Agencies should collaboratively and in a coordinated fashion provide cross-disciplinary workshops and venues to foster identification of those aspects of digital twin research and development that would benefit from a common approach and which specific research 建议五:各机构应协同合作,以协调的方式提供跨学科的研讨会和场所,以促进识别数字双胞胎研究与开发中哪些方面可以从共同方法中受益,以及哪些具体研究
topics are shared. Such activities should encompass responsible use of digital twins and should necessarily include international collaborators. 主题被共享。这些活动应包括数字双胞胎的负责任使用,并且必须包括国际合作伙伴。
Recommendation 6: Federal agencies should identify targeted areas relevant to their individual or collective missions where collaboration with industry would advance research and translation. Initial examples might include the following: 建议 6:联邦机构应确定与其各自或共同使命相关的目标领域,在这些领域与行业的合作将推动研究和转化。初步示例可能包括以下内容:
Department of Defense-asset management, incorporating the processes and practices employed in the commercial aviation industry for maintenance analysis. 国防部资产管理,结合商业航空行业在维护分析中采用的流程和实践。
Department of Energy - energy infrastructure security and improved (efficient and effective) emergency preparedness. 能源部 - 能源基础设施安全和改进(高效和有效的)应急准备。
National Institutes of Health-in silico drug discovery, clinical trials, preventative health care and behavior modification programs, clinical team coordination, and pandemic emergency preparedness. 国家卫生研究院 - 计算机辅助药物发现、临床试验、预防保健和行为改变项目、临床团队协调以及疫情应急准备。
National Science Foundation—Directorate for Technology, Innovation and Partnerships programs. 国家科学基金会——技术、创新与合作项目主任处。
There is a history of both sharing and coordination of models within the international climate research community as well as a consistent commitment to data exchange that is beneficial to digital twins. While other disciplines have open-source or shared models, few support the breadth in scale and the robust integration of uncertainty quantification that are found in Earth system models and workflows. A greater level of coordination among the multidisciplinary teams of other complex systems, such as biomedical systems, would benefit maturation and cultivate the adoption of digital twins. 在国际气候研究社区中,模型的共享和协调有着悠久的历史,并且始终致力于数据交换,这对数字双胞胎是有益的。虽然其他学科有开源或共享模型,但很少有支持地球系统模型和工作流程中所发现的广泛规模和强大的不确定性量化整合的模型。其他复杂系统(如生物医学系统)中多学科团队之间更高水平的协调将有助于成熟并促进数字双胞胎的采用。
Conclusion 7-4: Fostering a culture of collaborative exchange of data and models that incorporate context through metadata and provenance in digital twin-relevant disciplines could accelerate progress in the development and application of digital twins. 结论 7-4:在与数字双胞胎相关的学科中,培养一种通过元数据和来源整合上下文的协作数据和模型交换文化,可以加速数字双胞胎的开发和应用进程。
Recommendation 7: In defining new digital twin research efforts, federal agencies should, in the context of their current and future mission priorities, (1) seed the establishment of forums to facilitate good practices for effective collaborative exchange of data and models across disciplines and domains, while addressing the growing privacy and ethics demands of digital twins; (2) foster and/or require collaborative exchange of data and models; and (3) explicitly consider the role for collaboration and coordination with international bodies. 建议 7:在定义新的数字双胞胎研究工作时,联邦机构应在其当前和未来的使命优先事项的背景下,(1) 种植建立论坛,以促进跨学科和领域有效协作交流数据和模型的良好实践,同时应对数字双胞胎日益增长的隐私和伦理要求;(2) 促进和/或要求数据和模型的协作交流;(3) 明确考虑与国际机构的合作与协调的角色。
Preparing an Interdisciplinary Workforce for Digital Twins 为数字双胞胎准备跨学科劳动力
The successful adoption and progress of digital twins hinge on the appropriate education and training of the workforce. This educational shift requires formalizing, nurturing, and growing critical computational, mathematical, and engineering skill sets at the intersection of disciplines such as biology, chemistry, and physics. These critical skill sets include but are not limited to systems engineering, systems thinking and architecting, data analytics, ML/AI, statistical/ probabilistic modeling and simulation, uncertainty quantification, computational mathematics, and decision science. These disciplines are rarely taught within the same academic curriculum. 数字双胞胎的成功采用和进展依赖于对劳动力的适当教育和培训。这一教育转变需要在生物学、化学和物理学等学科交叉点上,正式化、培养和发展关键的计算、数学和工程技能。这些关键技能包括但不限于系统工程、系统思维与架构、数据分析、机器学习/人工智能、统计/概率建模与仿真、不确定性量化、计算数学和决策科学。这些学科在同一学术课程中很少教授。
Recommendation 8: Within the next year, federal agencies should organize workshops with participants from industry and academia to identify barriers, explore potential implementation pathways, and incentivize the creation of interdisciplinary degrees at the bachelor's, master's, and doctoral levels. 建议 8:在接下来的一年内,联邦机构应组织行业和学术界参与者的研讨会,以识别障碍,探索潜在的实施路径,并激励在本科、硕士和博士层面创建跨学科学位。
REFERENCES 参考文献
AIAA (American Institute of Aeronautics and Astronautics) Digital Engineering Integration Committee. 2020. "Digital Twin: Definition & Value." AIAA and AIA Position Paper, AIAA, Reston, VA. AIAA(美国航空航天学会)数字工程集成委员会。2020 年。“数字双胞胎:定义与价值。” AIAA 和 AIA 立场文件,AIAA,弗吉尼亚州雷斯顿。
NRC (National Research Council). 2012. Assessing the Reliability of Complex Models: Mathematical and Statistical Foundations of Verification, Validation, and Uncertainty Quantification. Washington, DC: The National Academies Press. NRC(国家研究委员会)。2012 年。《评估复杂模型的可靠性:验证、确认和不确定性量化的数学和统计基础》。华盛顿特区:国家科学院出版社。
Introduction 介绍
Digital twins, which are virtual representations of natural, engineered, or social systems, hold immense promise in accelerating scientific discovery and revolutionizing industries. This report aims to shed light on the key research needs to advance digital twins in several domains, and the opportunities that can be realized by bridging the gaps that currently hinder the effective implementation of digital twins in scientific research and industrial processes. This report provides practical recommendations to bring the promise of digital twins to fruition, both today and in the future. 数字双胞胎是自然、工程或社会系统的虚拟表示,具有加速科学发现和革新行业的巨大潜力。本报告旨在阐明在多个领域推进数字双胞胎的关键研究需求,以及通过弥补当前阻碍数字双胞胎在科学研究和工业过程有效实施的差距所能实现的机会。本报告提供了切实可行的建议,以实现数字双胞胎的承诺,无论是今天还是未来。
THE SIGNIFICANCE OF DIGITAL TWINS 数字双胞胎的意义
Digital twins are being explored and implemented in various domains as tools to allow for deeper insights into the performance, behavior, and characteristics of natural, engineered, or social systems. A digital twin can be a critical tool for decision-making that uses a synergistic combination of models and data. The bidirectional interplay between models and data endows the digital twin with a dynamic nature that goes beyond what has been traditionally possible with modeling and simulation, creating a virtual representation that evolves with the system over time. The use cases for digital twins are diverse and proliferating including applications in biomedical research, engineering, atmospheric science, and many more-and their potential is wide-reaching. 数字双胞胎正在各个领域被探索和实施,作为深入洞察自然、工程或社会系统的性能、行为和特征的工具。数字双胞胎可以成为决策的重要工具,利用模型和数据的协同组合。模型和数据之间的双向互动赋予数字双胞胎一种动态特性,超越了传统建模和仿真所能实现的,创造出一个随着系统随时间演变的虚拟表示。数字双胞胎的应用案例多种多样,并且正在迅速增加,包括生物医学研究、工程、大气科学等多个领域——它们的潜力广泛。
Digital twins are emerging as enablers for significant, sustainable progress across industries. With the potential to transform traditional scientific and industrial practices and enhance operational efficiency, digital twins have captured the attention and imagination of professionals across various disciplines and 数字双胞胎正在成为各行业实现重大、可持续进展的推动者。它们有潜力改变传统的科学和工业实践,提高运营效率,数字双胞胎吸引了各个领域专业人士的关注和想象。
sectors. By simulating real-time behavior, monitoring performance to detect anomalies and exceptional conditions, and enabling predictive insights and effective optimizations, digital twins have the capacity to revolutionize scientific research, enhance operational efficiency, optimize production strategies, reduce time-to-market, and unlock new avenues for scientific and industrial growth and innovation. 数字双胞胎通过模拟实时行为、监测性能以检测异常和特殊情况,以及提供预测洞察和有效优化,具有革命性地改变科学研究、提升运营效率、优化生产策略、缩短上市时间,并为科学和工业的增长与创新开辟新途径的能力。
Digital twins not only offer a means to capture the knowledge and expertise of experienced professionals but also provide a platform for knowledge transfer and continuity. By creating a digital representation of assets and systems, organizations can bridge the gap between generations, ensuring that critical knowledge is preserved and accessible to future workforces and economies. 数字双胞胎不仅提供了一种捕捉经验丰富的专业人士知识和专长的手段,还提供了知识转移和延续的平台。通过创建资产和系统的数字化表示,组织可以弥合代际之间的差距,确保关键知识得以保存并可供未来的劳动力和经济使用。
In the present landscape, "digital twin" has become a buzzword, often associated with innovation and transformation. While there is significant enthusiasm around industry developments and applications of digital twins, the focus of this report is on identifying research gaps and opportunities. The report's recommendations are particularly targeted toward what agencies and researchers can do to advance mathematical, statistical, and computational foundations of digital twins. Scientific and industrial organizations are eager to explore the possibilities offered by digital twins, but gaps and challenges often arise that impede their implementation and hinder their ability to fully deliver the promised value. Organizations eager to use digital twins do not always understand how well the digital twins match reality and whether they can be relied on for critical decisions-much of this report is aimed at elucidating the foundational mathematical, statistical, and computational research needed to bridge those gaps. Other technological complexities pose challenges as well, such as network connectivity and edge computing capabilities, data integration issues and the lack of standardized frameworks or data structures, and interoperability among various systems. Additional challenges include organizational aspects, including workforce readiness, cultural shifts, and change management required to facilitate the successful adoption and integration of digital twins. Furthermore, ensuring data security, cybersecurity, privacy, and ethical practices remains a pressing concern as organizations delve into the realm of digital twins. 在当前的环境中,“数字双胞胎”已成为一个流行词,通常与创新和转型相关联。尽管对数字双胞胎的行业发展和应用充满热情,但本报告的重点在于识别研究空白和机会。报告的建议特别针对机构和研究人员可以采取的措施,以推动数字双胞胎的数学、统计和计算基础。科学和工业组织渴望探索数字双胞胎所提供的可能性,但常常会出现阻碍其实施的空白和挑战,妨碍其充分实现承诺的价值。渴望使用数字双胞胎的组织并不总是理解数字双胞胎与现实的匹配程度,以及它们是否可以用于关键决策——本报告的许多内容旨在阐明弥补这些空白所需的基础数学、统计和计算研究。 其他技术复杂性也带来了挑战,例如网络连接和边缘计算能力、数据集成问题以及缺乏标准化框架或数据结构,以及各系统之间的互操作性。额外的挑战还包括组织方面的问题,包括员工准备情况、文化转变以及促进数字双胞胎成功采用和整合所需的变更管理。此外,确保数据安全、网络安全、隐私和伦理实践仍然是一个紧迫的问题,因为组织深入数字双胞胎的领域。
COMMITTEE TASK AND SCOPE OF WORK 委员会任务和工作范围
This study was supported by the Department of Energy (Office of Advanced Scientific Computing Research and Office of Biological and Environmental Research), the Department of Defense (Air Force Office of Scientific Research and Defense Advanced Research Projects Agency), the National Institutes of Health (National Cancer Institute, National Institute of Biomedical Imaging and Bioengineering, National Library of Medicine, and Office of Data Science Strategy), and the National Science Foundation (Directorate for Engineering and Directorate for Mathematical and Physical Sciences). In collaboration with the National 本研究得到了能源部(先进科学计算研究办公室和生物与环境研究办公室)、国防部(空军科学研究办公室和国防高级研究计划局)、国家卫生研究院(国家癌症研究所、国家生物医学成像与生物工程研究所、国家医学图书馆和数据科学战略办公室)以及国家科学基金会(工程学部和数学与物理科学部)的支持。与国家合作。
Academies of Sciences, Engineering, and Medicine, these agencies developed the study's statement of task (see Appendix A), which highlights important questions relating to the following: 科学、工程和医学学院,这些机构制定了研究的任务声明(见附录 A),强调了与以下内容相关的重要问题:
Definitions of and use cases for digital twins; 数字双胞胎的定义和使用案例;
Foundational mathematical, statistical, and computational gaps for digital twins; 数字双胞胎的基础数学、统计和计算差距;
Best practices for digital twin development and use; and 数字双胞胎开发和使用的最佳实践;以及
Opportunities to advance the use and practice of digital twins. 推动数字双胞胎使用和实践的机会。
The National Academies appointed a committee of 16 members with expertise in mathematics, statistics, computer science, computational science, data science, uncertainty quantification, biomedicine, computational biology, other life sciences, engineering, atmospheric science and climate, privacy and ethics, industry, urban planning/smart cities, and defense. Committee biographies are provided in Appendix F. 国家科学院任命了一个由 16 名成员组成的委员会,这些成员在数学、统计学、计算机科学、计算科学、数据科学、不确定性量化、生物医学、计算生物学、其他生命科学、工程、气象科学与气候、隐私与伦理、工业、城市规划/智慧城市和国防等领域具有专业知识。委员会的传记见附录 F。
The committee held several data-gathering meetings in support of this study, including three public workshops on the use of digital twins in atmospheric and climate sciences (NASEM 2023a), biomedical sciences (NASEM 2023b), and engineering (NASEM 2023c). 委员会举行了几次数据收集会议,以支持这项研究,包括关于数字双胞胎在大气和气候科学(NASEM 2023a)、生物医学科学(NASEM 2023b)和工程(NASEM 2023c)中应用的三次公众研讨会。
REPORT STRUCTURE 报告结构
This report was written with the intention of informing the scientific and research community, academia, pertinent government agencies, digital twin practitioners, and those in relevant industries about open needs and foundational gaps to overcome to advance digital twins. While the range of challenges and open questions around digital twins is broad, it should be noted that the focus of this report is on foundational gaps. The report begins by defining a digital twin, outlining its elements and overarching themes, and articulating the need for an integrated research agenda in Chapter 2. The next four chapters expound on the four major elements of a digital twin as defined by the committee: the virtual representation, the physical counterpart, the feedback flow from the physical to the virtual, and the feedback flow from the virtual to the physical. In Chapter 3, fitness for purpose, modeling challenges, and integration of digital twin components for the virtual representation are discussed. Chapter 4 explores the needs and opportunities around data acquisition and data integration in preparation for inverse problem and data assimilation tasks, which are discussed in Chapter 5. Automated decision-making and human-digital twin interactions, as well as the ethical implications of making decisions using a digital twin or its outputs, are addressed in Chapter 6. Chapter 7 looks at some of the broader gaps and needs to be addressed in order to scale and sustain digital twins, including cross-community efforts and workforce challenges. Finally, Chapter 8 aggregates all of the findings, conclusions, gaps, and recommendations placed throughout the report. 本报告旨在向科学和研究界、学术界、相关政府机构、数字双胞胎从业者以及相关行业人士通报开放需求和基础性差距,以推动数字双胞胎的发展。尽管围绕数字双胞胎的挑战和开放问题范围广泛,但本报告的重点在于基础性差距。报告首先在第二章中定义数字双胞胎,概述其要素和总体主题,并阐明整合研究议程的必要性。接下来的四章详细阐述了委员会定义的数字双胞胎的四个主要要素:虚拟表示、物理对应物、从物理到虚拟的反馈流以及从虚拟到物理的反馈流。在第三章中,讨论了适用性、建模挑战以及虚拟表示的数字双胞胎组件的整合。第四章探讨了在准备逆问题和数据同化任务时,数据获取和数据整合的需求和机会,这些内容在第五章中进行了讨论。 第六章讨论了自动化决策和人类与数字双胞胎的互动,以及使用数字双胞胎或其输出进行决策的伦理影响。第七章关注为扩大和维持数字双胞胎而需要解决的一些更广泛的差距和需求,包括跨社区的努力和劳动力挑战。最后,第八章汇总了报告中所有的发现、结论、差距和建议。
REFERENCES 参考文献
NASEM (National Academies of Sciences, Engineering, and Medicine). 2023a. Opportunities and Challenges for Digital Twins in Atmospheric and Climate Sciences: Proceedings of a Workshop-in Brief. Washington, DC: The National Academies Press. NASEM(国家科学院、工程院和医学院)。2023a。气候与大气科学中数字双胞胎的机遇与挑战:研讨会简报。华盛顿特区:国家科学院出版社。
NASEM. 2023b. Opportunities and Challenges for Digital Twins in Biomedical Research: Proceedings of a Workshop - in Brief. Washington, DC: The National Academies Press. NASEM. 2023b. 生物医学研究中数字双胞胎的机遇与挑战:研讨会简报。华盛顿特区:国家科学院出版社。
NASEM. 2023c. Opportunities and Challenges for Digital Twins in Engineering: Proceedings of a Workshop - in Brief. Washington, DC: The National Academies Press. NASEM. 2023c. 工程领域数字双胞胎的机遇与挑战:研讨会简报。华盛顿特区:国家科学院出版社。
2
The Digital Twin Landscape 数字双胞胎领域
This chapter lays the foundation for an understanding of the landscape of digital twins and the need for an integrated research agenda. The chapter begins by defining a digital twin. It then articulates the elements of the digital twin ecosystem, discussing how a digital twin is more than just a simulation and emphasizing the bidirectional interplay between a virtual representation and its physical counterpart. The chapter discusses the critical role of verification, validation, and uncertainty quantification (VVUQ) in digital twins, as well as the importance of ethics, privacy, data governance, and security. The chapter concludes with a brief assessment of the state of the art and articulates the importance of an integrated research agenda to realize the potential of digital twins across science, technology, and society. 本章为理解数字双胞胎的领域及其整合研究议程的必要性奠定了基础。章节开始时定义了数字双胞胎。接着阐述了数字双胞胎生态系统的要素,讨论了数字双胞胎不仅仅是一个模拟,并强调了虚拟表示与其物理对应物之间的双向互动。章节讨论了验证、确认和不确定性量化(VVUQ)在数字双胞胎中的关键作用,以及伦理、隐私、数据治理和安全的重要性。最后,章节简要评估了当前的技术水平,并阐明了整合研究议程的重要性,以实现数字双胞胎在科学、技术和社会中的潜力。
DEFINITIONS 定义
Noting that the scope of this study is on identifying foundational research gaps and opportunities for digital twins, it is important to have a shared understanding of the definition of a digital twin. For the purposes of this report, the committee uses the following definition of a digital twin: 注意到本研究的范围是识别数字双胞胎的基础研究差距和机会,重要的是对数字双胞胎的定义有一个共同的理解。为了本报告的目的,委员会使用以下数字双胞胎的定义:
A digital twin is a set of virtual information constructs that mimics the structure, context, and behavior of a natural, engineered, or social system (or systemof-systems), is dynamically updated with data from its physical twin, has a predictive capability, and informs decisions that realize value. The bidirectional interaction between the virtual and the physical is central to the digital twin. 数字双胞胎是一组虚拟信息构造,模拟自然、工程或社会系统(或系统的系统)的结构、上下文和行为,动态地从其物理双胞胎更新数据,具有预测能力,并为实现价值的决策提供信息。虚拟与物理之间的双向互动是数字双胞胎的核心。
This definition is based heavily on a definition published in 2020 by the American Institute of Aeronautics and Astronautics (AIAA) Digital Engineering Integration Committee (2020). The study committee's definition modifies the AIAA definition to better align with domains beyond aerospace engineering. In place of the term "asset," the committee refers to "a natural, engineered, or social system (or system-of-systems)" to describe digital twins of physical systems in the broadest sense possible, including the engineered world, natural phenomena, biological entities, and social systems. The term "system-of-systems" acknowledges that many digital twin use cases involve virtual representations of complex systems that are themselves a collection of multiple coupled systems. This definition also introduces the phrase "has a predictive capability" to emphasize the important point that a digital twin must be able to issue predictions beyond the available data in order to drive decisions that realize value. Finally, the committee's definition adds the sentence "The bidirectional interaction between the virtual and the physical is central to the digital twin." As described below, the bidirectional interaction comprises feedback flows of information from the physical system to the virtual representation and from the virtual back to the physical system to enable decision-making, either automatic or with a human- or humans-in-the-loop. Although the importance of the bidirectional interaction is implicit in the earlier part of the definition, our committee's information gathering revealed the importance of explicitly emphasizing this aspect (Ghattas 2023; Girolami 2022; Wells 2022). 该定义主要基于美国航空航天学会(AIAA)数字工程集成委员会在 2020 年发布的定义。研究委员会的定义修改了 AIAA 的定义,以更好地与航空航天工程以外的领域对齐。委员会用“自然、工程或社会系统(或系统-of-systems)”来替代“资产”一词,以描述物理系统的数字双胞胎,涵盖工程世界、自然现象、生物实体和社会系统的最广泛意义。“系统-of-systems”一词承认许多数字双胞胎用例涉及复杂系统的虚拟表示,而这些系统本身是多个耦合系统的集合。该定义还引入了“具有预测能力”这一短语,以强调数字双胞胎必须能够发出超出可用数据的预测,以推动实现价值的决策。最后,委员会的定义增加了句子“虚拟与物理之间的双向互动是数字双胞胎的核心。”如下面所述,双向交互包括从物理系统到虚拟表示的信息反馈流,以及从虚拟系统返回到物理系统的信息流,以便进行决策,无论是自动的还是由人类或人类参与的。尽管双向交互的重要性在定义的早期部分中隐含,但我们委员会的信息收集揭示了明确强调这一方面的重要性(Ghattas 2023;Girolami 2022;Wells 2022)。
While it is important to have a shared understanding of the definition of a digital twin, it is also important to recognize that the broad nature of the digital twin concept will lead to differences in digital twin elements across different domains, and even in different use cases within a particular domain. Thus, while the committee adopts this definition for the purposes of this report, it recognizes the value in alternate definitions in other settings. 虽然对数字双胞胎的定义达成共识很重要,但也必须认识到,数字双胞胎概念的广泛性将导致不同领域之间,甚至在特定领域内的不同用例中,数字双胞胎元素的差异。因此,尽管委员会为了本报告采用了这个定义,但它也承认在其他环境中替代定义的价值。
Digital Twin Origins 数字双胞胎的起源
While the concept itself is older, the term "digital twin" emerged around 2010 during technical roadmapping efforts at NASA co-led by John Vickers. The term "digital twin" was defined in published NASA reports by Piascik et al. (2012) and Shafto et al. (2012), and in a follow-on paper by Glaessgen and Stargel (2012): 尽管这一概念本身较早,但“数字双胞胎”一词是在 2010 年左右出现的,当时 NASA 在约翰·维克斯的共同领导下进行技术路线图规划。Piascik 等人(2012 年)和 Shafto 等人(2012 年)在发布的 NASA 报告中定义了“数字双胞胎”一词,Glaessgen 和 Stargel(2012 年)在后续论文中也进行了定义:
A digital twin is an integrated multiphysics, multi-scale, probabilistic simulation of a vehicle or system that uses the best available physical models, sensor updates, fleet history, etc., to mirror the life of its flying twin. The digital twin is ultra-realistic and may consider one or more important and interdependent vehicle systems, including propulsion and energy storage, life support, avionics, thermal protection, etc. (Shafto et al. 2012) 数字双胞胎是对车辆或系统的综合多物理、多尺度、概率模拟,利用最佳可用的物理模型、传感器更新、车队历史等,来反映其飞行双胞胎的生命周期。数字双胞胎具有超现实性,可能考虑一个或多个重要且相互依赖的车辆系统,包括推进和能源存储、生命支持、航空电子、热保护等。(Shafto 等,2012)
This definition and notion are built on earlier work by Grieves (2005a,b) in product life-cycle management. A closely related concept is that of Dynamic Data Driven Application Systems (DDDAS) (Darema 2004). Some of the early published DDDAS work has all the elements of a digital twin, including the physical, the virtual, and the two-way interaction via a feedback loop. Many of the notions underlying digital twins also have a long history in other fields, such as model predictive control, which similarly combines models and data in a bidirectional feedback loop (Rawlings et al. 2017), and data assimilation, which has long been used in the field of weather forecasting to combine multiple sources of data with numerical models (Reichle 2008). 该定义和概念建立在 Grieves(2005a,b)关于产品生命周期管理的早期工作基础上。一个密切相关的概念是动态数据驱动应用系统(DDDAS)(Darema 2004)。一些早期发布的 DDDAS 工作包含数字双胞胎的所有元素,包括物理、虚拟以及通过反馈循环进行的双向交互。数字双胞胎的许多基本概念在其他领域也有悠久的历史,例如模型预测控制,它同样在双向反馈循环中结合模型和数据(Rawlings 等,2017),以及数据同化,这在天气预报领域长期用于将多个数据源与数值模型结合(Reichle 2008)。
Much of the early work and development of digital twins was carried out in the field of aerospace engineering, particularly in the use of digital twins for structural health monitoring and predictive maintenance of airframes and aircraft engines (Tuegel et al. 2011). Today, interest in and development of digital twins has expanded well beyond aerospace engineering to include many different application areas across science, technology, and society. With that expansion has come a broadening in the views of what constitutes a digital twin along with differing specific digital twin definitions within different application contexts. During information-gathering sessions, the committee heard multiple different definitions of digital twins. The various definitions have some common elements, but even these common elements are not necessarily aligned across communities, reflecting the different nature of digital twins in different application settings. The committee also heard from multiple briefers that the "Digital Twin has no common agreed definition" (Girolami 2022; NASEM 2023a,b,c). 数字双胞胎的早期工作和发展主要是在航空航天工程领域进行的,特别是在结构健康监测和飞机机身及发动机的预测性维护方面(Tuegel et al. 2011)。如今,数字双胞胎的兴趣和发展已远远超出航空航天工程,涵盖了科学、技术和社会的许多不同应用领域。随着这一扩展,数字双胞胎的定义也在不断拓宽,不同应用背景下的具体定义各不相同。在信息收集会议上,委员会听到了多种不同的数字双胞胎定义。这些不同的定义有一些共同元素,但即使是这些共同元素在不同社区之间也不一定一致,反映了数字双胞胎在不同应用环境中的不同特性。委员会还听取了多位简报者的意见,认为“数字双胞胎没有共同认可的定义”(Girolami 2022;NASEM 2023a,b,c)。
ELEMENTS OF THE DIGITAL TWIN ECOSYSTEM 数字双胞胎生态系统的要素
A Digital Twin Is More Than Just Simulation and Modeling 数字双胞胎不仅仅是模拟和建模
The notion of a digital twin builds on a long history of modeling and simulation of complex systems but goes beyond simulation to include tighter integration between models, observational data, and decisions. The dynamic, bidirectional interaction between the physical and the virtual enables the digital twin to be tailored to a particular physical counterpart and to evolve as the physical counterpart evolves. This, in turn, enables dynamic data-driven decision-making. 数字双胞胎的概念建立在复杂系统建模和仿真的悠久历史基础上,但超越了仿真,包含了模型、观测数据和决策之间更紧密的集成。物理与虚拟之间的动态双向互动使得数字双胞胎能够针对特定的物理对应物进行定制,并随着物理对应物的演变而演变。这反过来又使得基于动态数据的决策制定成为可能。
Finding 2-1: A digital twin is more than just simulation and modeling. 发现 2-1:数字双胞胎不仅仅是模拟和建模。
Conclusion 2-1: The key elements that comprise a digital twin include (1) modeling and simulation to create a virtual representation of a physical 结论 2-1:构成数字双胞胎的关键要素包括(1)建模和仿真,以创建物理对象的虚拟表示
counterpart, and (2) a bidirectional interaction between the virtual and the physical. This bidirectional interaction forms a feedback loop that comprises dynamic data-driven model updating (e.g., sensor fusion, inversion, data assimilation) and optimal decision-making (e.g., control, sensor steering). 对等体,以及(2)虚拟与物理之间的双向互动。这种双向互动形成了一个反馈循环,包括动态数据驱动的模型更新(例如,传感器融合、反演、数据同化)和最优决策(例如,控制、传感器引导)。
These elements are depicted abstractly in Figure 2-1 and with examples in Box 2-1. More details are provided in the following subsections. 这些元素在图 2-1 中以抽象方式描绘,并在框 2-1 中提供了示例。更多细节将在以下小节中提供。
The Physical Counterpart and Its Virtual Representation 物理对应物及其虚拟表现
There are numerous and diverse examples of physical counterparts for which digital twins are recognized as bringing high potential value, including aircraft, body organs, cancer tumors, cities, civil infrastructure, coastal areas, farms, forests, global atmosphere, hospital operations, ice sheets, nuclear reactors, patients, and many more. These examples illustrate the broad potential scope of a digital twin, which may bring value at multiple levels of subsystem and system modeling. For example, digital twins at the levels of a cancer tumor, a body organ, and a patient all have utility and highlight the potential trade-offs in digital twin scope versus complexity. Essential to being able to create digital twins is the ability to acquire data from the physical counterpart. These data may be acquired from onboard or in situ sensors, remote sensing, automated and visual inspections, operational logs, imaging, and more. The committee considers these sensing and observational systems to be a part of the physical counterpart in its representation of the digital twin ecosystem. 数字双胞胎被认为在许多不同的物理对应物中具有很高的潜在价值,包括飞机、身体器官、癌症肿瘤、城市、民用基础设施、沿海地区、农场、森林、全球大气、医院运营、冰盖、核反应堆、患者等。这些例子展示了数字双胞胎的广泛潜在范围,可能在子系统和系统建模的多个层面上带来价值。例如,癌症肿瘤、身体器官和患者层面的数字双胞胎都具有实用性,并突显了数字双胞胎范围与复杂性之间的潜在权衡。创建数字双胞胎的关键在于能够从物理对应物中获取数据。这些数据可以通过机载或原位传感器、遥感、自动化和视觉检查、操作日志、成像等方式获取。委员会认为这些传感和观察系统是数字双胞胎生态系统中物理对应物的一部分。
FIGURE 2-1 Elements of the digital twin ecosystem. 图 2-1 数字双胞胎生态系统的要素。
NOTES: Information flows bidirectionally between the virtual representation and physical counterpart. These information flows may be through automated processes, human-driven processes, or a combination of the two. 注意:信息在虚拟表示和物理对应物之间双向流动。这些信息流可以通过自动化过程、人为驱动的过程或两者的结合进行。
BOX 2-1 盒子 2-1
Digital Twin Examples 数字双胞胎示例
Digital Twin of a Cancer Patient (Figure 2-1-1) 癌症患者的数字双胞胎(图 2-1-1)
The virtual representation of a cancer patient might comprise mechanistic models in the form of nonlinear partial differential equations describing temporal and spatial characteristics of tumor growth, with a state variable that represents spatiotemporal tumor cell density and/or heterogeneity. These models are characterized by parameters that represent the specific patient's anatomy, morphology, and constitutive properties such as the tumor cell proliferation rate and tissue carrying capacity; parameters that describe the initial tumor location, geometry, and burden; and parameters that describe the specific patient's response to treatments such as radiotherapy, chemotherapy, and immunotherapy. Quantities of interest might include computational estimates of patient characteristics, such as tumor cell count, time to progression, and toxicity. Decision tasks might include personalized therapy control decisions, such as the dose and schedule of delivery of therapeutics over time, and data collection decisions, such as the frequency of serial imaging studies, blood tests, and other clinical assessments. These decisions can be automated as part of the digital twin or made by a human informed by the digital twin's output. 癌症患者的虚拟表征可能包括以非线性偏微分方程形式描述肿瘤生长的时间和空间特征的机制模型,其中状态变量表示时空肿瘤细胞密度和/或异质性。这些模型的特征是代表特定患者解剖、形态和组成特性的参数,例如肿瘤细胞增殖率和组织承载能力;描述初始肿瘤位置、几何形状和负担的参数;以及描述特定患者对放疗、化疗和免疫疗法等治疗反应的参数。感兴趣的量可能包括患者特征的计算估计,例如肿瘤细胞计数、进展时间和毒性。决策任务可能包括个性化治疗控制决策,例如治疗药物的剂量和给药时间表,以及数据收集决策,例如连续影像学研究、血液检测和其他临床评估的频率。 这些决策可以作为数字双胞胎的一部分进行自动化,或者由人类根据数字双胞胎的输出进行决策。
Digital Twin of an Aircraft Engine 飞机发动机的数字双胞胎
The virtual representation of an aircraft engine might comprise machine learning (ML) models trained on a large database of sensor data and flight logs collected across a fleet of engines. These models are characterized by parameters that represent the operating conditions seen by this particular engine and numerical model parameters that represent the hyperparameters of the ML models. Quantities of interest might include computational estimates of possible blade material degradation. Decision tasks might include actions related to what maintenance to perform and when, as well as decisions related to performing additional inspections; these actions can be taken by a human informed by the digital twin's output, or they can be taken automatically by the digital twin. For instance, the digital twin could be leveraged for optimizing fuel efficiency in real time, simulating emergency response scenarios for enhanced pilot training, predicting parts that may soon need replacement for efficient inventory management, ensuring regulatory compliance on environmental and safety fronts, conducting cost-benefit analyses of various maintenance strategies, controlling noise pollution levels, and even assessing and planning for carbon emission reduction. By incorporating these additional decision-making tasks, the digital twin can contribute more comprehensively to the aircraft engine's operational efficiency, safety protocols, and compliance with environmental standards, thus amplifying its utility beyond merely informing maintenance schedules. 飞机发动机的虚拟表示可能包括在大量传感器数据和跨多个发动机收集的飞行日志上训练的机器学习(ML)模型。这些模型的特征由表示该特定发动机所见操作条件的参数和表示 ML 模型超参数的数值模型参数组成。感兴趣的量可能包括可能的叶片材料退化的计算估计。决策任务可能包括与何时进行维护以及执行额外检查相关的行动;这些行动可以由根据数字双胞胎的输出信息的人工进行,也可以由数字双胞胎自动执行。 例如,数字双胞胎可以用于实时优化燃油效率,模拟紧急响应场景以增强飞行员培训,预测可能需要更换的零部件以实现高效的库存管理,确保在环境和安全方面的合规性,进行各种维护策略的成本效益分析,控制噪音污染水平,甚至评估和规划碳排放减少。通过纳入这些额外的决策任务,数字双胞胎可以更全面地促进飞机发动机的操作效率、安全协议和环境标准的合规性,从而扩大其效用,不仅仅是通知维护计划。
BOX 2-1 Continued 盒子 2-1 续篇
FIGURE 2-1-1 Example of a digital twin of a cancer patient and tumor. 图 2-1-1 癌症患者及肿瘤的数字双胞胎示例。
Digital Twin of an Earth System 地球系统的数字双胞胎
The virtual representation of an Earth system might comprise a collection of high-fidelity, high-resolution physics models and associated surrogate models, collectively representing coupled atmospheric, oceanic, terrestrial, and cryospheric 地球系统的虚拟表示可能包括一系列高保真、高分辨率的物理模型和相关的替代模型, collectively 代表耦合的气候、大洋、陆地和冰冻圈
The virtual representation of the physical counterpart comprises a computational model or set of coupled models. These models are typically computational representations of first-principles, mechanistic, and/or empirical models, which take on a range of mathematical forms, including dynamical systems, differential equations, and statistical models (including machine learning [ML] models). The set of models comprising the virtual representation of a digital twin of a complex system will span multiple disciplines and multiple temporal and spatial scales. 物理对应物的虚拟表示包括一个计算模型或一组耦合模型。这些模型通常是第一性原理、机制性和/或经验模型的计算表示,采用多种数学形式,包括动态系统、微分方程和统计模型(包括机器学习[ML]模型)。构成复杂系统数字双胞胎虚拟表示的模型集将跨越多个学科以及多个时间和空间尺度。
physics. These models solve for state variables such as pressure, temperature, density, and salinity. These models are characterized by parameters that represent physical properties such as terrain geometry, fluid constitutive properties, boundary conditions, initial conditions, and anthropogenic source terms, as well as numerical model parameters that represent, for example, turbulence model closures and ML model hyperparameters. Quantities of interest might include projections of future global mean temperature or statistics of extreme precipitation events. Decision tasks might include actions related to policy-making, energy system design, deployment of new observing systems, and emergency preparedness for extreme weather events. These decisions may be made automatically as part of the digital twin or made by a human informed by the digital twin's output. 物理学。这些模型求解状态变量,如压力、温度、密度和盐度。这些模型的特征是代表物理属性的参数,如地形几何、流体本构属性、边界条件、初始条件和人为源项,以及代表例如湍流模型闭合和机器学习模型超参数的数值模型参数。感兴趣的量可能包括未来全球平均温度的预测或极端降水事件的统计数据。决策任务可能包括与政策制定、能源系统设计、新观察系统的部署以及极端天气事件的应急准备相关的行动。这些决策可以作为数字双胞胎的一部分自动做出,或由受数字双胞胎输出信息影响的人类做出。
Digital Twin of a Manufacturing Process 制造过程的数字双胞胎
Manufacturing environments afford many opportunities for digital twins. Consider a manufacturing system potentially comprising equipment, human workers, various stations and assembly lines, processes, and the materials that flow through the system. The virtual representation of a manufacturing process might include visually-, principles-, data-, and/or geometry-driven models which are parameterized by data such as process monitoring data (both real-time/near real-time and historical), production data, system layout, and equipment status and maintenance records. These data may span much of the process life cycle. Of course, these components will be tailored to the specific process and should be fit-for-purpose. Decision tasks might include operational decisions and process control, for instance. These decisions may be made automatically as part of the digital twin or made by a human informed by the digital twin's output. 制造环境为数字双胞胎提供了许多机会。考虑一个可能由设备、人工工人、各种工作站和装配线、流程以及在系统中流动的材料组成的制造系统。制造过程的虚拟表示可能包括视觉驱动、原理驱动、数据驱动和/或几何驱动的模型,这些模型由过程监控数据(包括实时/近实时和历史数据)、生产数据、系统布局以及设备状态和维护记录等数据参数化。这些数据可能涵盖过程生命周期的很大一部分。当然,这些组件将根据特定过程进行定制,并应适合其目的。决策任务可能包括操作决策和过程控制。例如,这些决策可以作为数字双胞胎的一部分自动做出,或者由人类根据数字双胞胎的输出做出。
Digital twin examples in the literature employ models that span a range of fidelities and resolutions, from high-resolution, high-fidelity replicas to simplified surrogate models. 文献中的数字双胞胎示例采用了不同保真度和分辨率的模型,从高分辨率、高保真度的复制品到简化的替代模型。
Another part of the digital twin virtual representation is the definition of parameters, states, and quantities of interest. The computational models are characterized by parameters that are the virtual representation of attributes such as geometry and constitutive properties of the physical counterpart, boundary 数字双胞胎虚拟表示的另一个部分是参数、状态和感兴趣量的定义。计算模型的特征是参数,这些参数是物理对应物的几何形状和本构性质等属性的虚拟表示,边界。
conditions, initial conditions, external factors that influence the physical counterpart, and transfer coefficients between resolved processes and parameterized unresolved processes. Sometimes these parameters will be known, while in other cases they must be estimated from data. Some types of models may also be characterized by parameters and hyperparameters that represent numerical approximations within a model, such as Gaussian process correlation lengths, regularization hyperparameters, and neural network training weights. The committee refers to this latter class of parameters as numerical model parameters to distinguish them from the parameters that represent attributes of the physical system. The committee uses the term state to denote the solved-for quantities in a model that takes the form of a dynamical system or system of differential equations. However, the committee notes that in many cases, the distinction between parameter and state can become blurred - when a digital twin couples multiple models across different disciplines, the state of one model may be a parameter in another model. Furthermore, the committee notes that many digital twin use cases explicitly target situations where parameters are dynamically changing, requiring dynamic estimation and updating of parameters, akin to state estimation in classical settings. Lastly, the committee denotes quantities of interest as the metrics that are of particular relevance to digital twin predictions and decisions. These quantities of interest are typically functions of parameters and states. The quantities of interest may themselves vary in definition as a particular digital twin is used in different decision-making scenarios over time. 条件、初始条件、影响物理对应物的外部因素,以及已解决过程与参数化未解决过程之间的传递系数。有时这些参数是已知的,而在其他情况下必须从数据中估计。一些类型的模型也可能通过参数和超参数来表征,这些参数和超参数表示模型中的数值近似,例如高斯过程相关长度、正则化超参数和神经网络训练权重。委员会将后一类参数称为数值模型参数,以将其与表示物理系统属性的参数区分开。委员会使用“状态”一词来表示在以动态系统或微分方程组形式存在的模型中求解的量。然而,委员会指出,在许多情况下,参数和状态之间的区别可能会变得模糊——当数字双胞胎跨不同学科耦合多个模型时,一个模型的状态可能是另一个模型中的参数。 此外,委员会指出,许多数字双胞胎的应用案例明确针对参数动态变化的情况,这需要对参数进行动态估计和更新,类似于经典环境中的状态估计。最后,委员会将关注的量定义为与数字双胞胎预测和决策特别相关的指标。这些关注的量通常是参数和状态的函数。关注的量在不同决策场景中使用特定数字双胞胎时,其定义可能会有所变化。
An important theme that runs throughout this report is the notion that the virtual representation be fit for purpose, meaning that the virtual representation-model types, fidelity, resolution, parameterization, and quantities of interest-be chosen, and in many cases dynamically adapted, to fit the particular decision task and computational constraints at hand. Another important theme that runs throughout this report is the critical need for uncertainty quantification to be an integral part of digital twin formulations. If this need is addressed by, for example, the use of Bayesian formulations, then the formulation of the virtual representation must also define prior information for parameters, numerical model parameters, and states. 本报告中贯穿的一个重要主题是虚拟表示必须适合其目的,这意味着虚拟表示的模型类型、保真度、分辨率、参数化和关注的量必须被选择,并且在许多情况下需要动态调整,以适应特定的决策任务和计算约束。另一个贯穿本报告的重要主题是对不确定性量化的迫切需求,必须成为数字双胞胎公式的一个组成部分。如果通过例如使用贝叶斯公式来解决这一需求,那么虚拟表示的公式还必须定义参数、数值模型参数和状态的先验信息。
Bidirectional Feedback Flow Between Physical and Virtual 物理与虚拟之间的双向反馈流动
The bidirectional interaction between the virtual representation and the physical counterpart forms an integral part of the digital twin. This interaction is sometimes characterized as a feedback loop, where data from the physical counterpart are used to update the virtual models, and, in turn, the virtual models are used to drive changes in the physical system. This feedback loop may occur in real time, such as for dynamic control of an autonomous vehicle or a wind farm, or it may occur on slower time scales, such as post-flight updating of a digital twin for aircraft engine predictive maintenance or post-imaging updating of a digital twin and subsequent treatment planning for a cancer patient. 虚拟表示与物理对应物之间的双向互动构成了数字双胞胎的一个 integral 部分。这种互动有时被描述为反馈循环,其中来自物理对应物的数据用于更新虚拟模型,而虚拟模型又用于推动物理系统的变化。这个反馈循环可以实时发生,例如用于自主车辆或风电场的动态控制,或者在较慢的时间尺度上发生,例如在飞机发动机预测性维护的飞行后更新数字双胞胎,或在癌症患者的成像后更新数字双胞胎及后续治疗计划。
On the physical-to-virtual flowpath, digital twin tasks include sensor data fusion, model calibration, dynamic model updating, and estimation of parameters and states that are not directly observable. These calibration, updating, and estimation tasks are typically posed mathematically as data assimilation and inverse problems, which can take the form of parameter estimation (both static and dynamic), state estimation, regression, classification, and detection. 在物理到虚拟的流动路径上,数字双胞胎任务包括传感器数据融合、模型校准、动态模型更新以及对不可直接观察的参数和状态的估计。这些校准、更新和估计任务通常在数学上被表述为数据同化和逆问题,可以表现为参数估计(静态和动态)、状态估计、回归、分类和检测。
On the virtual-to-physical flowpath, the digital twin is used to drive changes in the physical counterpart itself or in the sensor and observing systems associated with the physical counterpart. This flowpath may be fully automated, where the digital twin interacts directly with the physical system. Examples of automated decision-making tasks include automated control, scheduling, recommendation, and sensor steering. In many cases, these tasks relate to automatic feedback control, which is already in widespread use across many engineering systems. Concrete examples of potential digital twin automated decision-making tasks are given in the illustrative examples in Box 2-1. The virtual-to-physical flowpath may also include a human in the digital twin feedback loop. A human may play the key decision-making role, in which case the digital twin provides decision support, or decision-making may be shared jointly between the digital twin and a human as a human-agent team. Human-digital twin interaction may also take the form of the human playing a crucial role in designing, managing, and operating elements of the digital twin, including selecting sensors and data sources, managing the models underlying the virtual representation, and implementing algorithms and analytics tools. User-centered design is central to extracting value from the digital twin. 在虚拟到物理的流动路径上,数字双胞胎用于推动物理对应物本身或与物理对应物相关的传感器和观察系统的变化。这个流动路径可以是完全自动化的,数字双胞胎直接与物理系统互动。自动化决策任务的例子包括自动控制、调度、推荐和传感器引导。在许多情况下,这些任务与自动反馈控制有关,后者在许多工程系统中已经得到广泛应用。潜在的数字双胞胎自动决策任务的具体例子在框 2-1 的示例中给出。虚拟到物理的流动路径也可能包括数字双胞胎反馈循环中的人类。人类可能扮演关键决策角色,在这种情况下,数字双胞胎提供决策支持,或者决策可能在数字双胞胎和人类之间共同分享,形成一个人机团队。 人类与数字双胞胎的互动也可以表现为人类在设计、管理和操作数字双胞胎的各个元素中发挥关键作用,包括选择传感器和数据源、管理虚拟表示背后的模型,以及实施算法和分析工具。以用户为中心的设计对于从数字双胞胎中提取价值至关重要。
Verification, Validation, and Uncertainty Quantification 验证、确认和不确定性量化
VVUQ is essential for the responsible development, implementation, monitoring, and sustainability of digital twins. Since the precise definitions can differ among subject-matter areas, the committee adopts the definition of VVUQ used in the National Research Council report Assessing the Reliability of Complex Models (NRC 2012) for this report: VVUQ 对于数字双胞胎的负责任开发、实施、监测和可持续性至关重要。由于精确的定义在不同的学科领域可能有所不同,委员会在本报告中采用国家研究委员会报告《评估复杂模型的可靠性》(NRC 2012)中使用的 VVUQ 定义:
Verification is "the process of determining whether a computer program ('code') correctly solves the equations of the mathematical model. This includes code verification (determining whether the code correctly implements the intended algorithms) and solution verification (determining the accuracy with which the algorithms solve the mathematical model's equations for specified quantities of interest)." 验证是“确定计算机程序(‘代码’)是否正确解决数学模型方程的过程。这包括代码验证(确定代码是否正确实现了预期的算法)和解验证(确定算法在特定关注量上解决数学模型方程的准确性)。”
Validation is "the process of determining the degree to which a model is an accurate representation of the real world from the perspective of the intended uses of the model (taken from AIAA [Computational Fluid Dynamics Committee], 1998)." 验证是“从模型预期用途的角度确定模型在多大程度上准确代表现实世界的过程(摘自 AIAA [计算流体动力学委员会],1998 年)。”
Uncertainty quantification is "the process of quantifying uncertainties associated with model calculations of true, physical quantities of interest, with the goals of accounting for all sources of uncertainty and quantifying the contributions of specific sources to the overall uncertainty." 不确定性量化是“量化与模型计算的真实物理量相关的不确定性的过程,旨在考虑所有不确定性来源,并量化特定来源对整体不确定性的贡献。”
Each of the VVUQ tasks plays important roles for digital twins. There are, however, key differences. The challenges lie in the features that set digital twins apart from traditional modeling and simulation, with the most important difference being the bidirectional feedback loop between the virtual and the physical. Evolution of the physical counterpart in real-world use conditions, changes in data collection hardware and software, noisiness of data, addition and deletion of data sources, changes in the distribution of the data shared with the virtual twin, changes in the prediction and/or decision tasks posed to the digital twin, and evolution of the digital twin virtual models all have consequences for VVUQ. Significant challenges remain for VVUQ of stochastic and adaptive systems; due to their dynamic nature, digital twins inherit these challenges. 每个 VVUQ 任务在数字双胞胎中扮演着重要角色。然而,它们之间存在关键差异。挑战在于使数字双胞胎与传统建模和仿真区分开来的特征,其中最重要的区别是虚拟与物理之间的双向反馈循环。在实际使用条件下,物理对应物的演变、数据收集硬件和软件的变化、数据的噪声、数据源的增加和删除、与虚拟双胞胎共享的数据分布的变化、对数字双胞胎提出的预测和/或决策任务的变化,以及数字双胞胎虚拟模型的演变,都会对 VVUQ 产生影响。对于随机和自适应系统的 VVUQ 仍然存在重大挑战;由于其动态特性,数字双胞胎继承了这些挑战。
Traditionally, a computational model may be verified for sets of inputs at the code verification stage and for scenarios at the solution verification stage. While many of the elements are shared with VVUQ for computational models (NRC 2012), for digital twins, one anticipates, over time, upgrades to data collection technology (e.g., sensors). This may mean changes in the quality of data being collected, more and cheaper data capture hardware with potentially lower quality of information, different data sources, or changes in data structures. Additionally, the physical counterpart's state will undergo continual evolution. With such changes comes the need to revisit some or all aspects of verification. Furthermore, as the physical twin evolves over its lifetime, it is possible to enter system states that are far from the solution scenarios that were envisioned at initial verification. Indeed, major changes made to the physical twin may require that the virtual representation be substantially redefined and re-implemented. 传统上,计算模型可以在代码验证阶段对输入集进行验证,在解决方案验证阶段对场景进行验证。虽然许多元素与计算模型的 VVUQ(NRC 2012)共享,但对于数字双胞胎,预计随着时间的推移,数据收集技术(例如传感器)会有所升级。这可能意味着收集的数据质量发生变化,数据捕获硬件变得更多且更便宜,但信息质量可能较低,数据来源不同,或数据结构发生变化。此外,物理对应物的状态将经历持续的演变。随着这些变化,重新审视某些或所有验证方面的需求也随之而来。此外,随着物理双胞胎在其生命周期中的演变,可能会进入与初始验证时设想的解决方案场景相距甚远的系统状态。实际上,对物理双胞胎进行的重大更改可能需要对虚拟表示进行实质性的重新定义和重新实施。
As with verification, validation is more complicated in the context of a digital twin. The output of a digital twin needs to include the confidence level in its prediction. Changes in the state of the physical counterpart, data collection and structures, and the computational models can each impact the validation assessment and may require continual validation. The bidirectional interplay between the physical and the virtual means the predictive model is periodically, or even continuously, updated. For continual VVUQ, automated VVUQ methods may yield operational efficiencies. These updates must be factored into digital twin validation processes. 与验证一样,在数字双胞胎的背景下,验证更加复杂。数字双胞胎的输出需要包括其预测的置信水平。物理对应物的状态变化、数据收集和结构以及计算模型都可能影响验证评估,并可能需要持续验证。物理与虚拟之间的双向互动意味着预测模型会定期甚至持续更新。对于持续的 VVUQ,自动化的 VVUQ 方法可能会带来操作效率。这些更新必须纳入数字双胞胎的验证过程。
Uncertainty quantification is essential to making informed decisions and to promoting the necessary transparency for a digital twin to build trust with decision support. Uncertainty quantification is also essential for fitness-for-purpose considerations. There are many potential sources of uncertainty in a digital twin. These include those arising from modeling uncertainties (Chapter 3), 不确定性量化对于做出明智的决策和促进数字双胞胎所需的透明度以建立与决策支持的信任至关重要。不确定性量化对于适用性考虑也至关重要。在数字双胞胎中存在许多潜在的不确定性来源。这些包括来自建模不确定性的来源(第 3 章),
measurement and other data uncertainties (Chapter 4), the processes of data assimilation and model calibration (Chapter 5), and decision-making (Chapter 6). Particularly unique to digital twins is inclusion of uncertainties due to integration of multiple modalities of data and models, and bidirectional and sometimes realtime interaction between the virtual representation, the physical counterpart, and the possible human-in-the-loop interactions. These interactions and integration may even lead to new instabilities that emerge due to the nonlinear coupling among different elements of the digital twin. 测量和其他数据的不确定性(第 4 章)、数据同化和模型校准的过程(第 5 章)以及决策制定(第 6 章)。数字双胞胎特别独特之处在于包含由于多种数据和模型的集成而产生的不确定性,以及虚拟表示、物理对应物和可能的人机交互之间的双向和有时实时的互动。这些互动和集成甚至可能导致由于数字双胞胎不同元素之间的非线性耦合而出现新的不稳定性。
Given the interconnectedness of different systems and stakeholders across the digital twin ecosystem, it is imperative to outline the VVUQ pipeline and highlight potential sources of information breakdown and model collapse. It is important to recognize that VVUQ must play a role in all elements of the digital twin ecosystem. In the digital twin virtual representation, verification plays a key role in building trust that the mathematical models used for simulation of the physical counterpart have been sufficiently implemented. In cases that employ surrogate models, uncertainty quantification gives measures of the quality of prediction that the surrogate model provides. Field observations, for example, can be used to estimate uncertainties and parameters that govern the virtual representation (a type of inverse problem) as a step toward model validation, followed by the assessment of predictions. As information is passed from the physical counterpart to the virtual representation, new data can be used to update estimates and predictions with uncertainty quantification that can be used for decisions. These include challenges arising from model discrepancy, unresolved scales, surrogate modeling, and the need to issue predictions in extrapolatory regimes (Chapter 3). 鉴于数字双胞胎生态系统中不同系统和利益相关者之间的相互关联,明确 VVUQ 流程并突出潜在的信息中断和模型崩溃的来源是至关重要的。必须认识到,VVUQ 在数字双胞胎生态系统的所有元素中都必须发挥作用。在数字双胞胎的虚拟表示中,验证在建立信任方面发挥着关键作用,确保用于模拟物理对应物的数学模型得到了充分实施。在使用代理模型的情况下,不确定性量化提供了代理模型所提供的预测质量的度量。例如,现场观察可以用于估计不确定性和控制虚拟表示的参数(这是一种逆问题),作为模型验证的一个步骤,随后进行预测评估。当信息从物理对应物传递到虚拟表示时,可以使用新数据来更新估计和预测,并进行不确定性量化,以便用于决策。 这些包括来自模型差异、未解决的尺度、替代建模以及在外推领域发布预测的需要所带来的挑战(第 3 章)。
When constructing digital twins, there are often many sources of data (e.g., data arising from sensors or simulations), and consequently, there can be many sources of uncertainty. Despite the abundance of data, there are nonetheless limitations to the ability to reduce uncertainty. Computational models may inherently contain unresolvable model form errors or discrepancies. Additionally, measurement errors in sensors are typically unavoidable. Whether adopting a data-centric or model-centric view, it is important to assess carefully which parts of the digital twin model can be informed by data and simulations and which cannot in order to prevent overfitting and to provide a full accounting of uncertainty. 在构建数字双胞胎时,通常会有许多数据来源(例如,来自传感器或模拟的数据),因此可能会有许多不确定性来源。尽管数据丰富,但减少不确定性的能力仍然有限。计算模型可能固有地包含无法解决的模型形式错误或差异。此外,传感器中的测量误差通常是不可避免的。无论采用数据中心还是模型中心的视角,重要的是仔细评估数字双胞胎模型的哪些部分可以通过数据和模拟来获取信息,哪些部分不能,以防止过拟合并提供全面的不确定性评估。
The VVUQ contribution does not stop with the virtual representation. Monitoring the uncertainties associated with the physical counterpart and incorporating changes to, for example, sensors or data collection equipment are part of ensuring data quality passed to the virtual counterpart. Data quality improvements may be prioritized based on the relative impacts of parameter uncertainties on the resulting model uncertainties. Data quality challenges arise from measurement, undersampling, and other data uncertainties (Chapter 4). Data quality is especially pertinent when ML models are used. Research into methods for identifying and mitigating the impact of noisy or incomplete data is needed. VVUQ can also play a role in understanding the impact of mechanisms used to pass information VVUQ 的贡献不仅仅停留在虚拟表示上。监测与物理对应物相关的不确定性,并对传感器或数据收集设备等进行更改,是确保传递给虚拟对应物的数据质量的一部分。数据质量的改善可以根据参数不确定性对最终模型不确定性的相对影响进行优先排序。数据质量挑战源于测量、欠采样和其他数据不确定性(第 4 章)。当使用机器学习模型时,数据质量尤为重要。需要研究识别和减轻噪声或不完整数据影响的方法。VVUQ 还可以在理解用于传递信息的机制影响方面发挥作用。
between the physical and virtual, and vice versa. These include challenges arising from parameter uncertainty and ill-posed or indeterminate inverse problems (Chapter 5). Additionally, the uncertainty introduced by the inclusion of the human-in-the-loop should be measured and quantified in some settings. The humanin-the-loop as part of the VVUQ pipeline can be a critical source of variability that also has to be taken into consideration (Chapter 6). This can be particularly important in making predictions where different decision makers are involved. 在物理与虚拟之间,以及反之亦然。这些包括由于参数不确定性和不适定或不确定的逆问题(第 5 章)而产生的挑战。此外,在某些情况下,纳入人类参与的引入的不确定性应当被测量和量化。作为 VVUQ 流程一部分的人类参与可能是一个关键的变异来源,也必须考虑在内(第 6 章)。在涉及不同决策者的情况下,这在做出预测时尤为重要。
A digital twin without serious considerations of VVUQ is not trustworthy. However, a rigorous VVUQ approach across all elements of the digital twin may be difficult to achieve. Digital twins may represent systems-of-systems with multiscale, multiphysics, and multi-code components. VVUQ methods, and methods supporting digital twins broadly, will need to be adaptable and scalable as digital twins increase in complexity. Finally, the choice of performance metrics for VVUQ will depend on the use case. Such metrics might include average case prediction error (e.g., mean square prediction error), predictive variance, worse case prediction error, or risk-based assessments. 没有认真考虑 VVUQ 的数字双胞胎是不可信的。然而,在数字双胞胎的所有元素中实现严格的 VVUQ 方法可能是困难的。数字双胞胎可能代表具有多尺度、多物理和多代码组件的系统-of-systems。VVUQ 方法以及广泛支持数字双胞胎的方法,需要随着数字双胞胎复杂性的增加而具备适应性和可扩展性。最后,VVUQ 的性能指标选择将取决于具体应用场景。这些指标可能包括平均案例预测误差(例如,均方预测误差)、预测方差、最坏案例预测误差或基于风险的评估。
While this section has not provided an exhaustive list of VVUQ contributions to the digital twin ecosystem, it does serve to highlight that VVUQ plays a critical role in all aspects. Box 2-2 highlights the Department of Energy Predictive Science Academic Alliance Program as an exemplar model of interdisciplinary research that promotes VVUQ. 虽然本节没有提供 VVUQ 对数字双胞胎生态系统的详尽贡献列表,但确实强调了 VVUQ 在各个方面发挥着关键作用。框 2-2 突出了能源部预测科学学术联盟计划,作为促进 VVUQ 的跨学科研究的典范模型。
Conclusion 2-2: Digital twins require VVUQ to be a continual process that must adapt to changes in the physical counterpart, digital twin virtual models, data, and the prediction/decision task at hand. A gap exists between the class of problems that has been considered in traditional modeling and simulation settings and the VVUQ problems that will arise for digital twins. 结论 2-2:数字双胞胎需要将验证、验证和不确定性量化(VVUQ)作为一个持续的过程,必须适应物理对应物、数字双胞胎虚拟模型、数据以及当前的预测/决策任务的变化。在传统建模和仿真环境中考虑的问题类别与数字双胞胎将出现的 VVUQ 问题之间存在差距。
The importance of a rigorous VVUQ process for a potentially powerful tool such as a digital twin cannot be overstated. Consider the growing concern over the dangers of artificial intelligence (AI), with warnings even extending to the "risk of human extinction" (Center for A.I. Safety 2023; Roose 2023). Generative AI models such as ChatGPT are being widely deployed, despite open questions about their reliability, robustness, and accuracy. There has long been a healthy skepticism about the use of predictive simulations in critical decision-making. Over time, use-driven research and development in VVUQ has provided a robust framework to foster confidence and establish boundaries for use of simulations that draw from new and ongoing computational science research (Hendrickson et al. 2020). As a result of continued advances in VVUQ, many of the ingredients 对潜在强大工具如数字双胞胎而言,严格的 VVUQ 过程的重要性不容小觑。考虑到对人工智能(AI)危险的日益关注, 警告甚至扩展到“人类灭绝的风险”(人工智能安全中心 2023;Roose 2023)。尽管对其可靠性、稳健性和准确性存在开放性问题,生成性 AI 模型如 ChatGPT 仍被广泛部署。长期以来,对在关键决策中使用预测模拟的健康怀疑一直存在。随着时间的推移,基于使用驱动的 VVUQ 研究与开发提供了一个强大的框架,以增强信心并为利用来自新兴和持续计算科学研究的模拟建立界限(Hendrickson 等,2020)。由于 VVUQ 的持续进展,许多要素
BOX 2-2
Department of Energy Predictive Science Academic Alliance Program: Interdisciplinary Research Promoting Verification, Validation, and Uncertainty Quantification 盒子 2-2
能源部预测科学学术联盟计划:促进验证、确认和不确定性量化的跨学科研究
For more than two decades, the Department of Energy (DOE) National Nuclear Security Administration's (NNSA's) Advanced Simulation and Computing Program (ASC) has proven an exemplary model for promoting interdisciplinary research in computational science in U.S. research universities, which deserves emulation by other federal agencies. 在过去的二十多年里,能源部(DOE)国家核安全局(NNSA)的先进模拟与计算计划(ASC)已成为促进美国研究大学计算科学跨学科研究的典范,值得其他联邦机构效仿。
ASC has established a strong portfolio of strategic alliances with leading U.S. academic institutions. The program was established in 1997 to engage the U.S. academic community in advancing science-based modeling and simulation technologies. At the core of each university center is an overarching complex multiphysics problem that requires innovations in programming and runtime environments, physical models and algorithms, data analysis at scale, and uncertainty analysis. This overarching problem (proposed independently by each center) has served as a most effective catalyst to promote interdisciplinary cooperation among multiple departments (e.g., mathematics, computer science, and engineering). In 2008, a new phase of the ASC alliance program, the Predictive Science Academic Alliance Program (PSAAP), added an emphasis on verification, validation, and uncertainty quantification (VVUQ). PSAAP has profoundly affected university cultures and curricula in computational science by infusing VVUQ; scalable computing; programming paradigms on heterogeneous computer systems; multiscale, multiphysics, and multi-code integration science; etc. To facilitate the research agendas of the centers, DOE/NNSA provides significant cycles on the most powerful unclassified computing systems. ASC 与美国领先的学术机构建立了强大的战略联盟组合。该项目于 1997 年成立,旨在让美国学术界参与推动基于科学的建模和仿真技术。 每个大学中心的核心是一个复杂的多物理场问题,需要在编程和运行环境、物理模型和算法、大规模数据分析以及不确定性分析方面进行创新。这个总体问题(由每个中心独立提出)作为促进多个部门(例如数学、计算机科学和工程)之间跨学科合作的最有效催化剂。2008 年,ASC 联盟项目的新阶段——预测科学学术联盟计划(PSAAP) ,增加了对验证、确认和不确定性量化(VVUQ)的重视。PSAAP 通过注入 VVUQ、可扩展计算、异构计算系统上的编程范式、多尺度、多物理场和多代码集成科学等,深刻影响了大学在计算科学方面的文化和课程。 为了促进各中心的研究议程,能源部/国家核安全局在最强大的非机密计算系统上提供了大量计算资源。
An important aspect of the management of PSAAP involves active interactions with the scientists at the NNSA laboratories through biannual rigorous technical reviews that focus on the technical progress of the centers and provide recommendations to help them meet their goals and milestones. Another important aspect is required graduate student internships at the NNSA laboratories. PSAAP 管理的一个重要方面是通过每年两次的严格技术评审,与 NNSA 实验室的科学家进行积极互动,这些评审关注中心的技术进展,并提供建议以帮助他们实现目标和里程碑。另一个重要方面是要求研究生在 NNSA 实验室进行实习。
of AI methods-statistical modeling, surrogate modeling, inverse problems, data assimilation, optimal control-have long been used in engineering and scientific applications with acceptable levels of risk. One wonders: Is it the methods themselves that pose a risk to the human enterprise, or is it the way in which they are deployed without due attention to VVUQ and certification? When it comes to digital twins and their deployment in critical engineering and scientific applications, humanity cannot afford the cavalier attitude that pervades other applications of AI. It is critical that VVUQ be deeply embedded in the design, creation, AI 方法——统计建模、代理建模、逆问题、数据同化、最优控制——在工程和科学应用中长期以来一直被使用,并且风险水平可接受。人们不禁要问:是这些方法本身对人类事业构成风险,还是它们在部署时未能充分关注 VVUQ 和认证的方式?在数字双胞胎及其在关键工程和科学应用中的部署方面,人类不能承受在其他 AI 应用中普遍存在的轻率态度。将 VVUQ 深入嵌入设计、创建中至关重要。
and deployment of digital twins-while recognizing that doing so will almost certainly slow progress. 和数字双胞胎的部署——同时认识到这样做几乎肯定会减缓进展。
Conclusion 2-3: Despite the growing use of artificial intelligence, machine learning, and empirical modeling in engineering and scientific applications, there is a lack of standards in reporting VVUQ as well as a lack of consideration of confidence in modeling outputs. 结论 2-3:尽管在工程和科学应用中越来越多地使用人工智能、机器学习和经验建模,但在报告 VVUQ 方面缺乏标准,同时对建模输出的信心考虑也不足。
Conclusion 2-4: Methods for ensuring continual VVUQ and monitoring of digital twins are required to establish trustworthiness. It is critical that VVUQ be deeply embedded in the design, creation, and deployment of digital twins. In future digital twin research developments, VVUQ should play a core role and tight integration should be emphasized. Particular areas of research need include continual verification, continual validation, VVUQ in extrapolatory conditions, and scalable algorithms for complex multiscale, multiphysics, and multi-code digital twin software efforts. 结论 2-4:需要确保持续的 VVUQ 和数字双胞胎监测的方法,以建立可信度。VVUQ 必须深深嵌入数字双胞胎的设计、创建和部署中。在未来的数字双胞胎研究发展中,VVUQ 应发挥核心作用,并强调紧密集成。需要特别关注的研究领域包括持续验证、持续验证、外推条件下的 VVUQ,以及用于复杂多尺度、多物理和多代码数字双胞胎软件工作的可扩展算法。
Finding 2-2: The Department of Energy Predictive Science Academic Alliance Program has proven an exemplary model for promoting interdisciplinary research in computational science in U.S. research universities and has profoundly affected university cultures and curricula in computational science in the way that VVUQ is infused with scalable computing, programming paradigms on heterogeneous computer systems, and multiphysics and multi-code integration science. 发现 2-2:能源部预测科学学术联盟计划已证明是促进美国研究大学计算科学跨学科研究的典范模型,并深刻影响了计算科学的大学文化和课程,尤其是在可扩展计算、异构计算机系统上的编程范式以及多物理场和多代码集成科学方面。
Ethics, Privacy, and Data Governance 伦理、隐私与数据治理
Protecting individual privacy requires proactive ethical consideration at every phase of development and within each element of the digital twin ecosystem. When data are collected, used, or traded, the protection of the individual's identity is paramount. Despite the rampant collection of data in today's information landscape, questions remain around preserving individual privacy. Current privacy-preserving methods, such as differential privacy or the use of synthetic data, are gaining traction but have limitations in many settings (e.g., reduced accuracy in data-scarce settings). Additionally, user data are frequently repurposed or sold. During the atmospheric and climate sciences digital twin workshop, for instance, speakers pointed out that the buying and selling of individual location data is a particularly significant challenge that deserves greater attention (NASEM 2023a). 保护个人隐私需要在每个开发阶段和数字双胞胎生态系统的每个元素中进行主动的伦理考虑。当数据被收集、使用或交易时,保护个人身份至关重要。尽管在当今信息环境中数据的收集泛滥,但关于如何保护个人隐私的问题仍然存在。目前的隐私保护方法,如差分隐私或合成数据的使用,正在获得关注,但在许多情况下存在局限性(例如,在数据稀缺的环境中准确性降低)。此外,用户数据经常被重新利用或出售。例如,在大气和气候科学数字双胞胎研讨会上,发言者指出,个人位置数据的买卖是一个特别重要的挑战,值得更多关注(NASEM 2023a)。
Moreover, digital twins are enabled through the development and deployment of myriad complex algorithms. In both the biomedical workshop and atmospheric and climate sciences workshop on digital twins, speakers warned of the bias inherent in algorithms due to missing data as a result of historical and systemic biases (NASEM 2023a,b). 此外,数字双胞胎的实现依赖于众多复杂算法的开发和部署。在生物医学研讨会和数字双胞胎的气象与气候科学研讨会上,发言者警告说,由于历史和系统性偏见导致的数据缺失,算法中固有的偏见问题。
Collecting and using data in a way that is socially responsible, maintaining the privacy of individuals, and reducing bias in algorithms through inclusive and representative data gathering are all critical to the development of digital twins. However, these priorities are challenges for the research and professional communities at large and are not unique to digital twins. Below, the committee identifies some novel challenges that arise in the context of digital twins. 以社会责任的方式收集和使用数据,维护个人隐私,并通过包容性和代表性的数据收集来减少算法中的偏见,这些对于数字双胞胎的发展至关重要。然而,这些优先事项对研究和专业社区来说都是挑战,并非数字双胞胎所独有。下面,委员会识别出在数字双胞胎背景下出现的一些新挑战。
By virtue of the personalized nature of a digital twin (i.e., the digital twin's specificity to a unique asset, human, or system), the virtual construct aggregates sensitive data, potentially identifiable or re-identifiable, and models that offer tailored insights about the physical counterpart. Speakers in the biomedical digital twin workshop remarked that a digital twin in a medical setting might include a patient's entire health history and that a digital twin "will never be completely de-identifiable" (NASEM 2023b). As a repository of sensitive information, digital twins are vulnerable to data breaches, both accidental and malicious. 由于数字双胞胎的个性化特性(即数字双胞胎对独特资产、个体或系统的特定性),这一虚拟构造聚合了敏感数据,这些数据可能是可识别或可重新识别的,并提供关于物理对应物的量身定制的见解。在生物医学数字双胞胎研讨会上,发言者提到,医疗环境中的数字双胞胎可能包括患者的整个健康历史,并且数字双胞胎“永远不会完全去标识化”(NASEM 2023b)。作为敏感信息的存储库,数字双胞胎容易受到意外和恶意的数据泄露。
Speakers in both the biomedical workshop and the atmospheric and climate sciences workshop urged digital twin users and developers to enforce fitness for purpose and consider how the data are used. In a briefing to the committee, Dr. Lea Shanley repeated these concerns and stressed that the term "open data" does not mean unconditional use (Shanley 2023). During the atmospheric and climate sciences workshop, Dr. Michael Goodchild warned that "repurposing" data is a serious challenge that must be addressed (Goodchild 2023). Moreover, speakers highlighted the need for transparency surrounding individual data. As part of the final panel discussion in the biomedical workshop, Dr. Mangravite noted that once guidelines around data control are established, further work is needed to determine acceptable data access (Mangravite 2023). 在生物医学研讨会和大气与气候科学研讨会上,发言者敦促数字双胞胎的用户和开发者确保其适用性,并考虑数据的使用方式。在向委员会的简报中,莉亚·香利博士重申了这些担忧,并强调“开放数据”一词并不意味着可以无条件使用(香利 2023)。在大气与气候科学研讨会上,迈克尔·古德查德博士警告说,“重新利用”数据是一个必须解决的严重挑战(古德查德 2023)。此外,发言者强调了围绕个体数据透明度的必要性。在生物医学研讨会的最后一场小组讨论中,曼格拉维特博士指出,一旦建立了数据控制的指导方针,就需要进一步工作来确定可接受的数据访问(曼格拉维特 2023)。
The real-time data collection that may occur as part of some digital twins raises important questions around governance (NASEM 2023b). Dr. Shanley pointed out that using complex data sets that combine personal, public, and commercial data is fraught with legal and governance questions around ownership and responsibility (Shanley 2023). Understanding who is accountable for data accuracy is nontrivial and will require new legal frameworks. 实时数据收集可能作为某些数字双胞胎的一部分而发生,这引发了关于治理的重要问题(NASEM 2023b)。Shanley 博士指出,使用结合个人、公共和商业数据的复杂数据集充满了关于所有权和责任的法律和治理问题(Shanley 2023)。理解谁对数据准确性负责并非易事,这将需要新的法律框架。
Privacy, ownership, and responsibility for data accuracy in complex, heterogeneous digital twin environments are all areas with important open questions that require attention. The committee deemed governance to fall outside this study's focus on foundational mathematical, statistical, and computational gaps. However, the committee would be remiss if it did not point out the dangers of scaling (or even developing) digital twins without clear and actionable standards for appropriate use and guidelines for identifying liability in the case of accidental or intentional misuse of a digital twin or its elements, as well as mechanisms for enforcing appropriate use. 在复杂的异构数字双胞胎环境中,隐私、所有权和数据准确性的责任都是需要关注的重要开放问题。委员会认为治理超出了本研究对基础数学、统计和计算差距的关注范围。然而,如果委员会不指出在没有明确和可操作的适当使用标准以及在意外或故意滥用数字双胞胎或其元素时识别责任的指南的情况下扩展(甚至开发)数字双胞胎的危险,那将是失职的,同时也缺乏执行适当使用的机制。
Finding 2-3: Protecting privacy and determining data ownership and liability in complex, heterogeneous digital twin environments are unresolved challenges that pose critical barriers to the responsible development and scaling of digital twins. 发现 2-3:在复杂的异构数字双胞胎环境中,保护隐私以及确定数据所有权和责任是未解决的挑战,这对数字双胞胎的负责任开发和扩展构成了关键障碍。
Finally, making decisions based on information obtained from a digital twin raises additional ethical concerns. These challenges are discussed further in the context of automated and human-in-the-loop decision-making as part of Chapter 6 . 最后,基于从数字双胞胎获得的信息做出决策引发了额外的伦理问题。这些挑战在第六章中进一步讨论,涉及自动化和人机协作的决策过程。
Security 安全
Characteristic of digital twins is the tight integration between the physical system and its virtual representation. This integration has several cybersecurity implications that must be considered, beyond what has historically been needed, in order to effectively safeguard and scale digital twins. 数字双胞胎的特点是物理系统与其虚拟表示之间的紧密集成。这种集成具有几个网络安全方面的影响,必须考虑到这些影响,超出历史上所需的,以有效保护和扩展数字双胞胎。
To maximize efficacy and utility of the digital twin, the physical counterpart must share as much of its data on a meaningful time scale as possible. The need to capture and transmit detailed and time-critical information exposes the physical system to considerably more risks. Examples include physical manipulation while feeding the digital twin fake data, misleading the operator of the physical counterpart, and intercepting data traffic to capture detailed data on the physical system. 为了最大化数字双胞胎的效能和实用性,物理对应物必须在有意义的时间尺度上尽可能多地共享其数据。捕获和传输详细且时间关键的信息的需求使物理系统面临更多风险。例子包括在给数字双胞胎提供虚假数据时进行物理操控,误导物理对应物的操作员,以及拦截数据流以捕获物理系统的详细数据。
As shown in Figure 2-1, feedback is integral to the digital twin paradigm. The close integration of physical and digital systems exposes an additional attack surface for the physical system. A malicious actor can inject an attack into the feedback loop (e.g., spoofing as the digital twin) and influence the physical system in a harmful manner. 如图 2-1 所示,反馈是数字双胞胎范式的核心。物理系统与数字系统的紧密集成为物理系统暴露了额外的攻击面。恶意行为者可以将攻击注入反馈循环中(例如,伪装成数字双胞胎),并以有害的方式影响物理系统。
An additional novel area of security consideration for digital twins arises from the vision of an ideal future where digital twins scale easily and effortlessly. Imagine the scenario where the digital twin is exposed to the broader community (either by design or inadvertently). Since the digital twin represents true physical traits and behaviors of its counterpart, malicious interactions with the digital twin could lead to security risks for the physical system. For example, consider the digital twin of an aircraft system; a malicious actor could manipulate the digital twin to observe vulnerable traits or behaviors of the physical system (e.g., because such traits or behaviors can be inferred from certain simulations, or changes in simulation parameters). These vulnerabilities may be unknown to the system operator. A malicious actor could also interrogate the digital twin to glean intellectual property data such as designs and system parameters. Therefore, scaling digital twins must take into consideration a balance of scalability and information sharing. 数字双胞胎的安全考虑中出现了一个新的领域,这源于对理想未来的愿景,在这个未来中,数字双胞胎能够轻松且毫不费力地扩展。想象一下,数字双胞胎被暴露给更广泛的社区(无论是出于设计还是无意)。由于数字双胞胎代表了其对应物的真实物理特征和行为,恶意与数字双胞胎的互动可能会对物理系统造成安全风险。例如,考虑一个飞机系统的数字双胞胎;恶意行为者可以操纵数字双胞胎,以观察物理系统的脆弱特征或行为(例如,因为这些特征或行为可以从某些模拟中推断出来,或通过改变模拟参数)。这些脆弱性可能对系统操作员来说是未知的。恶意行为者还可以对数字双胞胎进行审问,以获取知识产权数据,例如设计和系统参数。因此,扩展数字双胞胎必须考虑可扩展性和信息共享之间的平衡。
DIGITAL TWIN STATE OF THE ART AND DOMAIN-SPECIFIC CHALLENGES 数字双胞胎的最新技术及领域特定挑战
During information-gathering sessions, the committee heard multiple examples of potential use cases for digital twins and some practical examples of digital twins being deployed. Use cases and practical examples arising in the domains of engineering, biomedical sciences, and atmospheric and climate sciences are summarized in the three Proceedings of a Workshop-in Brief (NASEM 2023a,b,c). Practical examples of digital twins for single assets and systems of assets are also given in a recent white paper from The Alan Turing Institute (Bennett et al. 2023). Digital twins can be seen as "innovation enablers" that are redefining engineering processes and multiplying capabilities to drive innovation across industries, businesses, and governments. This level of innovation is facilitated by a digital twin's ability to integrate a product's entire life cycle with performance data and to employ a continuous loop of optimization. Ultimately, digital twins could reduce risk, accelerate time from design to production, and improve decision-making as well as connect real-time data with virtual representations for remote monitoring, predictive capabilities, collaboration among stakeholders, and multiple training opportunities (Bochenek 2023). 在信息收集会议上,委员会听取了多个数字双胞胎潜在应用案例的例子,以及一些数字双胞胎实际部署的实例。工程、生物医学科学以及大气和气候科学领域的应用案例和实际例子在三份《研讨会简报》(NASEM 2023a,b,c)中进行了总结。阿兰·图灵研究所最近的一份白皮书(Bennett et al. 2023)中也提供了单个资产和资产系统的数字双胞胎的实际例子。数字双胞胎可以被视为“创新推动者”,正在重新定义工程流程,并增强各行业、企业和政府推动创新的能力。这种创新水平得益于数字双胞胎将产品的整个生命周期与性能数据集成的能力,并利用持续的优化循环。 最终,数字双胞胎可以降低风险,加快从设计到生产的时间,提高决策能力,并将实时数据与虚拟表示连接起来,以便进行远程监控、预测能力、利益相关者之间的协作以及多种培训机会(Bochenek 2023)。
While the exploration and use of digital twins is growing across domains, many state-of-the-art digital twins are largely the result of custom implementations that require considerable deployment resources and a high level of expertise (Niederer et al. 2021). Many of the exemplar use cases are limited to specific applications, using bespoke methods and technologies that are not widely applicable across other problem spaces. In part as a result of the bespoke nature of many digital twin implementations, the relative maturity of digital twins varies significantly across problem spaces. This section explores some current efforts under way in addition to domain-specific needs and opportunities within aerospace and defense applications; atmospheric, climate, and sustainability sciences; and biomedical applications. 尽管数字双胞胎的探索和应用在各个领域不断增长,但许多最先进的数字双胞胎主要是定制实现的结果,这需要相当大的部署资源和高水平的专业知识(Niederer et al. 2021)。许多示范用例仅限于特定应用,使用的定制方法和技术在其他问题领域并不广泛适用。部分原因是许多数字双胞胎实现的定制性质,数字双胞胎在不同问题领域的相对成熟度差异显著。本节探讨了一些正在进行的当前努力,以及航空航天和国防应用、气候、环境和可持续科学以及生物医学应用中的特定领域需求和机会。
Digital Twin Examples, Needs, and Opportunities for Aerospace and Defense Applications 数字双胞胎示例、需求和航空航天与国防应用的机会
There are many exciting and promising directions for digital twins in aerospace and defense applications. These directions are discussed in greater detail in Opportunities and Challenges for Digital Twins in Engineering: Proceedings of a Workshop - in Brief in Appendix E (NASEM 2023c); the following section outlines overarching themes from the workshop. The U.S. Air Force Research Laboratory Airframe Digital Twin program focuses on better maintaining the structural integrity of military aircraft. The initial goal of the program was to use digital twins to balance the need to avoid the unacceptable risk of catastrophic failure with the need to reduce the amount of downtime for maintenance 在航空航天和国防应用中,数字双胞胎有许多令人兴奋和充满前景的方向。这些方向在《工程中数字双胞胎的机遇与挑战:研讨会纪要附录 E(NASEM 2023c)》中进行了更详细的讨论;以下部分概述了研讨会的总体主题。美国空军研究实验室的机体数字双胞胎项目专注于更好地维护军用飞机的结构完整性。该项目的初始目标是利用数字双胞胎在避免不可接受的灾难性故障风险与减少维护停机时间之间取得平衡。
and prevent complicated and expensive maintenance. The use of data-informed simulations provides timely and actionable information to operators about what maintenance to perform and when. Operators can then plan for downtime, and maintainers can prepare to execute maintenance packages tailored for each physical twin and the corresponding asset (Kobryn 2023). The Department of Defense (DoD) could benefit from the broader use of digital twins in asset management, incorporating the processes and practices employed in the commercial aviation industry for maintenance analysis (Gahn 2023). Opportunities for digital twins include enhanced asset reliability, planned maintenance, reduced maintenance and inspection burden, and improved efficiency (Deshmukh 2023). 并防止复杂和昂贵的维护。使用数据驱动的模拟为操作员提供及时和可行的信息,告知他们何时进行维护。操作员可以计划停机时间,维护人员可以准备执行针对每个物理双胞胎及相应资产量身定制的维护方案(Kobryn 2023)。国防部(DoD)可以通过更广泛地使用数字双胞胎在资产管理中受益,结合商业航空业在维护分析中采用的流程和实践(Gahn 2023)。数字双胞胎的机会包括增强资产可靠性、计划维护、减少维护和检查负担,以及提高效率(Deshmukh 2023)。
Significant gaps remain before the Airframe Digital Twin can be adopted by DoD. Connecting the simulations across length scales and physical phenomena is key, as is integrating probabilistic analysis. There is value in advancing optimal experimental design, active learning, optimal sensor placement, and dynamic sensor scheduling. These are significant areas of opportunity for development of digital twins across DoD applications. For example, by using simulations to determine which test conditions to run and where to place sensors, physical test programs could be reduced and digital twins better calibrated for operation (Kobryn 2023). 在国防部采用机体数字双胞胎之前,仍然存在显著的差距。跨长度尺度和物理现象连接模拟是关键,整合概率分析同样重要。推进最佳实验设计、主动学习、最佳传感器布置和动态传感器调度具有重要价值。这些都是在国防部应用中开发数字双胞胎的重要机会领域。例如,通过使用模拟来确定运行哪些测试条件以及传感器的放置位置,物理测试程序可以减少,数字双胞胎的操作校准也可以更好(Kobryn 2023)。
When building a representation of a fleet asset in a digital twin for maintenance and life-cycle predictions, it is important to capture the sources of manufacturing, operational, and environmental variation to understand how a particular component is operating in the field. This understanding enables the digital twin to have an appropriate fidelity to be useful in accurately predicting asset maintenance needs (Deshmukh 2023). 在为维护和生命周期预测构建数字双胞胎的舰队资产表示时,捕捉制造、操作和环境变化的来源非常重要,以了解特定组件在实际使用中的运行情况。这种理解使数字双胞胎能够具有适当的保真度,从而在准确预测资产维护需求方面发挥作用(Deshmukh 2023)。
For DoD to move from digital twin "models to action," it is important to consider the following enablers: uncertainty propagation, fast inference, model error quantification, identifiability, causality, optimization and control, surrogates and reduced-order models, and multifidelity information. Integrating data science and domain knowledge is critical to enable decision-making based on analytics to drive process change. Managing massive amounts of data and applying advanced analytics with a new level of intelligent decision-making will be needed to fully take advantage of digital twins in the future. There is also a need for further research in ontologies and harmonization among the digital twin user community; interoperability (from cells, to units, to systems, to systems-of-systems); causality, correlation, and uncertainty quantification; data-physics fusion; and strategies to change the testing and organizational culture (Deshmukh 2023; Duraisamy 2023; Grieves 2023). 为了使国防部从数字双胞胎“模型到行动”转变,考虑以下促进因素是重要的:不确定性传播、快速推理、模型误差量化、可识别性、因果关系、优化与控制、代理模型和降阶模型,以及多保真信息。整合数据科学和领域知识对于基于分析进行决策以推动流程变革至关重要。管理大量数据并应用先进分析与新水平的智能决策将是未来充分利用数字双胞胎所需的。此外,还需要在本体论和数字双胞胎用户社区之间的协调方面进行进一步研究;互操作性(从单元到单位,再到系统,最后到系统的系统);因果关系、相关性和不确定性量化;数据与物理的融合;以及改变测试和组织文化的策略(Deshmukh 2023;Duraisamy 2023;Grieves 2023)。
Opportunities exist in the national security arena to test, design, and prototype processes and exercise virtual prototypes in military campaigns or with geopolitical analysis to improve mission readiness (Bochenek 2023). 在国家安全领域存在机会,可以测试、设计和原型化流程,并在军事行动或地缘政治分析中运用虚拟原型,以提高任务准备状态(Bochenek 2023)。
Digital Twin Examples, Needs, and Opportunities for Atmospheric and Climate Sciences 数字双胞胎的例子、需求和大气与气候科学的机会
Digital twins are being explored and implemented in a variety of contexts within the atmospheric, climate, and sustainability sciences. Specific use cases and opportunities are presented in Opportunities and Challenges for Digital Twins in Atmospheric and Climate Sciences: Proceedings of a Workshop-in Brief in Appendix C (NASEM 2023a). Key messages from the workshop panelists and speakers are summarized here. Destination Earth, or DestinE, is a collaborative European effort to model the planet and capture both natural and human activities. Plans for DestinE include interactive simulations of Earth systems, improved prediction capabilities, support for policy decisions, and mechanisms for members of the broader community to engage with its data (European Commission 2023). The models enabling DestinE are intended to be more realistic and of higher resolution, and the digital twin will incorporate both real and synthetic data (Modigliani 2023). The infrastructure required to support such robust and large-scale atmospheric, climate, and sustainability digital twins, however, necessitates increased observational abilities, computational capacity, mechanisms for large-scale data handling, and federated resource management. Such large-scale digital twins necessitate increased computational capacity, given that significant capacity is required to resolve multiple models of varying scale. Moreover, increasing computational abilities is not sufficient; computational capacity must also be used efficiently. 数字双胞胎正在大气、气候和可持续性科学的各种背景下进行探索和实施。具体的用例和机会在《大气和气候科学中数字双胞胎的机会与挑战:研讨会简报附录 C》中提出(NASEM 2023a)。研讨会小组成员和演讲者的关键信息在此总结。Destination Earth(DestinE)是一个协作的欧洲项目,旨在对地球进行建模,并捕捉自然和人类活动。DestinE 的计划包括地球系统的互动模拟、改进的预测能力、对政策决策的支持,以及让更广泛社区成员参与其数据的机制(欧洲委员会 2023)。支持 DestinE 的模型旨在更加真实且具有更高的分辨率,数字双胞胎将结合真实数据和合成数据(Modigliani 2023)。 支持如此强大和大规模的气候、气象和可持续性数字双胞胎所需的基础设施,然而,需要增强观测能力、计算能力、大规模数据处理机制和联合资源管理。这种大规模数字双胞胎需要增加计算能力,因为需要显著的能力来解决多个不同规模的模型。此外,增加计算能力并不足够;计算能力还必须高效使用。
It is important to note that climate predictions do not necessarily require realtime updates, but some climate-related issues, such as wildfire response planning, might (Ghattas 2023). Three specific thrusts could help to advance the sort of climate modeling needed to realize digital twins: research on parametric sparsity and generalizing observational data, generation of training data and computation for highest possible resolution, and uncertainty quantification and calibration based on both observational and synthetic data (Schneider 2023). ML could be used to expedite the data assimilation process of such diverse data. 重要的是要注意,气候预测不一定需要实时更新,但某些与气候相关的问题,例如野火响应规划,可能需要(Ghattas 2023)。三个具体的推动力可以帮助推进实现数字双胞胎所需的气候建模:对参数稀疏性和观测数据泛化的研究、生成训练数据和计算以达到最高可能分辨率,以及基于观测数据和合成数据的不确定性量化和校准(Schneider 2023)。机器学习可以用来加快这种多样化数据的数据同化过程。
There are many sources of unpredictability that limit the applicability of digital twins to atmospheric prediction or climate change projection. The atmosphere, for example, exhibits nonlinear behavior on many time scales. As a chaotic fluid that is sensitively dependent on initial conditions, the predictability of the atmosphere at instantaneous states is inherently limited. Similarly, the physics of the water cycle introduce another source of unpredictability. The water phase changes are associated with exchanges of energy, and they introduce irreversible conditions as water changes phase from vapor to liquid or solid in the atmosphere and precipitates out to the Earth's surface or the oceans. 有许多不可预测的来源限制了数字双胞胎在大气预测或气候变化预测中的适用性。例如,大气在许多时间尺度上表现出非线性行为。作为一种对初始条件敏感的混沌流体,大气在瞬时状态下的可预测性本质上是有限的。同样,水循环的物理过程也引入了另一种不可预测的来源。水相变化与能量交换相关,并且在水从气态转变为液态或固态时,引入了不可逆的条件,水会降落到地球表面或海洋中。
The importance of-and challenges around-incorporating uncertainty into digital twins cannot be overstated. Approaches that rely on a Bayesian framework could help, as could utilizing reduced-order and surrogate models for tractability (Ghattas 2023) or utilizing fast sampling to better incorporate uncertainty (Balaji 将不确定性纳入数字双胞胎的重要性和面临的挑战不容小觑。依赖贝叶斯框架的方法可能会有所帮助,使用降阶和代理模型以提高可处理性(Ghattas 2023)或利用快速采样更好地纳入不确定性(Balaji)。
2023). Giving users increased access to a digital twin's supporting data may foster understanding of the digital twin's uncertainty (McGovern 2023). 2023)。让用户更容易访问数字双胞胎的支持数据可能会促进对数字双胞胎不确定性的理解(McGovern 2023)。
Establishing and maintaining confidence in and reliability of digital twins is critical for their use. One area for further development is tools that will assess the quality of a digital twin's outputs, thus bolstering confidence in the system (NASEM 2023a). Predicting extreme events also poses challenges for widespread digital twin development and adoption. Because extreme events are, by definition, in the tail end of a distribution, methods for validating extreme events and long-term climate predictions are needed. 建立和维护对数字双胞胎的信心和可靠性对其使用至关重要。进一步发展的一个领域是评估数字双胞胎输出质量的工具,从而增强对系统的信心(NASEM 2023a)。预测极端事件也给广泛的数字双胞胎开发和采用带来了挑战。由于极端事件在定义上位于分布的尾部,因此需要验证极端事件和长期气候预测的方法。
It is important to note that digital twins are often designed to meet the needs of many stakeholders, often beyond the scientific community. Using physicsbased models in conjunction with data-driven models can help to incorporate social justice factors into community-centric metrics (Di Lorenzo 2023). It is necessary to include diverse thinking in a digital twin and to consider the obstacles current funding mechanisms pose toward the cross-disciplinary work that would foster such inclusion (Asch 2023). 重要的是要注意,数字双胞胎通常是为了满足许多利益相关者的需求而设计的,往往超出了科学界的范围。将基于物理的模型与数据驱动的模型结合使用,可以帮助将社会公正因素纳入以社区为中心的指标(Di Lorenzo 2023)。在数字双胞胎中纳入多样化的思维是必要的,并且需要考虑当前资金机制对促进这种包容性的跨学科工作的障碍(Asch 2023)。
Digital Twin Examples, Needs, and Opportunities for Biomedical Applications 数字双胞胎示例、生物医学应用的需求与机遇
Many researchers hold that digital twins are not yet in practical use for decision-making in the biomedical space, but extensive work to advance their development is ongoing. Many of these efforts are described in Opportunities and Challenges for Digital Twins in Biomedical Research: Proceedings of a Workshop - in Brief in Appendix D (NASEM 2023b). The European Union has funded various projects for digital twins in the biomedical space. The European Virtual Human Twin (EDITH) has the mission of creating a roadmap toward fully integrated multiscale and multiorgan whole-body digital twins. The goal of the project is to develop a cloud-based repository of digital twins for health care including data, models, algorithms, and good practices, providing a virtual collaboration environment. The team is also designing a simulation platform to support the transition toward an integrated twin. To prototype the platform, they have selected use cases in applications including cancer, cardiovascular disease, and osteoporosis. While questions for EDITH remain, including in the areas of technology (e.g., data, models, resource integration, infrastructure); users (e.g., access and workflows); ethics and regulations (e.g., privacy and policy); and sustainability (e.g., clinical uptake and business modeling) (Al-Lazikani et al. 2023), the work in this space is notable. DIGIPREDICT and the Swedish Digital 许多研究人员认为,数字双胞胎在生物医学领域尚未实际用于决策,但推动其发展的广泛工作正在进行中。这些努力中的许多在《生物医学研究中数字双胞胎的机遇与挑战:研讨会简报》附录 D(NASEM 2023b)中有所描述。欧盟资助了多个生物医学领域的数字双胞胎项目。欧洲虚拟人类双胞胎(EDITH) 的使命是创建一条通向完全整合的多尺度和多器官全身数字双胞胎的路线图。该项目的目标是开发一个基于云的数字双胞胎存储库,涵盖健康护理的数据、模型、算法和最佳实践,提供一个虚拟协作环境。团队还在设计一个模拟平台,以支持向集成双胞胎的过渡。为了原型化该平台,他们选择了癌症、心血管疾病和骨质疏松症等应用中的用例。尽管 EDITH 仍然存在一些问题,包括技术(例如,数据、模型、资源整合、基础设施)和用户(例如)等领域。访问和工作流程;伦理和法规(例如,隐私和政策);以及可持续性(例如,临床应用和商业建模)(Al-Lazikani 等,2023),该领域的工作值得注意。DIGIPREDICT 和瑞典数字
Twin Consortium are two other examples of emerging European Union-funded projects working toward biomedical digital twins. 双胞胎联盟 是另外两个正在推进生物医学数字双胞胎的欧盟资助项目的例子。
Technical challenges in modeling, computation, and data all pose current barriers to implementing digital twins for biomedical use. Because medical data are often sparse and collecting data can be invasive to patients, researchers need strategies to create working models despite missing data. A combination of datadriven and mechanistic models can be useful to this end (Glazier 2023; KalpathyCramer 2023), but these approaches can remain limited due to the complexities and lack of understanding of the full biological processes even when sufficient data are available. In addition, data heterogeneity and the difficulty of integrating disparate multimodal data, collected across different time and size scales, also engender significant research questions. New techniques are necessary to harmonize, aggregate, and assimilate heterogenous data for biomedical digital twins (Koumoutsakos 2023; Sachs 2023). Furthermore, achieving interoperability and composability of models will be essential (Glazier 2023). 在建模、计算和数据方面的技术挑战目前对生物医学领域实施数字双胞胎构成障碍。由于医疗数据通常稀疏,且收集数据可能对患者造成侵扰,研究人员需要策略来创建有效的模型,即使在缺少数据的情况下。数据驱动和机制模型的结合可以对此有所帮助(Glazier 2023;KalpathyCramer 2023),但即使在有足够数据的情况下,这些方法也可能因复杂性和对完整生物过程缺乏理解而受到限制。此外,数据异质性以及整合跨不同时间和规模收集的多模态数据的困难,也引发了重要的研究问题。需要新的技术来协调、聚合和同化异质数据,以便为生物医学数字双胞胎服务(Koumoutsakos 2023;Sachs 2023)。此外,实现模型的互操作性和可组合性将是至关重要的(Glazier 2023)。
Accounting for uncertainty in biomedical digital twins as well as communicating and making appropriate decisions based on uncertainty will be vital to their practical application. As discussed more in Chapter 6, trust is paramount in the use of digital twins - and this is particularly critical for the use of these models in health care. Widespread adoption of digital twins will likely not be possible until patients, biologists, and clinicians trust them, which will first require education and transparency within the biomedical community (Enderling 2023; Miller 2023). Clear mechanisms for communicating uncertainty to digital twin users are a necessity. Though many challenges remain, opportunity also arises in that predictions from digital twins can open a line of communication between clinician and patient (Enderling 2023). 在生物医学数字双胞胎中考虑不确定性,并基于不确定性进行沟通和做出适当决策,对于它们的实际应用至关重要。如第六章中进一步讨论的那样,信任在数字双胞胎的使用中至关重要——这对于这些模型在医疗保健中的应用尤为关键。数字双胞胎的广泛采用可能在患者、生物学家和临床医生信任它们之前无法实现,这首先需要生物医学界的教育和透明度(Enderling 2023;Miller 2023)。向数字双胞胎用户传达不确定性的明确机制是必要的。尽管仍然存在许多挑战,但数字双胞胎的预测也为临床医生和患者之间的沟通开辟了机会(Enderling 2023)。
Ethical concerns are also important to consider throughout the process of developing digital twins for biomedical applications; these concerns cannot merely be an afterthought (NASEM 2023b). Bias inherent in data, models, and clinical processes needs to be evaluated and considered throughout the life cycle of a digital twin. Particularly considering the sensitive nature of medical data, it is important to prioritize privacy and security issues. Data-sharing mechanisms will also need to be developed, especially considering that some kinds of aggregate health data will never be entirely de-identifiable (Price 2023). 在为生物医学应用开发数字双胞胎的过程中,伦理问题同样重要,这些问题不能仅仅是事后的考虑(NASEM 2023b)。数据、模型和临床过程中的固有偏见需要在数字双胞胎的整个生命周期中进行评估和考虑。特别是考虑到医疗数据的敏感性,优先考虑隐私和安全问题非常重要。还需要开发数据共享机制,特别是考虑到某些类型的汇总健康数据永远无法完全去标识化(Price 2023)。
ADVANCING DIGITAL TWIN STATE OF THE ART REQUIRES AN INTEGRATED RESEARCH AGENDA 推进数字双胞胎的最新技术需要一个综合研究计划
Despite the existence of examples of digital twins providing practical impact and value, the sentiment expressed across multiple committee information-gath- 尽管存在数字双胞胎提供实际影响和价值的例子,但在多个委员会的信息收集过程中表达的情绪却是
ering sessions is that the publicity around digital twins and digital twin solutions currently outweighs the evidence base of success. For example, in a briefing to the committee, Mark Girolami, chief scientist of The Alan Turing Institute, stated that the "Digital Twin evidence base of success and added value is seriously lacking" (Girolami 2022). 关于数字双胞胎和数字双胞胎解决方案的宣传目前超过了成功的证据基础。例如,在向委员会的简报中,阿兰·图灵研究所的首席科学家马克·吉罗拉米表示,“数字双胞胎的成功和附加价值的证据基础严重不足”(吉罗拉米 2022)。
Conclusion 2-5: Digital twins have been the subject of widespread interest and enthusiasm; it is challenging to separate what is true from what is merely aspirational, due to a lack of agreement across domains and sectors as well as misinformation. It is important to separate the aspirational from the actual to strengthen the credibility of the research in digital twins and to recognize that serious research questions remain in order to achieve the aspirational. 结论 2-5:数字双胞胎一直受到广泛的关注和热情;由于各个领域和行业之间缺乏共识以及错误信息,分辨真实与仅仅是理想的内容具有挑战性。区分理想与实际对于增强数字双胞胎研究的可信度至关重要,并且要认识到,为了实现理想,仍然存在严肃的研究问题。
Conclusion 2-6: Realizing the potential of digital twins requires an integrated research agenda that advances each one of the key digital twin elements and, importantly, a holistic perspective of their interdependencies and interactions. This integrated research agenda includes foundational needs that span multiple domains as well as domain-specific needs. 结论 2-6:实现数字双胞胎的潜力需要一个综合研究议程,推动每一个关键数字双胞胎元素的发展,并且重要的是,要有一个整体视角来理解它们的相互依赖和互动。这个综合研究议程包括跨多个领域的基础需求以及特定领域的需求。
Recommendation 1: Federal agencies should launch new crosscutting programs, such as those listed below, to advance mathematical, statistical, and computational foundations for digital twins. As these new digital twin-focused efforts are created and launched, federal agencies should identify opportunities for cross-agency interactions and facilitate crosscommunity collaborations where fruitful. An interagency working group may be helpful to ensure coordination. 建议 1:联邦机构应启动新的跨部门项目,例如以下列出的项目,以推进数字双胞胎的数学、统计和计算基础。在这些新的数字双胞胎重点项目创建和启动时,联邦机构应识别跨机构互动的机会,并在有利的情况下促进跨社区合作。一个跨机构工作组可能有助于确保协调。
National Science Foundation (NSF). NSF should launch a new program focused on mathematical, statistical, and computational foundations for digital twins that cuts across multiple application domains of science and engineering. 国家科学基金会(NSF)。NSF 应该启动一个新项目,专注于数字双胞胎的数学、统计和计算基础,涵盖多个科学和工程应用领域。
The scale and scope of this program should be in line with other multidisciplinary NSF programs (e.g., the NSF Artificial Intelligence Institutes) to highlight the technical challenge being solved as well as the emphasis on theoretical foundations being grounded in practical use cases. 该计划的规模和范围应与其他多学科的国家科学基金会(NSF)项目(例如,NSF 人工智能研究所)相一致,以突出所解决的技术挑战,以及强调理论基础应扎根于实际应用案例。
Ambitious new programs launched by NSF for digital twins should ensure that sufficient resources are allocated to the solicitation so that the technical advancements are evaluated using real-world use cases and testbeds. 国家科学基金会推出的雄心勃勃的新项目针对数字双胞胎,应确保为该招标分配足够的资源,以便使用真实世界的用例和测试平台来评估技术进步。
NSF should encourage collaborations across industry and academia and develop mechanisms to ensure that small and medium-sized industrial and academic institutions can also compete and be successful leading such initiatives. NSF 应该鼓励行业和学术界之间的合作,并制定机制以确保中小型工业和学术机构也能竞争并成功主导这些倡议。
Ideally, this program should be administered and funded by multiple directorates at NSF, ensuring that from inception to sunset, real-world applications in multiple domains guide the theoretical components of the program. 理想情况下,该项目应由国家科学基金会的多个部门管理和资助,确保从开始到结束,多个领域的实际应用指导项目的理论部分。
Department of Energy (DOE). DOE should draw on its unique computational facilities and large instruments coupled with the breadth of its mission as it considers new crosscutting programs in support of digital twin research and development. It is well positioned and experienced in large, interdisciplinary, multi-institutional mathematical, statistical, and computational programs. Moreover, it has demonstrated the ability to advance common foundational capabilities while also maintaining a focus on specific use-driven requirements (e.g., predictive high-fidelity models for high-consequence decision support). This collective ability should be reflected in a digital twin grand challenge research and development vision for DOE that goes beyond the current investments in large-scale simulation to advance and integrate the other digital twin elements, including the physical/virtual bidirectional interaction and high-consequence decision support. This vision, in turn, should guide DOE's approach in establishing new crosscutting programs in mathematical, statistical, and computational foundations for digital twins. 能源部(DOE)。在考虑支持数字双胞胎研究和开发的新跨学科项目时,DOE 应利用其独特的计算设施和大型仪器,以及其使命的广度。它在大型跨学科、多机构的数学、统计和计算项目方面具有良好的定位和经验。此外,它已经证明能够在保持对特定使用驱动需求(例如,用于高后果决策支持的预测高保真模型)的关注的同时,推进共同的基础能力。这种集体能力应体现在 DOE 的数字双胞胎重大挑战研究和开发愿景中,该愿景超越当前在大规模仿真方面的投资,以推进和整合其他数字双胞胎要素,包括物理/虚拟双向交互和高后果决策支持。反过来,这一愿景应指导 DOE 在建立数字双胞胎的数学、统计和计算基础的新跨学科项目时的做法。
National Institutes of Health (NIH). NIH should invest in filling the gaps in digital twin technology in areas that are particularly critical to biomedical sciences and medical systems. These include bioethics, handling of measurement errors and temporal variations in clinical measurements, capture of adequate metadata to enable effective data harmonization, complexities of clinical decision-making with digital twin interactions, safety of closed-loop systems, privacy, and many others. This could be done via new cross-institute programs and expansion of current programs such as the Interagency Modeling and Analysis Group. 国家卫生研究院(NIH)。NIH 应该投资于填补数字双胞胎技术在生物医学科学和医疗系统中特别关键领域的空白。这些领域包括生物伦理学、临床测量中的测量误差和时间变化的处理、捕获足够的元数据以实现有效的数据协调、数字双胞胎交互中的临床决策复杂性、闭环系统的安全性、隐私等。这可以通过新的跨机构项目和扩展当前项目(如跨机构建模与分析小组)来实现。
Department of Defense (DoD). DoD's Office of the Under Secretary of Defense for Research and Engineering should advance the application of digital twins as an integral part of the digital engineering performed to support system design, performance analysis, developmental and operational testing, operator and force training, and operational maintenance prediction. DoD should also consider using mechanisms such as the Multidisciplinary University Research Initiative and Defense Acquisition University to support research efforts to develop and mature the tools and techniques for the ap- 国防部(DoD)。国防部的副国防部长办公室应推动数字双胞胎的应用,使其成为支持系统设计、性能分析、开发和操作测试、操作员和部队培训以及操作维护预测的数字工程的重要组成部分。国防部还应考虑使用多学科大学研究倡议和国防采购大学等机制,以支持研究工作,开发和完善数字双胞胎的工具和技术。
plication of digital twins as part of system digital engineering and model-based system engineering processes. 数字双胞胎在系统数字工程和基于模型的系统工程过程中的应用。
Other federal agencies. Many federal agencies and organizations beyond those listed above can play important roles in the advancement of digital twin research. For example, the National Oceanic and Atmospheric Administration, the National Institute of Standards and Technology, and the National Aeronautics and Space Administration should be included in the discussion of digital twin research and development, drawing on their unique missions and extensive capabilities in the areas of data assimilation and realtime decision support. 其他联邦机构。许多联邦机构和组织超出了上述列出的范围,可以在数字双胞胎研究的推进中发挥重要作用。例如,国家海洋和大气管理局、国家标准与技术研究院以及国家航空航天局应当被纳入数字双胞胎研究与开发的讨论中,利用它们在数据同化和实时决策支持领域的独特使命和广泛能力。
As described earlier in this chapter, VVUQ is a key element of digital twins that necessitates collaborative and interdisciplinary investment. 如本章前面所述,VVUQ 是数字双胞胎的一个关键要素,需要协作和跨学科的投资。
Recommendation 2: Federal agencies should ensure that verification, validation, and uncertainty quantification (VVUQ) is an integral part of new digital twin programs. In crafting programs to advance the digital twin VVUQ research agenda, federal agencies should pay attention to the importance of (1) overarching complex multiscale, multiphysics problems as catalysts to promote interdisciplinary cooperation; (2) the availability and effective use of data and computational resources; (3) collaborations between academia and mission-driven government laboratories and agencies; and (4) opportunities to include digital twin VVUQ in educational programs. Federal agencies should consider the Department of Energy Predictive Science Academic Alliance Program as a possible model to emulate. 建议 2:联邦机构应确保验证、确认和不确定性量化(VVUQ)成为新数字双胞胎项目的一个 integral 部分。在制定推进数字双胞胎 VVUQ 研究议程的项目时,联邦机构应关注以下重要性:(1)作为促进跨学科合作的催化剂的整体复杂多尺度、多物理问题;(2)数据和计算资源的可用性及有效利用;(3)学术界与以任务为驱动的政府实验室和机构之间的合作;以及(4)在教育项目中纳入数字双胞胎 VVUQ 的机会。联邦机构应考虑能源部预测科学学术联盟计划作为一个可供借鉴的模型。
KEY GAPS, NEEDS, AND OPPORTUNITIES 关键差距、需求和机会
In Table 2-1, the committee highlights key gaps, needs, and opportunities across the digital twin landscape. This is not meant to be an exhaustive list of all opportunities presented in the chapter. For the purposes of this report, prioritization of a gap is indicated by 1 or 2 . While the committee believes all of the gaps listed are of high priority, gaps marked 1 may benefit from initial investment before moving on to gaps marked with a priority of 2 . 在表 2-1 中,委员会强调了数字双胞胎领域的关键差距、需求和机会。这并不是本章中所有机会的详尽列表。为了本报告的目的,差距的优先级用 1 或 2 表示。虽然委员会认为所有列出的差距都具有高优先级,但标记为 1 的差距可能在转向标记为 2 的差距之前受益于初始投资。
TABLE 2-1 Key Gaps, Needs, and Opportunities Across the Digital Twin Landscape 表 2-1 数字双胞胎领域的关键差距、需求和机会
Maturity 成熟
Priority 优先级
Early and Preliminary Stages 早期和初步阶段
需要开发和部署数字双胞胎,以使决策者能够预测和适应不断变化的威胁,规划和执行应急响应,并评估影响。
Development and deployment of digital twins that enable decision-makers to
anticipate and adapt to evolving threats, plan and execute emergency response, and
Due to the heterogeneity, complexity, multimodality, and breadth of biomedical
data, the harmonization, aggregation, and assimilation of data and models to
effectively combine these data into biomedical digital twins require significant
technical research.
1
Research Base Exists with Opportunities to Advance Digital Twins 研究基础存在,提供了推进数字双胞胎的机会
不确定性量化对大气、气候和可持续性科学的数字双胞胎至关重要,通常需要替代模型和/或改进的采样技术。
Uncertainty quantification is critical to digital twins for atmospheric, climate, and
sustainability sciences and will generally require surrogate models and/or improved
sampling techniques.
2
REFERENCES 参考文献
AIAA (American Institute of Aeronautics and Astronautics) Computational Fluid Dynamics Committee. 1998. Guide for the Verification and Validation of Computational Fluid Dynamics Simulations. Reston, VA. AIAA(美国航空航天学会)计算流体动力学委员会。1998 年。计算流体动力学仿真的验证和确认指南。弗吉尼亚州雷斯顿。
AIAA Digital Engineering Integration Committee. 2020. "Digital Twin: Definition and Value." AIAA and AIA Position Paper. AIAA 数字工程集成委员会。2020 年。“数字双胞胎:定义与价值。”AIAA 和 AIA 立场文件。
Al-Lazikani, B., G. An, and L. Geris. 2023. "Connecting Across Scales." Presentation to the Workshop on Opportunities and Challenges for Digital Twins in Biomedical Sciences. January 30. Washington, DC. 阿拉齐卡尼,B.,安,G. 和杰里斯,L. 2023。“跨尺度连接。”在生物医学科学中数字双胞胎的机遇与挑战研讨会上的演讲。1 月 30 日。华盛顿特区。
Asch, M. 2023. Presentation to the Workshop on Digital Twins in Atmospheric, Climate, and Sustainability Science. February 2. Washington, DC. 阿什,M. 2023。关于大气、气候和可持续性科学中的数字双胞胎研讨会的演讲。2 月 2 日。华盛顿特区。
Balaji, V. 2023. "Towards Traceable Model Hierarchies." Presentation to the Workshop on Digital Twins in Atmospheric, Climate, and Sustainability Science. February 1. Washington, DC. 巴拉吉,V. 2023。“朝向可追溯的模型层次结构。”在华盛顿特区于 2 月 1 日举行的气候、气象和可持续科学数字双胞胎研讨会上的演讲。
Bennett, H., M. Birkin, J. Ding, A. Duncan, and Z. Engin. 2023. "Towards Ecosystems of Connected Digital Twins to Address Global Challenges." White paper. London, England: The Alan Turing Institute. 贝内特,H.,M. 伯金,J. 丁,A. 邓肯,和 Z. 恩金。2023。“朝着连接数字双胞胎生态系统以应对全球挑战。”白皮书。英国伦敦:阿兰·图灵研究所。
Bochenek, G. 2023. Presentation to the Workshop on Opportunities and Challenges for Digital Twins in Engineering. February 9. Washington, DC. 博切内克,G. 2023。关于工程中数字双胞胎的机遇与挑战的研讨会演讲。2 月 9 日。华盛顿特区。
Center for A.I. Safety. 2023. "Statement on AI Risk [open letter]." 人工智能安全中心。2023 年。“关于人工智能风险的声明 [公开信]。”
Darema, F. 2004. "Dynamic Data-Driven Applications Systems: A New Paradigm for Application Simulations and Measurements." Lecture Notes in Computer Science 3038:662-669. 达雷马,F. 2004. “动态数据驱动应用系统:应用模拟和测量的新范式。” 计算机科学讲义 3038:662-669。
Deshmukh, D. 2023. Presentation to the Workshop on Opportunities and Challenges for Digital Twins in Engineering. February 7. Washington, DC. 德什穆克,D. 2023。关于工程中数字双胞胎的机遇与挑战的研讨会演讲。2 月 7 日。华盛顿特区。
Di Lorenzo, E. 2023. Presentation to the Workshop on Digital Twins in Atmospheric, Climate, and Sustainability Science. February 2. Washington, DC. 迪洛伦佐,E. 2023。关于大气、气候和可持续科学中的数字双胞胎研讨会的演讲。2 月 2 日。华盛顿特区。
Duraisamy, K. 2023. Presentation to the Workshop on Opportunities and Challenges for Digital Twins in Engineering. February 7. Washington, DC. 杜拉伊萨米,K. 2023. 在华盛顿特区举行的数字双胞胎在工程中的机遇与挑战研讨会上的演讲。2 月 7 日。
Enderling, H. 2023. Presentation to the Workshop on Opportunities and Challenges for Digital Twins in Biomedical Sciences. January 30. Washington, DC. Enderling, H. 2023. 在生物医学科学数字双胞胎的机遇与挑战研讨会上的演讲。1 月 30 日。华盛顿特区。
European Commission. 2023. "Destination Earth." https://digital-strategy.ec.europa.eu/en/policies/ destination-earth. Last modified April 20. 欧洲委员会。2023 年。“地球目的地。” https://digital-strategy.ec.europa.eu/en/policies/ destination-earth。最后修改于 4 月 20 日。
Gahn, M.S. 2023. Presentation to the Workshop on Opportunities and Challenges for Digital Twins in Engineering. February 7. Washington, DC. Gahn, M.S. 2023. 关于工程中数字双胞胎的机遇与挑战的研讨会演讲。2 月 7 日。华盛顿特区。
Ghattas, O. 2023. Presentation to the Workshop on Digital Twins in Atmospheric, Climate, and Sustainability Science. February 1. Washington, DC. Ghattas, O. 2023. 在华盛顿特区于 2 月 1 日举行的气候、气象和可持续科学数字双胞胎研讨会上的演讲。
Girolami, M. 2022. "Digital Twins: Essential, Mathematical, Statistical and Computing Research Foundations." Presentation to the Committee on Foundational Research Gaps and Future Directions for Digital Twins. November 21. Washington, DC. 吉罗拉米,M. 2022. “数字双胞胎:基本的、数学的、统计的和计算研究基础。” 提交给数字双胞胎基础研究差距与未来方向委员会的报告。11 月 21 日。华盛顿特区。
Glaessgen, E., and D. Stargel. 2012. "The Digital Twin Paradigm for Future NASA and US Air Force Vehicles." AIAA Paper 2012-1818 in Proceedings of the 53rd AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference. April. Honolulu, Hawaii. Glaessgen, E. 和 D. Stargel. 2012. "未来 NASA 和美国空军车辆的数字双胞胎范式。" AIAA 论文 2012-1818,发表于第 53 届 AIAA/ASME/ASCE/AHS/ASC 结构、结构动力学和材料会议论文集中。四月。夏威夷檀香山。
Glazier, J.A. 2023. Presentation to the Workshop on Opportunities and Challenges for Digital Twins in Biomedical Sciences. January 30. Washington, DC. 格雷泽,J.A. 2023. 在生物医学科学数字双胞胎的机遇与挑战研讨会上的演讲。1 月 30 日。华盛顿特区。
Goodchild, M. 2023. Presentation to the Workshop on Digital Twins in Atmospheric, Climate, and Sustainability Science. February 2. Washington, DC. 古德查德,M. 2023。关于大气、气候和可持续性科学中的数字双胞胎研讨会的演讲。2 月 2 日。华盛顿特区。
Grieves, M. 2005a. Product Lifecycle Management: Driving the Next Generation of Lean Thinking. New York: McGraw-Hill. 格里夫斯,M. 2005a. 产品生命周期管理:推动下一代精益思维。纽约:麦格劳-希尔。
Grieves, M. 2005b. "Product Lifecycle Management: The New Paradigm for Enterprises." International Journal of Product Development 2 1(2):71-84. 格里夫斯,M. 2005b. “产品生命周期管理:企业的新范式。” 国际产品开发杂志 2 1(2):71-84.
Grieves, M. 2023. Presentation to the Workshop on Opportunities and Challenges for Digital Twins in Engineering. February 7. Washington, DC. 格里夫斯,M. 2023. 在华盛顿特区举行的数字双胞胎在工程中的机遇与挑战研讨会上的演讲。2 月 7 日。
Hendrickson, B., B. Bland, J. Chen, P. Colella, E. Dart, J. Dongarra, T. Dunning, et al. 2020. ASCR@ 40: Highlights and Impacts of ASCR's Programs. Department of Energy Office of Science. 亨德里克森, B., B. 布兰德, J. 陈, P. 科莱拉, E. 达特, J. 东加拉, T. 达宁, 等. 2020. ASCR@ 40: ASCR 项目的亮点与影响. 美国能源部科学办公室.
Kalpathy-Cramer, J. 2023. "Digital Twins at the Organ, Tumor, and Microenvironment Scale." Presentation to the Workshop on Opportunities and Challenges for Digital Twins in Biomedical Sciences. January 30. Washington, DC. Kalpathy-Cramer, J. 2023. "器官、肿瘤和微环境尺度的数字双胞胎。" 在生物医学科学中数字双胞胎的机遇与挑战研讨会上的演讲。1 月 30 日。华盛顿特区。
Kobryn, P. 2023. "AFRL Airframe Digital Twin." Presentation to the Workshop on Opportunities and Challenges for Digital Twins in Engineering. February 9. Washington, DC. 科布林,P. 2023。“AFRL 机身数字双胞胎。”在工程领域数字双胞胎的机遇与挑战研讨会上的演讲。2 月 9 日。华盛顿特区。
Koumoutsakos, P. 2023. Presentation to the Workshop on Opportunities and Challenges for Digital Twins in Biomedical Sciences. January 30. Washington, DC. Koumoutsakos, P. 2023. 在生物医学科学数字双胞胎的机遇与挑战研讨会上的演讲。1 月 30 日。华盛顿特区。
Mangravite, L. 2023. Presentation to the Workshop on Opportunities and Challenges for Digital Twins in Biomedical Sciences. January 30. Washington, DC. Mangravite, L. 2023. 在生物医学科学中数字双胞胎的机遇与挑战研讨会上的演讲。1 月 30 日。华盛顿特区。
McGovern, A. 2023. Presentation to the Workshop on Digital Twins in Atmospheric, Climate, and Sustainability Science. February 2. Washington, DC. 麦戈文,A. 2023。关于大气、气候和可持续性科学中的数字双胞胎研讨会的演讲。2 月 2 日。华盛顿特区。
Miller, D. 2023. "Prognostic Digital Twins in Practice." Presentation to the Workshop on Opportunities and Challenges for Digital Twins in Biomedical Sciences. January 30. Washington, DC. 米勒,D. 2023。“实践中的预后数字双胞胎。”在生物医学科学中数字双胞胎的机遇与挑战研讨会上的演讲。1 月 30 日。华盛顿特区。
Modigliani, U. 2023. "Earth System Digital Twins and the European Destination Earth Initiative." Presentation to the Workshop on Digital Twins in Atmospheric, Climate, and Sustainability Science. February 1. Washington, DC. 莫迪利亚尼,U. 2023. “地球系统数字双胞胎与欧洲目的地地球倡议。” 在气候、大气和可持续性科学数字双胞胎研讨会上的演讲。2 月 1 日。华盛顿特区。
NASEM (National Academies of Sciences, Engineering, and Medicine). 2023a. Opportunities and Challenges for Digital Twins in Atmospheric and Climate Sciences: Proceedings of a Workshop-in Brief. Washington, DC: The National Academies Press. NASEM(国家科学院、工程院和医学院)。2023a。气候与大气科学中数字双胞胎的机遇与挑战:研讨会简报。华盛顿特区:国家科学院出版社。
NASEM. 2023b. Opportunities and Challenges for Digital Twins in Biomedical Research: Proceedings of a Workshop-in Brief. Washington, DC: The National Academies Press. NASEM. 2023b. 生物医学研究中数字双胞胎的机遇与挑战:研讨会简报。华盛顿特区:国家科学院出版社。
NASEM. 2023c. Opportunities and Challenges for Digital Twins in Engineering: Proceedings of a Workshop-in Brief. Washington, DC: The National Academies Press. NASEM. 2023c. 工程领域数字双胞胎的机遇与挑战:研讨会简报。华盛顿特区:国家科学院出版社。
Niederer, S.A., M.S. Sacks, M. Girolami, and K. Willcox. 2021. "Scaling Digital Twins from the Artisanal to the Industrial." Nature Computational Science 1(5):313-320. Niederer, S.A., M.S. Sacks, M. Girolami, 和 K. Willcox. 2021. "将数字双胞胎从手工艺扩展到工业." 自然计算科学 1(5):313-320.
NIST (National Institute of Standards and Technology). 2009. "The System Development Life Cycle (SDLC)." ITL Bulletin. https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=902622. NIST(国家标准与技术研究所)。2009 年。“系统开发生命周期(SDLC)。”ITL 公报。https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=902622。
NRC (National Research Council). 2012. Assessing the Reliability of Complex Models: Mathematical and Statistical Foundations of Verification, Validation, and Uncertainty Quantification. Washington, DC: The National Academies Press. NRC(国家研究委员会)。2012 年。《评估复杂模型的可靠性:验证、确认和不确定性量化的数学和统计基础》。华盛顿特区:国家科学院出版社。
Piascik, B., J. Vickers, D. Lowry, S. Scotti, J. Stewart, and A. Calomino. 2012. "Materials, Structures, Mechanical Systems, and Manufacturing Roadmap: Technology Area 12." Washington, DC: NASA. 皮亚斯基克,B.,维克斯,J.,洛瑞,D.,斯科蒂,S.,斯图尔特,J.,和卡洛米诺,A. 2012. “材料、结构、机械系统和制造路线图:技术领域 12。” 华盛顿特区:NASA。
Price, N. 2023. Presentation to the Workshop on Opportunities and Challenges for Digital Twins in Biomedical Sciences. January 30. Washington, DC. 价格,N. 2023。关于生物医学科学中数字双胞胎的机遇与挑战的研讨会演讲。1 月 30 日。华盛顿特区。
Rawlings, J.B., D.Q. Mayne, and M. Diehl. 2017. Model Predictive Control: Theory, Computation, and Design. Vol. 2. Madison, WI: Nob Hill Publishing. 罗林斯,J.B.,D.Q. 梅因,和 M. 迪尔。2017 年。《模型预测控制:理论、计算与设计》。第 2 卷。威斯康星州麦迪逊:诺布山出版社。
Reichle, R.H. 2008. "Data Assimilation Methods in the Earth Sciences." Advances in Water Resources 31(11):1411-1418. Reichle, R.H. 2008. "地球科学中的数据同化方法。" 水资源进展 31(11):1411-1418.
Roose, K. 2023. "A.I. Poses 'Risk of Extinction,' Industry Leaders Warn." New York Times. May 30. https://www.nytimes.com/2023/05/30/technology/ai-threat-warning.html. Roose, K. 2023. "人工智能带来‘灭绝风险’,行业领袖警告。" 纽约时报。5 月 30 日。https://www.nytimes.com/2023/05/30/technology/ai-threat-warning.html。
Sachs, J.R. 2023. "Digital Twins: Pairing Science with Simulation for Life Sciences." Presentation to the Workshop on Opportunities and Challenges for Digital Twins in Biomedical Sciences. January 30. Washington, DC. 萨克斯,J.R. 2023. “数字双胞胎:将科学与生命科学的模拟相结合。” 在生物医学科学中数字双胞胎的机遇与挑战研讨会上的演讲。1 月 30 日。华盛顿特区。
Schneider, T. 2023. Presentation to the Workshop on Digital Twins in Atmospheric, Climate, and Sustainability Science. February 1. Washington, DC. 施耐德,T. 2023。关于大气、气候和可持续性科学中的数字双胞胎研讨会的演讲。2 月 1 日。华盛顿特区。
Shafto, M., M. Conroy, R. Doyle, E. Glaessgen, C. Kemp, J. LeMoigne, and L. Wang. 2012. "Modeling, Simulation, Information Technology and Processing Roadmap." National Aeronautics and Space Administration 32(2012):1-38. Shafto, M., M. Conroy, R. Doyle, E. Glaessgen, C. Kemp, J. LeMoigne, 和 L. Wang. 2012. "建模、仿真、信息技术与处理路线图." 美国国家航空航天局 32(2012):1-38.
Shanley, L. 2023. "Discussion of Ethical Considerations of Digital Twins." Presentation to the Committee on Foundational Research Gaps and Future Directions for Digital Twins. April 27. Washington, DC. 香利,L. 2023。“数字双胞胎的伦理考虑讨论。”向数字双胞胎基础研究差距与未来方向委员会的报告。4 月 27 日。华盛顿特区。
Tuegel, E.J., A.R. Ingraffea, T.G. Eason, and S.M. Spottswood. 2011. "Reengineering Aircraft Structural Life Prediction Using a Digital Twin." International Journal of Aerospace Engineering 2011:1-14. Tuegel, E.J., A.R. Ingraffea, T.G. Eason, 和 S.M. Spottswood. 2011. "使用数字双胞胎重新工程飞机结构寿命预测." 国际航空工程杂志 2011:1-14.
Wells, J. 2022. "Digital Twins and NVIDIA Omniverse." Presentation to the Committee on Foundational Research Gaps and Future Directions for Digital Twins. November 21. Washington, DC. 威尔斯,J. 2022. “数字双胞胎与 NVIDIA Omniverse。”向数字双胞胎基础研究差距与未来方向委员会的报告。11 月 21 日。华盛顿特区。
3
Virtual Representation: Foundational Research Needs and Opportunities 虚拟代表:基础研究需求与机会
The digital twin virtual representation comprises a computational model or set of coupled models. This chapter identifies research needs and opportunities associated with creating, scaling, validating, and deploying models in the context of a digital twin. The chapter emphasizes the importance of the virtual representation being fit for purpose and the associated needs for data-centric and model-centric formulations. The chapter discusses multiscale modeling needs and opportunities, including the importance of hybrid modeling that combines mechanistic models and machine learning (ML). This chapter also discusses the challenges of integrating component and subsystem digital twins into the virtual representation. Surrogate modeling needs and opportunities for digital twins are also discussed, including surrogate modeling for high-dimensional, complex multidisciplinary systems and the essential data assimilation, dynamic updating, and adaptation of surrogate models. 数字双胞胎的虚拟表示包括一个计算模型或一组耦合模型。本章确定了与创建、扩展、验证和部署数字双胞胎模型相关的研究需求和机会。本章强调虚拟表示适合其目的的重要性,以及与之相关的数据中心和模型中心的制定需求。本章讨论了多尺度建模的需求和机会,包括结合机械模型和机器学习(ML)的混合建模的重要性。本章还讨论了将组件和子系统数字双胞胎集成到虚拟表示中的挑战。还讨论了数字双胞胎的替代建模需求和机会,包括针对高维、复杂的多学科系统的替代建模,以及替代模型的必要数据同化、动态更新和适应。
FIT-FOR-PURPOSE VIRTUAL REPRESENTATIONS FOR DIGITAL TWINS 适合目的的数字双胞胎虚拟表示
As discussed in Chapter 2, the computational models underlying the digital twin virtual representation can take many mathematical forms (including dynamical systems, differential equations, and statistical models) and need to be "fit for purpose" (meaning that model types, fidelity, resolution, parameterization, and quantities of interest must be chosen and potentially dynamically adapted to fit the particular decision task and computational constraints). The success of a digital twin hinges critically on the availability of models that can represent the physical counterpart with fidelity that is fit for purpose, and that can be used to 如第二章所讨论的,数字双胞胎虚拟表示背后的计算模型可以采取多种数学形式(包括动态系统、微分方程和统计模型),并且需要“适合目的”(这意味着模型类型、保真度、分辨率、参数化和关注的量必须被选择,并可能动态调整以适应特定的决策任务和计算约束)。数字双胞胎的成功在于能够提供能够以适合目的的保真度表示物理对应物的模型,并且可以用于
issue predictions with known confidence, possibly in extrapolatory regimes, all while satisfying computational resource constraints. 在满足计算资源限制的同时,发出具有已知置信度的预测,可能在外推范围内。
As the foundational research needs and opportunities for modeling in support of digital twins are outlined, it is important to emphasize that there is no one-size-fits-all approach. The vast range of domain applications and use cases that are envisioned for digital twins requires a similarly vast range of models: first-principles, mechanistic, and empirical models all have a role to play. 随着支持数字双胞胎建模的基础研究需求和机会的概述,重要的是要强调没有一种通用的方法。数字双胞胎所设想的广泛领域应用和用例需要同样广泛的模型:第一性原理模型、机制模型和经验模型都发挥着作用。
There are several areas in which the state of the art in modeling is currently a barrier to achieving the impact of digital twins, due to the challenges of modeling complex multiphysics systems across multiple scales. In some cases, the mathematical models are well understood, and these barriers relate to our inability to bridge scales in a computationally tractable way. In other cases, the mathematical models are lacking, and discovery of new models that explain observed phenomena is needed. In yet other cases, mathematical models may be well understood and computationally tractable to solve at the component level, but foundational questions remain around stability and accuracy when multiple models are coupled at a full system or system-of-systems level. There are other areas in which the state of the art in modeling provides potential enablers for digital twins. The fields of statistics, ML, and surrogate modeling have advanced considerably in recent years, but a gap remains between the class of problems that has been addressed and the modeling needs for digital twins. 在建模的最前沿,有几个领域目前成为实现数字双胞胎影响的障碍,这主要是由于在多个尺度上建模复杂多物理系统的挑战。在某些情况下,数学模型已经被很好地理解,这些障碍与我们无法以计算上可行的方式跨尺度桥接有关。在其他情况下,数学模型尚不完善,需要发现新的模型来解释观察到的现象。在还有其他情况下,数学模型可能在组件级别上被很好地理解并且计算上可解,但在多个模型在完整系统或系统集成层面耦合时,关于稳定性和准确性仍然存在基础性问题。还有其他领域,建模的最前沿为数字双胞胎提供了潜在的推动力。统计学、机器学习和代理建模领域近年来取得了显著进展,但在已解决的问题类别与数字双胞胎的建模需求之间仍然存在差距。
Some communities focus on high-fidelity models in the development of digital twins while others define digital twins using simplified and/or surrogate models. Some literature states that a digital twin must be a high-resolution, highfidelity replica of the physical system (Bauer et al. 2021; NASEM 2023a). An early definition of a digital twin proposed "a set of virtual information constructs that fully describes a potential or actual physical manufactured product from the micro atomic level to the macro geometrical level. At its optimum, any information that could be obtained from inspecting a physical manufactured product can be obtained from its Digital Twin" (Grieves 2014). Other literature proposes surrogate modeling as a key enabler for digital twins (Hartmann et al. 2018; NASEM 2023c), particularly recognizing the dynamic (possibly real-time) nature of many digital twin calculations. 一些社区在数字双胞胎的开发中专注于高保真模型,而其他社区则使用简化和/或替代模型来定义数字双胞胎。一些文献指出,数字双胞胎必须是物理系统的高分辨率、高保真复制品(Bauer et al. 2021; NASEM 2023a)。数字双胞胎的早期定义提出“一个虚拟信息构造的集合,全面描述一个潜在或实际的物理制造产品,从微观原子层面到宏观几何层面。在最佳状态下,任何可以通过检查物理制造产品获得的信息都可以从其数字双胞胎中获得”(Grieves 2014)。其他文献则提出替代建模是数字双胞胎的关键推动力(Hartmann et al. 2018; NASEM 2023c),特别是认识到许多数字双胞胎计算的动态(可能是实时)特性。
Conclusion 3-1: A digital twin should be defined at a level of fidelity and resolution that makes it fit for purpose. Important considerations are the required level of fidelity for prediction of the quantities of interest, the available computational resources, and the acceptable cost. This may lead to the digital twin including high-fidelity, simplified, or surrogate models, as well as a mixture thereof. Furthermore, a digital twin may include the ability to represent and query the virtual models at variable levels of resolution and fidelity depending on the particular task at hand and the available resources (e.g., time, computing, bandwidth, data). 结论 3-1:数字双胞胎应在适合其目的的保真度和分辨率水平上进行定义。重要的考虑因素包括对感兴趣量的预测所需的保真度水平、可用的计算资源和可接受的成本。这可能导致数字双胞胎包括高保真、简化或替代模型,以及它们的混合。此外,数字双胞胎可能包括根据特定任务和可用资源(例如时间、计算、带宽、数据)在不同的分辨率和保真度水平上表示和查询虚拟模型的能力。
Determining whether a virtual representation is fit for purpose is itself a mathematical gap when it comes to the complexity of situations that arise with digital twins. For a model to be fit for purpose, it must balance the fidelity of predictions of quantities of interest with computational constraints, factoring in acceptable levels of uncertainty to drive decisions. If there is a human in the digital twin loop, fitness for purpose must also account for human-digital twin interaction needs such as visualization and communication of uncertainty. Furthermore, since a digital twin's purpose may change over time, the requirements for it to be fit for purpose may also evolve. Historically, computational mathematics has addressed accuracy requirements for numerical solution of partial differential equations using rigorous approaches such as a posteriori error estimation combined with numerical adaptivity (Ainsworth and Oden 1997). These kinds of analyses are an important ingredient of assessing fitness for purpose; however, the needs for digital twins go far beyond this, particularly given the range of model types that digital twins will employ and the likelihood that a digital twin will couple multiple models of differing fidelity. A key feature for determining fitness for purpose is assessing whether the fusion of a mathematical model, potentially corrected via a discrepancy function, and observational data provides relevant information for decision-making. Another key aspect of determining digital twin fitness for purpose is assessment of the integrity of the physical system's observational data, as discussed in Chapter 4. 确定虚拟表示是否适合其目的本身就是一个数学难题,尤其是在数字双胞胎所带来的复杂情况中。为了使模型适合其目的,它必须在对感兴趣量的预测精度与计算约束之间取得平衡,同时考虑可接受的不确定性水平以推动决策。如果数字双胞胎环节中有人的参与,适合目的的标准还必须考虑人类与数字双胞胎之间的互动需求,例如不确定性的可视化和沟通。此外,由于数字双胞胎的目的可能会随着时间而变化,适合目的的要求也可能会演变。历史上,计算数学通过严格的方法解决了偏微分方程数值解的准确性要求,例如后验误差估计结合数值自适应(Ainsworth 和 Oden 1997)。 这些分析是评估适用性的一个重要组成部分;然而,数字双胞胎的需求远不止于此,特别是考虑到数字双胞胎将使用的模型类型范围以及数字双胞胎可能会结合多个不同保真度模型的可能性。确定适用性的一个关键特征是评估数学模型的融合(可能通过差异函数进行修正)和观测数据是否提供了决策所需的相关信息。确定数字双胞胎适用性的另一个关键方面是评估物理系统观测数据的完整性,如第 4 章所讨论的。
Finding 3-1: Approaches to assess modeling fidelity are mathematically mature for some classes of models, such as partial differential equations that represent one discipline or one component of a complex system; however, theory and methods are less mature for assessing the fidelity of other classes of models (particularly empirical models) and coupled multiphysics, multicomponent systems. 发现 3-1:评估建模保真度的方法在某些模型类别(例如表示一个学科或复杂系统一个组成部分的偏微分方程)上已经相对成熟;然而,对于评估其他模型类别(特别是经验模型)和耦合多物理场、多组分系统的保真度,理论和方法仍然不够成熟。
An additional consideration in determining model fitness for purpose is the complementary role of models and data - a digital twin is distinguished from traditional modeling and simulation in the way that models and data work together to drive decision-making. Thus, it is important to analyze the entire digital twin ecosystem when assessing modeling needs and the trade-offs between data-driven and model-driven approaches (Ferrari 2023). 在确定模型适用性时,另一个需要考虑的因素是模型和数据的互补作用——数字双胞胎与传统建模和仿真之间的区别在于模型和数据如何协同工作以推动决策。因此,在评估建模需求以及数据驱动和模型驱动方法之间的权衡时,分析整个数字双胞胎生态系统是很重要的(Ferrari 2023)。
In some cases, there is an abundance of data, and the decisions to be made fall largely within the realm of conditions represented by the data. In these cases, a data-centric view of a digital twin (Figure 3-1) is appropriate-the data form the core of the digital twin, the numerical model is likely heavily empirical (e.g., obtained via statistical or ML methods), and analytics and decision-making wrap around this numerical model. An example of such a setting is the digital twin of an aircraft engine, trained on a large database of sensor data and flight logs col- 在某些情况下,数据丰富,待做的决策主要落在数据所代表的条件范围内。在这些情况下,数字双胞胎的数据中心视角(图 3-1)是合适的——数据构成数字双胞胎的核心,数值模型可能高度依赖经验(例如,通过统计或机器学习方法获得),分析和决策围绕这个数值模型展开。这样的设置的一个例子是飞机发动机的数字双胞胎,基于大量传感器数据和飞行日志的数据库进行训练。
lected across a fleet of engines (Aviation Week Network 2019; Sieger 2019). Other cases are data-poor, and the digital twin will be called on to issue predictions in extrapolatory regimes that go well beyond the available data. In these cases, a model-centric view of a digital twin (Figure 3-1) is appropriate-a mathematical model and its associated numerical model form the core of the digital twin, and data are assimilated through the lens of these models. Examples include climate digital twins, where observations are typically spatially sparse and predictions may extend decades into the future (NASEM 2023a), and cancer patient digital twins, where observations are typically temporally sparse and the increasingly patient-specific and complex nature of diseases and therapies requires predictions of patient responses that go beyond available data (Yankeelov 2023). In these data-poor situations, the models play a greater role in determining digital twin fidelity. As discussed in the next section, an important need is to advance hybrid modeling approaches that leverage the synergistic strengths of data-driven and model-driven digital twin formulations. 在一系列发动机中收集的数据(航空周刊网络 2019;Sieger 2019)。其他案例数据稀缺,数字双胞胎将被用于在超出可用数据的外推范围内发出预测。在这些情况下,数字双胞胎的模型中心视角(图 3-1)是合适的——数学模型及其相关的数值模型构成数字双胞胎的核心,数据通过这些模型的视角进行同化。例子包括气候数字双胞胎,其中观察通常在空间上稀疏,预测可能延伸到几十年以后(NASEM 2023a),以及癌症患者数字双胞胎,其中观察通常在时间上稀疏,疾病和治疗的日益个性化和复杂性要求对患者反应的预测超出可用数据(Yankeelov 2023)。在这些数据稀缺的情况下,模型在确定数字双胞胎的真实性方面发挥更大作用。如下一节所讨论的,一个重要的需求是推进混合建模方法,利用数据驱动和模型驱动的数字双胞胎构造的协同优势。
MULTISCALE MODELING NEEDS AND OPPORTUNITIES FOR DIGITAL TWINS 多尺度建模对数字双胞胎的需求和机遇
A fundamental challenge for digital twins is the vast range of spatial and temporal scales that the virtual representation may need to address. The following section describes research opportunities for modeling across scales in support of digital twins and the need to integrate empirical and mechanistic methods for 数字双胞胎面临的一个基本挑战是虚拟表示可能需要处理的广泛空间和时间尺度。以下部分描述了支持数字双胞胎的跨尺度建模研究机会,以及整合经验方法和机制方法的必要性。
FIGURE 3-1 Conceptualizing a digital twin: data-centric and model-centric views. In data-rich settings, the data form the core of the digital twin, while in data-poor settings, mathematical models play a more important role. 图 3-1 数字双胞胎的概念化:以数据为中心和以模型为中心的视角。在数据丰富的环境中,数据构成数字双胞胎的核心,而在数据贫乏的环境中,数学模型则发挥更重要的作用。
SOURCE: Courtesy of Karen Willcox. 来源:凯伦·威尔科克斯提供。
hybrid approaches to leverage the best of both data-driven and model-driven digital twin formulations. 混合方法利用数据驱动和模型驱动数字双胞胎模型的最佳优势。
The Predictive Power of Digital Twins Requires Modeling Across Scales 数字双胞胎的预测能力需要跨尺度建模
For many applications, the models that underlie the digital twin virtual representation must represent the behavior of the system across a wide range of spatial and temporal scales. For systems with a wide range of scales on which there are significant nonlinear scale interactions, it may be impossible to represent explicitly in a digital model the full richness of behavior at all scales and including all interactions. For example, the Earth's atmosphere and oceans are components of the Earth system, and their instantaneous and statistical behaviors are described respectively as weather and climate. These behaviors exhibit a wide range of variability on both spatial scales (from millimeters to tens of thousands of kilometers) and temporal scales (from seconds to centuries). Similarly, relevant dynamics in biological systems range from nanometers to meters in spatial scales and from milliseconds to years in temporal scales. In biomedical systems, modeling requirements range across scales from the molecular to the whole-body physiology and pathophysiology to populations. Temporal ranges in nanoseconds represent biochemical reactions, signaling pathways, gene expression, and cellular processes such as redox reactions or transient protein modifications. These events underpin the larger-scale interactions between cells, tissues, and organs; multiple organs and systems converge to address disease and non-disease states. 对于许多应用,支撑数字双胞胎虚拟表示的模型必须在广泛的空间和时间尺度上代表系统的行为。对于具有广泛尺度且存在显著非线性尺度相互作用的系统,可能无法在数字模型中明确表示所有尺度和所有相互作用的行为的丰富性。例如,地球的大气和海洋是地球系统的组成部分,它们的瞬时和统计行为分别被描述为天气和气候。这些行为在空间尺度(从毫米到数万公里)和时间尺度(从秒到世纪)上表现出广泛的变异性。同样,生物系统中的相关动态在空间尺度上从纳米到米,在时间尺度上从毫秒到年。在生物医学系统中,建模要求跨越从分子到全身生理和病理生理再到人群的各个尺度。 时间范围以纳秒为单位,代表生化反应、信号通路、基因表达以及细胞过程,如氧化还原反应或瞬态蛋白质修饰。这些事件支撑着细胞、组织和器官之间的大规模相互作用;多个器官和系统汇聚在一起以应对疾病和非疾病状态。
Numerical models of many engineering systems in energy, transportation, and aerospace sectors also span a range of temporal and spatial resolutions, and complexity owing to multiphysics phenomena (e.g., chemical reactions, heat transfer, phase change, unsteady flow/structure interactions) and resolution of intricate geometrical features. In weather and climate simulations, as well as in many engineered and biomedical systems, system behavior is explicitly modeled across a limited range of scales-typically, from the largest scale to an arbitrary cutoff scale determined by available modeling resources-and the remaining (small) scales are represented in a parameterized form. Fortunately, in many applications, the smaller unresolved scales are known to be more universal than the large-scale features and thus more amenable to phenomenological parameterization. Even so, a gap remains between the scales that can be simulated and actionable scales. 许多工程系统的数值模型在能源、交通和航空航天领域也跨越了多种时间和空间分辨率,以及由于多物理现象(例如,化学反应、热传递、相变、非稳态流动/结构相互作用)和复杂几何特征的分辨率而导致的复杂性。在天气和气候模拟中,以及在许多工程和生物医学系统中,系统行为在有限的尺度范围内被明确建模——通常,从最大尺度到由可用建模资源确定的任意截止尺度——而其余(小)尺度则以参数化形式表示。幸运的是,在许多应用中,较小的未解决尺度被认为比大尺度特征更具普遍性,因此更适合现象学参数化。即便如此,可模拟的尺度与可操作尺度之间仍然存在差距。
An additional challenge is that as finer scales are resolved and a given model achieves greater fidelity to the physical counterpart it simulates, the computational and data storage/analysis requirements increase. This limits the applicability of the model for some purposes, such as uncertainty quantification, probabilistic prediction, scenario testing, and visualization. As a result, the demarcation between resolved and unresolved scales is often determined by computational constraints 一个额外的挑战是,随着更细尺度的解析和给定模型对其模拟的物理对应物的更高保真度,计算和数据存储/分析的需求增加。这限制了模型在某些目的上的适用性,例如不确定性量化、概率预测、情景测试和可视化。因此,解析尺度和未解析尺度之间的划分通常由计算限制决定。
rather than a priori scientific considerations. Another challenge to increasing resolution is that the scale interactions may enter a different regime as scales change. For example, in atmospheric models, turbulence is largely two-dimensional at scales larger than 10 km and largely three-dimensional at scales smaller than 10 km ; the behavior of fluid-scale interactions fundamentally changes as the model grid is refined. 而不是先验的科学考虑。提高分辨率的另一个挑战是,随着尺度的变化,尺度之间的相互作用可能进入不同的状态。例如,在大气模型中,尺度大于 10 公里时,湍流主要是二维的,而尺度小于 10 公里时,湍流主要是三维的;随着模型网格的细化,流体尺度相互作用的行为会发生根本变化。
Thus, there are incentives to drive modeling for digital twins in two directions: toward resolution of finer scales to achieve greater realism and fidelity on the one hand, and toward simplifications to achieve computational tractability on the other. There is a motivation to do both by increasing model resolution to acquire data from the most realistic possible model that can then be mined to extract a more tractable model that can be used as appropriate. 因此,推动数字双胞胎建模的动力有两个方向:一方面是朝着更细的尺度进行建模,以实现更大的真实感和保真度,另一方面是朝着简化方向,以实现计算的可处理性。通过提高模型分辨率以获取来自最真实模型的数据,从而提取出一个更易处理的模型,这两者都有动机去做。
Finding 3-2: Different applications of digital twins drive different requirements for modeling fidelity, data, precision, accuracy, visualization, and time-to-solution, yet many of the potential uses of digital twins are currently intractable to realize with existing computational resources. 发现 3-2:数字双胞胎的不同应用驱动了对建模保真度、数据、精度、准确性、可视化和解决时间的不同要求,但目前许多数字双胞胎的潜在用途在现有计算资源下是无法实现的。
Finding 3-3: Often, there is a gap between the scales that can be simulated and actionable scales. It is necessary to identify the intersection of simulated and actionable scales in order to support optimizing decisions. The demarcation between resolved and unresolved scales is often determined by available computing resources, not by a priori scientific considerations. 发现 3-3:通常,能够模拟的尺度与可操作的尺度之间存在差距。必须识别模拟尺度与可操作尺度的交集,以支持优化决策。已解决尺度与未解决尺度之间的划分通常由可用的计算资源决定,而不是由先验的科学考虑。
Recommendation 3: In crafting research programs to advance the foundations and applications of digital twins, federal agencies should create mechanisms to provide digital twin researchers with computational resources, recognizing the large existing gap between simulated and actionable scales and the differing levels of maturity of high-performance computing across communities. 建议三:在制定研究计划以推进数字双胞胎的基础和应用时,联邦机构应建立机制,为数字双胞胎研究人员提供计算资源,认识到模拟与可操作规模之间存在的巨大差距,以及各个社区高性能计算的成熟程度差异。
Finding 3-4: Advancing mathematical theory and algorithms in both datadriven and multiscale physics-based modeling to reduce computational needs for digital twins is an important complement to increased computing resources. 发现 3-4:在数据驱动和多尺度基于物理的建模中推进数学理论和算法,以减少数字双胞胎的计算需求,是对增加计算资源的重要补充。
Hybrid Modeling Combining Mechanistic Models and Machine Learning 混合建模:结合机制模型和机器学习
Hybrid modeling approaches-synergistic combinations of empirical and mechanistic modeling approaches that leverage the best of both data-driven and model-driven formulations-were repeatedly emphasized during this study's information gathering (NASEM 2023a,b,c). This section provides some examples of how hybrid modeling approaches can address digital twin modeling challenges. 混合建模方法——经验建模和机制建模方法的协同组合,利用数据驱动和模型驱动公式的最佳优势——在本研究的信息收集过程中被反复强调(NASEM 2023a,b,c)。本节提供了一些示例,说明混合建模方法如何应对数字双胞胎建模挑战。
In biology, modeling organic living matter requires the integration of biological, chemical, and even electrical influences that stimulate or inhibit the living material response. For many biomedical applications, this requires the incorporation of smaller-scale biological phenomena that influence the dynamics of the larger-scale system and results in the need for multiphysics, multiscale modeling. Incorporating multiple smaller-scale phenomena allows modelers to observe the impact of these underlying mechanisms at a larger scale, but resolving the substantial number of unknown parameters to support such an approach is challenging. Data-driven modeling presents the ability to utilize the growing volume of biological and biomedical data to identify correlations and generate inferences about the behavior of these biological systems that can be tested experimentally. This synergistic use of data-driven and multiscale modeling approaches in biomedical and related fields is illustrated in Figure 3-2. 在生物学中,建模有机活体物质需要整合生物、化学甚至电气影响,这些影响刺激或抑制活体材料的反应。对于许多生物医学应用,这需要纳入影响大规模系统动态的小规模生物现象,从而导致对多物理场和多尺度建模的需求。整合多个小规模现象使建模者能够观察这些基础机制在更大尺度上的影响,但解决支持这种方法的大量未知参数是具有挑战性的。数据驱动建模提供了利用日益增长的生物和生物医学数据的能力,以识别相关性并生成关于这些生物系统行为的推论,这些推论可以通过实验进行验证。在生物医学及相关领域中,数据驱动和多尺度建模方法的协同使用在图 3-2 中得到了说明。
Advances in hybrid modeling in the Earth sciences are following similar lines. Models for weather prediction or climate simulation must solve multiscale and multiphysics problems that are computationally intractable at the necessary level of fidelity, as described above. Over the past several decades of work in developing atmospheric, oceanic, and Earth system models, the unresolved scales have been represented by parameterizations that are based on conceptual models of the relevant unresolved processes. With the explosion of Earth system observations from remote sensing platforms in recent years, this approach has been modified to incorporate ML methods to relate the behavior of unresolved processes to that of resolved processes. There are also experiments in replacing entire Earth system components with empirical artificial intelligence (AI) components. Furthermore, the use of ensemble modeling to approximate probability distributions invites the use of ML techniques, often in a Bayesian framework, to cull ensemble members that are less accurate or to define clusters of solutions that simplify the application to decision-making. 在地球科学中的混合建模进展遵循类似的方向。天气预测或气候模拟模型必须解决多尺度和多物理问题,这些问题在所需的精度水平上是计算上不可处理的。如上所述,在过去几十年中开发大气、海洋和地球系统模型的工作中,未解决的尺度通过基于相关未解决过程的概念模型的参数化来表示。近年来,随着来自遥感平台的地球系统观测的激增,这种方法已被修改,以结合机器学习(ML)方法,将未解决过程的行为与已解决过程的行为联系起来。还有一些实验是用经验人工智能(AI)组件替代整个地球系统组件。此外,使用集合建模来近似概率分布也促使了机器学习技术的使用,通常是在贝叶斯框架下,以剔除不太准确的集合成员或定义简化决策应用的解决方案集群。
In climate and engineering applications, the potential for hybrid modeling to underpin digital twins is significant. In addition to modeling across scales as described above, hybrid models can help provide understandability and explainability. Often, a purely data-driven model can identify a problem or potential opportunity without offering an understanding of the root cause. Without this understanding, decisions related to the outcome may be less useful. The combination of data and mechanistic models comprising a hybrid model can help mitigate this problem. The aerospace industry has developed hybrid digital twin solutions that can analyze large, diverse data sets associated with part failures in aircraft engines using the data-driven capabilities of the hybrid model (Deshmukh 2022). Additionally, these digital twin solutions can provide root cause analysis indicators using the mechanistic-driven capabilities of the hybrid model. 在气候和工程应用中,混合建模在支撑数字双胞胎方面的潜力是显著的。除了如上所述的跨尺度建模,混合模型还可以帮助提供可理解性和可解释性。通常,纯数据驱动模型可以识别问题或潜在机会,但无法提供根本原因的理解。没有这种理解,与结果相关的决策可能会不太有用。由数据和机械模型组合而成的混合模型可以帮助缓解这个问题。航空航天行业已经开发出混合数字双胞胎解决方案,能够利用混合模型的数据驱动能力分析与飞机发动机部件故障相关的大型多样化数据集(Deshmukh 2022)。此外,这些数字双胞胎解决方案还可以利用混合模型的机械驱动能力提供根本原因分析指标。
However, there are several gaps in hybrid modeling approaches that need to be addressed to realize the full potential value of these digital twin solutions. These gaps exist in five major areas: (1) data quality, availability, and affordabil- 然而,混合建模方法中存在几个需要解决的差距,以实现这些数字双胞胎解决方案的全部潜在价值。这些差距主要存在于五个领域:(1)数据质量、可用性和可负担性。
ity; (2) model coupling and integration; (3) model validation and calibration; (4) uncertainty quantification and model interpretability; and (5) model scalability and management. 城市;(2)模型耦合与集成;(3)模型验证与校准;(4)不确定性量化与模型可解释性;(5)模型可扩展性与管理。
Data quality, availability, and affordability can be challenging in biomedical, climate, and engineering applications as obtaining accurate and representative data for model training and validation at an affordable price is difficult. Prior data collected may have been specific to certain tasks, limited by the cost of capture and storage, or deemed unsuitable for current use due to evolving environments and new knowledge. Addressing data gaps based on the fit-for-purpose requirements of the digital twin and an analysis of current available data is crucial. Minimizing the need for large sample sizes and designing methodologies to learn robustly from data sets with few samples would also help overcome these barriers. AI methods might be developed to predict a priori what amount and type of data are needed to support the virtual counterpart. 数据质量、可用性和可负担性在生物医学、气候和工程应用中可能面临挑战,因为以可承受的价格获取准确和具有代表性的数据用于模型训练和验证是困难的。之前收集的数据可能特定于某些任务,受到捕获和存储成本的限制,或因环境变化和新知识而被认为不适合当前使用。根据数字双胞胎的适用性要求和对当前可用数据的分析来解决数据缺口至关重要。最小化对大样本量的需求,并设计方法论以从样本较少的数据集中稳健学习,也将有助于克服这些障碍。可以开发人工智能方法来预测支持虚拟对应物所需的数据量和类型。
Combining data-driven models with mechanistic models requires effective coupling techniques to facilitate the flow of information (data, variables, etc.) between the models while understanding the inherent constraints and assumptions of each model. Coupling is complex in many cases, and model integration is even more so as it involves creating a single comprehensive model that represents the features and behaviors of both the data-driven and the mechanistic-driven model within a coherent framework. Both integration and coupling techniques require harmonizing different scales, assumptions, constraints, and equations, and understanding their implications on the uncertainty associated with the outcome. Matching well-known, model-driven digital twin representations with uncharacterized data-driven models requires attention to how the various levels of fidelity comprised in these models interact with each other in ways that may result in unanticipated overall digital twin behavior and inaccurate representation at the macro level. Another gap lies in the challenge of choosing the specific data collection points to adequately represent the effects of the less-characterized elements and augment the model-driven elements without oversampling the behavior already represented in the model-driven representations. Finally, one can have simulations that produce a large data set (e.g., a space-time field where each solution field is of high dimension) but only relatively few ensembles. In such cases, a more structured statistical model may be required to combine simulations and observations. 将数据驱动模型与机械模型结合需要有效的耦合技术,以促进模型之间信息(数据、变量等)的流动,同时理解每个模型固有的约束和假设。在许多情况下,耦合是复杂的,而模型集成则更为复杂,因为它涉及创建一个综合模型,以在一个连贯的框架内表示数据驱动模型和机械驱动模型的特征和行为。集成和耦合技术都需要协调不同的尺度、假设、约束和方程,并理解它们对结果不确定性的影响。将众所周知的模型驱动数字双胞胎表示与未表征的数据驱动模型匹配,需要关注这些模型中不同保真度级别如何相互作用,以可能导致意想不到的整体数字双胞胎行为和宏观层面不准确的表示。 另一个差距在于选择特定的数据收集点,以充分代表较少特征元素的影响,并增强模型驱动元素,而不对模型驱动表示中已经代表的行为进行过采样。最后,可以进行产生大量数据集的模拟(例如,一个时空场,其中每个解场的维度很高),但只有相对较少的集合。在这种情况下,可能需要一个更结构化的统计模型来结合模拟和观测。
Model validation is another evident gap that needs to be overcome given the diverse nature of the involved data-driven and mechanistic models and their underlying assumptions. Validating data-driven models heavily relies on having sufficient and representative validation data for training as well as evaluating the accuracy of the outcome and the model's generalizability to new data. On the other hand, mechanistic-driven models heavily rely on calibration and parameter estimation to accurately reproduce against experimental and independent data. The validation and calibration processes for these hybrid models must be harmonized to ensure the accuracy and reliability required in these solutions. 模型验证是另一个明显的差距,需要克服,因为涉及的数据驱动模型和机械模型的多样性及其基本假设。验证数据驱动模型在很大程度上依赖于拥有足够且具有代表性的验证数据,以便进行训练以及评估结果的准确性和模型对新数据的泛化能力。另一方面,机械驱动模型在很大程度上依赖于校准和参数估计,以准确地与实验和独立数据进行对比。这些混合模型的验证和校准过程必须协调,以确保这些解决方案所需的准确性和可靠性。
Uncertainty quantification and model explainability and interpretability are significant gaps associated with hybrid systems. These systems must accurately account for uncertainties arising from both the data-driven and mechanisticdriven components of the model. Uncertainties can arise from various factors related to both components, including data limitations and quality, model assumptions, and parameter estimation. Addressing how these uncertainties are quantified and propagated through the hybrid model is another gap that must be tackled for robust predictions. Furthermore, interpreting and explaining the outcomes may pose a significant challenge, particularly in complex systems. 不确定性量化以及模型的可解释性和可理解性是与混合系统相关的重要缺口。这些系统必须准确考虑来自数据驱动和机制驱动模型组件的各种不确定性。不确定性可能源于与这两个组件相关的各种因素,包括数据限制和质量、模型假设以及参数估计。解决这些不确定性如何在混合模型中量化和传播是另一个必须解决的缺口,以实现稳健的预测。此外,解释和说明结果可能会带来重大挑战,特别是在复杂系统中。
Finally, many hybrid models associated with biomedical, climate, and engineering problems can be computationally demanding and require unique skill sets. Striking a balance between techniques that manage the computational complexity of mechanistic models (e.g., parallelization and model simplification) and techniques used in data-driven models (e.g., graphics processing unit coding, pruning, and model compression) is essential. Furthermore, hybrid approaches require that domain scientists either learn details of computational complexity and data-driven techniques or partner with additional researchers to experiment with hybrid digital twins. Resolving how to achieve this combination and balance at a feasible and affordable level is a gap that needs to be addressed. Additionally, the model will need to be monitored and updated as time and conditions change and errors in the system arise, requiring the development of model management capabilities. 最后,许多与生物医学、气候和工程问题相关的混合模型在计算上可能非常复杂,并且需要独特的技能组合。在管理机制模型的计算复杂性(例如,平行化和模型简化)与数据驱动模型中使用的技术(例如,图形处理单元编码、剪枝和模型压缩)之间取得平衡至关重要。此外,混合方法要求领域科学家要么学习计算复杂性和数据驱动技术的细节,要么与其他研究人员合作,实验混合数字双胞胎。解决如何在可行和可负担的水平上实现这种组合和平衡是一个需要解决的空白。此外,随着时间和条件的变化以及系统中错误的出现,模型需要进行监控和更新,这需要开发模型管理能力。
While hybrid modeling provides an attractive path forward to address digital twin modeling needs, simply crafting new hybrid models that better match available data is insufficient. The development of hybrid modeling approaches for digital twins requires rigorous verification, validation, and uncertainty quantification (VVUQ), including the quantification of uncertainty in extrapolatory conditions. If the hybrid modeling is done in a way that the data-driven components of the model are continually updated, then these updating methods also require associated VVUQ. Another challenge is that in many high-value contexts, digital twins need to represent both typical operating conditions and anomalous operating conditions, where the latter may entail rare or extreme events. As noted in Conclusion 2-2, a gap exists between the class of problems that has been considered in VVUQ for traditional modeling and simulation settings and the VVUQ problems that will arise for digital twins. Hybrid models-in particular those that infuse some form of black-box deep learning-represent a particular gap in this regard. 虽然混合建模为解决数字双胞胎建模需求提供了一个有吸引力的前进方向,但仅仅制作更好匹配可用数据的新混合模型是不够的。数字双胞胎的混合建模方法的开发需要严格的验证、确认和不确定性量化(VVUQ),包括在外推条件下的不确定性量化。如果混合建模的方式是持续更新模型的数据驱动组件,那么这些更新方法也需要相关的 VVUQ。另一个挑战是,在许多高价值的背景下,数字双胞胎需要同时表示典型操作条件和异常操作条件,而后者可能涉及罕见或极端事件。如结论 2-2 所述,传统建模和仿真环境中 VVUQ 所考虑的问题类别与数字双胞胎将出现的 VVUQ 问题之间存在差距。混合模型,特别是那些融入某种形式的黑箱深度学习的模型,在这方面代表了一个特定的差距。
Finding 3-5: Hybrid modeling approaches that combine data-driven and mechanistic modeling approaches are a productive path forward for meeting the modeling needs of digital twins, but their effectiveness and practical use are limited by key gaps in theory and methods. 发现 3-5:结合数据驱动和机制建模方法的混合建模方法是满足数字双胞胎建模需求的有效途径,但其有效性和实际应用受到理论和方法中的关键差距的限制。
INTEGRATING COMPONENT AND SUBSYSTEM DIGITAL TWINS 集成组件和子系统数字双胞胎
The extent to which the virtual representation will integrate component and subsystem models is an important consideration in modeling digital twins. A digital twin of a system of systems will likely couple multiple constituent digital twins. Integration of models and data to this extent goes beyond what is done routinely and entails a number of foundational mathematical and computational challenges. In addition to the software challenge of coupling models and solvers, VVUQ tasks and the determination of fitness for purpose become much more challenging in the coupled setting. 虚拟表示将集成组件和子系统模型的程度是建模数字双胞胎时的重要考虑因素。一个系统的数字双胞胎可能会耦合多个组成数字双胞胎。模型和数据的这种程度的集成超出了常规做法,并涉及许多基础数学和计算挑战。除了耦合模型和求解器的软件挑战外,在耦合环境中,VVUQ 任务和适用性评估变得更加困难。
Modeling of a complex system often requires coupling models of different components/subsystems of the system, which presents additional challenges beyond modeling of the individual components/subsystems. For example, Earth system models couple models of atmosphere, land surface, river, ocean, sea ice, and land ice to represent interactions among these subsystems that determine the internal variability of the system and its response to external forcing. Component models that are calibrated individually to be fit for purpose when provided with observed boundary conditions of the other components may behave differently when the component models are coupled together due to error propagation and nonlinear feedback between the subsystems. This is particularly the case when models representing the different components/subsystems have different fidelity or mathematical forms, necessitating the need for additional mathematical operations such as spatiotemporal filtering, which adds uncertainty in the coupled model. 复杂系统的建模通常需要将系统不同组件/子系统的模型耦合在一起,这带来了超出单个组件/子系统建模的额外挑战。例如,地球系统模型耦合了大气、陆地表面、河流、海洋、海冰和陆地冰的模型,以表示这些子系统之间的相互作用,这些相互作用决定了系统的内部变异性及其对外部强迫的响应。当组件模型在提供其他组件的观测边界条件时,单独校准以适应目的,但在耦合在一起时,由于子系统之间的误差传播和非线性反馈,可能会表现出不同的行为。当表示不同组件/子系统的模型具有不同的保真度或数学形式时,尤其如此,这就需要额外的数学操作,如时空滤波,这增加了耦合模型中的不确定性。
Another example is the coupling of human system models with Earth system models, which often differ in model fidelity as well as in mathematical forms. Furthermore, in the context of digital twins, some technical challenges remain in coupled model data assimilation, such as properly initializing each component model. Additional examples of the integration of components are shown in Box 3-1. 另一个例子是人类系统模型与地球系统模型的耦合,这两者在模型精度和数学形式上往往存在差异。此外,在数字双胞胎的背景下,耦合模型数据同化仍然面临一些技术挑战,例如正确初始化每个组件模型。组件集成的其他示例见于框 3-1。
Interoperability of software and data are a challenge across domains and pose a particular challenge when integrating component and subsystem digital twins. Semantic and syntactic interoperability, in which data are exchanged between and understood by the different systems, can be challenging given the possible difference in the systems. Furthermore, assumptions made in one model can be distinct from the assumptions made in other models. Some communities have established approaches to reducing interoperability -for example, though the use of shared standards for data, software, and models, or through the use of software templates - and this is a critical aspect of integrating complex digital twin models. 软件和数据的互操作性在各个领域都是一个挑战,尤其在集成组件和子系统数字双胞胎时更是如此。语义和语法互操作性,即数据在不同系统之间交换并被理解,可能会因系统之间的差异而变得困难。此外,一个模型中所做的假设可能与其他模型中的假设截然不同。一些社区已经建立了减少互操作性问题的方法,例如通过使用共享的数据、软件和模型标准,或通过使用软件模板,这对于集成复杂的数字双胞胎模型至关重要。
Finding 3-6: Integration of component/subsystem digital twins is a pacing item for the digital twin representation of a complex system, especially if different fidelity models are used in the digital twin representation of its components/subsystems. 发现 3-6:组件/子系统数字双胞胎的集成是复杂系统数字双胞胎表示的一个关键因素,特别是在其组件/子系统的数字双胞胎表示中使用不同精度模型时。
BOX 3-1 Examples of the Integration of Components 盒子 3-1 组件集成的示例
Gas Turbine Engine 燃气涡轮发动机
Gas turbines are used as propulsion devices in aviation and for electric power generation. Numerical simulation of the aerothermal flow through the entire gas turbine engine involves many different physical processes, which are described using different models and even different computer codes. Simulation of different modules (e.g., compressors, combustor, and turbines) separately requires imposition of (artificial) boundary conditions at the component interfaces, which are not known a priori in detail and can result in missing crucial interactions between components such as thermoacoustic instabilities. In an integrated simulation, the interaction between modules requires exchange of information between the participating solvers. Automation of this exchange requires a coupler software that manages the required exchange of information between the solvers in a seamless and efficient manner. 燃气涡轮被用作航空推进装置和电力生成。对整个燃气涡轮发动机的气热流动进行数值模拟涉及许多不同的物理过程,这些过程使用不同的模型甚至不同的计算机代码进行描述。单独模拟不同模块(例如,压缩机、燃烧室和涡轮)需要在组件接口施加(人工)边界条件,而这些条件在事先并不详细已知,可能导致组件之间缺失关键的相互作用,例如热声不稳定性。在集成模拟中,模块之间的相互作用需要参与求解器之间的信息交换。自动化这一交换需要一个耦合软件,以无缝高效地管理求解器之间所需的信息交换。
Human Cardiac System 人类心脏系统
Integrated simulation of blood flow through the cardiac system involves a range of parameters. Models can capture genetic base characteristics, genetic variations, gene expression, and molecular interactions at the cellular and tissue levels to understand how specific genetic factors influence physiological processes and disease susceptibility. Structural information collected by imaging technology (e.g., magnetic resonance imaging, computed tomography scans) provides anatomical orientation of chambers, valves, and major blood vessels. Electrical activity of the heart captures the generation and propagation of electrical signals that coordinate the contraction of cardiac muscle cells. Models based on the Hodgkin-Huxley equations or other electrophysiological models are utilized to replicate the cardiac action potential and activation patterns. Mechanical aspects involve modeling the contraction and relaxation of cardiac muscle cells using parameters such as ventricular pressure, myocardial deformation, and valve dynamics. Hemodynamic models use computational fluid dynamics to simulate blood flow within the cardiac system (blood pressure, flow rates, and resistance), accounting for the interaction between the heart and the vasculature. Techniques can be employed to simulate blood flow patterns. Modeling the interaction between blood flow and the heart tissue captures the effects of fluid-structure interaction. The digital twin can incorporate regulatory mechanisms that control heart rate, blood pressure, and other physiological variables that maintain homeostasis and response mechanisms. However, each of these parameters is subject to multiple uncertainties: physiological or genetic parameters may vary between individuals; input data may be unreliable (e.g., imaging resolution); and experimental validation may contain measurement noise or other capture limitations. 心脏系统血流的综合模拟涉及一系列参数。模型可以捕捉遗传基础特征、遗传变异、基因表达以及细胞和组织层面的分子相互作用,以理解特定遗传因素如何影响生理过程和疾病易感性。通过成像技术(例如,磁共振成像、计算机断层扫描)收集的结构信息提供了心腔、瓣膜和主要血管的解剖定位。心脏的电活动捕捉了协调心肌细胞收缩的电信号的产生和传播。基于霍奇金-赫克斯利方程或其他电生理模型的模型被用来复制心脏动作电位和激活模式。机械方面涉及使用心室压力、心肌变形和瓣膜动力学等参数对心肌细胞的收缩和放松进行建模。 血流动力学模型使用计算流体动力学来模拟心脏系统内的血流(血压、流量和阻力),考虑心脏与血管之间的相互作用。可以采用技术来模拟血流模式。建模血流与心脏组织之间的相互作用捕捉流体-结构相互作用的影响。数字双胞胎可以结合调节机制,以控制心率、血压和其他维持体内平衡和反应机制的生理变量。然而,这些参数都受到多种不确定性的影响:生理或遗传参数在个体之间可能有所不同;输入数据可能不可靠(例如,成像分辨率);实验验证可能包含测量噪声或其他捕获限制。
SURROGATE MODELING NEEDS AND OPPORTUNITIES FOR DIGITAL TWINS 数字双胞胎的替代模型需求与机遇
Surrogate models play a key role in addressing the computational challenges of digital twins. Surrogate models can be categorized into three types: statistical data-fit models, reduced-order models, and simplified models. 代理模型在解决数字双胞胎的计算挑战中发挥着关键作用。代理模型可以分为三种类型:统计数据拟合模型、降阶模型和简化模型。
Statistical data-fit models use statistical methods to fit approximate inputoutput maps to training data, with the surrogate model employing a generic functional form that does not explicitly reflect the structure of the physical governing equations underlying the numerical simulations. 统计数据拟合模型使用统计方法将近似的输入输出映射拟合到训练数据,代理模型采用一种通用的函数形式,并未明确反映数值模拟背后的物理控制方程的结构。
Reduced-order models incorporate low-dimensional structure learned from training data into a structured form of the surrogate model that reflects the underlying physical governing equations. 降阶模型将从训练数据中学习到的低维结构融入到代理模型的结构形式中,以反映基础物理控制方程。
Simplified models are obtained in a variety of ways, such as coarser grids, simplified physical assumptions, and loosened residual tolerances. 简化模型通过多种方式获得,例如粗糙网格、简化的物理假设和放宽的残差容忍度。
Surrogate modeling is a broad topic, with many applications beyond digital twins. This section focuses on unique challenges that digital twins pose to surrogate modeling and the associated foundational gaps in surrogate modeling methods. A first challenge is the scale at which surrogate modeling will be needed. Digital twins by their nature may require modeling at the full system scale, with models involving multiple disciplines, covering multiple system components, and described by parameter spaces of high dimensions. A second challenge is the critical need for VVUQ of surrogate models, recognizing the uncertain conditions under which digital twins will be called on to make predictions, often in extrapolatory regimes. A third challenge relates to the dynamic updating and adaptation that is key to the digital twin concept. Each one of these challenges highlights gaps in the current state of the art in surrogate modeling, as the committee discusses in more detail in the following. 代理建模是一个广泛的话题,具有超出数字双胞胎的许多应用。本节重点关注数字双胞胎对代理建模提出的独特挑战以及代理建模方法中相关的基础性缺口。第一个挑战是代理建模所需的规模。数字双胞胎本质上可能需要在整个系统规模上进行建模,模型涉及多个学科,涵盖多个系统组件,并由高维参数空间描述。第二个挑战是对代理模型进行验证、验证和不确定性量化(VVUQ)的关键需求,认识到数字双胞胎在进行预测时所面临的不确定条件,通常是在外推范围内。第三个挑战与动态更新和适应性有关,这是数字双胞胎概念的关键。这些挑战突显了当前代理建模技术的不足,委员会将在后文中更详细地讨论。
Surrogate modeling is an enabler for computationally efficient digital twins, but there is a limited understanding of trade-offs associated with collections of surrogate models operating in tandem in digital twins, the effects of multiphysics coupling on surrogate model accuracy, performance in high-dimensional settings, surrogate model VVUQ-especially in extrapolatory regimes-and, for datadriven surrogates, costs of generating training data and learning. 代理建模是实现计算高效数字双胞胎的关键,但对于在数字双胞胎中协同工作的代理模型集合的权衡、物理多重耦合对代理模型准确性的影响、高维环境中的性能、代理模型的验证、确认和不确定性(VVUQ)——尤其是在外推领域——以及对于数据驱动的代理模型,生成训练数据和学习的成本,理解仍然有限。
Surrogate Modeling for High-Dimensional, Complex Multidisciplinary Systems 高维复杂多学科系统的代理建模
State-of-the-art surrogate modeling has made considerable progress for simpler systems but remains an open challenge at the level of complexity needed for digital twins. Multiple interacting disciplines and nonlinear coupling among 先进的替代建模在简单系统上取得了显著进展,但在数字双胞胎所需的复杂性水平上仍然是一个未解决的挑战。多个相互作用的学科和非线性耦合之间的关系。
disciplines, as needed in a digital twin, pose a particular challenge for surrogate modeling. The availability of accurate and computationally efficient surrogate models depends on the ability to identify and exploit structure that is amenable to approximation. For example, reduced-order modeling may exploit low-rank structure in a way that permits dynamics to be evolved in a low-dimensional manifold or coarse-graining of only a subset of features, while statistical data-fit methods exploit the computational efficiencies of representing complex dynamics with a surrogate input-output map, such as a Gaussian process model or deep neural network. A challenge with coupled multidisciplinary systems is that coupling is often a key driver of dynamics - that is, the essential system dynamics can change dramatically due to coupling effects. 在数字双胞胎中所需的学科对替代建模提出了特别的挑战。准确且计算高效的替代模型的可用性取决于识别和利用适合近似的结构的能力。例如,降阶建模可以利用低秩结构,以便在低维流形中演化动态,或仅对部分特征进行粗粒化,而统计数据拟合方法则利用用替代输入输出映射(如高斯过程模型或深度神经网络)表示复杂动态的计算效率。耦合多学科系统的一个挑战是,耦合往往是动态的关键驱动因素——也就是说,基本系统动态可能因耦合效应而发生剧烈变化。
One example of this is Earth system models that must represent the dynamics of the atmosphere, ocean, sea ice, land surface, and cryosphere, all of which interact with each other in complex, nonlinear ways that result in interactions of processes occurring across a wide range of spatial and temporal scales. The interactions involve fluxes of mass, energy (both heat and radiation), and momentum that are dependent on the states of the various system components. Yet in many cases, the surrogate models are derived for the individual model components separately, and then coupled. 一个例子是地球系统模型,这些模型必须表示大气、海洋、海冰、陆地表面和冰冻圈的动态,这些部分以复杂的非线性方式相互作用,导致在广泛的空间和时间尺度上发生过程的相互作用。这些相互作用涉及质量、能量(包括热量和辐射)和动量的通量,这些通量依赖于各个系统组件的状态。然而,在许多情况下,替代模型是分别为各个模型组件单独推导的,然后再进行耦合。
Surrogate models for coupled systems-whether data-fit or reduced-order models - remain a challenge because even if the individual model components are highly accurate representations of the dynamics and processes in those components, they may lose much of their fidelity when additional degrees of freedom due to coupling with other system components are added. Another set of challenges encompass important mathematical questions around the consistency, stability, and property-preservation attributes of coupled surrogates. A further challenge is ensuring model fidelity and fitness for purpose when multiple physical processes interact. 耦合系统的代理模型——无论是数据拟合模型还是降阶模型——仍然是一个挑战,因为即使单个模型组件对这些组件中的动态和过程的高度准确表示,当由于与其他系统组件的耦合而增加额外的自由度时,它们可能会失去很多保真度。另一组挑战涉及关于耦合代理的一致性、稳定性和属性保持特征的重要数学问题。进一步的挑战是确保在多个物理过程相互作用时模型的保真度和适用性。
Finding 3-7: State-of-the-art literature and practice show advances and successes in surrogate modeling for models that form one discipline or one component of a complex system, but theory and methods for surrogates of coupled multiphysics systems are less mature. 发现 3-7:最新的文献和实践表明,在形成一个学科或复杂系统一个组成部分的代理建模方面取得了进展和成功,但耦合多物理系统的代理理论和方法仍不够成熟。
An additional further challenge in dealing with surrogate models for digital twins of complex multidisciplinary systems is that the dimensionality of the parameter spaces underlying the surrogates can become high. For example, a surrogate model of the structural health of an engineering structure (e.g., building, bridge, airplane wing) would need to be representative over many thousands of material and structural properties that capture variation over space and time. Similarly, a surrogate model of tumor evolution in a cancer patient digital twin would potentially have thousands of parameters representing patient anatomy, physiology, and mechanical properties, again capturing variation over space and 处理复杂多学科系统的数字双胞胎的代理模型时,面临的另一个挑战是代理模型所依赖的参数空间的维度可能会变得很高。例如,工程结构(如建筑、桥梁、飞机机翼)的结构健康代理模型需要在数千种材料和结构属性上具有代表性,这些属性捕捉了空间和时间的变化。同样,癌症患者数字双胞胎中的肿瘤演变代理模型可能会有数千个参数,代表患者的解剖、 physiology 和机械属性,同样捕捉空间和时间的变化。
time. Deep neural networks have shown promise in representing input-output maps even when the input parameter dimension is large, yet generating sufficient training data for these complex problems remains a challenge. As discussed below, notable in the literature is that many apparent successes in surrogate modeling fail to report the cost of training, either for determining parameters in a neural network or in tuning the parameters in a reduced-order model. 时间。深度神经网络在表示输入输出映射方面显示出潜力,即使输入参数维度较大,但为这些复杂问题生成足够的训练数据仍然是一个挑战。如下面所讨论的,文献中值得注意的是,许多表面上的代理建模成功未能报告训练成本,无论是用于确定神经网络中的参数,还是用于调整降阶模型中的参数。
It also remains a challenge to quantify the degree to which surrogate predictions may generalize in a high-dimensional setting. While mathematical advances are revealing rigorous insights into high-dimensional approximation (Cohen and DeVore 2015), this work is largely for a class of problems that exhibit smooth dynamics. Work is needed to bridge the gap between rigorous theory in high-dimensional approximation and the complex models that will underlie digital twins. Another promising set of approaches uses mathematical decompositions to break a high-dimensional problem into a set of coupled smaller-dimension problems. Again, recent advances have demonstrated significant benefits, including in the digital twin setting (Sharma et al. 2018), but these approaches have largely been limited to problems within structural modeling. 在高维环境中量化替代预测的普遍性程度仍然是一个挑战。尽管数学进展正在揭示高维近似的严格见解(Cohen 和 DeVore 2015),但这项工作主要针对表现出平滑动态的一类问题。需要努力弥合高维近似中的严格理论与支撑数字双胞胎的复杂模型之间的差距。另一组有前景的方法使用数学分解将高维问题分解为一组耦合的小维度问题。同样,最近的进展已经显示出显著的好处,包括在数字双胞胎环境中(Sharma 等,2018),但这些方法在很大程度上仅限于结构建模中的问题。
Finding 3-8: Digital twins will typically entail high-dimensional parameter spaces. This poses a significant challenge to state-of-the-art surrogate modeling methods. 发现 3-8:数字双胞胎通常涉及高维参数空间。这对最先进的代理建模方法构成了重大挑战。
Another challenge associated with surrogate models in digital twins is accounting for the data and computational resources needed to develop data-driven surrogates. While the surrogate modeling community has developed several compelling approaches in recent years, analyses of the speedups associated with these approaches in many cases do not account for the time and expense associated with generating training data or using complex numerical solvers at each iteration of the training process. A careful accounting of these elements is essential to understanding the cost-benefit trade-offs associated with surrogate models in digital twins. In tandem, advances in surrogate modeling methods for handling limited training data are needed. 与数字双胞胎中的代理模型相关的另一个挑战是考虑开发数据驱动代理所需的数据和计算资源。尽管代理建模社区近年来开发了几种引人注目的方法,但对这些方法在许多情况下所带来的加速分析并未考虑生成训练数据或在每次训练过程迭代中使用复杂数值求解器所需的时间和费用。仔细考虑这些因素对于理解数字双胞胎中代理模型的成本效益权衡至关重要。同时,需要在处理有限训练数据的代理建模方法上取得进展。
Finding 3-9: One of the challenges of creating surrogate models for highdimensional parameter spaces is the cost of generating sufficient training data. Many papers in the literature fail to properly acknowledge and report the excessively high costs (in terms of data, hardware, time, and energy consumption) of training. 发现 3-9:为高维参数空间创建替代模型的挑战之一是生成足够训练数据的成本。许多文献中的论文未能正确承认和报告训练的过高成本(在数据、硬件、时间和能耗方面)。
Conclusion 3-2: In order for surrogate modeling methods to be viable and scalable for the complex modeling situations arising in digital twins, the cost of surrogate model training, including the cost of generating the training data, must be analyzed and reported when new methods are proposed. 结论 3-2:为了使代理建模方法在数字双胞胎中出现的复杂建模情况下可行且可扩展,必须在提出新方法时分析和报告代理模型训练的成本,包括生成训练数据的成本。
Finally, the committee again emphasizes the importance of VVUQ. As noted above for hybrid modeling, development of new surrogate modeling methods must incorporate VVUQ as an integral component. While data-driven surrogate modeling methods are attractive because they reduce the computational intractability of complex modeling and require limited effort to implement, important questions remain about how well they generalize or extrapolate in realms beyond the experience of their training data. This is particularly relevant in the context of digital twins, where ideally the digital twin would explore "what if" scenarios, potentially far from the domain of the available training data-that is, where the digital twin must extrapolate to previously unseen settings. While incorporating physical models, constraints, and symmetries into data-driven surrogate models may facilitate better extrapolation performance than a generic data-driven approach, there is a lack of fundamental understanding of how to select a surrogate model approach to maximize extrapolation performance beyond empirical testing. Reduced-order models are supported by literature establishing their theoretical properties and developing error estimators for some classes of systems. Extending this kind of rigorous work may enable surrogates to be used for extrapolation with guarantees of confidence. 最后,委员会再次强调 VVUQ 的重要性。如上所述,对于混合建模,新的替代建模方法的开发必须将 VVUQ 作为一个 integral 组成部分。虽然数据驱动的替代建模方法因其减少复杂建模的计算难度和实施所需的有限努力而具有吸引力,但关于它们在超出训练数据经验的领域中如何很好地推广或外推仍然存在重要问题。这在数字双胞胎的背景下尤为相关,理想情况下,数字双胞胎将探索“如果”场景,可能远离可用训练数据的领域——也就是说,数字双胞胎必须外推到以前未见过的设置中。虽然将物理模型、约束和对称性纳入数据驱动的替代模型可能比通用的数据驱动方法促进更好的外推性能,但关于如何选择替代模型方法以最大化超出经验测试的外推性能的基本理解仍然缺乏。 降阶模型得到了文献的支持,建立了它们的理论属性,并为某些类别的系统开发了误差估计器。扩展这种严格的工作可能使得代理模型能够在有信心保证的情况下用于外推。
Data Assimilation, Dynamic Updating, and Adaptation of Surrogate Models 数据同化、动态更新和替代模型的适应性
Dynamic updating and model adaptation are central to the digital twin concept. In many cases, this updating must be done on the fly under computational and time constraints. Surrogates play a role in making this updating computationally feasible. At the same time, the surrogate models themselves must be updated - and correspondingly validated - as the digital twin virtual representation evolves. 动态更新和模型适应是数字双胞胎概念的核心。在许多情况下,这种更新必须在计算和时间限制下实时进行。替代模型在使这种更新在计算上可行方面发挥了作用。同时,随着数字双胞胎虚拟表示的演变,替代模型本身也必须更新并相应地进行验证。
One set of research gaps is around the role of a surrogate model in accelerating digital twin state estimation (data assimilation) and parameter estimation (inverse problem). Challenges surrounding data assimilation and model updating in general are discussed further in Chapter 5. While data assimilation with surrogate models has been considered in some settings, it has not been extended to the scale and complexity required for the digital twin setting. Research at the intersection of data assimilation and surrogate models is an important gap. For example, data assimilation attempts to produce a state estimate by optimally combining observations and model simulation in a probabilistic framework. For data assimilation with a surrogate model to be effective, the surrogate model needs to simulate the state of the physical system accurately enough so that the difference between simulated and observed states is small. Often the parameters in the surrogate model itself are informed by data assimilation, which can introduce circularity of error propagation. 一组研究空白是关于替代模型在加速数字双胞胎状态估计(数据同化)和参数估计(逆问题)中的作用。关于数据同化和模型更新的一般挑战将在第五章中进一步讨论。虽然在某些环境中考虑了使用替代模型进行数据同化,但尚未扩展到数字双胞胎环境所需的规模和复杂性。数据同化与替代模型交叉领域的研究是一个重要的空白。例如,数据同化试图通过在概率框架中最佳结合观测和模型模拟来产生状态估计。为了使替代模型的数据同化有效,替代模型需要足够准确地模拟物理系统的状态,以便模拟状态与观测状态之间的差异较小。通常,替代模型本身的参数是通过数据同化来获取的,这可能引入错误传播的循环性。
A second set of gaps is around adaptation of the surrogate models themselves. Data-fit surrogate models and reduced-order models can be updated as more data become available--an essential feature for digital twins. Entailing multiphysics coupling and high-dimensional parameter spaces as discussed above, the digital twin setting provides a particular challenge to achieving adaptation under computational constraints. Furthermore, the adaptation of a surrogate model will require an associated continual VVUQ workflow - which again must be conducted under computational constraints - so that the adapted surrogate may be used with confidence in the virtual-to-physical digital twin decision-making tasks. 第二组差距在于代理模型本身的适应性。数据拟合的代理模型和降阶模型可以随着更多数据的可用而更新——这是数字双胞胎的一个重要特征。正如上面讨论的那样,涉及多物理场耦合和高维参数空间,数字双胞胎环境在计算约束下实现适应性面临特定挑战。此外,代理模型的适应性将需要一个相关的持续 VVUQ 工作流程——这同样必须在计算约束下进行——以便适应后的代理模型可以在虚拟到物理的数字双胞胎决策任务中自信地使用。
KEY GAPS, NEEDS, AND OPPORTUNITIES 关键差距、需求和机会
In Table 3-1, the committee highlights key gaps, needs, and opportunities for realizing the virtual representation of a digital twin. This is not meant to be an exhaustive list of all opportunities presented in the chapter. For the purposes of this report, prioritization of a gap is indicated by 1 or 2 . While the committee believes all of the gaps listed are of high priority, gaps marked 1 may benefit from initial investment before moving on to gaps marked with a priority of 2. 在表 3-1 中,委员会强调了实现数字双胞胎虚拟表示的关键差距、需求和机会。这并不是本章中所有机会的详尽列表。为了本报告的目的,差距的优先级用 1 或 2 表示。虽然委员会认为所有列出的差距都具有高优先级,但标记为 1 的差距可能在转向标记为 2 的差距之前受益于初始投资。
TABLE 3-1 Key Gaps, Needs, and Opportunities for Realizing the Virtual Representation of a Digital Twin 表 3-1 实现数字双胞胎虚拟表示的关键差距、需求和机会
Maturity 成熟
Early and Preliminary Stages 早期和初步阶段
Increasing the available computing resources for digital twin development and use 增加数字双胞胎开发和使用的可用计算资源
is a necessary element for closing the gap between simulated and actionable scales 是缩小模拟与可操作规模之间差距的必要元素
and for engaging a broader academic community in digital twin research. Certain 并且为了让更广泛的学术界参与数字双胞胎研究。某些
domains and sectors have had more success, such as engineering physics and 领域和行业取得了更多成功,例如工程物理和
sciences, as well as national labs. 科学以及国家实验室。
Model validation and calibration for hybrid modeling are difficult given the 混合建模的模型验证和校准是困难的,因为
diverse nature of the involved data and mechanistic models and their underlying 所涉及数据和机制模型的多样性及其基础
assumptions. Validating data-driven models relies on sufficient and representative 假设。验证数据驱动模型依赖于足够且具有代表性的数据。
validation data for training, evaluation of model accuracy, and evaluation of model 训练的验证数据、模型准确性的评估和模型的评估
generalizability to new data. On the other hand, mechanistic-driven models rely on 对新数据的普遍适用性。另一方面,基于机制的模型依赖于
calibration and parameter estimation to accurately reproduce against experimental 校准和参数估计以准确再现实验结果
and independent data. Harmonizing the validation and calibration processes for 和独立数据。协调验证和校准过程以
these hybrid models is a gap that must be overcome to ensure the required accuracy 这些混合模型的差距必须克服,以确保所需的准确性
and reliability. 和可靠性。
Uncertainty quantification, explainability, and interpretability are often difficult for 不确定性量化、可解释性和可理解性通常很难实现
hybrid modeling as these systems must account for uncertainties arising from both 混合建模,因为这些系统必须考虑来自两方面的不确定性
the data-driven and mechanistic-driven components of the model as well as their 模型的数据驱动和机械驱动组件以及它们的
interplay. Particular areas of need include uncertainty quantification for dynamically 相互作用。特定的需求领域包括动态的不确定性量化。
updated hybrid models, for hybrid models in extrapolative regimes, and for rare 更新的混合模型,用于外推领域的混合模型,以及用于稀有情况的混合模型
or extreme events. Warnings for extrapolations are particularly important in digital 或极端事件。数字中的外推警告尤其重要。
The variety and coupled nature of models employed in digital twins pose particular
challenges to assessing model fitness for purpose. There is a gap between the
complexity of problems for which mathematical theory and scalable algorithms
for error estimation exist and the class of problems that underlies high-impact
applications of digital twins.
2
REFERENCES 参考文献
Ainsworth, M., and J.T. Oden. 1997. "A Posteriori Error Estimation in Finite Element Analysis." Computer Methods in Applied Mechanics and Engineering 142(1-2):1-88. 艾因斯沃斯,M.,和 J.T. 奥登。1997 年。“有限元分析中的后验误差估计。”应用力学与工程计算方法 142(1-2):1-88。
Alber, M., A. Buganza Tepole, W.R. Cannon, S. De, S. Dura-Bernal, K. Garikipati, G. Karniadakis, et al. 2019. "Integrating Machine Learning and Multiscale Modeling-Perspectives, Challenges, and Opportunities in the Biological, Biomedical, and Behavioral Sciences." npj Digital Medicine 2(1):115. Alber, M., A. Buganza Tepole, W.R. Cannon, S. De, S. Dura-Bernal, K. Garikipati, G. Karniadakis, 等. 2019. "整合机器学习与多尺度建模——生物、医学和行为科学中的视角、挑战与机遇." npj 数字医学 2(1):115.
Alonso, J., S. Hahn, F. Ham, M. Herrmann, G. Iaccarino, G. Kalitzin, P. LeGresley, et al. 2006. "CHIMPS: A High-Performance Scalable Module for Multiphysics Simulations." Aerospace Research Council, pp. 1-28. 阿隆索, J., S. 汉, F. 哈姆, M. 赫尔曼, G. 伊亚卡里诺, G. 卡利津, P. 勒格雷斯利, 等. 2006. "CHIMPS: 一种高性能可扩展的多物理场模拟模块." 航空航天研究委员会, 第 1-28 页.
Aviation Week Network. 2019. "Emirates Cuts Unscheduled Engine Removals by One-Third." Aviation Week Network, May 16. https://aviationweek.com/special-topics/optimizing-engines-throughlifecycle/emirates-cuts-unscheduled-engine-removals-one. 航空周刊网络。2019 年。“阿联酋航空将非计划性发动机拆卸减少三分之一。”航空周刊网络,5 月 16 日。https://aviationweek.com/special-topics/optimizing-engines-throughlifecycle/emirates-cuts-unscheduled-engine-removals-one。
Bauer, P., B. Stevens, and W. Hazeleger. 2021. "A Digital Twin of Earth for the Green Transition." Nature Climate Change 11(2):80-83. 鲍尔,P.,B. 史蒂文斯,和 W. 哈泽尔杰。2021。“地球的数字双胞胎以实现绿色转型。”《自然气候变化》11(2):80-83。
Cohen, A., and R. DeVore. 2015. "Approximation of High-Dimensional Parametric PDEs." Acta Numerica 24:1-159. 科恩,A.,和 R. 德沃尔。2015。“高维参数偏微分方程的近似。” 数值学报 24:1-159。
Deshmukh, D. 2022. "Aviation Fleet Management: Transformation Through AI/Machine Learning." Global Gas Turbine News 62(2):56-57. Deshmukh, D. 2022. "航空机队管理:通过人工智能/机器学习的转型。" 全球燃气涡轮新闻 62(2):56-57.
Ferrari, A. 2023. "Building Robust Digital Twins." Presentation to the Committee on Foundational Research Gaps and Future Directions for Digital Twins. April 24. Washington, DC. 法拉利,A. 2023。“构建强大的数字双胞胎。”在数字双胞胎基础研究差距与未来方向委员会的演讲。4 月 24 日。华盛顿特区。
Grieves, M. 2014. "Digital Twin: Manufacturing Excellence Through Virtual Factory Replication." White paper. Michael W. Grieves LLC. 格里夫斯,M. 2014. “数字双胞胎:通过虚拟工厂复制实现制造卓越。” 白皮书。迈克尔·W·格里夫斯有限公司。
Hartmann, D., M. Herz, and U. Wever. 2018. "Model Order Reduction a Key Technology for Digital Twins." Pp. 167-179 in Reduced-Order Modeling (ROM) for Simulation and Optimization: Powerful Algorithms as Key Enablers for Scientific Computing. Cham, Germany: Springer International Publishing. 哈特曼, D., M. 赫兹, 和 U. 韦弗. 2018. "模型阶次减少:数字双胞胎的关键技术." 收录于《用于仿真和优化的降阶建模 (ROM):强大的算法作为科学计算的关键推动力》,第 167-179 页. 德国查姆: 施普林格国际出版.
NASEM (National Academies of Sciences, Engineering, and Medicine). 2023a. Opportunities and Challenges for Digital Twins in Atmospheric and Climate Sciences: Proceedings of a Workshop-in Brief. Washington, DC: The National Academies Press. NASEM(国家科学院、工程院和医学院)。2023a。气候与大气科学中数字双胞胎的机遇与挑战:研讨会简报。华盛顿特区:国家科学院出版社。
NASEM. 2023b. Opportunities and Challenges for Digital Twins in Biomedical Research: Proceedings of a Workshop-in Brief. Washington, DC: The National Academies Press. NASEM. 2023b. 生物医学研究中数字双胞胎的机遇与挑战:研讨会简报。华盛顿特区:国家科学院出版社。
NASEM. 2023c. Opportunities and Challenges for Digital Twins in Engineering: Proceedings of a Workshop-in Brief. Washington, DC: The National Academies Press. NASEM. 2023c. 工程领域数字双胞胎的机遇与挑战:研讨会简报。华盛顿特区:国家科学院出版社。
Sharma, P., D. Knezevic, P. Huynh, and G. Malinowski. 2018. "RB-FEA Based Digital Twin for Structural Integrity Assessment of Offshore Structures." Offshore Technology Conference. April 30-May 3. Houston, TX. Sharma, P., D. Knezevic, P. Huynh, 和 G. Malinowski. 2018. "基于 RB-FEA 的数字双胞胎用于海上结构的结构完整性评估。" 海上技术会议。4 月 30 日至 5 月 3 日。德克萨斯州休斯顿。
Sieger, M. 2019. "Getting More Air Time: This Software Helps Emirates Keep Its Planes Up and Running." General Electric, February 20. https://www.ge.com/news/reports/getting-air-timesoftware-helps-emirates-keep-planes-running. Sieger, M. 2019. "获得更多的空中时间:这款软件帮助阿联酋航空保持飞机正常运行。" 通用电气,2 月 20 日。https://www.ge.com/news/reports/getting-air-timesoftware-helps-emirates-keep-planes-running。
Yankeelov, T. 2023. "Digital Twins in Oncology." Presentation to the Workshop on Opportunities and Challenges for Digital Twins in Biomedical Sciences. January 30. Washington, DC. Yankeelov, T. 2023. "肿瘤学中的数字双胞胎。" 在生物医学科学中数字双胞胎的机遇与挑战研讨会上的演讲。1 月 30 日。华盛顿特区。
4
The Physical Counterpart: Foundational Research Needs and Opportunities 物理对应物:基础研究需求与机会
Abstract 摘要
Digital twins rely on observation of the physical counterpart in conjunction with modeling to inform the virtual representation (as discussed in Chapter 3). In many applications, these data will be multimodal, coming from disparate sources, and of varying quality. Only when high-quality, integrated data are combined with advanced modeling approaches can the synergistic strengths of data- and model-driven digital twins be realized. This chapter addresses data acquisition and data integration for digital twins. While significant literature has been devoted to the science and best practices around gathering and preparing data for use, this chapter focuses on the most important gaps and opportunities that are crucial for robust digital twins. 数字双胞胎依赖于对物理对应物的观察以及建模,以告知虚拟表示(如第 3 章所讨论)。在许多应用中,这些数据将是多模态的,来自不同的来源,并且质量各异。只有当高质量的集成数据与先进的建模方法相结合时,数据驱动和模型驱动的数字双胞胎的协同优势才能得以实现。本章讨论了数字双胞胎的数据获取和数据集成。尽管已有大量文献致力于收集和准备数据的科学和最佳实践,但本章重点关注对强大数字双胞胎至关重要的最重要的差距和机会。
DATA ACQUISITION FOR DIGITAL TWINS 数字双胞胎的数据采集
Data collection for digital twins is a continual process that plays a critical role in the development, refinement, and validation of the models that comprise the virtual representation. 数字双胞胎的数据收集是一个持续的过程,在构成虚拟表示的模型的开发、完善和验证中发挥着关键作用。
The Challenges Surrounding Data Acquisition for Digital Twins 数字双胞胎数据获取的挑战
Undersampling in complex systems with large spatiotemporal variability is a significant challenge for acquiring the data needed to characterize and quantify the dynamic physical and biological systems for digital twin development. 在具有大时空变异性的复杂系统中,欠采样是获取所需数据以表征和量化动态物理和生物系统以进行数字双胞胎开发的重大挑战。
The complex systems that may make up the physical counterpart of a digital twin often exhibit intricate patterns, nonlinear behaviors, feedback, and emergent phenomena that require comprehensive sampling in order to develop an under- 数字双胞胎的物理对应物可能由复杂系统构成,这些系统通常表现出复杂的模式、非线性行为、反馈和涌现现象,这需要全面的采样以便进行深入理解
standing of system behaviors. Systems with significant spatiotemporal variability may also exhibit heterogeneity because of external conditions, system dynamics, and component interactions. However, constraints in resources, time, or accessibility may hinder the gathering of data at an adequate frequency or resolution to capture the complete system dynamics. This undersampling could result in an incomplete characterization of the system and lead to overlooking critical events or significant features, thus risking the accuracy and predictive capabilities of digital twins. Moreover, undersampling introduces a level of uncertainty that could propagate through a digital twin's predictive models, potentially leading to inaccurate or misleading outcomes. Understanding and quantifying this uncertainty is vital for assessing the reliability and limitations of the digital twin, especially in safety-critical or high-stakes applications. To minimize the risk and effects of undersampling, innovative sampling approaches can be used to optimize data collection. Additionally, statistical methods and undersampling techniques may be leveraged to mitigate the effects of limited data. 系统行为的表现。具有显著时空变异性的系统可能由于外部条件、系统动态和组件交互而表现出异质性。然而,资源、时间或可达性的限制可能会阻碍以足够的频率或分辨率收集数据,从而无法捕捉完整的系统动态。这种欠采样可能导致对系统的不完整表征,并可能忽视关键事件或重要特征,从而危及数字双胞胎的准确性和预测能力。此外,欠采样引入了一定程度的不确定性,这种不确定性可能在数字双胞胎的预测模型中传播,可能导致不准确或误导性的结果。理解和量化这种不确定性对于评估数字双胞胎的可靠性和局限性至关重要,尤其是在安全关键或高风险应用中。为了最小化欠采样的风险和影响,可以使用创新的采样方法来优化数据收集。此外,可以利用统计方法和欠采样技术来减轻有限数据的影响。
Finally, data acquisition efforts are often enhanced by a collaborative and multidisciplinary approach, combining expertise in data acquisition, modeling, and system analysis, to address the task holistically and with an understanding of how the data will move through the digital twin. 最后,数据采集工作通常通过协作和多学科的方法得到增强,结合数据采集、建模和系统分析的专业知识,以整体的方式解决任务,并理解数据如何在数字双胞胎中流动。
Data Accuracy and Reliability 数据准确性和可靠性
Digital twin technology relies on the accuracy and reliability of data, which requires tools and methods to ensure data quality, efficient data storage, management, and accessibility. Standards and governance policies are critical for data quality, accuracy, and integrity, and frameworks play an important role in providing standards and guidelines for data collection, management, and sharing while maintaining data security and privacy (see Box 4-1). Efficient and secure data flow is essential for the success of digital twin technology, and research is needed to develop cybersecurity measures; methods for verifying trustworthiness, reliability, and accuracy; and standard methods for data flow to ensure compatibility between systems. Maintaining confidentiality and privacy is also vital. 数字双胞胎技术依赖于数据的准确性和可靠性,这需要工具和方法来确保数据质量、高效的数据存储、管理和可访问性。标准和治理政策对数据质量、准确性和完整性至关重要,框架在提供数据收集、管理和共享的标准和指南方面发挥着重要作用,同时维护数据安全和隐私(见框 4-1)。高效和安全的数据流对数字双胞胎技术的成功至关重要,需要进行研究以开发网络安全措施;验证可信性、可靠性和准确性的方法;以及确保系统之间兼容的数据流标准方法。维护机密性和隐私也至关重要。
Data quality assurance is a subtle problem that will need to be addressed differently in different contexts. For instance, a key question is how a digital twin should handle outlier or anomalous data. In some settings, such data may be the result of sensor malfunctions and should be detected and ignored, while in other settings, outliers may correspond to rare events that are essential to create an accurate virtual representation of the physical counterpart. A key research challenge for digital twins is the development of methods for data quality assessment that ensure digital twins are robust to spurious outliers while accurately representing salient rare events. Several technical challenges must be addressed here. Anomaly detection is central to identifying potential issues with data quality. While anomaly detection has been studied by the statistics and signal processing communi- 数据质量保证是一个微妙的问题,需要在不同的环境中以不同的方式解决。例如,一个关键问题是数字双胞胎应该如何处理异常或离群数据。在某些情况下,这些数据可能是传感器故障的结果,应被检测并忽略,而在其他情况下,离群值可能对应于创建物理对应物准确虚拟表示所必需的稀有事件。数字双胞胎的一个关键研究挑战是开发数据质量评估的方法,以确保数字双胞胎对虚假离群值具有鲁棒性,同时准确表示显著的稀有事件。这里必须解决几个技术挑战。异常检测是识别数据质量潜在问题的核心。虽然异常检测已被统计学和信号处理社区研究。
BOX 4-1 Ethics and Privacy 框 4-1 伦理与隐私
When data from human subjects or sensitive systems are involved, privacy requirements may limit data type and volume as well as the types of computation that can be performed on them. For example, the Health Insurance Portability and Accountability Act and the Common Rule have specific guidance on what types of data de-identification processes must be followed when using data from electronic health records. They describe how this type of data can be stored, transmitted, and used for secondary purposes. The General Protection Data Rule contains similar guidance, but it extends well beyond electronic health records to most data containing personally identifiable data. In addition to legal requirements, ethical and institutional considerations are involved when utilizing real-world data. Increasing calls for transparency in the utilization of personal data have been made, as well as for personal control of data, with several examples related to electronic health records. These regulations that call for de-identification of the data do not address the elevated risks to privacy in the context of digital twins, and updates to regulations and data protection practices will need to address specific risks associated with digital twins. 当涉及人类受试者或敏感系统的数据时,隐私要求可能会限制数据类型和数量,以及可以对其执行的计算类型。例如,《健康保险流通与问责法案》 和《共同规则》 对使用电子健康记录数据时必须遵循的数据去标识化过程类型有具体指导。 它们描述了如何存储、传输和用于二次目的这类数据。《一般数据保护条例》 包含类似的指导,但其范围远超电子健康记录,适用于大多数包含个人可识别数据的数据。除了法律要求外,利用真实世界数据时还涉及伦理和机构考虑。对个人数据使用透明度的呼声日益高涨, 同时也呼吁对数据的个人控制,涉及多个与电子健康记录相关的例子。 这些要求对数据进行去标识化的规定并未解决数字双胞胎背景下隐私面临的更高风险,法规和数据保护实践的更新需要针对与数字双胞胎相关的特定风险进行处理。
Modelers need to keep this in mind when designing systems that will typically require periodic collection of data, as study protocols submitted to human subjects' protections programs (Institutional Review Boards) should explain the need for continuous updates (as in registries) as well as the potential for harm (as in interventional studies). Of note, model outputs may also be subject to privacy protections, since they may reveal patient information that could be used to harm patients directly or indirectly (e.g., by revealing a high probability of developing a specific health condition or by lowering their ranking in an organ transplantation queue). 建模者在设计通常需要定期收集数据的系统时,需要牢记这一点,因为提交给人类受试者保护项目(机构审查委员会)的研究方案应解释持续更新的必要性(如在登记册中)以及潜在的危害(如在干预研究中)。值得注意的是,模型输出也可能受到隐私保护,因为它们可能揭示患者信息,这些信息可能直接或间接地对患者造成伤害(例如,通过揭示发展特定健康状况的高概率或通过降低他们在器官移植队列中的排名)。
ties, unique challenges arise in the bidirectional feedback loop between virtual and physical systems that is inherent to digital twins, including the introduction of statistical dependencies among samples; the need for real-time processing; and heterogeneous, large-scale, multiresolution data. Another core challenge is that many machine learning (ML) and artificial intelligence (AI) methods that might be used to update virtual models from new physical data focus on maximizing average-case performance-that is, they may yield large errors on rare events. Developing digital twins that do not ignore salient rare events requires rethinking loss functions and performance metrics used in data-driven contexts. 在数字双胞胎固有的虚拟与物理系统之间的双向反馈循环中,出现了独特的挑战,包括样本之间统计依赖性的引入;实时处理的需求;以及异构、大规模、多分辨率数据。另一个核心挑战是,许多可能用于根据新物理数据更新虚拟模型的机器学习(ML)和人工智能(AI)方法,侧重于最大化平均情况性能——也就是说,它们可能在稀有事件上产生较大的误差。开发不忽视显著稀有事件的数字双胞胎需要重新思考在数据驱动环境中使用的损失函数和性能指标。
A fundamental challenge in decision-making may arise from discrepancies between the data streamed from the physical model and that which is predicted by the digital twin. In the case of an erroneous sensor on a physical model, how can a human operator trust the output of the virtual representation, given that the supporting data were, at some point, attained data from the physical counterpart? While sensors and other data collection devices have reliability ratings, additional measures such as how reliability degrades over time may need to be taken into consideration. For example, a relatively new physical sensor showing different output compared to its digital twin may point to errors in the virtual representation instead of the physical sensor. One potential cause may be that the digital twin models may not have had enough training data under diverse operating conditions that capture the changing environment of the physical counterpart. 决策中的一个基本挑战可能源于来自物理模型的数据与数字双胞胎预测的数据之间的差异。在物理模型上出现错误传感器的情况下,考虑到支持数据在某个时刻是来自物理对应物的数据,人类操作员如何信任虚拟表示的输出?虽然传感器和其他数据收集设备有可靠性评级,但可能还需要考虑其他措施,例如可靠性随时间的下降。例如,较新的物理传感器与其数字双胞胎显示不同的输出,可能指向虚拟表示中的错误,而不是物理传感器的错误。一个潜在的原因可能是数字双胞胎模型在多样化操作条件下没有足够的训练数据,以捕捉物理对应物的变化环境。
Data quality (e.g., ensuring that the data set is accurate, complete, valid, and consistent) is another major concern for digital twins. Consider data assimilation for the artificial pancreas or closed-loop pump (insulin and glucagon). The continuous glucose monitor has an error range, as does the glucometer check, which itself is dependent on compliance from the human user (e.g., washing hands before the glucose check). Data assimilation techniques for digital twins must be able to handle challenges with multiple inputs from the glucose monitor and the glucometer, especially if they provide very different glucose levels, differ in units from different countries (e.g., or ), or lack regular calibration of the glucometer. Assessing and documenting data quality, including completeness and measures taken to curate the data, tools used at each step of the way, and benchmarks against which any model is evaluated, are integral parts of developing and maintaining a library of reproducible models that can be embedded in a digital twin system. 数据质量(例如,确保数据集准确、完整、有效和一致)是数字双胞胎的另一个主要关注点。考虑人工胰腺或闭环泵(胰岛素和胰高血糖素)的数据同化。连续血糖监测仪有一个误差范围,血糖仪检查也是如此,而这本身依赖于人类用户的配合(例如,在血糖检查前洗手)。数字双胞胎的数据同化技术必须能够处理来自血糖监测仪和血糖仪的多个输入所带来的挑战,特别是当它们提供非常不同的血糖水平、来自不同国家的单位不同时(例如, 或 ),或缺乏血糖仪的定期校准。评估和记录数据质量,包括完整性和为整理数据所采取的措施、每一步所使用的工具,以及任何模型评估的基准,是开发和维护可重复模型库的 integral 部分,这些模型可以嵌入数字双胞胎系统中。
Finding 4-1: Documenting data quality and the metadata that reflect the data provenance is critical. 发现 4-1:记录数据质量和反映数据来源的元数据至关重要。
Without clear guidelines for defining the objectives and use cases of digital twin technology, it can be challenging to identify critical components that significantly impact the physical system's performance (VanDerHorn and Mahadevan 在没有明确的指导方针来定义数字双胞胎技术的目标和用例的情况下,识别对物理系统性能产生重大影响的关键组件可能会很困难(VanDerHorn 和 Mahadevan)
2021). The absence of standardized quality assurance frameworks makes it difficult to compare and validate results across different organizations and systems. 2021 年)。缺乏标准化的质量保证框架使得在不同组织和系统之间比较和验证结果变得困难。
Finding 4-2: The absence of standardized quality assurance frameworks makes it difficult to compare and validate results across different organizations and systems. This is important for cybersecurity and information and decision sciences. Integrating data from various sources, including Internet of Things devices, sensors, and historical data, can be challenging due to differences in data format, quality, and structure. 发现 4-2:缺乏标准化的质量保证框架使得在不同组织和系统之间比较和验证结果变得困难。这对于网络安全以及信息和决策科学至关重要。由于数据格式、质量和结构的差异,从各种来源(包括物联网设备、传感器和历史数据)整合数据可能会面临挑战。
Considerations for Sensors 传感器的考虑因素
Sensors provide timely data on the condition of the physical counterpart. Improvements in sensor integrity, performance, and reliability will all play a crucial role in advancing the reliability of digital twin technology; this requires research into sensor calibration, performance, maintenance, and fusion methods. Detecting and mitigating adversarial attacks on sensors, such as tampering or false data injection, is essential for preserving system integrity and prediction fidelity. Finally, multimodal sensors that combine multiple sensing technologies may enhance the accuracy and reliability of data collection. Data integration is explored further in the next section. A related set of research questions around optimal sensor placement, sensor steering, and sensor dynamic scheduling is discussed in Chapter 6. 传感器提供有关物理对应物状态的及时数据。传感器的完整性、性能和可靠性的改善将在推动数字双胞胎技术的可靠性方面发挥关键作用;这需要对传感器的校准、性能、维护和融合方法进行研究。检测和缓解对传感器的对抗性攻击,如篡改或虚假数据注入,对于维护系统完整性和预测准确性至关重要。最后,结合多种传感技术的多模态传感器可能会提高数据收集的准确性和可靠性。数据集成将在下一节中进一步探讨。第六章讨论了一组与最佳传感器布置、传感器引导和传感器动态调度相关的研究问题。
DATA INTEGRATION FOR DIGITAL TWINS 数字双胞胎的数据集成
Increased access to diverse and dynamic streams of data from sensors and instruments can inform decision-making and improve model reliability and robustness. The digital twin of a complex physical system often gets data in different formats from multiple sources with different levels of verification and validation (e.g., visual inspection, record of repairs and overhauls, and quantitative sensor data from a limited number of locations). Integrating data from various sources - including Internet of Things devices, sensors, and historical data-can be challenging due to differences in data format, quality, and structure. Data interoperability (i.e., the ability for two or more systems to exchange and use information from other systems) and integration are important considerations for digital twins, but current efforts toward semantic integration are not scalable. Adequate metadata are critical to enabling data interoperability, harmonization, and integration, as well as informing appropriate use (Chung and Jaffray 2021). The transmission and level of key information needed and how to incorporate it in the digital twin are not well understood, and efforts to standardize metadata exist but are not yet sufficient for the needs of digital twins. Developers and end 增加对来自传感器和仪器的多样化和动态数据流的访问可以为决策提供信息,并提高模型的可靠性和稳健性。复杂物理系统的数字双胞胎通常会从多个来源以不同格式获取数据,这些来源具有不同的验证和确认级别(例如,目视检查、维修和大修记录,以及来自有限地点的定量传感器数据)。由于数据格式、质量和结构的差异,整合来自各种来源的数据(包括物联网设备、传感器和历史数据)可能会面临挑战。数据互操作性(即两个或多个系统交换和使用其他系统信息的能力)和集成是数字双胞胎的重要考虑因素,但当前的语义集成努力并不可扩展。充足的元数据对于实现数据互操作性、协调和集成至关重要,同时也有助于指导适当的使用(Chung 和 Jaffray 2021)。 关键数据的传输和水平以及如何将其纳入数字双胞胎尚不清楚,虽然已有标准化元数据的努力,但仍不足以满足数字双胞胎的需求。开发者和最终用户
users would benefit from collaboratively addressing the needed type and format of data prior to deployment. 用户在部署之前,协作解决所需的数据类型和格式将会受益。
Handling Large Amounts of Data 处理大量数据
In some applications, data may be streaming at full four-dimensional resolution and coupled with applications on the fly. This produces significantly large amounts of data for processing. Due to the large and streaming nature of some data sets, all operations must be running in continuous or on-demand modes (e.g., ML models need to be trained and applied on the fly, applications must operate in fully immersive data spaces, and data assimilation and data handling architecture must be scalable). Specific challenges around data assimilation and the associated verification, validation, and uncertainty quantification efforts are discussed further in Chapter 5. Historically, data assimilation methods have been model-based and developed independently from data-driven ML models. In the context of digital twins, however, these two paradigms will require integration. For instance, ML methods used within digital twins need to be optimized to facilitate data assimilation with large-scale streaming data, and data assimilation methods that leverage ML models, architectures, and computational frameworks need to be developed. 在某些应用中,数据可能以全四维分辨率流式传输,并与应用程序实时耦合。这会产生大量需要处理的数据。由于某些数据集的庞大和流式特性,所有操作必须以连续或按需模式运行(例如,机器学习模型需要实时训练和应用,应用程序必须在完全沉浸的数据空间中运行,数据同化和数据处理架构必须具备可扩展性)。关于数据同化及其相关的验证、确认和不确定性量化工作的具体挑战将在第五章中进一步讨论。历史上,数据同化方法是基于模型的,并与数据驱动的机器学习模型独立开发。然而,在数字双胞胎的背景下,这两种范式需要整合。例如,在数字双胞胎中使用的机器学习方法需要优化,以便与大规模流式数据进行数据同化,并且需要开发利用机器学习模型、架构和计算框架的数据同化方法。
The scalability of data storage, movement, and management solutions becomes an issue as the amount of data collected from digital twin systems increases. In some settings, the digital twin will face computational resource constraints (e.g., as a result of power constraints); in such cases, low-power ML and data assimilation methods are required. Approaches based on subsampling data (i.e., only using a subset of the available data to update the digital twin's virtual models) necessitate statistical and ML methods that operate reliably and robustly with limited data. Foundational research on the sample complexity of ML methods as well as pretrained and foundational models that only require limited data for fine tuning are essential to this endeavor. Additional approaches requiring further research and development include model compression, which facilitates the efficient evaluation of deployed models; dimensionality reduction (particularly in dynamic environments); and low-power hardware or firmware deployments of ML and data assimilation tools. 随着从数字双胞胎系统收集的数据量增加,数据存储、移动和管理解决方案的可扩展性成为一个问题。在某些情况下,数字双胞胎将面临计算资源的限制(例如,由于电力限制);在这种情况下,需要低功耗的机器学习和数据同化方法。基于子采样数据的方法(即仅使用可用数据的一个子集来更新数字双胞胎的虚拟模型)需要在有限数据下可靠且稳健地运行的统计和机器学习方法。关于机器学习方法的样本复杂性以及仅需有限数据进行微调的预训练和基础模型的基础研究对这一工作至关重要。需要进一步研究和开发的其他方法包括模型压缩,这有助于高效评估已部署的模型;降维(特别是在动态环境中);以及机器学习和数据同化工具的低功耗硬件或固件部署。
In addition, when streaming data are being collected and assimilated continuously, models must be updated incrementally. Online and incremental learning methods play an important role here. A core challenge is setting the learning rate in these models. The learning rate controls to what extent the model retains its memory of past system states as opposed to adapting to new data. This rate as well as other model hyperparameters must be set and tuned on the fly, in contrast to the standard paradigm of offline tuning using holdout data from the same distribution as training data. Methods for adaptively setting a learning rate, so that it is low enough to provide robustness to noisy and other data errors when the 此外,当流数据被持续收集和同化时,模型必须进行增量更新。在线和增量学习方法在这里发挥着重要作用。一个核心挑战是设置这些模型的学习率。学习率控制模型在多大程度上保留对过去系统状态的记忆,而不是适应新数据。这个学习率以及其他模型超参数必须实时设置和调整,这与使用与训练数据相同分布的保留数据进行离线调优的标准范式形成对比。自适应设置学习率的方法,使其足够低,以便在数据噪声和其他错误时提供鲁棒性。
underlying state is slowly varying yet can be increased when the state changes sharply (e.g., in hybrid or switched dynamical systems), are a critical research challenge for digital twins. Finally, note that the data quality challenges outlined above are present in the large-scale streaming data setting as well, making the challenge of adaptive model training in the presence of anomalies and outliers that may correspond to either sensor failures or salient rare events particularly challenging. 基础状态变化缓慢,但在状态发生剧烈变化时(例如,在混合或切换动态系统中)可以增加,这对数字双胞胎来说是一个关键的研究挑战。最后,请注意,上述数据质量挑战在大规模流数据环境中也存在,这使得在存在可能对应于传感器故障或显著稀有事件的异常和离群值的情况下进行自适应模型训练的挑战尤为困难。
Data Fusion and Synchronization 数据融合与同步
Digital twins can integrate data from different data streams, which provides a means to address missing data or data sparsity, but there are specific concerns regarding data synchronization (e.g., across scales) and data interoperability. For example, the heterogeneity of data sources (e.g., data from diverse sensor systems) can present challenges for data assimilation in digital twins. Specific challenges include the need to estimate the impact of missing data as well as the need to integrate data uncertainties and errors in future workflows. The integration of heterogeneous data requires macro to micro levels of statistical synthesis that span multiple levels, scales, and fidelities. Moreover, approaches must be able to handle mismatched digital representations. Recent efforts in the ML community on multiview learning and joint representation learning of data from disparate sources (e.g., learning a joint representation space for images and their text captions, facilitating the automatic captioning of new images) provide a collection of tools for building models based on disparate data sources. 数字双胞胎可以整合来自不同数据流的数据,这为解决缺失数据或数据稀疏提供了一种手段,但在数据同步(例如,跨尺度)和数据互操作性方面存在特定的担忧。例如,数据源的异质性(例如,来自不同传感器系统的数据)可能会给数字双胞胎中的数据同化带来挑战。具体挑战包括需要估计缺失数据的影响,以及在未来工作流程中整合数据的不确定性和错误。异质数据的整合需要跨越多个层次、尺度和保真度的宏观到微观的统计综合。此外,方法必须能够处理不匹配的数字表示。最近,机器学习社区在多视角学习和来自不同来源的数据的联合表示学习方面的努力(例如,为图像及其文本说明学习联合表示空间,促进新图像的自动说明)提供了一系列基于不同数据源构建模型的工具。
For example, in tumor detection using magnetic resonance imaging (MRI), results depend on the radiologist identifying the tumor and measuring the linear diameter manually (which is susceptible to inter- and intra-observer variability). There are efforts to automate the detection, segmentation, and/or measurement of tumors (e.g., using AI and ML approaches), but these are still vulnerable to upstream variability in image acquisition (e.g., a very small 2 mm tumor may be detected on a high-quality MRI but may not be visible on a poorer quality machine). Assimilating serial tumor measurement data is a complex challenge due to patients being scanned in different scanners with different protocols over time. 例如,在使用磁共振成像(MRI)进行肿瘤检测时,结果依赖于放射科医生手动识别肿瘤并测量线性直径(这容易受到观察者之间和观察者内部的变异影响)。目前有努力自动化肿瘤的检测、分割和/或测量(例如,使用人工智能和机器学习方法),但这些仍然容易受到图像采集上游变异的影响(例如,一个非常小的 2 毫米肿瘤可能在高质量的 MRI 上被检测到,但在较低质量的机器上可能不可见)。由于患者在不同的扫描仪上以不同的协议进行扫描,整合连续的肿瘤测量数据是一个复杂的挑战。
Data fusion and synchronization are further exacerbated by disparate sampling rates, complete or partial duplication of records, and different data collection contexts, which may result in seemingly contradictory data. The degree to which data collection is done in real time (or near real time) is dependent on the intended purpose of the digital twin system as well as available resources. For example, an ambulatory care system has sporadic electronic health record data, while intensive care unit sensor data are acquired at a much faster sampling rate. Additionally, in some systems, data imputation to mitigate effects of missing data will also require the development of imputation models learned from data. 数据融合和同步因采样率差异、记录的完全或部分重复以及不同的数据收集环境而进一步加剧,这可能导致看似矛盾的数据。数据收集的实时性(或近实时性)取决于数字双胞胎系统的预期目的以及可用资源。例如,门诊护理系统的电子健康记录数据是零散的,而重症监护室的传感器数据则以更快的采样率获取。此外,在某些系统中,为了减轻缺失数据的影响,数据插补还需要开发从数据中学习的插补模型。
Lack of standardization creates interoperability issues while integrating data from different sources. 缺乏标准化会在整合来自不同来源的数据时产生互操作性问题。
Conclusion 4-1: The lack of adopted standards in data generation hinders the interoperability of data required for digital twins. Fundamental challenges include aggregating uncertainty across different data modalities and scales as well as addressing missing data. Strategies for data sharing and collaboration must address challenges such as data ownership and intellectual property issues while maintaining data security and privacy. 结论 4-1:数据生成中缺乏采用的标准阻碍了数字双胞胎所需数据的互操作性。基本挑战包括在不同数据模式和尺度之间聚合不确定性,以及解决缺失数据的问题。数据共享和协作的策略必须解决数据所有权和知识产权等挑战,同时维护数据安全和隐私。
Challenges with Data Access and Collaboration 数据访问和协作的挑战
Digital twins are an inherently multidisciplinary and collaborative effort. Data from multiple stakeholders may be integrated and/or shared across communities. Strategies for data collaboration must address challenges such as data ownership, responsibility, and intellectual property issues prior to data usage and digital twin deployment. 数字双胞胎本质上是一项多学科和协作的工作。来自多个利益相关者的数据可以在社区之间集成和/或共享。数据协作的策略必须在数据使用和数字双胞胎部署之前解决数据所有权、责任和知识产权等挑战。
Some of these challenges can be seen in Earth science research, which has been integrating data from multiple sources for decades. Since the late 1970s, Earth observing satellites have been taking measurements that provide a nearly simultaneous global estimate of the state of the Earth system. When combined through data assimilation with in situ measurements from a variety of platforms (e.g., surface stations, ships, aircraft, and balloons), they provide global initial conditions for a numerical model to produce forecasts and also provide a basis for development and improvement of models (Ackerman et al. 2019; Balsamo et al. 2018; Fu et al. 2019; Ghil et al. 1979). The combination of general circulation models of the atmosphere, coupled models of the ocean-atmosphere system, and Earth system models that include biogeochemical models of the carbon cycle together with global, synoptic observations and a data assimilation method represent a digital twin of the Earth system that can be used to make weather forecasts and simulate climate variability and change. Numerical weather prediction systems are also used to assess the relative value of different observing systems and individual observing stations (Gelaro and Zhu 2009). 这些挑战在地球科学研究中可以看到,该领域已经整合了来自多个来源的数据数十年。自 1970 年代末以来,地球观测卫星一直在进行测量,提供几乎同时的全球地球系统状态估计。当通过数据同化与来自各种平台(例如,地面站、船只、飞机和气球)的现场测量结合时,它们为数值模型提供全球初始条件,以生成预测,并为模型的发展和改进提供基础(Ackerman 等,2019;Balsamo 等,2018;Fu 等,2019;Ghil 等,1979)。大气的一般环流模型、海洋-大气系统的耦合模型以及包括碳循环生物地球化学模型的地球系统模型与全球、综合观测和数据同化方法的结合,代表了地球系统的数字双胞胎,可用于天气预报和模拟气候变异与变化。 数值天气预报系统也用于评估不同观测系统和单个观测站的相对价值(Gelaro 和 Zhu 2009)。
KEY GAPS, NEEDS, AND OPPORTUNITIES 关键差距、需求和机会
In Table 4-1, the committee highlights key gaps, needs, and opportunities for managing the physical counterpart of a digital twin. There are many gaps, needs, and opportunities associated with data management more broadly; here the committee focuses on those for which digital twins bring unique challenges. This is not meant to be an exhaustive list of all opportunities presented in the chapter. For the purposes of this report, prioritization of a gap is indicated by 1 or 2 . While the committee believes all of the gaps listed are of high priority, gaps 在表 4-1 中,委员会强调了管理数字双胞胎物理对应物的关键差距、需求和机会。与数据管理更广泛相关的差距、需求和机会有很多;在这里,委员会专注于数字双胞胎带来独特挑战的那些。此列表并不旨在详尽列出本章中提出的所有机会。为了本报告的目的,差距的优先级用 1 或 2 表示。虽然委员会认为列出的所有差距都具有高优先级,但差距
TABLE 4-1 Key Gaps, Needs, and Opportunities for Managing the Physical Counterpart of a Digital Twin 表 4-1 管理数字双胞胎物理对应物的关键差距、需求和机会
Maturity 成熟
Priority 优先级
Early and Preliminary Stages 早期和初步阶段
缺乏促进数字双胞胎数据和模型互操作性的标准(例如,由监管机构制定)。
Standards to facilitate interoperability of data and models for digital twins (e.g.,
There is a gap in the mathematical tools available for assessing data quality,
determining appropriate utilization of all available information, understanding
how data quality affects the performance of digital twin systems, and guiding the
choice of an appropriate algorithm.
2
marked 1 may benefit from initial investment before moving on to gaps marked with a priority of 2 . 标记为 1 的项目可以在转向标记为 2 的优先级差距之前受益于初始投资。
REFERENCES 参考文献
Ackerman, S.A, S. Platnick, P.K. Bhartia, B. Duncan, T. L'Ecuyer, A. Heidinger, G.J. Skofronick, N. Loeb, T. Schmit, and N. Smith. 2019. "Satellites See the World's Atmosphere." Meteorological Monographs 59(1):1-53. Ackerman, S.A, S. Platnick, P.K. Bhartia, B. Duncan, T. L'Ecuyer, A. Heidinger, G.J. Skofronick, N. Loeb, T. Schmit, 和 N. Smith. 2019. "卫星观察世界大气。" 气象专论 59(1):1-53.
Balsamo, G., A.A. Parareda, C. Albergel, C. Arduini, A. Beljaars, J. Bidlot, E. Blyth, et al. 2018. "Satellite and In Situ Observations for Advancing Global Earth Surface Modelling: A Review." Remote Sensing 10(12):2038. 巴尔萨莫,G.,A.A. 帕拉雷达,C. 阿尔贝尔杰尔,C. 阿尔杜伊尼,A. 贝尔贾尔斯,J. 比德洛特,E. 布莱斯,等。2018。“卫星和现场观测在推进全球地表建模中的应用:综述。”遥感 10(12):2038。
Chung, C., and D. Jaffray. 2021. "Cancer Needs a Robust ‘Metadata Supply Chain' to Realize the Promise of Artificial Intelligence." American Association for Cancer Research 81(23):5810-5812. 钟,C.,和 D. 贾夫雷。2021。“癌症需要一个强大的‘元数据供应链’来实现人工智能的承诺。”美国癌症研究协会 81(23):5810-5812。
Fu, L.L., T. Lee, W.T. Liu, and R. Kwok. 2019. "50 Years of Satellite Remote Sensing of the Ocean." Meteorological Monographs 59(1):1-46. 傅, L.L., 李, T., 刘, W.T., 和 郭, R. 2019. "卫星遥感海洋的 50 年." 气象专论 59(1):1-46.
Gelaro, R., and Y. Zhu. 2009. "Examination of Observation Impacts Derived from Observing System Experiments (OSEs) and Adjoint Models." Tellus A: Dynamic Meteorology and Oceanography 61(2):179-193. Gelaro, R. 和 Y. Zhu. 2009. "观察系统实验(OSEs)和伴随模型产生的观测影响的研究。" Tellus A: 动态气象与海洋学 61(2):179-193.
Ghil, M., M. Halem, and R. Atlas. 1979. "Time-Continuous Assimilation of Remote-Sounding Data and Its Effect on Weather Forecasting." Monthly Weather Review 107(2):140-171. Ghil, M., M. Halem, 和 R. Atlas. 1979. "遥感数据的时间连续同化及其对天气预报的影响。" 月度天气评论 107(2):140-171.
VanDerHorn, E., and S. Mahadevan. 2021. "Digital Twin: Generalization, Characterization and Implementation." Decision Support Systems 145:113524. VanDerHorn, E. 和 S. Mahadevan. 2021. "数字双胞胎:概括、特征化与实施。" 决策支持系统 145:113524.
5
Feedback Flow from Physical to Virtual: Foundational Research Needs and Opportunities 从物理到虚拟的反馈流:基础研究需求与机会
In the digital twin feedback flow from physical to virtual, inverse problem methodologies and data assimilation are required for combining physical observations and virtual models in a rigorous, systematic, and scalable way. This chapter addresses specific challenges for digital twins including calibration and updating on actionable time scales. These challenges represent foundational gaps in inverse problem and data assimilation theory, methodology, and computational approaches. 在物理到虚拟的数字双胞胎反馈流程中,需要逆问题方法和数据同化,以一种严格、系统和可扩展的方式结合物理观测和虚拟模型。本章讨论了数字双胞胎的具体挑战,包括在可操作时间尺度上的校准和更新。这些挑战代表了逆问题和数据同化理论、方法论和计算方法中的基础性缺口。
INVERSE PROBLEMS AND DIGITAL TWIN CALIBRATION 逆问题与数字双胞胎校准
Digital twin calibration is the process of estimating numerical model parameters for individualized digital twin virtual representations. This task of estimating numerical model parameters and states that are not directly observable can be posed mathematically as an inverse problem, but the problem may be ill posed. Bayesian approaches can be used to incorporate expert knowledge that constrains solutions and predictions. It must be noted, however, that for some settings, specification of prior distributions can greatly impact the inferences that a digital twin is meant to provide-for better or for worse. Digital twins present specific challenges to Bayesian approaches, including the need for good priors that capture tails of distributions, the need to incorporate model errors and updates, and the need for robust and scalable methods under uncertainty and for high-consequence decisions. This presents a new class of open problems in the realm of inverse problems for large-scale complex systems. 数字双胞胎校准是为个性化数字双胞胎虚拟表示估计数值模型参数的过程。这个估计不可直接观察的数值模型参数和状态的任务可以在数学上被表述为一个逆问题,但该问题可能是病态的。贝叶斯方法可以用来结合约束解决方案和预测的专家知识。然而,必须注意的是,在某些情况下,先验分布的指定可能会极大影响数字双胞胎所提供的推断——无论是好是坏。数字双胞胎对贝叶斯方法提出了特定挑战,包括需要良好的先验以捕捉分布的尾部、需要纳入模型误差和更新,以及在不确定性和高后果决策下需要稳健和可扩展的方法。这在大规模复杂系统的逆问题领域提出了一类新的开放问题。
Parameter Estimation and Regularization for Digital Twin Calibration 数字双胞胎校准的参数估计与正则化
The process of estimating numerical model parameters from data is an ill-posed problem, whereby the solution may not exist, may not be unique, or may not depend continuously on the data. The first two conditions are related to identifiability of solutions. The third condition is related to the stability of the problem; in some cases, small errors in the data may result in large errors in the reconstructed parameters. Bayesian regularization, in which priors are encoded using probability distribution functions, can be used to handle missing information, ill-posedness, and uncertainty. A specific challenge for digital twins is that standard priors - such as those based on simple Gaussian assumptions - may not be informative and representative for making high-stakes decisions. Also, due to the continuous feedback loop, updated models need to be included on the fly (without restarting from scratch). Moreover, the prior for one problem may be taken as the posterior from a previous problem, so it is important to assign probabilities to data and priors in a rigorous way such that the posterior probability is consistent when using a Bayesian framework. 从数据中估计数值模型参数的过程是一个不适定问题,其中解可能不存在、可能不唯一或可能不连续依赖于数据。前两个条件与解的可识别性有关。第三个条件与问题的稳定性有关;在某些情况下,数据中的小错误可能导致重建参数的大错误。贝叶斯正则化通过使用概率分布函数编码先验,可以用来处理缺失信息、不适定性和不确定性。数字双胞胎面临的一个具体挑战是,标准先验——例如基于简单高斯假设的先验——可能对做出高风险决策并不具有信息性和代表性。此外,由于持续的反馈循环,更新模型需要实时纳入(而不是从头开始重启)。此外,一个问题的先验可以作为先前问题的后验,因此在使用贝叶斯框架时,以严格的方式为数据和先验分配概率是重要的,以确保后验概率的一致性。
Approaches to learn priors through existing data (e.g., machine learninginformed bias correction) can work well in data-rich environments but may not accurately represent or predict extreme events because of limited relevant training data. Bayesian formulations require priors for the unknown parameters, which may depend on expensive-to-tune hyperparameters. Data-driven regularization approaches that incorporate more realistic priors are necessary for digital twins. 通过现有数据学习先验的方法(例如,基于机器学习的偏差修正)在数据丰富的环境中效果良好,但由于相关训练数据有限,可能无法准确表示或预测极端事件。贝叶斯公式需要未知参数的先验,这可能依赖于昂贵的超参数调优。需要采用更现实的先验的基于数据的正则化方法,以适应数字双胞胎。
Optimization of Numerical Model Parameters Under Uncertainty 在不确定性下的数值模型参数优化
Another key challenge is to perform optimization of numerical model parameters (and any additional hyperparameters) under uncertainty-any computational model must be calibrated to meet its requirements and be fit for purpose. In general, optimization under uncertainty is challenging because the cost functions are stochastic and must be able to incorporate different types of uncertainty and missing information. Bayesian optimization and stochastic optimization approaches (e.g., online learning) can be used, and some fundamental challenges - such as obtaining sensitivity information from legacy code with missing adjoints-are discussed in Chapter 6. 另一个关键挑战是在不确定性下对数值模型参数(以及任何额外的超参数)进行优化——任何计算模型都必须经过校准以满足其要求并适合其目的。一般来说,在不确定性下进行优化是具有挑战性的,因为成本函数是随机的,必须能够纳入不同类型的不确定性和缺失信息。可以使用贝叶斯优化和随机优化方法(例如,在线学习),并且一些基本挑战——例如从缺少伴随的遗留代码中获取灵敏度信息——在第六章中进行了讨论。
These challenges are compounded for digital twin model calibration, especially when models are needed at multiple resolutions. Methods are needed for fast sampling of parametric and structural uncertainty. For digital twins to support high-consequence decisions, methods may need to be tuned to risk and extreme events, accounting for worst-case scenarios. Risk-adaptive loss functions and data-informed prior distribution functions for capturing extreme events and for incorporating risk during inversion merit further exploration. Non-differentiability also becomes a significant concern as mathematical models may demonstrate 这些挑战在数字双胞胎模型校准中更加复杂,尤其是在需要多种分辨率的模型时。需要快速采样参数和结构不确定性的方法。为了使数字双胞胎支持高后果决策,方法可能需要针对风险和极端事件进行调整,考虑最坏情况。风险自适应损失函数和数据驱动的先验分布函数在捕捉极端事件和在反演过程中纳入风险方面值得进一步探索。非可微性也成为一个重要问题,因为数学模型可能会表现出。
discontinuous behavior or numerical artifacts may result in models that appear non-differentiable. Moreover, models may even be chaotic, which can be intractable for adjoint and tangent linear models. Standard loss functions, such as the least-squares loss, are not able to model chaotic behavior in the data (Royset 2023) and are not able to represent complex statistical distributions of model errors that arise from issues such as using a reduced or low-fidelity digital-forward model. Robust and stable optimization techniques (beyond gradient-based methods) to handle new loss functions and to address high displacements (e.g., the upper tail of a distribution) that are not captured using only the mean and standard deviation are needed. 不连续的行为或数值伪影可能导致模型看起来不可微分。此外,模型甚至可能是混沌的,这对于伴随和切线线性模型来说可能是难以处理的。标准损失函数,如最小二乘损失,无法对数据中的混沌行为建模(Royset 2023),也无法表示由于使用简化或低保真度数字前向模型等问题而产生的模型误差的复杂统计分布。需要稳健和稳定的优化技术(超越基于梯度的方法)来处理新的损失函数,并解决仅使用均值和标准差无法捕捉到的高位移(例如,分布的上尾)。
DATA ASSIMILATION AND DIGITAL TWIN UPDATING 数据同化与数字双胞胎更新
Data assimilation tools have been used heavily in numerical weather forecasting, and they can be critical for digital twins broadly, including to improve model states based on current observations. Still, there is more to be exploited in the bidirectional feedback flow between physical and virtual beyond standard data assimilation (Blair 2021). 数据同化工具在数值天气预报中被广泛使用,它们对于数字双胞胎至关重要,包括基于当前观测改善模型状态。然而,在物理与虚拟之间的双向反馈流中,除了标准数据同化之外,还有更多可以利用的空间(Blair 2021)。
First, existing data assimilation methods rely heavily on assumptions of high-fidelity models. However, due to the continual and dynamic nature of digital twins, the validity of a model's assumptions - and thus the model's fidelity - may evolve over time, especially as the physical counterpart undergoes significant shifts in condition and properties. A second challenge is the need to perform uncertainty quantification for high-consequence decisions on actionable time scales. This becomes particularly challenging for large-scale complex systems with high-dimensional parameter and state spaces. Direct simulations and inversions (e.g., in the case of variational methods) needed for data assimilation are no longer feasible. Third, with different digital technologies providing data at unprecedented rates, there are few mechanisms for integrating artificial intelligence, machine learning, and data science tools for updating digital twins. 首先,现有的数据同化方法在很大程度上依赖于高保真模型的假设。然而,由于数字双胞胎的持续和动态特性,模型假设的有效性——因此模型的保真度——可能会随着时间的推移而演变,特别是当物理对应物经历显著的条件和属性变化时。第二个挑战是需要在可操作的时间尺度上对高后果决策进行不确定性量化。这对于具有高维参数和状态空间的大规模复杂系统尤其具有挑战性。用于数据同化的直接模拟和反演(例如,在变分方法的情况下)不再可行。第三,随着不同数字技术以空前的速度提供数据,整合人工智能、机器学习和数据科学工具以更新数字双胞胎的机制仍然很少。
Digital Twin Demands for Continual Updates 数字双胞胎对持续更新的需求
Digital twins require continual feedback from the physical to virtual, often using partial and noisy observations. Updates to the twin should be incorporated in a timely way (oftentimes immediately), so that the updated digital twin may be used for further forecasting, prediction, and guidance on where to obtain new data. These updates may be initiated when something in the physical counterpart evolves or in response to changes in the virtual representation, such as improved model parameters, a higher-fidelity model that incorporates new physical understanding, or improvements in scale/resolution. Due to the continual nature of digital twins as well as the presence of errors and noise in the models, the observations, and the initial conditions, sequential data assimilation approaches (e.g., 数字双胞胎需要从物理到虚拟的持续反馈,通常使用部分和噪声观察。对双胞胎的更新应及时纳入(通常是立即),以便更新后的数字双胞胎可以用于进一步的预测、预报和指导获取新数据的方向。这些更新可能在物理对应物发生变化时启动,或响应虚拟表示的变化,例如改进的模型参数、包含新物理理解的高保真模型,或在规模/分辨率上的改进。由于数字双胞胎的持续特性以及模型、观察和初始条件中存在的错误和噪声,序列数据同化方法(例如,
particle-based approaches and ensemble Kalman filters) are the natural choice for state and parameter estimation. However, these probabilistic approaches have some disadvantages compared to variational approaches, such as sampling errors, rank deficiency, and inconsistent assimilation of asynchronous observations. 基于粒子的算法和集合卡尔曼滤波器是状态和参数估计的自然选择。然而,与变分方法相比,这些概率方法存在一些缺点,例如采样误差、秩缺陷和对异步观测的不一致同化。
Data assimilation techniques need to be adapted for continuous streams of data from different sources and need to interface with numerical models with potentially varying levels of uncertainty. These methods need to be able to infer system state under uncertainty when a system is evolving and be able to integrate model updates efficiently. Moreover, navigating discrepancies between predictions and observed data requires the deve