这是用户在 2024-4-26 15:18 为 https://app.immersivetranslate.com/pdf-pro/799e136f-d857-49a8-bc01-85523dc6c2d3 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
2024_04_26_0a980a70489ec4861c3ag
RAS Feature Description
Standard
RAS
SKU
Advanced
RAS SKU
Error Reporting (MCA, AER) -
错误报告(MCA,AER)-
core, uncore, and IIO
核心、非核心和 IIO
Error reporting includes error logging and signaling within core,
错误报告包括核心、非核心和 IIO 子系统内的错误日志记录和信令。
uncore, and IIO sub-systems. It covers following domains of error
它涵盖了错误的以下领域。
reporting:
1. Machine Check Architecture (MCA)
1. 机器检查架构(MCA)
2. PCIe Advanced Error Reporting (AER) and additional IIO error
2. PCIe 高级错误报告(AER)和额外的 IIO 错误
reporting through Integrated Error Handler (IEH)
通过集成错误处理程序(IEH)进行报告
3. Platform-specific Intel UPI Error reporting
3. 特定平台 Intel UPI 错误报告
4. Platform-specific Integrated Voltage Regulator (IVR) Error
4. 特定平台集成电压调节器(IVR)错误
Reporting
5. Platform-specific memory error reporting
5. 平台特定的内存错误报告
Yes Yes
Memory Corrected Error 内存校正错误
Reporting
It provides per rank corrected error counters with leaky bucket, can
它提供了每个 rank 的校正错误计数器,带有漏桶功能,可以
trigger either SMI/NMI/ERROR_N[0] Error Signaling for Platform use
触发平台使用的 SMI/NMI/ERROR_N[0] 错误信号
only. It can be used by firmware for invocation of various memory
仅限。固件可以使用它来调用各种内存
device sparing and mirroring features.
设备备用和镜像功能。
Yes Yes
Error Reporting via IOMCA
通过 IOMCA 进行错误报告
IOMCA extends the 'Legacy IA- 32 MCA' error reporting to the IIO
IOMCA 将“Legacy IA-32 MCA”错误报告扩展到 IIO 子系统
sub-system. Processor has added a dedicated Machine Check Bank
处理器已添加了专用的机器检查银行
within the UBOX for reporting IIO uncorrected error.
在 UBOX 中报告 IIO 未校正错误。
Yes Yes
First Corrected Error (FCERR)
第一个已校正错误(FCERR)
Mode of Reporting 报告模式
FCERR allows latching error log information specific to the first
FCERR 允许锁存特定于第一个已纠正错误的错误日志信息,并防止后续错误覆盖错误日志寄存器
corrected error and prevents over-writing the error logging registers
with subsequent errors. 与后续错误。
Yes Yes
Error Reporting through MCA
通过 MCA 进行错误报告
2.0 (EMCA Gen2) 2.0(EMCA Gen2)
EMCA Gen2 is an enhancement to the 'Legacy IA-32 MCA'
EMCA Gen2 是对“传统 IA-32 MCA”的增强
supporting Firmware First Model (FFM) of error reporting. All the
支持首先固件模型(FFM)的错误报告。所有检测到的错误都通过系统管理中断(SMI)首先发出信号。
detected errors are signaled via System Management Interrupt
SMM 处理程序被允许将错误信号传递给软件/操作系统。
(SMI) first. SMM Handler is allowed to signal the error to SW/OS as
(SMI)首先。 SMM 处理程序被允许将错误信号传递给软件/操作系统。
per the platform RAS policies. (Adv .RAS SkU, SMM hander can set
根据平台RAS政策。 (Adv .RAS SkU,SMM hander可以设置
to 1 bits in MCA banks)
在 MCA 银行中的 1 位
Yes Yes
PCIe Corrected Error PCIe 纠正错误
Reporting (error counters and
报告(错误计数器和
leaky-bucket) 漏桶)
PCIe corrected error counter, and threshold setting. Also supports
PCIe 纠正错误计数器和阈值设置。还支持
leaky-bucket logic to periodically deplete the count.
漏桶逻辑以定期减少计数。
Yes Yes
Thresholding for Corrected
修正后的阈值
Errors (Uncore MCA banks)
错误(Uncore MCA 银行)的阈值化
Threshold support for CSMI generation from all uncore MCA banks.
从所有 Uncore MCA 银行生成 CSMI 的阈值支持
It allows signaling corrected error events once a threshold is
一旦达到阈值,它允许信号纠正错误事件
reached.
Yes Yes
MCA Bank Error Control (aka
MCA Bank Error Control(又名
'cloaking')
It gives UEFI FW and PECI visibility into Corrected and UCNA errors,
它使 UEFI 固件和 PECI 能够看到已纠正和 UCNA 错误,
and mask signaling corrected and UCNA errors to OS/SW. (OEM
并将掩码信号校正和 UCNA 错误传递给 OS/SW。(OEM
specific application.) 特定应用程序。)
Yes Yes
CSR Error Log Cloaking
CSR 错误日志伪装
DEVHIDE enables UEFI FW to fully manage platform error logs, and
DEVHIDE使UEFI固件能够完全管理平台错误日志。
prevent non UEFI FW or PECI agent from accessing configuration
防止非 UEFI 固件或 PECI 代理访问配置
CSRs.
Yes Yes
Enhanced SMM (ESMM) 增强的 SMM(ESMM)
Enhancements to the existing SMM mode used for platform specific
对用于特定平台的现有 SMM 模式进行增强
error reporting. Key attributes of this feature are:
错误报告。此功能的关键属性包括:
1. Thread in Long Flow/Blocked indicators
1. 长流程/阻塞指示器中的线程
2. Targeted SMI-- Not supported.
2. 目标 SMI-- 不支持。
3. Execution outside of SMRR region detection
3. 在 SMRR 区域之外执行检测
4. SMM dump state storage into internal MSRs
4. 将 SMM 转储状态存储到内部 MSRs 中
5. Spurious SMI handling
5. 虚假 SMI 处理
6. 32-bit protected mode SMM entry
6. 32 位保护模式 SMM 入口
Yes Yes

NOTES:  注意:

  1. RAS features may not be supported on all SKUs of a processor type.
    处理器类型的所有 SKU 可能不支持 RAS 功能。
  2. A two socket workstation SKU supports a standard RAS SKU.
    一个双插槽工作站 SKU 支持标准的 RAS SKU。

11.2.1 Error Detection and Correction
11.2.1 错误检测和纠正

The processor implements extensive error detection and correction capability within various internal modules to maintain data integrity and target level of processor reliability.
处理器在各个内部模块中实现了广泛的错误检测和纠正能力,以保持数据完整性和处理器可靠性水平。
This feature covers entire processor level fault detection and correction capability. It offers data protection and data integrity via error detection within the core and uncore. It includes enhanced cache error reporting, Data Path Parity Protection (DPPP) and Address Path Parity Protection (APPP) within the processor interconnect hierarchy.
此功能涵盖整个处理器级别的故障检测和纠正能力。它通过核心和非核心内部的错误检测提供数据保护和数据完整性。它包括增强的缓存错误报告、数据路径奇偶校验保护(DPPP)和地址路径奇偶校验保护(APPP)在处理器互连层次结构内。
Detected errors are the errors that have been detected and reported by the error handling logic within the processor. Detected errors can be corrected through the embedded ECC in the IMC module, or through a retry transaction. A not correctable error is called Uncorrected Error (UC error).
检测到的错误是由处理器内的错误处理逻辑检测并报告的错误。检测到的错误可以通过 IMC 模块中的嵌入式 ECC 或通过重试事务进行纠正。一个无法纠正的错误称为未纠正错误(UC 错误)。
When the processor is configured in the legacy IA-32 MCA mode, the UC errors are reported as fatal and leads to an MCE (Machine Check Exception/error) resulting in a system reset. When the processor is configured in the corrupt data containment (poison or poison viral) mode, certain types of UC data errors will result in a hardware poison of the data, and allow the software (OS/VMware) to manage the error condition (recover or crash the system). Such errors are further classified as UCNA, SRAO, or SRAR as described next and shown in Figure 27 on page 226.
当处理器配置在传统的 IA-32 MCA 模式下时,UC 错误被报告为致命错误,并导致机器检查异常/错误(MCE),导致系统重置。当处理器配置在腐败数据封装(毒药或毒药病毒)模式下时,某些类型的 UC 数据错误将导致数据的硬件毒药,并允许软件(操作系统/VMware)管理错误条件(恢复或崩溃系统)。这些错误进一步分类为 UCNA、SRAO 或 SRAR,如下所述并在第 226 页的第 27 页上显示。
  • Uncorrected No Action (UCNA) - Uncorrecteable data is logged in the MCA bank with a unique signature ( ). The error containment bit (also known as poison bit) is attached to the bad data and forwarded to the requesting agent. Such an error classification can trigger a CSMI in the eMCA gen2 mode.
    未校正无操作(UCNA)- 无法校正的数据记录在具有唯一签名( )的 MCA 存储器中。错误包含位(也称为毒位)附加到错误数据并转发给请求代理。这种错误分类可以在 eMCA gen2 模式下触发 CSMI。
  • Software Recoverable Action Optional (SRAO) - There is no SRAO error type support on the platform. SRAO type events is reverted to UCNA. The OS/software action remains the same for error types that used to signal SRAO and now signaling UCNA.
    软件可恢复操作可选(SRAO)- 平台不支持 SRAO 错误类型。SRAO 类型事件被还原为 UCNA。对于以前用于表示 SRAO 并且现在表示 UCNA 的错误类型,OS/软件操作保持不变。
  • Software Recoverable Action Required (SRAR) - Bad data or instruction in the core execution path can result in an SRAR error in the MCA bank status with unique signature ( ). For example, Data Cache Unit ( ) load/ store and Instruction Fetch Unit (IFU) load operations on poisoned data. Such error classification can trigger MSMI in eMCA gen2.
    软件可恢复操作必需(SRAR)- 核心执行路径中的错误数据或指令可能导致 MCA 存储器状态中具有唯一签名( )的 SRAR 错误。例如,对毒数据进行的数据缓存单元( )加载/存储和指令获取单元(IFU)加载操作。这种错误分类可以在 eMCA gen2 中触发 MSMI。
Figure 27. Error Classification
图 27. 错误分类
  • Machine check architecture: MCA
    机器检查架构:MCA
  • Advance error reporting (exclude UCR): AER
    高级错误报告(不包括 UCR):AER
  • Detected but uncorrectable error: DUE
    检测到但无法纠正的错误: DUE
  • Uncorrected recoverable: UCR
    未纠正的可恢复错误: UCR
  • Uncorrected no action required IA32_MCi_STATUS; ( : UCNA
    未纠正的无需采取行动的错误 IA32_MCi_STATUS; ( : UCNA
  • SRAO IA32_MCi_STATUS; /not supported in this generation processors: SRAO
    SRAO IA32_MCi_STATUS; /此代处理器不支持: SRAO
  • Software recoverable action required IA32_MCi_STATUS; ( , ): SRAR
    软件可恢复操作所需 IA32_MCi_STATUS; ( , ): SRAR
The following table summarizes error detection and correction coverage within the processor.
下表总结了处理器内的错误检测和纠正覆盖范围。
Table 74. Error Detection and Correction Coverage
表 74. 错误检测和纠正覆盖率
Module Definition/Sub-Module 定义/子模块 Detection/Correction 检测/纠正 Error Reporting1  错误报告 1
EE Execution Engine (Integer)
执行引擎(整数)
Detection: Residue check
检测:残留检查
Correction: Instruction retry on error
更正:错误时指令重试
Logging: MCA
Signaling: CMCI/CSMI/MSMI/MCERR
信令:CMCI/CSMI/MSMI/MCERR
IFU
Instruction fetch unit 指令获取单元
(FLC-cache)
Detection: Parity 检测:奇偶校验
Correction: Instruction Retry on error
纠错:指令错误重试
Logging: MCA
Signaling: CMCI/CSMI/MSMI/MCERR
信号:CMCI/CSMI/MSMI/MCERR
DCU
Data cache unit 数据缓存单元
(FLC-cache)
Detection: Parity 检测:奇偶校验
Correction: If cache is in non-M state
纠正:如果缓存处于非-M 状态
Logging: MCA
Signaling: CMCI/CSMI/MSMI/MCERR
信令:CMCI/CSMI/MSMI/MCERR
I/DTLB
Instruction/data translation
指令/数据翻译
look aside buffer 看边缓冲器
Detection: Parity 检测:奇偶校验
Logging: MCA
Signaling: CMCI/CSMI/MSMI/MCERR
信令:CMCI/CSMI/MSMI/MCERR
Continued...
Module Definition/Sub-Module 定义/子模块 Detection/Correction 检测/校正 Error Reporting  错误报告
MLC Mid-Level Cache 中级缓存 Detection + Correction: ECC
检测 + 纠正:ECC
Logging: MCA
Signaling: CMCI/CSMI/MSMI/MCERR
信令:CMCI/CSMI/MSMI/MCERR
CHA
LLC cache: data, LLC 缓存:数据
Tag, MESIF state 标签,MESIF 状态
Detection + Correction: ECC (DECTED)
检测 + 纠正:ECC(DECTED)
Logging: MCA
Signaling: MCERR, CSMI/MSMI, CMCI
信令:MCERR,CSMI/MSMI,CMCI
B2CMI
Bridge-to-common memory 桥接到通用内存
Interface
Detection: Parity/UC-poison
检测:奇偶校验/UC-毒害
Logging: MCA, CSRs, bank shadow
记录:MCA,CSRs,bank 阴影
Signaling: CMCI/CSMI/MSMI/MCERR
信令: CMCI/CSMI/MSMI/MCERR
Punit Power controller unit 电源控制器单元
Detection: Parity, stack overflow,
检测: 奇偶校验、堆栈溢出
timeout schemes 超时方案
Logging:
Signaling: IERR, MSMI 信令:IERR,MSMI
IVR Integrated voltage regulators
集成电压调节器
Detection: Over voltage and over
检测:过压和过
current
Logging: MCA, additional CSRs
记录:MCA,附加 CSRs
Signaling: IERR, FIVR_FAULT (core/
信号:IERR,FIVR_FAULT(核心/
uncore)
MON_FAIL_N
Intel
UPI
Ultra Path Interconnect -
超级路径互连 -
physical and link layer
物理和链路层
Detection +Retry: CRC 检测 + 重试:CRC
Logging: MCA
Signaling: CMCI/CSMI/MSMI/MCERR,
信令: CMCI/CSMI/MSMI/MCERR
Intel
UPI
RX and TX Queues
接收和发送队列
Detection: parity 检测: 奇偶校验
Logging: MCA
Signaling: CMCI/CSMI/MSMI/MCERR,
信令: CMCI/CSMI/MSMI/MCERR
IMC Write data buffer parity
写入数据缓冲区奇偶校验
Detection: Parity 检测: 奇偶校验
Logging: MCA, and bank shadow
记录:MCA 和银行阴影
Signaling: SMI/CSMI/MSMI,ERROR_N[0]/
信令:SMI/CSMI/MSMI,ERROR_N[0]/
CMCI
IMC
Memory read
Write data byte enable
写入数据字节使能
Detection + Correction: ECC
检测 + 纠正:ECC
Logging: MCA, and bank shadow
记录:MCA 和 bank shadow
Signaling: SMI/CSMI/MSMI,ERROR_N[0]/
信令:SMI/CSMI/MSMI,ERROR_N[0]/
CMCI
IMC DDR link errors DDR 链接错误 Correctable or fatal 可纠正或致命
Logging: MCA
Signaling: CMCI/CSMI/MCE
信令:CMCI/CSMI/MCE
IIO/ PCI
Express*
Integrated I/O: PHY and link
集成 I/O:PHY 和链接
layer
Write data caches 写入数据缓存
Detection + Retry: CRC
检测 + 重试:CRC
Detection + Correction: ECC
检测 + 纠正:ECC
Logging: IIO AER CSRs/ IOMCA
记录:IIO AER CSRs/ IOMCA
Signaling: NMI, SMI, ERROR_N[n], MSI
信令:NMI,SMI,ERROR_N[n],MSI
and MSMI
IIO/ PCI
Express*
queues Detection: Parity 检测:奇偶校验
Logging: IIO AER CSRs/ IOMCA
记录:IIO AER CSRs/ IOMCA
Signaling: NMI, SMI, ERROR_N[n], MSI
信号:NMI,SMI,ERROR_N[n],MSI
and MSMI
IIO/ PCI
Express*
IIO, IRP, Intel ,
IIO, IRP, 英特尔 ,
DMA errors
Detection + Correction 检测 + 修正
Logging: IIO AER CSRs/ IOMCA
记录:IIO AER CSRs/ IOMCA
Signaling: NMI, SMI, ERROR_N[n], MSI
信号传输:NMI,SMI,ERROR_N[n],MSI
and MSMI
Internal
Mesh
Internal Mesh - Data and
内部网格 - 数据和
Command (DPPP, APPP6) 命令(DPPP,APPP6)
Detection: Parity 检测:奇偶校验
Logging: MCA
Signaling: MCERR, SMI 信令:MCERR,SMI

NOTES: 注释:

  1. Assumes the legacy IA-32 MCA mode. Error reporting includes both error logging and signaling.
    假定传统的 IA-32 MCA 模式。错误报告包括错误日志记录和信号。
  2. Defined as advanced error detection and correction (AEDC).
    定义为高级错误检测和纠正(AEDC)。
  3. Log and signal memory data read errors. Shadow registers for UEFI-FW use.
    记录和信号内存数据读取错误。为 UEFI-FW 使用的影子寄存器。
  4. PCU logs the corrected error for the platform debug purpose, but does not signal CMCI.
    PCU 记录已更正的错误,用于平台调试目的,但不会发出 CMCI 信号。
  5. Log and signal normal data read and patrol scrub data read errors.
    记录并发出正常数据读取和巡逻擦除数据读取错误信号。
  6. Address Path Parity Protection (APPP)
    地址路径奇偶校验保护(APPP)
Error detection, correction, and reporting within execution engine is described in Advanced Error Detection and Correction - AEDC on page 228.
执行引擎内的错误检测、纠正和报告在《高级错误检测和纠正 - AEDC》第 228 页中有描述。

11.2.2 Advanced Error Detection and Correction - AEDC
11.2.2 高级错误检测和纠正 - AEDC

AEDC allows detecting faults within the execution engine (arrays and logic) using "residue checking" and parity protection techniques. Fault correction is accomplished by "instruction retry" and reported as a Corrected Error (CE). It relies on existing error reporting architecture. A CE event will be logged in the IFU MCA bank with and . If a retry does not correct the fault, then a fatal MCERR ( ) is reported via the IFU MCA bank.
AEDC 允许使用“残留检查”和奇偶校验技术在执行引擎(阵列和逻辑)内检测故障。通过“指令重试”来完成故障纠正,并报告为已纠正错误(CE)。它依赖于现有的错误报告架构。CE 事件将在 IFU MCA 银行中记录为 。如果重试无法纠正故障,则通过 IFU MCA 银行报告致命 MCERR( )。
AEDC is self-contained within the CPU and does not require any additional SW/OS support. OS/UEFI-FW is required to enable logging and signaling during the system initialization phase. Refer to Machine Check Architecture-Based Error Reporting on page 229 for enabling machine-check-architecture-based-error logging and signaling.
AEDC 完全包含在 CPU 内部,不需要任何额外的 SW/OS 支持。在系统初始化阶段启用日志记录和信令需要 OS/UEFI-FW。请参考第 229 页的基于机器检查架构的错误报告,以启用基于机器检查架构的错误日志记录和信令。

11.2.3 Error Reporting (MCA, AER) - Core, Uncore and IIO
11.2.3 错误报告(MCA,AER)- 核心,非核心和 IIO

The processor implements error reporting primarily through Machine Check Architecture (MCA) and Advanced Error Reporting (AER). Error reporting includes logging and signaling. This feature covers the following domains of error reporting:
处理器主要通过机器检查架构(MCA)和高级错误报告(AER)实现错误报告。错误报告包括日志记录和信令。此功能涵盖以下错误报告领域:
  1. MCA-based Error Reporting: A mechanism to capture and log the first fault in case of an uncorrected error and first or last fault in case of a corrected error. This includes all modules within the processor - core, uncore, and IIO (via IOMCA). See Machine Check Architecture-Based Error Reporting on page 229 for further details.
    基于 MCA 的错误报告:一种机制,用于在发生未纠正错误时捕获和记录第一个故障,在发生已纠正错误时捕获第一个或最后一个故障。这包括处理器内的所有模块 - 核心、非核心和 IIO(通过 IOMCA)。有关更多详细信息,请参阅第 229 页的基于机器检查架构的错误报告。
  2. Integrated Error Handler (IEH) 2.0, based Error Reporting: The processor implements a unified and hierarchical error handler called Integrated Error Handler (IEH). It incorporates the global IEH located in the UBOX and satellite IEH located inthe various IIO sub-modules, for example, PCI Express root ports, traffic switch, IRP, IIO core, Intel -d, and DSA. As per the PCI Express Specification, it includes an optional extended capability called PCIe Advanced Error Reporting (PCIe AER) that provides more robust error reporting than the standard PCI Express error reporting mechanism. See the PCI Express Error Reporting on page 247 for further details. The IEH Rev. 2.0 brings to IEH domain error reporting of IIO stack's internal functional units.
    集成错误处理程序(IEH)2.0,基于错误报告:处理器实现了一个统一和分层的错误处理程序,称为集成错误处理程序(IEH)。它包括位于 UBOX 中的全局 IEH 和位于各种 IIO 子模块中的卫星 IEH,例如,PCI Express 根端口、流量交换机、IRP、IIO 核心、Intel -d 和 DSA。根据 PCI Express 规范,它包括一个名为 PCIe 高级错误报告(PCIe AER)的可选扩展功能,提供比标准 PCI Express 错误报告机制更强大的错误报告。有关更多详细信息,请参阅第 247 页的 PCI Express 错误报告。IEH Rev. 2.0 将 IIO 堆栈的内部功能单元的错误报告引入到 IEH 领域。
  3. Inte UPI Error Reporting: Intel UPI error logging and signaling capability. See Intel UPI link Sub system RAS features.
    Intel UPI 错误报告:Intel UPI 错误记录和信令功能。请参阅 Intel UPI 链路子系统 RAS 功能。
  4. Integrated Voltage Regulator (IVR) Error Reporting: IVR error logging and signaling capability. See the Integrated Voltage Regulator (IVR) Error Reporting on page 239 for further details.
    集成电压调节器(IVR)错误报告:IVR 错误记录和信令功能。有关更多详细信息,请参阅第 239 页上的集成电压调节器(IVR)错误报告。
  5. Memory Corrected Error Reporting: Memory corrected error logging and signaling. See the Memory Corrected Error Reporting on page 240 for further details.
    内存校正错误报告:内存校正错误记录和信令。有关更多详细信息,请参阅第 240 页上的内存校正错误报告。
Error reporting consists of two primary functions:
错误报告包括两个主要功能:
  1. Error logging 错误日志记录
  2. Error signaling 错误信号传递
Error logging is implemented within the core and uncore including the IIO modules using MCA banks and proprietary CSR. Proactive signaling of an error event is a function of the error severity and mode of the operation (firmware first, BMC, and/or the OS).
错误日志记录已在核心和非核心中实现,包括使用 MCA 银行和专有 CSR 的 IIO 模块。错误事件的主动信号是错误严重性和操作模式(固件优先,BMC 和/或操作系统)的功能。
Table 75. Error Severity and Reporting Methodology
表 75. 错误严重性和报告方法论
Error Type
Classification
s
Reporting
Domain
Logging Signaling
Corrected MCA Various MCA bank registers
各种 MCA 银行寄存器
CMCI
Corrected AER (Severity 0) AER(严重程度 0)
Various PCIe Error logging registers. See PCI Express Error
各种 PCIe 错误记录寄存器。请参阅第 247 页的 PCI Express 错误报告以获取更多详细信息。
Reporting on page 247 for more details.
报告。
MSI, or SMI or
MSI,或 SMI 或
ERROR_N[0] pin ERROR_N[0] 引脚
Corrected
Memory ECC
corrected error 纠正错误
counters
CORRERRCNT (8x 32 bit covering eight ranks per channel)
CORRERRCNT(每通道覆盖八个排名的 8x 32 位)
SMI/CMCI/
ERROR_N[0] pin ERROR_N[0] 引脚
Uncorrected
Necoverable or 无法恢复或
MCA MCA bank registers MCA 银行注册 CMCI, MCERR
Uncorrected
Recoverable or 可恢复或
Non-Fatal
AER (Severity 1) AER(严重性 1)
Various IIO error logging registers. See PCI Express Error
各种 IIO 错误记录寄存器。有关更多详细信息,请参见第 247 页的 PCI Express 错误报告。
Reporting on page 247 for more details.
报告。
MSI, SMI, NMI, MSI、SMI、NMI。
ERROR_N[1] pin 错误_N[1] 引脚
Uncorrected
Fatal
MCA MCA bank registers MCA 银行寄存器 MCERR
Uncorrected
Fatal
AER (Severity 2) AER(严重程度 2)
Various IIO error logging registers. See PCI Express Error
各种 IIO 错误记录寄存器。有关更多详细信息,请参见第 247 页的 PCI Express 错误报告。
Reporting on page 247 for more details.
报告。
MSI, SMI, NMI, MSI、SMI、NMI。
ERROR_N[2] pin 错误_N[2] 引脚
Catastrophic MCA MCA bank registers MCA 银行寄存器 IERR

NOTES 笔记

  1. Assumes the legacy IA-32 MCA mode of signaling.
    假定信号传输的是传统的 IA-32 MCA 模式。
  2. In addition to MCERR signaling, the processor asserts the CATERR_N pin low for 16 BCLKs.
    除了 MCERR 信号传输外,处理器还会在 16 个 BCLK 时钟周期内将 CATERR_N 引脚置为低电平。
  3. In addition to IERR signaling, the processor asserts the CATERR_N pin persistently low until reset.
    除了 IERR 信号传输外,处理器会持续将 CATERR_N 引脚置为低电平,直到复位。
Platform level design note: In case of fatal or catastrophic faults, it may not be possible for UEFI-FW/OS/SW to collect all the error logs immediately after the fault is detected and signaled. In such a scenario, a platform reset (warm reset) may be required. Although all the MCA banks are sticky across a warm reset; however, there may be cases where a warm reset may not successfully complete. See Surprise Reset for further details.
平台级设计说明:在发生致命或灾难性故障时,UEFI-FW/OS/SW 可能无法在故障被检测和信号后立即收集所有错误日志。在这种情况下,可能需要进行平台复位(热复位)。尽管所有 MCA 银行在热复位时都是粘性的;然而,可能存在热复位无法成功完成的情况。请参阅“意外复位”以获取更多详细信息。

11.2.4 Machine Check Architecture-Based Error Reporting
11.2.4 机器检查架构基础错误报告

Machine check architecture is a primary mechanism for reporting errors to the operating system/software. It is described in the Inte/ and IA-32 Architectures Software Developer's Manual Volume 3B: System Programming Guide, Part 2 (referred to as SDM), document number 671427, Chapter 15, Machine Check Architecture.
机器检查架构是向操作系统/软件报告错误的主要机制。它在 Intel/ 和 IA-32 架构软件开发人员手册第 3B 卷:系统编程指南第 2 部分(简称 SDM),文档编号 671427,第 15 章“机器检查架构”中有描述。
The following table summarizes all the MCA configuration registers. This table also lists various other UBOX and PCU registers that may be used by the UEFI-FW to configure the error signaling behavior when a machine-check event is detected.
以下表格总结了所有 MCA 配置寄存器。该表还列出了可能被 UEFI-FW 使用的各种其他 UBOX 和 PCU 寄存器,用于配置在检测到机器检查事件时的错误信号行为。
Table 76. Machine Check Architecture Based Error Reporting Configuration Registers
表 76. 基于机器检查架构的错误报告配置寄存器
Scope Register Description
MCA Configuration Registers
MCA 配置寄存器
Core CR4[6]
Machine Check Enable (MCE). See SDM,
机器检查启用(MCE)。请参阅 SDM,
Volume 3 for more details (search for
有关更多详细信息,请参阅第 3 卷(搜索
CR4.MCE).
Bank
IA32_MCi_CTL (MSR 0x400+
IA32_MCi_CTL(MSR 0x400+
BANK_NUMBER*4))  银行号码*4))
Configures the signaling of all MCA
配置所有 MCA 的信令
based errors. CPU default is to disable
基于错误。 CPU 默认为禁用
all MCA error signaling, and OS/UEFI-FW
所有 MCA 错误信号和 OS/UEFI-FW
is required to enable it by programming
需要通过编程来启用它
all 1's. See SDM, Volume 3, Section
全部为 1。请参阅 SDM,第 3 卷,第 3 节
15.3.2.1, IA32_MCi_CTL MSRs, for more
15.3.2.1,IA32_MCi_CTL MSRs,更多
details.
Bank
IA32_MCi_CTL2 (MSR IA32_MCi_CTL2(MSR
BANK_NUMBER)
Configures CMCI enable/disable,
配置 CMCI 启用/禁用,
corrected error threshold, and morphing
修正错误阈值和变形
of CMCI/MCERR to CSMI/MSMI
从 CMCI/MCERR 到 CSMI/MSMI
respectively (applicable only when EMCA
分别(仅在 EMCA 时适用
Gen2 is used). See SDM, Volume 3,
Gen2 用于)。请参阅 SDM,第 3 卷,
Section 15.3.2.5, IA32_MCi_CTL2 MSRs,
第 15.3.2.5 节,IA32_MCi_CTL2 MSRs,
for more details. 了解更多详情。
Global PCU_MC_CTL () PCU_MC_CTL()
Configures the programmable counter
配置可编程计数器
value that the CPU would use to
CPU 将使用的值
interpret an incoming CAT_ERR_N pin
将传入的 CAT_ERR_N 引脚解释为 IERR 或 MCERR。
signal as an IERR or MCERR.
信号。
PCU Configuration Register
PCU 配置寄存器
Global VIRAL_CONTRO_CFG
Controls how the PCU responds to viral
控制 PCU 如何响应病毒
and EMCA Gen2 signaling.
和 EMCA Gen2 信号。
Note: This register must be
注意:此寄存器必须
programmed for EMCA Gen2
为 EMCA Gen2 编程
mode even if viral is not
即使病毒不存在也是模式
enabled.
Intel UPI Configuration Register
Intel UPI 配置寄存器
Inte UPI links Intel UPI 链接 KTIERRDISO
Intel UPI Error Disable. Allows UEFI-FW
Intel UPI 错误禁用。允许 UEFI-FW
to disable error report (logging and
禁用错误报告(记录和
signaling) for correctable error cases
用于可纠正错误情况的信令
within the Intel UPI Machine Check
在 Intel UPI 机器检查内
banks.
Inte UPI links Intel UPI 链接 KTIERRDIS1
Intel UPI Error Disable. Allows UEFI-FW
禁用 Intel UPI 错误。允许 UEFI-FW
to disable error report (logging and
禁用错误报告(日志记录和
signaling) for uncorrectable error cases
信号)以处理不可纠正的错误情况
within the Intel UPI Machine-check
在 Intel UPI 机器检查中
banks.
Inte UPI links Intel UPI 链接 KTICSMITHRES
Repeated for each Intel UPI link. Used to
为每个 Intel UPI 链接重复。用于
set the threshold and enable/disable
设置阈值并启用/禁用
CSMI once threshold is reached.
一旦达到阈值,启用 CSMI。
Inte UPI links Inte UPI 链接 KTICERRLOGCTRL
Repeated for each Intel UPI link. Used to
每个 Intel UPI 链路都会重复。用于
enable 'MCA Bank Error Control (aka
启用“MCA 银行错误控制(又名
Cloaking) for Intel UPI corrected errors.
隐形)以纠正 Intel UPI 错误。
Refer to Table 88 on page 250 for
请参考第 250 页的第 88 表
further details. 获取更多详细信息。
The following table summarizes all the MCA based error logging registers. This table also lists various other UBOX and PCU error logging registers that may be used by the UEFI-FW for error reporting and debugging purposes.
以下表格总结了所有基于 MCA 的错误记录寄存器。该表还列出了各种其他 UBOX 和 PCU 错误记录寄存器,这些寄存器可能被 UEFI-FW 用于错误报告和调试目的。
Table 77. Machine Check Architecture Based Error Log Registers
表 77. 机器检查架构基于错误日志寄存器
Scope Register Description
MCA Capability and Logging
MCA 能力和日志记录
Registers
Global IA32_MCG_CAP (MSR 0x179)
IA32_MCG_CAP(MSR 0x179)
Described in Section 15.3.1.1
在第 15.3.1.1 节中描述
IA32_MCG_CAP MSR of the SDM [1a].
SDM 的 IA32_MCG_CAP MSR [1a]中
Also documented in Volume 3, Table
也记录在第 3 卷,表格
35-2 Architectural MSRs. Describes the
35-2 建筑 MSR。描述了系统的机器检查能力,包括数量
machine check capabilities of the
system, including the number of
机器检查能力,包括数量
machine check banks available. Only
机器检查可用的银行。只有
needs to be read once per system.
需要每个系统只读一次。
Global IA32_MCG_STATUS (MSR 0x17A)
IA32_MCG_STATUS(MSR 0x17A)
Described in Section 15.3.1.2
在第 15.3.1.2 节中描述
IA32_MCG_STATUS MSR of the SDM
SDM 的 IA32_MCG_STATUS MSR
. Also documented in Volume 3,
也记录在第 3 卷中
Table 35-2 Architectural MSRs. Describes
表 35-2 架构 MSRs。描述
a machine check that just occurred,
刚刚发生的机器检查,
including if an exception was signaled, if
包括是否发出了异常信号,如果
a machine check is in progress, if the
机器正在进行检查,如果
instruction pointer is valid, and if
指令指针有效,并且如果
program execution can be restarted
可以重新启动程序执行
reliably.
Per Thread
IA32_MCi_STATUS (MSR 0x401+
IA32_MCi_STATUS(MSR 0x401+
BANK_NUMBER*4))
See SDM, Volume 3, Section 15.3.2.2,
请参阅 SDM,第 3 卷,第 15.3.2.2 节,
IA32_MCi_STATUS MSRs, for more
IA32_MCi_STATUS MSRs,用于更多
details.
Per Core
IA32_MCi_ADDR (MSR 0x402+
IA32_MCi_ADDR(MSR 0x402+
BANK_NUMBER*4))
See SDM, Volume 3, Section 15.3.2.3,
查看 SDM,第 3 卷,第 15.3.2.3 节,
IA32_MCi_ADDR MSRs, for more details.
有关更多详细信息,请参阅 IA32_MCi_ADDR MSRs。
Per Package
IA32_MCi_MISC (MSR 0x403+
IA32_MCi_MISC(MSR 0x403+
BANK_NUMBER*4))  银行号*4))
See SDM, Volume 3, Section 15.3.2.4,
请参阅 SDM,第 3 卷,第 15.3.2.4 节,
IA32_MCi_MISC MSRs, for more details.
有关更多详细信息,请参阅 IA32_MCi_MISC MSRs。
Error Logging Registers in PCU
PCU 中的错误日志寄存器
Global MCA_ERR_SRC_LOG
Indicates internal/external CATERR/
指示内部/外部 CATERR/
RMCA, IERR, MCERR, and MSMI. Error
RMCA,IERR,MCERR 和 MSMI。错误
handling FW/FW should check this
处理 FW/FW 应该检查这个
register first to locate the source
首先注册以定位来源
socket(s).
Global PCU_FIRST_IERR_TSC_LO
The time stamp when the first IERR
第一个 IERR 时的时间戳
from the PCU is triggered gets logged.
从 PCU 触发的日志被记录。
Global PCU_FIRST_IERR_TSC_HI
The time stamp when the first IERR
第一个 IERR 触发的时间戳
from the PCU is triggered gets logged.
从 PCU 触发的日志被记录。
Global PCU_FIRST_MCERR_TSC_LO
The time stamp when the first MCERR
第一次从 PCU 触发时记录的时间戳
from the PCU is triggered gets logged.
被触发时记录的时间戳
Global PCU_FIRST_MCERR_TSC_HI
The time stamp when the first MCERR
第一次从 PCU 触发时记录的时间戳
from the PCU is triggered gets logged.
触发来自 PCU 的日志记录。
Global CORE_FIVR_ERR_LOG Reports core FIVR faults.
报告核心 FIVR 故障。
Global UNCORE_FIVR_ERR_LOG Reports uncore FIVR faults.
报告非核心 FIVR 故障。
Intel UPI Error Logging Registers
Intel UPI 错误日志寄存器
Inte UPI links Inte UPI 链接 BIOS_KTI_ERR_ST
Repeated for each UPI link (for example,
每个 UPI 链接都会重复(例如,
link 0 is device 14). Same information is
链接 0 是设备 14)。相同的信息是
logged as in UPI MCi_STATUS MSRs. The
记录在 UPI MCi_STATUS MSRs 中。不同之处在于它将记录错误
Scope Register Description
difference is that it will log the error
差异在于它将记录错误
even when an error is disabled using
即使使用错误已禁用
KTIERRDIS0 and KTIERRDIS1 registers.
KTIERRDIS0 和 KTIERRDIS1 寄存器。
Intel UPI links Intel UPI 链接 BIOS_KTI_ERR_MISC
Repeated for each UPI link (for example,
每个 UPI 链接都重复一次(例如,
link 0 is device 14). Same information is
链接 0 是设备 14)。相同的信息也会被记录在 UPI MCi_MISC MSRs 中。
logged as in UPI MCi_MISC MSRs. The
difference is that it will log the error
差异在于它将记录错误
even when an error is disabled using
即使使用禁用错误
KTIERRDIS0 and KTIERRDIS1 registers.
KTIERRDIS0 和 KTIERRDIS1 寄存器。
Intel UPI links Intel UPI 链接 BIOS_KTI_ERR_AD
Repeated for each UPI link (for example,
每个 UPI 链接重复一次(例如,
link 0 is device 14). Same information is
链接 0 是设备 14)。相同的信息
logged as in UPI MCi_ADDR MSRs. The
记录为 UPI MCi_ADDR MSRs 中的
difference is that it will log the error
不同之处在于它将记录错误
even when an error is disabled using
即使使用禁用错误时
KTIERRDIS0 and KTIERRDIS1 registers.
KTIERRDIS0 和 KTIERRDIS1 寄存器。

NOTE 注意

The machine bank numbers supported vary based on processor type and processor SKU.
支持的机器组编号因处理器类型和处理器 SKU 而异。

11.2.4.1 Enhanced Cache Error Reporting
11.2.4.1 增强缓存错误报告

The processor supports the threshold-based error reporting, and IA32_MCG_CAP[11] (MCG_TES_P) is set to one. The processor contains hardware that tracks the operating status of certain caches and provides an indicator of their health. The hardware reports a "green" status when the number of lines that incur repeated corrections is at or below a pre-defined threshold, and a "yellow" status when the number of affected lines exceeds the threshold. The yellow status means that the cache reporting the event is operating correctly; however, a system service is required to mitigate the issue. The processor implements TES in the LLC and is reported via the
处理器支持基于阈值的错误报告,并且 IA32_MCG_CAP[11] (MCG_TES_P) 设置为一。处理器包含跟踪某些缓存的操作状态并提供其健康状态指示器的硬件。当发生需要重复校正的行数等于或低于预定义阈值时,硬件报告“绿色”状态;当受影响行数超过阈值时,报告“黄色”状态。黄色状态意味着报告事件的缓存正常运行;但是,需要系统服务来缓解问题。处理器在 LLC 中实现了 TES,并通过 IA32_MCi_STATUS[54:53] 位在 CHA MCA 银行 (MC9, MC10, MC11) 中报告。
IA32_MCi_STATUS[54:53] bits within the CHA MCA banks (MC9, MC10, MC11). The system/platform response to a yellow event should be less severe than its response to an uncorrected error. An uncorrected error means that a serious error has actually occurred. Where as the yellow condition is a warning that the number of affected lines have exceeded the threshold but is not, in itself a serious event, the error was corrected and the system state was not compromised.
系统/平台对黄色事件的响应应比对未校正错误的响应更为温和。未校正错误意味着实际发生了严重错误。而黄色条件是一个警告,受影响行数超过阈值,但本身不是严重事件,错误已被校正,系统状态未受损。

11.2.4.2 Machine Check Architecture Based Corrected Error Signaling
11.2.4.2 机器检查架构基于纠正错误信号

Corrected errors are signaled via Corrected Machine Check Error Interrupt (CMCI). CMCI is an architectural enhancement to the machine check architecture. It provides capabilities beyond those of the threshold based error reporting. With threshold-based error reporting, the software is limited to use periodic polling to query the status of hardware corrected MC errors. CMCI provides a signaling mechanism to deliver a local interrupt based on threshold values that software can program using the IA32_MCi_CTL2 MSRs.
纠正错误通过纠正机器检查错误中断(CMCI)进行信号传递。CMCI 是机器检查架构的一个架构增强功能。它提供了超出基于阈值的错误报告的能力。使用基于阈值的错误报告,软件仅限于使用周期性轮询来查询硬件纠正的 MC 错误的状态。CMCI 提供了一个信号传递机制,根据软件可以使用 IA32_MCi_CTL2 MSRs 编程的阈值来提供基于本地中断的信号传递。
CMCI is disabled by default. The system software is required to enable CMCI for each IA32_MCi bank that supports the reporting of hardware corrected errors if IA32_MCG_CAP[10] = 1. The IA32_MCi_CTL2 MSR is used to enable/disable the CMCI capability for each bank and to program the threshold values. CMCI is not affected by the CR4.MCE bit, and it is not affected by the IA32_MCi_CTL MSRs.
CMCI 默认情况下是禁用的。系统软件需要为支持硬件纠正错误报告的每个 IA32_MCi bank 启用 CMCI,如果 IA32_MCG_CAP[10] = 1。IA32_MCi_CTL2 MSR 用于为每个 bank 启用/禁用 CMCI 功能并编程阈值。CMCI 不受 CR4.MCE 位的影响,也不受 IA32_MCi_CTL MSRs 的影响。
To detect the existence of thresholding for a given bank, the software writes bits [14:0] of the IA32_MCi_CTL2 register with the threshold value. If the bits persist, then thresholding is available (and CMCI is available). If the bits are all zeros, then no thresholding exists to the value returned. To detect that CMCI signaling exists, the software sets the MCi_CTL2 register bit 30 to one. Upon subsequent read, if bit 30 is equal to zero, then no CMCI is available for this bank. If bit , then is available and enabled.
为了检测给定银行的阈值存在性,软件将 IA32_MCi_CTL2 寄存器的位 [14:0] 写入阈值值。如果位持续存在,则表示存在阈值(并且 CMCI 可用)。如果所有位都是零,则表示不存在返回的值的阈值。为了检测 CMCI 信号存在性,软件将 MCi_CTL2 寄存器的第 30 位设置为一。在随后的读取中,如果第 30 位等于零,则表示此银行不可用 CMCI。如果位 ,则 可用且已启用。
CMCI interrupt delivery is configured by writing to the IA32_X2APIC_LVT_CMCI MSR entry in the local APIC register space at the default address of APIC_BASE . A CMCI interrupt can be delivered to more than one logical processor if multiple logical processors are affected by the associated MC errors. For example, if a corrected bit error in a cache shared by two logical processors caused a CMCI, the interrupt is delivered to both logical processors sharing that micro-architectural sub-system. Similarly, package level errors may cause the CMCI to be delivered to all logical processors within the package. The CMCI interrupt is not propagated outside of a given processor in a multi-processor system. The LVT entry allows four delivery modes, an 8-bit interrupt vector, and masking. The operating system is expected to manage this LVT entry.
CMCI 中断传递通过写入本地 APIC 寄存器空间中 IA32_X2APIC_LVT_CMCI MSR 条目来配置,其默认地址为 APIC_BASE 。如果多个逻辑处理器受到相关 MC 错误的影响,CMCI 中断可以传递给多个逻辑处理器。例如,如果由两个逻辑处理器共享的缓存中的校正位错误导致 CMCI 中断,则该中断将传递给共享该微架构子系统的两个逻辑处理器。同样,包级别错误可能导致 CMCI 传递给包内的所有逻辑处理器。在多处理器系统中,CMCI 中断不会传播到给定处理器之外。LVT 条目允许四种传递模式,一个 8 位中断向量和屏蔽。操作系统应该管理这个 LVT 条目。
In the non-firmware first model, the Uncorrected Error No Action (UCNA) type of UCR errors also signals the CMCI. The signaling methodology for such errors is similar to that of corrected errors as described in this section. The following table describes the CMCI capability for the processor.
在非固件优先模型中,未纠正错误无操作(UCNA)类型的 UCR 错误也会发出 CMCI 信号。这类错误的信号方法与本节中描述的校正错误类似。以下表格描述了处理器的 CMCI 功能。
Table 78. Processor's CMCI Capability
表 78. 处理器的 CMCI 功能
Processor Module 处理器模块 CMCI on Corrected Error Threshold
在纠正错误阈值上的 CMCI
CMCI on UCNA Type of Errors
在 UCNA 类型的错误上的 CMCI
Core-IFU Supported Supported
Core-DCU Supported Supported
Core-DTLB Not Supported 不支持 Not supported 不支持
Core-MLC Supported Supported
PCU Not Supported 不支持 Not supported 不支持
Intel UPI 英特尔 UPI Supported Supported
UBOX Not Supported 不支持 Not supported 不支持
CHA Supported Supported
B2CMI Supported Supported
IMC Supported Supported

NOTES 注意事项

  1. The information is preliminary and may change during the design phase of the processor. Consult your local Intel representative prior to finalizing your design.
    该信息为初步信息,可能会在处理器设计阶段发生变化。在最终确定设计之前,请咨询您当地的英特尔代表。
  2. Supports CMCI on an uncorrected error only when the corrupt data containment mode is enabled.
    仅当启用了损坏数据包含模式时,才支持 CMCI 处理未校正的错误。
  3. is broadcast to the only thread within the core and not to other cores and sockets in the system. If a corrected error is detected outside of cores (that is, within the uncore), then the CMCI is signaled to all the logical threads within the socket. In either case, CMCI is not signaled to other sockets in a multi-socket system.
    被广播到核心内的唯一线程,而不是系统中的其他核心和插槽。如果在核心之外(即在非核心内)检测到校正错误,则向套接字内的所有逻辑线程发出 CMCI 信号。在任何情况下,CMCI 不会向多插槽系统中的其他插槽发出信号。
The CMCI triggered by a core within the processor is first signaled to its local XAPIC, and then further delivered to the destination(s) as defined in the CMCI LVT. The uncore triggered CMCIs are routed to the UBOX to broadcast to all the cores on the socket. There is no provision for one processor to signal a CMCI to another processor within a system.
处理器内核触发的 CMCI 首先被发送到其本地 XAPIC,然后进一步传递到 CMCI LVT 中定义的目的地。非核心触发的 CMCI 被路由到 UBOX,以广播到套接字上的所有核心。在系统内,一个处理器没有向另一个处理器发出 CMCI 的规定。

11.2.4.3 Machine Check Architecture Based Uncorrected Recoverable (UCR)
11.2.4.3 基于机器检查架构的不可纠正恢复性(UCR)错误

Error Signaling
The processor family incorporates a UCR error signaling when the corrupt data containment mode is enabled. UCNA type of UCR errors are signaled via the CMCI, and SRAR types of UCR errors are signaled via the MCERR (Machine Check Error).
当启用损坏数据包含模式时,处理器系列将包含 UCR 错误信令。UCNA 类型的 UCR 错误通过 CMCI 信令,SRAR 类型的 UCR 错误通过 MCERR(机器检查错误)信令。

11.2.4.4 Machine Check Architecture Based Uncorrected (Fatal and Catastrophic) Error Signaling
11.2.4.4 基于机器检查架构的不可纠正(致命和灾难性)错误信令

In the event of an uncorrected error and when the corrupt data containment mode is disabled, the entire MCA behavior reverts to the legacy behavior, that is, the producer of the uncorrected error will signal over the CATERR# pin, as a pulse or hold condition reflecting MCERR (fatal) or IERR (catastrophic) error. Such an event will propagate to all the threads within the package via localxAPIC and will activate the CATERR_N pin to propagate to the rest of the platform.
在发生未纠正错误且损坏数据封装模式已禁用时,整个 MCA 行为将恢复到传统行为,即未纠正错误的生产者将通过 CATERR# 引脚发出信号,作为反映 MCERR(致命)或 IERR(灾难性)错误的脉冲或保持条件。此类事件将通过 localxAPIC 传播到包内的所有线程,并将激活 CATERR_N 引脚以传播到平台的其余部分。
  • MCERR - Machine check architecture is enabled by setting CR4.MCE. MCERR is signaled to all cores in the socket and the CATERR_N pin will also be asserted.
    MCERR - 通过设置 CR4.MCE 启用机器检查体系结构。MCERR 信号发送到插座中的所有内核,并且 CATERR_N 引脚也将被断言。
  • IERR - Refers to a catastrophic error, the processor core may not be able to execute reliably and may not fully process an INT18 handler flow. The following are some possible cases of such catastrophic errors:
    IERR - 指的是灾难性错误,处理器核心可能无法可靠执行,也可能无法完全处理 INT18 处理程序流。以下是一些可能发生此类灾难性错误的情况:
  • Retirement watchdog time-out from the core
    核心退休监控超时
  • MCERR when CR4.MCE is not set to 1 in the core
    当核心中 CR4.MCE 未设置为 1 时出现 MCERR
  • Power Control Unit (PCU) errors
    电源控制单元(PCU)错误
  • Factory-configuration download failures
    工厂配置下载失败
In addition, when the processor is configured in viral mode, all core or uncore fatal errors will be in an IERR condition.
此外,当处理器配置为病毒模式时,所有核心或非核心致命错误 将处于 IERR 状态。

11.2.4.5 Machine Check Architecture Based External Error Signaling
11.2.4.5 机器检查架构基于外部错误信号

One external pin of the processor to signal uncorrectable errors is called CATERR_N. The pin is used to signal non-recoverable MCERR or IERR depending upon the duration of active signal. To signal an MCERR, the processor drives CATERR_N active low for 16 BCLKs (that is, 160 ns with the reference clock). To signal an IERR, CATERR_N is driven active low until a warm (without the PWRGOOD signal asserted) or cold reset (with the PWRGOOD signal asserted). Therefore, the recipients of CATERR_N signal can detect whether the processor signals MCERR or IERR at least after the assertion of CATERR_ by the processor.
处理器的一个外部引脚,用于信号不可纠正错误,称为 CATERR_N。该引脚用于信号不可恢复的 MCERR 或 IERR,具体取决于活动信号的持续时间。为了发出 MCERR 信号,处理器会使 CATERR_N 在 16 个 BCLKs(即,160 ns,使用 参考时钟)内保持低电平。为了发出 IERR 信号,CATERR_N 会一直保持低电平,直到进行热重置(未断言 PWRGOOD 信号)或冷重置(已断言 PWRGOOD 信号)。因此,CATERR_N 信号的接收方可以在处理器通过 CATERR_ 发出信号至少 后检测到处理器信号 MCERR 或 IERR。
The second pin, RMCA_N, is used to signal recoverable errors. The Base Module Controller (BMC) or external management module needs to observe both CATERR_N pin and RMCA_N to determine if the error condition is recoverable or a system reset is expected. The BMC or external management module should provide system MCE handling software, the opportunity to process the error, and issue the necessary platform reset.
第二个引脚 RMCA_N 用于信号可恢复错误。基础模块控制器(BMC)或外部管理模块需要观察 CATERR_N 引脚和 RMCA_N 引脚,以确定错误条件是否可恢复或是否需要系统重置。BMC 或外部管理模块应提供系统 MCE 处理软件,以处理错误并发出必要的平台重置。
The processor drives the CATERR_N pin for 16 Bclocks for an MCERR event in the processor core/uncore, and asserts it as level signal for an IERR event in the processor core/uncore. Due to interconnect loading, the CATERR_N signal may stay low for longer than 16 BCLKs. Any agents that are sampling the CATERR_N pin should consider sampling the pin for a longer than 16 BCLKs to remedy interconnect loading concerns. A reasonable time can be selected to be 28 BCLKs. If CATERR_N is asserted for less than 28 BCLKs, then the processor indicates an MCERR event. Otherwise, the processor indicates an IERR event.
处理器在处理器核心/非核心中为 MCERR 事件驱动 CATERR_N 引脚 16 个 B 时钟,并在处理器核心/非核心中将其断言为 IERR 事件的电平信号。由于互连负载,CATERR_N 信号可能低于 16 个 BCLKs 的时间更长。任何对 CATERR_N 引脚进行采样的代理应考虑对该引脚进行长于 16 个 BCLKs 的采样,以解决互连负载问题。可以选择合理的时间为 28 个 BCLKs。如果 CATERR_N 在少于 28 个 BCLKs 的时间内被断言,则处理器指示 MCERR 事件。否则,处理器指示 IERR 事件。
Out-Of-Band (OOB) agents such as the BMC in the platform may observe the CATERR_N pin and query the error registers either via the existing in-band access mechanisms or one of the available OOB access mechanisms.
基带外(OOB)代理,如平台中的 BMC,可以观察 CATERR_N 引脚,并通过现有的带内访问机制或其中一个可用的 OOB 访问机制查询错误寄存器。
The following tables summarize error signaling capabilities.
以下表格总结了错误信号能力。
Table 79. MCA Banks Based Error Logging and Signaling
基于错误日志和信号的 MCA 银行表 79
Error Type MCi_STATUS Fields [Bit Position]
MCi_STATUS 字段 [位位置]
Error Signaling 错误信号
UC [61] PCC [57] S [56] AR [55]
CATERR_N/RMCA_N
Pin
Interrupt
Type
Corrected 0 None CMCI
Uncorrected Recoverable (UCNA)
未校正可恢复 (UCNA)
1 0 0 0 None CMCI
Uncorrected Recoverable (SRAR)
未校正可恢复 (SRAR)
1 0 1 1
Pulse (RMCA_N in non
脉冲(非 RMCA_N)
LMCE mode)
INT18
Uncorrected Fatal 未校正的致命错误 1 1 Pulse (CATERR_N) 脉冲(CATERR_N) INT18
Uncorrected Catastrophic
未校正的灾难性
1 1
Persistent Low 持续低
(CATERR_N)
INT18

NOTE 注意

If EMCA Gen2 feature is enabled, then CMCI will be morphed to CSMI, and INT18 will be morphed to MSMI.
如果启用了 EMCA Gen2 功能,则 CMCI 将会变形为 CSMI,INT18 将会变形为 MSMI。
Table 80. MCA Banks Capability of Reporting Errors
表 80. MCA 银行报告错误的能力
Error Type Machine Check Banks Capable of Producing Errors of This Type
机器检查银行能够产生此类型错误
IFU DCU DTLB MLC PCU Intel UPI 英特尔 UPI UBOX
B2CM
I
CHA IMC
Corrected (threshold)
已更正(阈值)
Uncorrected No Action 未更正 无操作
(UCNA)
SW Recoverable Action SW 可恢复操作
Required (SRAR) 必需的(SRAR)
Fatal/Catastrophic 致命/灾难性
NOTE 注意
If EMCA Gen2 feature is enabled, then corrected error signaling is morphed to CSMI; threshold based CSMI is only implemented in Intel UPI MCA bank.
如果启用了 EMCA Gen2 功能,则纠正的错误信号会转变为 CSMI;基于阈值的 CSMI 仅在 Intel UPI MCA bank 中实现。

11.2.4.6 MCA Bank Error Reporting Modes
11.2.4.6 MCA 银行错误报告模式

The processor-based systems can be configured in several different modes of MCA bank error reporting depending upon the customer needs and are listed here:
基于处理器的系统可以根据客户需求在几种不同的 MCA 银行错误报告模式下进行配置,这里列出了这些模式:
  1. Legacy IA-32 MCA mode
    传统的 IA-32 MCA 模式
  2. Corrupt Data Containment (CDC) (also known as poison mode)
    损坏数据封装(CDC)(也称为毒模式)
  3. Corrupt data containment + viral mode
    损坏数据封装 + 病毒模式
Error signaling can be configured to be in enhanced MCA Gen 2 (eMCA Gen 2) or legacy mode.
错误信号可以配置为增强 MCA Gen 2 (eMCA Gen 2) 或传统模式。
Some of these modes can be enabled simultaneously and some are complementary to each other. The following table provides further details of the various operating modes feasible simultaneously.
其中一些模式可以同时启用,一些是互补的。以下表格提供了同时可行的各种操作模式的进一步细节。
Table 81. Operating Mode Mixing Feasible for the Platform
平台可行的操作模式混合表 81
Operating modes 操作模式
Legacy IA-32
MCA
Corrupt Data
Containment
EMCA Gen 2 IOMCA Mode Viral Mode
Legacy IA-32 MCA 传统的 IA-32 MCA
Mode
Yes
Corrupt Data
Containment
No Yes
EMCA Gen 2 No Yes Yes
IOMCA Mode Yes Yes Yes Yes
Viral Mode1 No Yes Yes Yes Yes
NOTE 注意
Available only as part of an advanced RAS feature
仅作为高级 RAS 功能的一部分提供
In each of these modes, error reporting (logging and signaling) differs. The following table provides further details.
在这些模式中,错误报告(日志记录和信令)有所不同。以下表格提供了更多细节。
Table 82. Various Modes of MCA Bank Error Signaling
MCA 银行错误信号的各种模式表 82
MCA Modes Corrected Uncorrected
Legacy IA-32
MCA
- Triggers MCE at the source of uncorrected error.
- 在未纠正错误的源头触发 MCE。
- Asserts CATERR_N (pulse in case of uncorrectable MCERR, level in case of IERR).
- 断言 CATERR_N(在无法纠正的 MCERR 情况下为脉冲,IERR 情况下为电平)。
Corrupt Data
Containment
(CDC)
- UCNA: Do not trigger MCE -
- UCNA:不要触发 MCE -
may be triggered if enabled.
如果启用可能会触发。
- SRAO: Trigger MCE with non-execution path, action is optional.
- SRAO:使用 非执行路径触发 MCE,操作是可选的。
- SRAR: Trigger MCE with execution path, take recovery actions
- SRAR: 使用 执行路径触发 MCE,执行恢复操作
- Fatal: Trigger MCE with system reset
- 致命错误: 触发 MCE,系统重置
EMCA Gen2 CSMI
MSMI, and asserts RMCA# for recoverable error and CATERR_R for non-recoverable errors to
MSMI,并对可恢复错误断言 RMCA#,对不可恢复错误断言 CATERR_R
remote sockets in the partition.
分区中的远程套接字。
IOMCA NA
- Fatal and non-fatal: Trigger MCE with PCC=1. Signals MSMI or MCE in accordance with eMCA
- 致命和非致命:使用 PCC=1 触发 MCE。根据 eMCA 触发 MSMI 或 MCE 信号。
Gen 2 being on/off.
Gen 2 开/关。
Viral NA All the sockets enter viral state. The CATERR_N pin is pulled low indicating IERR.
所有插座进入病毒状态。CATERR_N 引脚被拉低,表示 IERR。

11.2.5 Integrated Error Handler (IEH) Based Error Reporting
11.2.5 集成错误处理程序(IEH)基于错误报告

11.2.5.1 IEH - Functional Description
11.2.5.1 IEH - 功能描述

Distributed IIO root ports follow an architected scheme to propagate the error to signal an interrupt or toggle the pin on the processor socket. IEH creates an architected way of reporting all processor integrated and attached agent's errors in a format similar to the PCIe standard. With IEH, IIO root port's internal error reporting also falls through the IEH path, signaling through the satellite IEHs to the global IEH as the central resource for logging and error escalation. In the processor, the RCEC is inside the satellite IEH making it OS-visible. An OS-visible RCEC allows more flexibility because errors and error propagation can be cleared. The RCEC also has an optional signaling path before propagation to the global IEH. The global IEH processes all incoming error messages from satellite IEHs (if present) and local devices (if some devices are connected directly to the global IEH) and signals interrupts as described in the following section.
分布式 IIO 根端口遵循一种架构方案,用于传播错误以信号中断或切换处理器插座上的引脚。IEH 创建了一种类似于 PCIe 标准的格式,用于报告所有处理器集成和附加代理的错误的架构方式。通过 IEH,IIO 根端口的内部错误报告也通过 IEH 路径传递,通过卫星 IEH 信号到全局 IEH,作为记录和错误升级的中央资源。在处理器中,RCEC 位于卫星 IEH 内部,使其对操作系统可见。可见的 RCEC 允许更灵活,因为错误和错误传播可以被清除。RCEC 还具有在传播到全局 IEH 之前的可选信号路径。全局 IEH 处理来自卫星 IEH(如果存在)和本地设备(如果某些设备直接连接到全局 IEH)的所有传入错误消息,并根据以下部分的描述发出中断信号。

11.2.5.2 IEH - Error Detection, Correction, and Reporting
11.2.5.2 IEH - 错误检测、纠正和报告

Figure 28. IEH Hierarchy
图 28. IEH 层次结构
The following is a high level summary of the IEH:
下面是 IEH 的高级摘要:
  1. The CPU implements a global IEH, which is the central resource for logging and escalation errors. In addition to the global IEH, the CPU implements multiple satellite IEHs. In such implementations, global IEH and the satellite IEHs are connected to form an IEH hierarchy as shown in the figure above.
    CPU 实现了一个全局 IEH,这是记录和升级错误的中央资源。除了全局 IEH 外,CPU 还实现了多个卫星 IEH。在这种实现中,全局 IEH 和卫星 IEH 连接在一起,形成如上图所示的 IEH 层次结构。
  2. Every IEH in a processor presents itself as a PCIe device (with BDF) to the software.
    处理器中的每个 IEH 都会向软件呈现为一个 PCIe 设备(带有 BDF)。
  3. A unique device ID is assigned to IEH - global IEH ( ), satellite IEH (DID and revision ID is zero (RID for both.
    为全球 IEH( )和卫星 IEH(DID )分配了唯一的设备 ID,修订 ID 对于两者都是零(RID )。
  4. Error detection and header logging is done for the first uncorrectable error (fatal or non-fatal).
    对于第一个不可纠正的错误(致命或非致命),进行错误检测和标题记录。
  5. Allows flexible mapping of the detected errors to different error severity.
    允许将检测到的错误灵活映射到不同的错误严重性。
  6. Allows different customers to have flexible options for various kinds of signaling SMI/NMI, ERROR_N[2:0] pins, MSI/INTx, and/or IOMCA. The global IEH sends error severity and Bus/Func/Dev information to the IOMCA.
    允许不同的客户端对各种类型的信令 SMI/NMI、ERROR_N[2:0] 引脚、MSI/INTx 和/或 IOMCA 进行灵活的选项。全局 IEH 将错误严重性和总线/功能/设备信息发送到 IOMCA。
  7. Incorporates PCI Express specifications based advanced error reporting with the following key features:
    结合基于 PCI Express 规范的高级错误报告,具有以下关键特性:
a. The processor incorporates PCI Express AER as defined in the PCI Express Base Specification, Revision 5.0. Detects, logs and signals errors received from the downstream devices connected to processors' PCI Express and DMI2 interfaces. Signaling is done via Message Signaling Interrupt (MSI) at the local root port level.
a. 处理器将 PCI Express AER 集成为 PCI Express 基础规范第 5.0 版中定义的内容。检测、记录并信令从连接到处理器的 PCI Express 和 DMI2 接口的下游设备接收到的错误。信令通过本地根端口级别的消息信令中断(MSI)完成。
b. Provides capability to mask error detection thus preventing further reporting to architecturally defined error handling software.
提供屏蔽错误检测的能力,从而防止进一步报告给架构定义的错误处理软件。
  1. If no satellite IEHs are present, then all the devices in the processor forward error information to the global IEH over sideband.
    如果没有卫星 IEH 存在,则处理器中的所有设备将通过旁路将错误信息转发到全局 IEH。
  2. The satellite IEH supports PCIe, Do_Serr, Correctable_AER, and Uncorrectable_AER.
    卫星 IEH 支持 PCIe、Do_Serr、Correctable_AER 和 Uncorrectable_AER。
In satellite IEH, there are three different categories for error sources: RP, RCiEP, and non-OS visible IPs. The category of the error source determines which signaling path is used. Non-OS visible devices report errors to the local error logic of the satellite IEH. Local error logic in the processor supports up to a maximum of 32 error sources. Local error logic has Local Correctable Error Status, Local Correctable Error Mask, Local Uncorrectable Error Status, Local Uncorrectable Error Mask, Local Uncorrectable Error First Error Pointer, Local Prefix Log, and Local Header Log registers. Each local error source has a corresponding Status and Mask register instance. After errors are logged in their corresponding local error logic, they are reported to the RCEC where there is the option of MSI/INTx signaling. The RCEC assigns fatal/non-fatal severity to local uncorrectable errors. RCiEP devices report directly to the RCEC. Errors are then reported to the satellite IEH's global error logic. In addition, errors from RP, RCEC, and other SAT IEH(s) also report to the global error logic of the satellite IEH. After these errors are logged in the global error logic, they are passed to the global IEH's global error logic for signaling.
在卫星IEH中,错误来源分为三种不同类别:RP、RCiEP和非OS可见IP。错误来源的类别决定了使用哪种信令路径。非OS可见设备将错误报告给卫星IEH的本地错误逻辑。处理器中的本地错误逻辑支持最多32个错误来源。本地错误逻辑具有本地可纠正错误状态、本地可纠正错误屏蔽、本地不可纠正错误状态、本地不可纠正错误屏蔽、本地不可纠正错误首错误指针、本地前缀日志和本地头部日志寄存器。每个本地错误来源都有相应的状态和屏蔽寄存器实例。在错误被记录在相应的本地错误逻辑中后,它们被报告给RCEC,其中有MSI/INTx信令的选项。RCEC将致命/非致命严重性分配给本地不可纠正错误。RCiEP设备直接向RCEC报告。然后将错误报告给卫星IEH的全局错误逻辑。此外,来自RP、RCEC和其他卫星IEH的错误也报告给卫星IEH的全局错误逻辑。 在这些错误被记录在全局错误逻辑之后,它们会传递给全局 IEH 的全局错误逻辑进行信号传递。
The global IEH can have up to 31 satellite IEHs and PCIe/legacy devices connected to it. For global IEHs, the only local errors being logged are the internal errors of the global IEH and they are treated as fatal severity. All satellite IEHs report to the global error logic of the global IEH. Thresholded corrected iMC events are also reported to the global error logic of global IEH for signaling. After these errors are logged in the global error section, they can be signaled by the NMI/SMI, ERR[2:0] pins, and/or the IOMCA.
全局 IEH 最多可以连接 31 个卫星 IEH 和 PCIe/传统设备。对于全局 IEH,仅记录本地错误是全局 IEH 的内部错误,并将其视为致命错误。所有卫星 IEH 都向全局 IEH 的全局错误逻辑报告。经过阈值校正的 iMC 事件也会报告给全局 IEH 的全局错误逻辑进行信号传递。在这些错误被记录在全局错误部分之后,它们可以通过 NMI/SMI、ERR[2:0] 引脚和/或 IOMCA 进行信号传递。
In the OS first mode, errors bypass the IEH. The firmware first mode is the only relevant operating mode for IEH.
在 OS 第一模式下,错误会绕过 IEH。固件第一模式是 IEH 唯一相关的操作模式。

11.2.6 Intel VT-d Translation Engine Error Reporting
11.2.6 Intel VT-d 翻译引擎错误报告

Intel VT-d translation engine has capability to detect internal faults and allow logging the errors in VTUNCERRSTS.
Intel VT-d 翻译引擎具有检测内部故障并允许将错误记录在 VTUNCERRSTS 中的能力。
Details on this topic can be found in the Processor Register Specification
有关此主题的详细信息可在处理器寄存器规范中找到。

11.2.7 Intel UPI Error Reporting
11.2.7 Intel UPI 错误报告

Intel UPI links are capable of detecting various types of correctable and uncorrected errors. Once an error is detected, it is reported (logged and signaled) using Intel UPI link MCA banks and platform specific log registers. This section further describes the corrected error and uncorrected error reporting briefly.
Intel UPI 链路能够检测各种类型的可纠正和不可纠正错误。一旦检测到错误,将使用 Intel UPI 链路 MCA 存储器和特定平台日志寄存器进行报告(记录和信号)。本节进一步简要描述了纠正错误和不可纠正错误报告。

11.2.7.1 Intel UPI Corrected Error Reporting
11.2.7.1 Intel UPI 纠正错误报告

Error reporting register information can be found in the Processor Register Specification
错误报告寄存器信息可以在处理器寄存器规范中找到

11.2.7.2 Intel UPI Uncorrected Fatal Error Reporting
11.2.7.2 Intel UPI 未校正致命错误报告

Detailed register information for this topic can be found in the Processor Register Specification
有关此主题的详细寄存器信息可以在处理器寄存器规范中找到

11.2.8 Integrated Voltage Regulator (IVR) Error Reporting
11.2.8 集成电压调节器(IVR)错误报告

The processor incorporates error reporting for faults originating from the Integrated Voltage Regulator (IVR) in the processor. The processor contains multiple VRs and each is capable of signaling fault conditions. There are three distinct types of IVR faults.
处理器集成了针对处理器中集成电压调节器(IVR)发生故障的错误报告。处理器包含多个 VR,每个 VR 都能够发出故障信号。IVR 故障有三种不同类型。
  • Boot IVR Fault: The Boot IVR is removed from the processor. The boot logic is now powered from the motherboard VR.
    引导 IVR 故障:引导 IVR 已从处理器中移除。现在引导逻辑由主板 VR 供电。
  • Core IVR Fault: A core IVR fault is asserted by an IERR on the CATERR_N pin, and it is possible to perform a FRB with the failing core disabled. Core IVR faults are logged to the CORE_FIVR_ERR_LOG CSR. This register contains a bit for each processor core at a bit position corresponding to the core's logical ID. Any bit set to "1" indicates an IVR fault has been detected for that core. When a warm reset is performed after a core IVR fault, the processor disables any core with an IVR fault indicated in CORE_FIVR_ERR_LOG. The RESOLVED_CORES CSR will be updated to indicate the failing core has been disabled.
    核心 IVR 故障:CATERR_N 引脚上的 IERR 断言了核心 IVR 故障,并且可以在故障核心禁用的情况下执行 FRB。核心 IVR 故障被记录到 CORE_FIVR_ERR_LOG CSR 中。该寄存器包含一个位,对应于核心逻辑 ID 的位位置。任何设置为“1”的位表示已检测到该核心的 IVR 故障。在核心 IVR 故障后执行热复位时,处理器将禁用 CORE_FIVR_ERR_LOG 中指示有 IVR 故障的任何核心。RESOLVED_CORES CSR 将被更新以指示故障核心已被禁用。
  • Uncore IVR Fault: An uncore IVR fault is asserted by an IERR on the CATERR_N pin. A BMC monitoring CATERR_N can detect this failure and determine the specific cause of the IERR over PECI. Uncore IVR faults are logged to the UNCORE_FIVR_ERR_LOG CSR. Any non-zero value in that CSR indicates an uncore IVR fault has been detected. Once the BMC determines a specific socket has asserted IERR as a result of an uncore IVR fault, the BMC may initiate socket-level Fault Resilient Booting (FRB) procedures.
    非核心 IVR 故障:CATERR_N 引脚上的 IERR 断言了非核心 IVR 故障。监视 CATERR_N 的 BMC 可以检测到此故障,并通过 PECI 确定 IERR 的具体原因。非核心 IVR 故障被记录到 UNCORE_FIVR_ERR_LOG CSR 中。该 CSR 中的任何非零值表示已检测到非核心 IVR 故障。一旦 BMC 确定特定插座由于非核心 IVR 故障而断言 IERR,BMC 可能会启动插座级别的故障恢复引导(FRB)过程。
If a FIVR_CATASTROPHIC_OVER_VOLTAGE_FAULT or
如果发生 FIVR_CATASTROPHIC_OVER_VOLTAGE_FAULT 或
FIVR_CATASTROPHIC_OVER_CURRENT_FAULT MCA is logged in the MSR (419h)
在 MSR (419h) 中记录了 FIVR_CATASTROPHIC_OVER_CURRENT_FAULT MCA
IA32_MC6_STATUS.MSEC_FW and both CSR CORE_FIVR_ERR and CSR
如果 IA32_MC6_STATUS.MSEC_FW 和 CSR CORE_FIVR_ERR 以及 CSR UNCORE_FIVR_ERR 都为 0,则应将其视为非核心 IVR 故障。
UNCORE_FIVR_ERR are both 0 , then this should be treated as an uncore IVR fault.
如果 IA32_MC6_STATUS.MSEC_FW 和 CSR CORE_FIVR_ERR 以及 CSR UNCORE_FIVR_ERR 都为 0,则应将其视为非核心 IVR 故障。

11.2.9 Memory Corrected Error Reporting
11.2.9 内存校正错误报告

Memory corrected error reporting provides per rank corrected error counters with leaky bucket that can trigger SMI/CMCI/ERROR_N [0] (as controlled via SMISPARECTL csr). Platform BIOS/firmware can invoke RAS (for example, runtime sPPR or ADDDC/ ADC) with such notifications. Additionally such error logs can be used towards Predictive Failure Analysis (PFA), failed DIMM isolation [Field Replaceable Unit (FRU) isolation], and additional debugging.
内存校正错误报告提供每个等级的校正错误计数器,带有可以触发 SMI/CMCI/ERROR_N [0] 的漏桶(通过 SMISPARECTL csr 控制)。 平台 BIOS/固件可以使用此类通知调用 RAS(例如,运行时 sPPR 或 ADDDC/ADC)。 此外,此类错误日志可用于预测性故障分析(PFA)、失败的 DIMM 隔离[可更换部件(FRU)隔离]和额外的调试。
The processor logs the corrected memory errors in the CSRs and is also capable of generating an in-band SMI or assert the ERRO pin when the corrected memory errors on any memory RANK crosses a certain threshold. UEFI-FW can configure the threshold independent of the OS configured CMCI threshold. Note that these CSRs log ECC corrected errors and the MCA bank's corrected error count reflects the ECC + transient corrected errors. (Transient errors are a type of errors where good data is achieved by retrying the read operation.) The processor can also be configured to signal the event to the platform BMC via the ERROR_N [0] pin. By using this capability, the platform can implement memory RAS via the BMC firmware.
处理器记录 CSRs 中的纠正内存错误,并且在任何内存 RANK 上的纠正内存错误超过一定阈值时,还能够生成带内 SMI 或断言 ERRO 引脚。UEFI-FW 可以配置阈值,独立于 OS 配置的 CMCI 阈值。请注意,这些 CSRs 记录 ECC 纠正错误,MCA bank 的纠正错误计数反映了 ECC +瞬态纠正错误。(瞬态错误是一种通过重试读操作来获得良好数据的错误类型。)处理器还可以配置为通过 ERROR_N [0]引脚向平台 BMC 发出事件信号。通过使用这种能力,平台可以通过 BMC 固件实现内存 RAS。
The processor also supports Corrected Error Log Disable which allows masking corrected error logging on a per rank basis. This capability helps the UEFI-FW based error handler to manage the cases where a specific DIMM is known to be reporting persistent corrected errors and does not require additional logging and reporting. See the following figure for a high-level flow.
处理器还支持纠正错误日志禁用,允许按照每个 rank 屏蔽纠正错误记录。这种能力有助于基于 UEFI-FW 的错误处理程序处理已知会报告持续纠正错误且不需要额外记录和报告的特定 DIMM 的情况。请参阅以下图表,了解高级流程。
Figure 29. Platform Memory Corrected Error Reporting Flow
图 29. 平台内存纠正错误报告流程
The following table provides a list of all relevant registers to configure this feature and any associated error logs.
下表提供了配置此功能和任何相关错误日志所需的所有相关寄存器列表。
Table 83. Memory Corrected Error Reporting Configuration and Logging Registers
表 83. 内存校正错误报告配置和日志寄存器
Scope Register Description
Configuration Registers 配置寄存器
IMC LEAKY_BUCKET_CFG
Sets leaky bucket drip interval threshold
设置漏桶滴水间隔阈值
using two hot encoding values -
使用两个热编码值 -
leaky_bkt_cfg_hi and leaky_bkt_cfg_lo..
leaky_bkt_cfg_hi 和 leaky_bkt_cfg_lo。
Rank LEAKY_BKT_2ND_CNTR_LIMIT
Sets the secondary leaky bucket counter
设置次要泄漏桶计数器
limit (drip internal threshold).
限制(滴漏内部阈值)。
Rank CORRERRTHRSHLD_<0:7>
Sets the corrected error threshold on a
设置基于每个等级的校正错误阈值。
per rank basis. 每个等级的基础。
IMC SMISPARECTL
Sets option to trigger the jSMI, CMCI-
设置触发 jSMI、CMCI 的选项。
proxy, ERROR_N [0] pin.
代理,错误_N [0] 销
Rank DIS_CORR_ERR_LOG
Disables correctable error logging/
禁用可纠正错误日志/
signaling for rank 0:7 on each channel,
在每个通道上为 rank 0:7 发出信号,
and corrected error count in iMC's
并纠正了 iMC 中的错误计数
Mci_status.
IMC ERR_CNTR_CTL
Saturating error counters in the B2CMI,
在 B2CMI 中饱和错误计数
track the number of transactions that
跟踪交易数量
had a memory controller error.
出现了内存控制器错误。
Status Registers 状态寄存器
Rank CORRERRCNT_<0:7>
Logs the corrected error counts on a per
记录每个修正错误计数
rank basis.
Rank CORRERRORSTATUS
Logs the error overflow status on a per
记录每个错误溢出状态
rank basis.
IMC
LEAKY_BUCKET_CNTR_LO and
LEAKY_BUCKET_CNTR_LO 和
LEAKY_BUCKET_CNTR_HI
Logs the leaky bucket lower and upper
记录漏桶的下限和上限
counter status. 计数器状态
IMC Channel RETRY_RD_ERR_LOG[0/1/2/3]
Memory Error logging: Patrol/sparing
内存错误日志记录:巡逻/备用
and/or demand access, no overflow,
和/或需求访问,无溢出,
error correctable/UC, correctable error
可纠正错误/UC,可纠正错误
in parity device, ECC mode (1LM, 2 LM,
在奇偶校验设备中,ECC 模式(1LM,2 LM,
ADDDC 1 LM ).
ADDDC 1 LM)。
IMC Channel
RETRY_RD_ERR_LOG_PARITY
This register logs the parity syndrome
此寄存器记录奇偶校验综合症
(mask) after completing the error
(掩码) 完成错误后
correction.
IMC Channel RETRY_RD_ERR_LOG_Misc [0/1/2/3]
This register logs the Misc. information
此寄存器记录杂项信息
about the error correction.
关于错误校正。
IMC Channel
RETRY_RD_ERR_LOG_ADDRESS1
This register logs the address
此寄存器记录地址。
information after completing error
完成错误后的信息。
correction - failed device ID, chip select,
更正 - 失败的设备 ID,芯片选择,
c0-c2 encoded sub bank bit, bank ID,
c0-c2 编码的子银行位,银行 ID,
and column address. 和列地址。
IMC Channel
RETRY_RD_ERR_LOG_ADDRESS2
This register logs the row address after
这个寄存器记录纠错完成后的行地址
completing an error correction.
完成错误校正后
IMC Channel
RETRY_RD_ERR_LOG_ADDRESS3
This bit register logs the system
这个 位寄存器记录系统
address of the cacheline where the error
错误被检测到的缓存行地址
was detected. 

NOTE 注意

There are four sets of retry_rd_err logs provided that can log errors due to the demand versus patrol scrub operation. Alternatively, by setting the "noover", the second set of an error log can also be used to log the second instance of the demand or patrol scrub event. (Set to 1 "noover" bit in the first set.)
提供了四组 retry_rd_err 日志,可以记录由于需求与巡逻擦洗操作而导致的错误。或者,通过设置 "noover",也可以使用第二组错误日志来记录需求或巡逻擦洗事件的第二个实例。(在第一组中将 "noover" 位设置为 1。)

11.2.10 Error Reporting Via the IOMCA
11.2.10 通过 IOMCA 进行错误报告

11.2.10.1 Functional Description
11.2.10.1 功能描述

With IOMCA enabled, the processor reports PCIe fatal and non-fatal errors as machine check conditions, and the I/O transitions in the MC domain. This feature allows logging and signaling IIO severity 1 or 2 errors via the MCA using the MCA bank 4 which is shared with the UBOX. This feature provides a uniform error reporting architecture aligned with the MCA. It also allows IIO uncorrected fatal and uncorrected non-fatal error signaling via the MCE to improve platform diagnosability. IOMCA is transparent to the OS in terms that the legacy OS (unaware of the IOMCA) will behave as expected in the presence of IOMCA, and the OS does not have a mechanism to determine that the IOMCA is enabled. It is supported as a standard RAS feature. In cases where the platform assumes the responsibility of handling the PCI Express AER events, it is the responsibility of the platform firmware to signal the OS/software when an uncorrected fatal error occurs. Platform firmware relies on NMI to notify OS/ software, must be noted there are cases where the OS may not respond to the NMI immediately. This feature addresses this issue and allows the platform firmware to configure the hardware such that IIO detected uncorrected fatal or non-fatal errors will trigger a MCERR event.
启用IOMCA后,处理器将把PCIe致命和非致命错误报告为机器检查条件,并在MC域中进行I/O转换。此功能允许通过MCA使用与UBOX共享的MCA bank 4记录和信号IIO严重性1或2错误。该功能提供了与MCA对齐的统一错误报告架构。它还允许通过MCE信号IIO未纠正的致命和非致命错误,以提高平台的诊断能力。在操作系统方面,IOMCA是透明的,即传统操作系统(不知道IOMCA存在)在IOMCA存在时会按预期运行,操作系统没有机制确定IOMCA是否已启用。它作为标准的RAS功能得到支持。在平台承担处理PCI Express AER事件的责任的情况下,平台固件有责任在发生未纠正的致命错误时向操作系统/软件发出信号。平台固件依赖NMI通知操作系统/软件,需要注意的是,有些情况下操作系统可能不会立即响应NMI。 此功能解决了此问题,并允许平台固件配置硬件,以便 IIO 检测到的未校正的致命或非致命错误将触发 MCERR 事件。

11.2.10.2 Error Signaling
11.2.10.2 错误信号

Platform firmware/software will program the global IEH so fatal and non-fatal PCIe errors are signaled to the BMC via CATERR and not ERR[2:1] pins. The IOMCA does not impact correctable errors. These will be reported to the PCIe AER structure. The IOMCA bank is located within the UBOX. The error indications are communicated from the global IEH. The implementation leverages the MSCOD and MCi_MISC fields to communicate errors logged in the banks. Once in IOMCA mode, the BIOS/software needs to ensure the configuration CSRs are disabling the signaling Sev1/2 through SMI/NMI in UBOX global logic.
平台固件/软件将对全局 IEH 进行编程,以便将致命和非致命 PCIe 错误通过 CATERR 信号传递给 BMC,而不是通过 ERR[2:1] 引脚。 IOMCA 不影响可纠正的错误。这些错误将报告给 PCIe AER 结构。 IOMCA 银行位于 UBOX 内。 错误指示是从全局 IEH 传递的。 该实现利用 MSCOD 和 MCi_MISC 字段来传递在银行中记录的错误。 一旦进入 IOMCA 模式,BIOS/软件需要确保配置 CSR 禁用通过 UBOX 全局逻辑中的 SMI/NMI 传递 Sev1/2 的信号。

11.2.11 First Corrected Error (FCERR) Mode
11.2.11 第一个已更正错误 (FCERR) 模式

Capability to latch on the first corrected error event within all MCA banks. This prevents overwriting of the first error logs (status address miscellaneous registers).
能够在所有 MCA 银行中抓取第一个已更正错误事件。这可以防止覆盖第一个错误日志 (状态地址杂项寄存器)。
Often error logs are stored in multiple registers and error handing FW/SW needs to issue multiple read commands to capture various log registers. In case of burst of corrected errors, the hardware is allowed to overwrite a log register. It is challenging for error handling FW/SW to capture the error logs associated with one given error event because the hardware can overwrite the log registers while FW/SW is still in the process of reading various registers. This feature prevents such issue and allows the FW/SW to capture all the error logs reliably.
经常错误日志存储在多个寄存器中,错误处理 FW/SW 需要发出多个读取命令来捕获各种日志寄存器。在发生已更正错误的突发情况下,硬件允许覆盖一个日志寄存器。对于错误处理 FW/SW 来说,捕获与一个给定错误事件相关的错误日志是具有挑战性的,因为硬件可以在 FW/SW 仍在读取各种寄存器的过程中覆盖日志寄存器。这个功能可以防止这种问题,并允许 FW/SW 可靠地捕获所有错误日志。
  1. The Retry_rd_err_log registers can be configured to not overwrite the corrected error through the NOOVER bit in the CSR.
    通过 CSR 中的 NOOVER 位,可以配置 Retry_rd_err_log 寄存器不覆盖已纠正的错误。
  2. A latched corrected error does not impact the corrected error counters operation.
    锁存的已纠正错误不会影响已纠正错误计数器的操作。
The corrected error count continues to increment and the overflow bit can still be set. No change to all other overwrite rules (a fatal error still overwrites the first corrected error).
已纠正错误计数仍然会继续增加,并且溢出位仍然可以被设置。对于所有其他覆盖规则没有变化(致命错误仍然会覆盖第一个已纠正错误)。
The following table provides a list of all relevant registers to configure this feature and any associated error logs.
以下表格提供了配置此功能和任何相关错误日志的所有相关寄存器列表。
Table 84. First Corrected Error Mode Configuration Registers
表 84. 第一个已更正错误模式配置寄存器
Scope Register Description
Intel UPI KTI_MCA_CFG
When Set, corrected error logged in Intel UPI MC banks do not
当设置时,Intel UPI MC 银行中记录的已更正错误不
overwrite other corrected errors.
覆盖其他已更正的错误。
B2CMI
banks
EXRAS_CONFIG
Corr2CorrOverwriteDis bit. (One for each B2CMI banks) When set,
Corr2CorrOverwriteDis 位。(每个 B2CMI 银行一个)设置后,
corrected errors logged in the B2CMI machine check banks, and
在 B2CMI 机器检查银行中记录的已更正错误,
shadow registers do not overwrite other corrected errors.
阴影寄存器不会覆盖其他已更正的错误。
IMC banks MC0_DP_CHKN_BIT no_over_ce bit. Set to 1 to avoid CE to override CE in the MCA bank.
no_over_ce 位。设置为 1 以避免 CE 覆盖 MCA 银行中的 CE。
IMC
Channel
RETRY_RD_ERR_LOGs
Noover bit. When set to 1 , it locks the register after the first error
Noover 位。设置为 1 后,在第一个错误之后锁定寄存器。
and prevents over writing.
并防止覆盖写入。

11.2.12 Error Reporting Through MCA 2.0 (EMCA Gen 2)
11.2.12 错误报告通过 MCA 2.0 (EMCA Gen 2)

This feature incorporates second generation of enhancements to the legacy IA-32 MCA. Key motivation is to create an architecture based implementation that can be enabled through industry leading operating systems (for example, Windows*, Linux*) and further increase the scope of the Firmware First Model (FFM) of error reporting above and beyond the memory sub-system.
此功能包含对传统 IA-32 MCA 的第二代增强功能。关键动机是创建一个基于体系结构的实现,可以通过行业领先的操作系统(例如,Windows*、Linux*)启用,并进一步扩大超出内存子系统的固件优先模型(FFM)的错误报告范围。
Prior to EMCA Gen 2, legacy IA-32 MCA implemented error handling where all the HW errors are logged in a uniform architecture based registers (MCA Banks) and signaled to the OS/VMM directly thus limited the UEFI-FW capability for fault diagnosis and FRU isolation. As the legacy IA-32 MCA mode is deployed by customers over last several generations of platforms, the following limitations has been identified that are addressed by the EMCA Gen 2 feature:
在 EMCA Gen 2 之前,传统的 IA-32 MCA 实现了错误处理,其中所有的硬件错误都记录在基于统一架构的寄存器(MCA Banks)中,并直接向 OS/VMM 发出信号,因此限制了 UEFI-FW 对故障诊断和 FRU 隔离的能力。由于传统的 IA-32 MCA 模式在过去几代平台上被客户部署,已经确定了以下限制,这些限制由 EMCA Gen 2 功能解决:
  1. Most of the uncorrected errors (UCE) would require routing to NMI, the NMI handler may lose fault containment. Additionally, NMI precludes error recovery in cases where such UCE may be recoverable.
    大多数未纠正的错误(UCE)需要路由到 NMI,NMI 处理程序可能会丢失故障隔离。此外,在这种 UCE 可能是可恢复的情况下,NMI 排除了错误恢复的可能性。
  2. UCNA error reporting via the SCI may not be fast enough.
    通过 SCI 进行的 UCNA 错误报告可能不够快。
  3. Some error logs are blocked within UEFI-FW/FW level access. It constrains OSbased error handling capability. For example, some error logs or registers are recorded in the CSR or MSR shadow registers that may not be fully accessed by the OS/VMM level.
    一些错误日志在 UEFI-FW/FW 级别访问中被阻止。这限制了基于 OS 的错误处理能力。例如,一些错误日志或寄存器记录在 CSR 或 MSR 影子寄存器中,可能无法被 OS/VMM 级别完全访问。
Due to these shortcomings of legacy IA-32 MCA based error handling, the EMCA Gen 2 is developed and it allows the firmware to enhance the error logging capabilities of machine check architecture (corrected and uncorrected errors). When enabled, the UEFI-FW SMI handler can read the MCA bank registers and other model-specific error logging registers prior to the operating system machine check handler reads and clears the MCA banks.
由于传统 IA-32 MCA 基于错误处理的这些缺点,EMCA Gen 2 被开发出来,它允许固件增强机器检查体系结构的错误日志功能(已更正和未更正的错误)。启用后,UEFI-FW SMI 处理程序可以在操作系统机器检查处理程序读取和清除 MCA 银行之前读取 MCA 银行寄存器和其他特定于模型的错误日志寄存器。
EMCA Gen 2 feature provides enhanced error reporting to support the Firmware First Model (FFM) with following attributes:
EMCA Gen 2 功能提供增强的错误报告,以支持 Firmware First Model (FFM),具有以下属性:
  1. Allows the SMM code to intercept MCERR/CMCI.
    允许 SMM 代码拦截 MCERR/CMCI。
  2. Allows the SMM code to write the MCA Status/Add/Misc. registers(Advanced RAS only).
    允许 SMM 代码写入 MCA 状态/添加/杂项寄存器(仅适用于高级 RAS)。
  3. Allows the SMM code to invoke the Int18 handler for the original MCERR condition.
    允许 SMM 代码调用原始 MCERR 条件的 Int18 处理程序。
  4. Allows the DSM based pointer for enhanced error logs.
    允许基于 DSM 的指针用于增强错误日志。
  5. Additional IA32_MCG_CAP bit for EMCA support
    为 EMCA 支持添加额外的 IA32_MCG_CAP 位。
The benefits of EMCA Gen 2 are:
EMCA Gen 2 的好处是:
  • Provides an opportunity for the platform firmware to deliver richer error logs to higher level software at the time of signaling. Additional error information is delivered to the system software through the ACPI DSM and Enhanced MCA L1 Directory data structure synchronous with MCE or CMCI.
    提供了一个机会,使平台固件能够在信号传递时向更高级别软件提供更丰富的错误日志。通过 ACPI DSM 和增强 MCA L1 目录数据结构,附加的错误信息与 MCE 或 CMCI 同步地传递给系统软件。
  • Provides an opportunity for the platform firmware to log all errors for their own diagnostic purposes.
    为平台固件提供了一个机会,记录所有错误以供诊断目的。
  • Advanced RAS provides an additional capability for platform SMM firmware error recovery from uncorrected errors enabled (in addition to the opportunity already available in the hardware and OS/VMM level). It also provides the ability for SMM to write non-zero values into the MCi_Status, MCi_Addr and MCi_Misc registers.
    高级 RAS 为平台 SMM 固件提供了额外的能力,用于从未纠正的错误中进行错误恢复(除了硬件和 OS/VMM 级别已经可用的机会之外)。它还提供了 SMM 将非零值写入 MCi_Status、MCi_Addr 和 MCi_Misc 寄存器的能力。
The EMCA2 error handling flow with FFM is shown in the following figure.
EMCA2 错误处理流程与 FFM 在下图中显示。
Figure 30. EMCA2 Error Handling FW Model
图 30. EMCA2 错误处理 FW 模型
The EMCA Gen 2 infrastructure provides the platform firmware the ability to signal an SMI on a per machine check bank basis for all uncorrected and corrected errors signaled by that machine check bank. One bit in the MCi_CTL2 register controls signaling an SMI on errors that would have been signaled as a Machine Check Exception (MCE), and one controls signaling an SMI on errors that would have been
EMCA Gen 2 基础设施为平台固件提供了在每台机器检查银行基础上信号 SMI 的能力,用于所有由该机器检查银行信号的未校正和已校正错误。 MCi_CTL2 寄存器中的一个位控制在错误时发出 SMI,该错误本应作为机器检查异常(MCE)发出,另一个位控制在错误时发出 SMI,该错误本应发出

signaled as a CMCI. This document will refer to machine check exceptions that are signaled as an SMI as machine check SMI (MSMI) and CMCIs that are signaled as an SMI as a CMCI SMI (CSMI).
作为 CMCI 信号。本文档将引用作为 SMI 信号的机器检查异常,称为机器检查 SMI(MSMI),以及作为 SMI 信号的 CMCIs,称为 CMCI SMI(CSMI)。
The EMCA Gen 2 related capability registers and control registers are as follows.
EMCA Gen 2 相关的能力寄存器和控制寄存器如下。
  • EMCA Gen 2 MCA capability bit is set in the global IA32_MCG_CAP register (bit 25) to indicate that individual MCA banks may have EMCA Gen 2 capability.
    在全局 IA32_MCG_CAP 寄存器中设置 EMCA Gen 2 MCA 能力位(位 25),以指示单个 MCA 银行可能具有 EMCA Gen 2 能力。
  • A read-only SMM visible MSR (SMM_MCA_CAP) provides a bit/bank indication whether the corresponding bank supports EMCA Gen 2 capabilities.
    一个只读的 SMM 可见的 MSR(SMM_MCA_CAP)提供了一个位/银行指示,指示相应的银行是否支持 EMCA Gen 2 的功能。
  • EMCA Gen 2 control bits 32 and 34 in each IA32_MCi_CTL2 register. These bits are programmable by UEFI FW and are not visible to the SW/OS/VMM. Therefore, reading bits 32 and 34 inthe IA32_MCi_CTL2 register by SW/OS/VMM will always return zeroes and writing a non-zero value will result in #GP fault.
    每个 IA32_MCi_CTL2 寄存器中的 EMCA Gen 2 控制位 32 和 34。这些位可由 UEFI 固件进行编程,对于 SW/OS/VMM 不可见。因此,通过 SW/OS/VMM 读取 IA32_MCi_CTL2 寄存器中的位 32 和 34 将始终返回零,并写入非零值将导致 #GP 错误。
Table 85. EMCA Gen 2 MCA Capability in the IA32_MCi_CTL2 Register
表 85. IA32_MCi_CTL2 寄存器中的 EMCA Gen 2 MCA 能力
IA32_MCi_CTL2 Register IA32_MCi_CTL2 寄存器 Bit value Bit Value=1
Bit 32: CMCI_CONTROL 位 32: CMCI_CONTROL Legacy CMCI Morphed into an SMI (CSMI)
转变为 SMI(CSMI)
Bit 34: MCE_CONTROL 位 34:MCE_CONTROL Legacy MCE Morphed into an SMI (MSMI)
转变为 SMI(MSMI)
  • CORE_SMI_ERR_SRC and UNCORE_SMI_ERR_SRCcidentify the sources of the triggered event. See the Granite Rapids Processor Registers Specification for details of each MSR. There may be more than one MCA bank that signaled the SMI.
    CORE_SMI_ERR_SRC 和 UNCORE_SMI_ERR_SRCc 识别触发事件的来源。有关每个 MSR 的详细信息,请参阅 Granite Rapids 处理器寄存器规范。可能有多个 MCA 银行发出了 SMI 信号。
The following table provides a list of all relevant registers to configure the EMCA Gen 2 feature and any associated error logs.
下表提供了配置 EMCA Gen 2 功能和任何相关错误日志的所有相关寄存器列表。
Table 86. Error Reporting Through EMCA Gen 2 Configuration Register
表 86. 通过 EMCA Gen 2 配置寄存器进行错误报告
Scope Register Description
EMCA Gen2 Capability EMCA Gen2 能力
Global MCG_CAP (MSR: 0x179) MCG_CAP(MSR:0x179)
Bit 25: Enhanced_MCA: set if Enhanced MCA Gen 2 is
位 25:Enhanced_MCA:如果 Enhanced MCA Gen 2 被设置
supported.
SMM_MCA_CAP (MSR: 0x17D)
SMM_MCA_CAP(MSR:0x17D)
Bit [31:0]: Bank Support - one bit per MC bank. If the bit is set,
位 [31:0]:银行支持 - 每个 MC 银行一个位。如果该位被设置,
then the corresponding MC bank supports the EMCA Gen 2
则相应的 MC 银行支持 EMCA Gen 2
capability.
UBOX EMCA_EN_CORE_IERR_TO_MSMI
See the Granite Rapids Processor Registers Specification for
请参阅 Granite Rapids 处理器寄存器规范
further details. 进一步的细节。
Per Bank IA32_MCi_CTL2
See the Granite Rapids Processor Registers Specification for
请参阅花岗岩急流处理器寄存器规范以获取
further details. 进一步的细节。
EMCA Gen2 Error Source Logs
EMCA Gen2 错误源日志
Global CORE_SMI_ERR_SRC
See the Granite Rapids Processor Registers Specification for
请参阅花岗岩急流处理器寄存器规范以获取
further details. 进一步的细节。
Global UNCORE_SMI_ERR_SRC
See the Granite Rapids Processor Registers Specification for
查看花岗岩急流处理器寄存器规范以获取更多详细信息。
further details. 进一步的细节请参阅花岗岩急流处理器寄存器规范。
Global MCA_ERR_SRC_LOG
See the Granite Rapids Processor Registers Specification for
查看花岗岩急流处理器寄存器规范以获取更多详细信息。
further details. 进一步的细节。
Scope Register Description
Per Bank IA32_MCi_STATUS/ADDR/MISC
See the Granite Rapids Processor Registers Specification for
请参阅 Granite Rapids 处理器寄存器规范
further details. 进一步的细节。
Memory sub-
system
RETRY_ERR_LOGs
See the Granite Rapids Processor Registers Specification for
请参阅花岗岩急流处理器寄存器规范以获取
further details. 进一步的细节。
Global MCG_CAP (MSR: 0x179) MCG_CAP(MSR:0x179)
Bit 25: Enhanced_MCA: set if Enhanced MCA Gen 2 is
位 25:Enhanced_MCA:如果 Enhanced MCA Gen 2 被设置
supported.
SMM_MCA_CAP (MSR: 0x17D)
SMM_MCA_CAP(MSR:0x17D)
Bit [31:0]: Bank Support - one bit per MC bank. If the bit is set,
位 [31:0]:银行支持 - 每个 MC 银行一个位。如果该位被设置,
then the corresponding MC bank supports the EMCA Gen 2
则相应的 MC 银行支持 EMCA Gen 2
capability.
See the following table for a summary of the three modes of operation.
请参阅以下表格,了解三种操作模式的摘要。
Table 87. Legacy IA-32 MCA and EMCA Gen 2 Modes Summary
表 87. 遗留 IA-32 MCA 和 EMCA 第 2 代模式摘要
Error Handling 错误处理 System Error Handling Mode
系统错误处理模式
Error
Type
Error
Reporting
Legacy IA-32 MCA Mode
传统 IA-32 MCA 模式
EMCA Gen 2, MCi_CTL2 Bits 32, 34 Set to 1
EMCA 第二代,MCi_CTL2 位 32、34 设置为 1
CE
Error
Hierarchy
HW -> UEFI-FW/SMM -> OS (architectural
硬件 -> UEFI-FW/SMM -> 操作系统(架构
implementation of the firmware first model)
固件第一个模型的实现
CE
Error
Signaling
CMCI (threshold based) CMCI (基于阈值)
For memory: SMI (threshold based)
对于内存: SMI (基于阈值)
CSMI threshold for all corrected errors from uncore
所有来自 uncore 的校正错误的 CSMI 阈值
Mcbanks
CE
Error
Logging
Processor logs errors in the MCA banks.
处理器记录 MCA 银行中的错误。
Memory controller, B2CMI, and Intel UPI
内存控制器,B2CMI 和 Intel UPI
logs additional details outside of the MCA
记录 MCA 外的额外细节
banks.
Processor logs errors in the MCA banks. Memory
处理器记录 MCA 银行中的错误。内存
controller, B2CMI, and Intel UPI logs additional details
控制器、B2CMI 和 Intel UPI 记录额外细节
outside of the MCA banks.
MCA 银行之外。
CE Error Flow
For All Core/Uncore CE:
对于所有核心/非核心 CE:
- HW detects an error. Logs and signal
- 硬件检测到错误。记录和信号
CMCI after the threshold is reached.
达到阈值后的 CMCI。
- The OS CMCI handler reads the MCA
- 操作系统的 CMCI 处理程序读取 MCA
banks for further error handling.
用于进一步处理错误的银行。
- Additional detail logs are not captured.
- 不会记录额外的详细日志。
- Proprietary driver/SW (for example, Linux
- 可以使用专有驱动程序/软件(例如,Linux
EDAC driver) can be used to capture
EDAC 驱动程序)来捕获
additional detailed logs, for example,
附加详细日志,例如,
faulty DIMM location) 故障 DIMM 位置)
For All Core/Uncore CE:
对于所有核心/非核心 CE:
- HW detects an error. Logs and signal CSMI.
- 硬件检测到错误。记录并发出 CSMI 信号。
- The FW/SMM handler gets a chance to capture the
- 固件/SMM 处理程序有机会捕获
logs first.
- The FW/SMM handler reads and clears (optional) the
- 固件/SMM 处理程序读取并清除(可选)
MCA banks and creates ELOG for OS/SW
MCA 银行并为 OS/SW 创建 ELOG
consumption if enabled. The DSM method is used
如果启用,将消耗。该 DSM 方法用于此 OS/UEFI-FW 接口。
for this OS/UEFI-FW interface.
用于此 OS/UEFI-FW 接口的 DSM 方法。
- FW/SMM signals CMCI (optional). The OS reads the
- FW/SMM 信号 CMCI(可选)。操作系统读取
MCA banks. The OS also reads ELOG if created, and
MCA 银行。操作系统还会读取已创建的 ELOG,以及
the OS supports ELOG.
操作系统支持 ELOG。
UCE
Error
Hierarchy
HW -> BIOS/SMM -> OS (architectural implementation
硬件 -> BIOS/SMM -> 操作系统(架构实现
of the firmware first model)
固件第一模型的实现)
UCE
Error
Signaling
MCERR (INT18) MCERR(INT18)
- CSMI (UCNA type at the point of detection)
- CSMI(在检测点的 UCNA 类型)
- MSMI (SRAR type at the point of consumption)
- MSMI(在消费点的 SRAR 类型)
- MCERR (INT18) (UEFI-FW to OS)
- MCERR(INT18)(UEFI-FW 到 OS)
UCE
Error
Logging
Processor logs errors in the MCA banks.
处理器记录 MCA 银行中的错误。
Memory controller, B2CMI, and Intel UPI
内存控制器、B2CMI 和 Intel UPI
logs additional details outside of the MCA
记录 MCA 之外的额外细节。
banks.
Processor logs errors in the MCA banks. Memory
处理器记录 MCA 存储器中的错误。内存
controller, B2CMI, and Intel UPI logs additional details
控制器、B2CMI 和 Intel UPI 记录额外细节
outside of the MCA banks.
在 MCA 存储器之外。
UCE Error Flow
- HW detects an error. Logs and enters the
- HW 检测到错误。记录并进入
Ucode MCERR handler. Ucode MCERR 处理程序。
- The Ucode MCERR handler resumes
- Ucode MCERR 处理程序恢复
MCERR (INT18).
- The OS INT18 handler reads the MCA
- 操作系统 INT18 处理程序读取 MCA
banks for further error handling. In most
银行以进行进一步的错误处理。在大多数情况下
cases, this will result in kernel panic.
在某些情况下,这将导致内核崩溃。
- HW detects an error. Logs and trigger MSMI. The
- 硬件检测到错误。记录并触发 MSMI。该
FW/SMM handler captures the log and triggers INT18
固件/SMM 处理程序捕获日志并触发 INT18
at the point of RSM.
在 RSM 点。
- The BIOS can signal an MCA to the OS by writing a
- BIOS 可以通过在 SMRAM 中的 EVENTS_CONTROL 字段中写入 0x1 来向 OS 发送 MCA 信号。
0x1 in the EVENTS_CONTROL field in the SMRAM
0x1。
image of the on-chip MSR when the on-chip state
当片上状态保存启用时片上 MSR 的图像
save is enabled. 已启用保存。
Error Handling 错误处理 System Error Handling Mode
系统错误处理模式
Error
Type
Error
Reporting
Legacy IA-32 MCA Mode
传统 IA-32 MCA 模式
EMCA Gen 2, MCi_CTL2 Bits 32, 34 Set to 1
EMCA 第二代,MCi_CTL2 位 32、34 设置为 1
- EVENTS_CONTROL is a 2-byte field in at offset
- EVENTS_CONTROL 是 SMRAM 偏移量为 OxffO4 处的一个 2 字节字段
OxffO4 of SMRAM. - EVENTS_CONTROL 是 MSR 0xC1F 中的一个 2 字节字段
- EVENTS_CONTROL is a 2-byte field in MSR 0xC1F
(b[47:32]) if an on-chip state save is enabled.
如果启用了芯片上的状态保存,则为(b[47:32])。
- OS INT18 handler reads the MCA banks for further
- 操作系统 INT18 处理程序读取 MCA 银行以进行进一步处理。
error handling. In case of an SRAR type of error, the
错误处理。在发生 SRAR 类型的错误时,
OS makes an attempt to kill the affected thread and
操作系统试图终止受影响的线程,并
resumes operation of remaining threads thus
恢复其余线程的运行,
recovering from the fault.
从故障中恢复。

11.2.13 PCI Express Error Reporting
11.2.13 PCI Express 错误报告

11.2.13.1 Functional Description
11.2.13.1 功能描述

The PCI Express interface is an integral part of the IIO module, and all the PCI Express links support link CRC and link level retry in case of a CRC error. The TLP transmission path through the data link layer prepares each TLP for transmission by applying a sequence number, then calculating and appending a Link CRC (LCRC), which is used to ensure the integrity of TLPs during transmission across a link from one component to another. TLPs have a sequence number attached to them by the DLL in the transmitter so that a packet can be re-sent if an error is detected on that packet by the receiver. Each sent packet is moved to a retry buffer until acknowledged as received by the receiver using the Ack/Nak protocol.
PCI Express 接口是 IIO 模块的一个组成部分,所有 PCI Express 链路都支持链路 CRC 和链路级重试,以防 CRC 错误发生。通过数据链路层的 TLP 传输路径,每个 TLP 在传输之前都会应用一个序列号,然后计算并附加一个链路 CRC(LCRC),用于确保 TLP 在从一个组件到另一个组件的链路传输过程中的完整性。发射端的 DLL 会为每个 TLP 附加一个序列号,以便在接收端检测到该数据包存在错误时可以重新发送数据包。每个发送的数据包都会移动到重试缓冲区,直到接收端使用 Ack/Nak 协议确认接收到该数据包。

11.2.13.2 Error Signaling
11.2.13.2 错误信令

Th following is an example flow where an error is signaled via the SMI:
以下是一个示例流程,其中通过 SMI 发信号来表示错误:
  1. The root port detects a corrected error (for example, Bad TLP). The error is logged in the correctable error status register and sent to the IEH subsystem where it would be signaled as an SMI. Note: Any additional errors of the same type and on the same root port will not be logged and signaled until a previous error is cleared. See the Integrated Error Handler (IEH) Based Error Reporting on page 237 for more information on IEH error signaling.
    根端口检测到一个已校正的错误(例如,坏的 TLP)。错误被记录在可校正错误状态寄存器中,并发送到 IEH 子系统,其中它将被表示为 SMI。注意:在同一根端口上的同一类型的任何其他错误在先前的错误被清除之前将不会被记录和发信号。有关 IEH 错误信令的更多信息,请参阅第 237 页上的基于 IEH 的错误报告。
  2. The SMM handler collects the log and clears the correctable error status register and various other status registers within the error reporting hierarchy, and resumes to normal operation.
    SMM 处理程序收集日志并清除错误报告层次结构中的可纠正错误状态寄存器和其他各种状态寄存器,然后恢复正常操作。
  3. When the PCIe corrected error reporting feature is enabled and if there is a persistent fault, SMM could set the threshold to a higher value and resume normal operation. If the SMI is triggered again, then th SMM handler could declare it a FRU replacement/notification event.
    当启用 PCIe 纠正错误报告功能并存在持久故障时,SMM 可以将阈值设置为更高值并恢复正常操作。如果再次触发 SMI,则 SMM 处理程序可能将其声明为 FRU 替换/通知事件。
This feature allows a capability to count a PCIe corrected error on a per root-port basis and signals an event once a pre-determined threshold is reached. An option is to allow visibility to these logs within the SMM mode only.
此功能允许按每个根端口计算 PCIe 纠正错误的能力,并在达到预定阈值时发出事件。一种选择是仅在 SMM 模式下允许查看这些日志。

11.2.14 CXL* RAS

11.2.14.1 Functional Description
11.2.14.1 功能描述

CXL.io leverages the PCIe RAS mechanism, see the PCI Express Error Reporting section. CXL.cachemem protocol errors are logged in CXL-specific registers. These registers are located in the RCRBBAR memory mapped registers instead of the config space and are PCIe Advanced Error Register [AER] like.
CXL.io 利用 PCIe RAS 机制,参见 PCI Express 错误报告部分。CXL.cachemem 协议错误记录在 CXL 特定寄存器中。这些寄存器位于 RCRBBAR 内存映射寄存器中,而不是配置空间,并且类似于 PCIe 高级错误寄存器 [AER]。

11.2.14.2 Error Signaling
11.2.14.2 错误信号

CXL.cachemem error signaling utilizes the CXL.io AER Correctable Internal Errors (CIE) or AER Uncorrectable Internal Errors (UIE) based on error severity. See Error Signaling on page 247 for more information.
CXL.cachemem 错误信号利用基于错误严重性的 CXL.io AER 可纠正内部错误(CIE)或 AER 不可纠正内部错误(UIE)。有关更多信息,请参见第 247 页的错误信号。

11.2.14.3 CXL* 1.1 and 2.0 Differences
11.2.14.3 CXL* 1.1 和 2.0 的区别

Figure 31. CXL* 1.1 Error Reporting
图 31. CXL* 1.1 错误报告
In CXL1.1, the root port is hidden from the OS. Errors from the CXL* DP are reported to the local error status in the IEH and then forwarded to the RCEC as either a CIE or UIE. That means all uncorrectable errors, non-fatal or fatal, will be reported to the RCEC as a UIE with the default fatal severity.
在 CXL1.1 中,根端口对操作系统是隐藏的。来自 CXL* DP 的错误会报告给 IEH 中的本地错误状态,然后转发到 RCEC,作为 CIE 或 UIE。这意味着所有不可纠正的错误,无论是非致命的还是致命的,都将作为默认致命严重性的 UIE 报告给 RCEC。
Figure 32. CXL* 2.0 Error Reporting
图 32. CXL* 2.0 错误报告
CXL2.0 looks just like a regular PCIe* root port.
CXL2.0 看起来就像一个普通的 PCIe* 根端口。

11.2.15 Threshold for Corrected Errors
11.2.15 已校正错误的阈值

The processor supports corrected error thresholding for all uncore MC banks. Thresholding corrected errors modulate a number of CSMI's given functional unit triggers. The motivation is to provide proactive notification of corrected errors and manage a potential storm of CSMIs. See UBOX CSR UBOXCSMITHRES, one per uncore Mcbank.
处理器支持所有非核心 MC 银行的已校正错误阈值。校正错误阈值调制一定数量的 CSMI,给定功能单元触发器。动机是提供已校正错误的主动通知,并管理潜在的 CSMI 风暴。请参阅 UBOX CSR UBOXCSMITHRES,每个非核心 Mcbank 一个。

11.2.16 MCA Bank Error Control
11.2.16 MCA 银行错误控制

The feature gives UEFI FW and out of band visibility to corrected and UCNA errors, while masking the error from other software layers. The feature has an OEM specific application. The platform will have the control when to escalate the error to the OS.
该功能为 UEFI 固件提供带外和带内可见性,以纠正和 UCNA 错误,同时掩盖其他软件层的错误。该功能具有 OEM 特定的应用程序。平台将在何时将错误升级到操作系统时具有控制权。
MCA error control allows the firmware to intercept the corrected and UCNA (uncorrected no action required) errors before the OS clears the log. Two bits have been added to the MSR (52h) ERROR_CONTROL to enable this feature which is available in the SMM mode only (will #GP fault outside of SMM).
MCA 错误控制允许固件在操作系统清除日志之前拦截已更正和 UCNA(无需采取行动的未更正)错误。已向 MSR(52h)ERROR_CONTROL 添加了两位,以启用此功能,该功能仅在 SMM 模式下可用(在 SMM 之外将引发 #GP 错误)。
  • Bit 0:CERR_RD_STATUS_IN_SMM_ONLY: When set to 1 , a rdmsr to any MCi_STATUS register will return 0 while a corrected error is logged in the register unless the processor is in SMM mode; Corrected Error: ( )
    位 0:CERR_RD_STATUS_IN_SMM_ONLY:当设置为 1 时,对任何 MCi_STATUS 寄存器的 rdmsr 操作将返回 0,除非处理器处于 SMM 模式;已校正错误:( )
  • Bit 1: UCNA_RD_STATUS_IN_SMM_ONLY: When set to 1 , a rdmsr to any MCi_STATUS register will return 0 while an UCNA error is logged in the register unless the processor is in SMM mode; UCNA Error: ( , )
    位 1:UCNA_RD_STATUS_IN_SMM_ONLY:当设置为 1 时,对任何 MCi_STATUS 寄存器的 rdmsr 操作将返回 0,除非处理器处于 SMM 模式;UCNA 错误:( )
The following table outlines control bits that are required to enable this feature. Changes in the error reporting (logging and signaling) are also described in this table.
以下表格概述了启用此功能所需的控制位。错误报告(记录和信号)的更改也在此表中描述。
Table 88. MCA Bank Error Control Enabling Registers
表 88. MCA 银行错误控制启用寄存器
Scope Register Description
Global
ERROR_CONTROL (MSR 错误控制(MSR
, bit 4
CMCI_DISABLE-When set to 1 , disables the corrected machine check
CMCI_DISABLE-设置为 1 时,禁用已校正的机器检查
error interrupt entirely, cleared upon each reset. Typically useful if the
错误中断完全,每次重置后清除。如果通常有用
OEM offers UEFI-FW based Predicted Failure Analysis (PFA) or a health
OEM 提供基于 UEFI-FW 的预测故障分析 (PFA) 或健康
monitoring capability. 监控能力。
Global
MCA_ERROR_CONTROL (MSR
, bit 0
CERR_RD_STATUS_IN_SMM_ONLY-Default value is when set to 1 , an
CERR_RD_STATUS_IN_SMM_ONLY-默认值为当设置为 1 时,一个
rdmsr to any MCi_STATUS register will return 0 while a corrected error is
rdmsr 到任何 MCi_STATUS 寄存器将返回 0,而纠正的错误是
logged in the register unless the processor is in the SMM mode. Corrected
除非处理器处于 SMM 模式,否则不会在寄存器中记录
Error: . 错误:
Global
MCA_ERROR_CONTROL (MSR
, bit 1 )
UCNA_RD_STATUS_IN_SMM_ONLY_- Default value is when set to 1 , an
UCNA_RD_STATUS_IN_SMM_ONLY_- 默认值为 1 时,对任何 MCi_STATUS 寄存器进行 rdmsr 操作将返回 0,而在处理器处于 SMM 模式时,将在寄存器中记录 UCNA 错误。
rdmsr to any MCi_STATUS register will return 0 while an UCNA error is
logged in the register unless the processor is in the SMM mode. UCNA
Error: . 错误: .
Intel UPI Links Corrected
Intel UPI 链路已校正
Error Control (Cloaking)
错误控制 (隐蔽)
Inte UPI links Intel UPI 链接 KTICERRLOGCTRL
Dis_ce_log bit 0 .
禁用日志位 0 .
When set, Corrected errors will not be logged in Intel UPI MCA bank
当设置时,Intel UPI MCA 银行中的校正错误将不会被记录
status registers (IA32_MCi_STATUS). It will still be logged in
状态寄存器(IA32_MCi_STATUS)。它仍将被记录在
BIOS_KTI_ERR_ST registers. CMCI/CSMI will not be signaled for corrected
BIOS_KTI_ERR_ST 寄存器中。对于已纠正的错误,不会发出 CMCI/CSMI 信号
errors when DIS_CE_LOG is set, which happens naturally since they are
当设置 DIS_CE_LOG 时,不会发生自然发生的错误,因为它们是
neither logged nor counted.
既不记录也不计数。
Dis_ucna_log bit 1 .
Dis_ucna_log 位 1。
When set, UCNA errors will not be logged in Intel UPI MCA bank status
当设置时,Intel UPI MCA 银行状态中将不记录 UCNA 错误。
registers (IA32_MCi_STATUS). It will still be logged in BIOS_KTI_ERR_ST
寄存器(IA32_MCi_STATUS)。它仍将在 BIOS_KTI_ERR_ST 中记录
registers. CMCI/CSMI will not be signaled for UCNA errors when
寄存器。当设置 DIS_UCNA_LOG 时,UCNA 错误将不会触发 CMCI/CSMI
DIS_UCNA_LOG is set. Note: Intel UPI is not logging any UCNA cases in
注意:Intel UPI 在不记录任何 UCNA 情况
processor, so setting the bit has no affect
处理器,因此设置该位不会产生影响
Inte UPI links Intel UPI 链接 KTICORERRCNTDIS
Corerrcnt_mask bits . Intel UPI Error count disable mask.
Corerrcnt_mask 位 。Intel UPI 错误计数禁用掩码。
NOTE: This has no effect on the KTIERRCNT0/1/2_CNTR error counters. If
注意:这不会影响 KTIERRCNT0/1/2_CNTR 错误计数器。如果
bit is set to 1 it disables incrementing of Intel UPI MCA banks status
位设置为 1,则禁用增加 Intel UPI MCA 银行状态
register (IA32_MCi_STATUS.cor_err_cnt and/or
寄存器(IA32_MCi_STATUS.cor_err_cnt 和/或
BIOS_KTI_ERR_ST.cor_err_cnt for a given error code).
给定错误代码的 BIOS_KTI_ERR_ST.cor_err_cnt)。

11.2.17 CSR Error Log Control
11.2.17 CSR 错误日志控制

DEVHIDE is the mechanism to hide platform implementation specific CSRs from the and prevent unintended manipulation of the processor configuration. If a custom driver (for example, EDAC driver) needs access, then the platform BIOS needs to unhide the CSRs. System developers can set DEVHIDE bit for selected CSR ranges that contain error log information. During boot time, these hidden CSR ranges are not enumerated to the OS/SW, and therefore effectively hiding from the OS/SW. In the processor, the scope of the CSR error log control feature is for memory and IIO error logs only.
DEVHIDE 是隐藏平台实现特定 CSR 并防止处理器配置被意外操作的机制。如果自定义驱动程序(例如,EDAC 驱动程序)需要访问,则平台 BIOS 需要取消隐藏这些 CSR。系统开发人员可以为包含错误日志信息的选定 CSR 范围设置 DEVHIDE 位。在引导时,这些隐藏的 CSR 范围不会被枚举到操作系统/软件中,因此有效地隐藏在操作系统/软件中。在处理器中,CSR 错误日志控制功能的范围仅限于内存和 IIO 错误日志。
The following table lists all the Intel UPI corrected error counter, mask, and threshold registers.
以下表格列出了所有 Intel UPI 纠正错误计数器、掩码和阈值寄存器。
Table 89. DEVHIDE Configuration Registers for CSR Error Log Control
表 89. 用于 CSR 错误日志控制的 DEVHIDE 配置寄存器
Type Register
CFG DEVHIDE0 7_CFG Disable Function 禁用功能
CFG DEVHIEDE0 7_1_CFG Disable Function 禁用功能

11.2.18 Enhanced SMM Features
11.2.18 增强的 SMM 功能

Refer to the MSR SMM_MCA_CAP for feature support.
参考 MSR SMM_MCA_CAP 以获取功能支持。
  • Bit 59 - Long and Blocked Flow Indication - If set to 1, indicates that the SMM long and blocked flow indicator is supported.
    位 59 - 长和阻塞流指示 - 如果设置为 1,则表示支持 SMM 长和阻塞流指示器。
  • Bit 58 - SMM Code Access Check - If set to 1, indicates the SMM code access check feature is supported.
    位 58 - SMM 代码访问检查 - 如果设置为 1,则表示支持 SMM 代码访问检查功能。
  • Bit 57 - SMM CPU Save/Restore - If set to 1, indicates the on-chip SAVE/RESTORE feature is supported.
    位 57 - SMM CPU 保存/恢复 - 如果设置为 1,则表示支持芯片上的 SAVE/RESTORE 功能。
New feature enhancements with enhanced SMM are as follows. Refer to the Birch Stream Platform BIOS Writer's Guide (BWG) for further details.
具有增强 SMM 的新功能增强如下。有关更多详细信息,请参阅 Birch Stream 平台 BIOS 编写指南(BWG)。
  • Long ISA (Instruction Set Architecture) Instruction hints to SMM - Can reduce SMM rendezvous loop time by providing long ISA instruction hints to the firmware which may prevent the SMM from waiting (time-out) for threads that have a long ISA to avoid resource conflicts between the OS and SMM environments. New CSRs added (SMM_DELAYED[1-0] and SMM_BLOCKED[1-0]) to indicate logical processors entered into long ISA instructions. SMM_DELAYEDO covers logical processors 0-31. SMM_DELAYED1 covers logical processors 32-35. Long instructions are WBINVD, C6 entry/exit, ratio change/throttle, and patch load. BLOCKEDO covers logical processors 0-31. BLOCKED1 covers logical processors 32-35. Blocked instructions are wait-for-SIPI, LT-SENTER sleep state, VMX abort, and error shutdown.
    长 ISA(指令集架构)指令提示到 SMM - 通过向固件提供长 ISA 指令提示,可以减少 SMM 会合循环时间,从而避免 SMM 等待(超时)长 ISA 线程,以避免 OS 和 SMM 环境之间的资源冲突。新增的 CSRs(SMM_DELAYED[1-0] 和 SMM_BLOCKED[1-0])用于指示进入长 ISA 指令的逻辑处理器。SMM_DELAYED0 包括逻辑处理器 0-31。SMM_DELAYED1 包括逻辑处理器 32-35。长指令包括 WBINVD、C6 进入/退出、比率更改/节流和补丁加载。BLOCKED0 包括逻辑处理器 0-31。BLOCKED1 包括逻辑处理器 32-35。被阻止的指令包括等待 SIPI、LT-SENTER 休眠状态、VMX 中止和错误关机。
  • In-silicon SMM State Save - Provides an option to save/restore context to/from insilicon storage which provides faster access time and immunity from DRAM faults. New CSR and field added SMM_FEATURE_CONTROL.SMM_CPU_SAVE_EN which enables this feature. When enabled, the following MSRs are available to the processor:
    在硅 SMM 状态保存 - 提供将上下文保存/恢复到/从硅内存中的选项,这提供了更快的访问时间和免疫 DRAM 故障。新增的 CSR 和字段 SMM_FEATURE_CONTROL.SMM_CPU_SAVE_EN 可以启用此功能。启用后,以下 MSRs 可供处理器使用:
  • SMRAM_CRO, SMRAM_CR3, SMRAM_EFLAGS, SMM_EFER, SMRAM_RIP, SMRAM_DR6, SMRAM_DR7, SMRAM_TR_LDTR, SMRAM_GS_FS, SMRAM_DS_SS, SMRAM_CS_ES, SMRAM_IO_MISC, SMRAM_IO_MEM_ADDR, SMRAM_RDI, SMRAM_RSI, SMRAM_RBP, SMRAM_RSP, SMRAM_RBX, SMRAM_RDX, SMRAM_RCX, SMRAM_RAX, SMRAM_R[15-8], SMRAM_EVENT_CTL_HLT_IO, SMRAM_SMBASE, SMRAM_SMM_REVID, SMRAM_IEDBASE, SMRAM_EPTP_ENABLE, SMRAM_EPTP, SMRAM_LDTR_BASE, SMRAM_IDTR_BASE, SMRAM_GDTR_BASE, SMRAM_CR4, SMRAM_IO_RSI, SMRAM_IO_RCX, SMRAM_IO_RIP, SMM_IO_RDI
    SMRAM_CRO,SMRAM_CR3,SMRAM_EFLAGS,SMM_EFER,SMRAM_RIP,SMRAM_DR6,SMRAM_DR7,SMRAM_TR_LDTR,SMRAM_GS_FS,SMRAM_DS_SS,SMRAM_CS_ES,SMRAM_IO_MISC,SMRAM_IO_MEM_ADDR,SMRAM_RDI,SMRAM_RSI,SMRAM_RBP,SMRAM_RSP,SMRAM_RBX,SMRAM_RDX,SMRAM_RCX,SMRAM_RAX,SMRAM_R[15-8],SMRAM_EVENT_CTL_HLT_IO,SMRAM_SMBASE,SMRAM_SMM_REVID,SMRAM_IEDBASE,SMRAM_EPTP_ENABLE,SMRAM_EPTP,SMRAM_LDTR_BASE,SMRAM_IDTR_BASE,SMRAM_GDTR_BASE,SMRAM_CR4,SMRAM_IO_RSI,SMRAM_IO_RCX,SMRAM_IO_RIP,SMM_IO_RDI
  • SMM Security - Provides the ability to detect a potential security issue if the SMM code branches to non-SMM protected memory (outside SMRR or SMRR2 range). The processor generates an MCE if the SMM handler attempts to execute from an address that does not fall within the System Management Range Registers (SMRR) address range. New CSR and field added
    SMM 安全性 - 提供了检测潜在安全问题的能力,如果 SMM 代码分支到非 SMM 受保护内存(超出 SMRR 或 SMRR2 范围)。如果 SMM 处理程序尝试从不在系统管理范围寄存器(SMRR)地址范围内的地址执行,则处理器会生成 MCE。新增 CSR 和字段添加
SMM_FEATURE_CONTROL.SMM_CODE_CHK_EN which enables this feature.
SMM_FEATURE_CONTROL.SMM_CODE_CHK_EN,启用此功能。
  • Second range of memory SMM protected (SMRR2) - Not supported.
    内存第二范围 SMM 受保护 (SMRR2) - 不支持。
  • Spurious SMI Handling - Provides the ability for the BIOS to clear spurious SMIs. A spurious SMI is where an SMI is signaled but there is no valid SMI source logged/ identified. To enable this feature, the processor supports two new MSRs. The MSR 58h SMM_CFG_OPTIONS.[0] when set, clears the pending SMI events when a thread exits Wait for SIPI. the MSR 57h CLEAR_SMI.[0] when set, will clear the threads pending the SMI.
    伪 SMI 处理 - 提供 BIOS 清除伪造 SMI 的能力。伪 SMI 是指发出 SMI 信号但没有记录/识别到有效 SMI 来源的情况。为启用此功能,处理器支持两个新的 MSRs。当设置 MSR 58h SMM_CFG_OPTIONS.[0] 时,会在线程退出等待 SIPI 时清除待处理的 SMI 事件。当设置 MSR 57h CLEAR_SMI.[0] 时,将清除待处理的 SMI 线程。

11.2.19 Processor Error Signaling Pins
11.2.19 处理器错误信号引脚

There are two distinct error reporting domains:
有两个不同的错误报告领域:
  • MCA for core, Intel UPI, CHA, iMC and IIO (IOMCA mode)
    MCA 用于核心、Intel UPI、CHA、iMC 和 IIO(IOMCA 模式)
  • AER for IIO (for example, PCIe)
    AER 用于 IIO(例如 PCIe)
MCA domain errors are signaled to remote sockets via the CATERR_N pin for unrecoverable/fatal errors, and the RMCA_N pin for recoverable errors.
MCA 域错误通过 CATERR_N 引脚传递给远程插座,用于不可恢复/致命错误,以及通过 RMCA_N 引脚传递给可恢复错误。
AER uses three pin ERROR_N[2:0].
AER 使用三个引脚 ERROR_N[2:0]。
ERROR_N[0]: Correctable errors and advisory (nonfatal)
ERROR_N[0]: 可纠正错误和咨询性(非致命)
ERROR_N[1]: Uncorrected but recoverable errors (nonfatal)
ERROR_N[1]: 未更正但可恢复的错误(非致命)
ERROR_N[2]: Uncorrected fatal error
ERROR_N[2]: 未更正的致命错误

11.2.20 S3M Error Reporting
11.2.20 S3M 错误报告

This section covers a high level overview of the logging and signaling of possible S3M errors during runtime (S0 state).
本节涵盖了在运行时(S0 状态)记录和信令可能的 S3M 错误的高级概述。
S3M broadly has two RAS domains with different error signaling. The first domain would be the MCA reporting path (top half of the diagram below) and the second domain would be the IIO reporting path (bottom half of the diagram below).
S3M 广泛地具有两个具有不同错误信令的 RAS 领域。第一个领域将是 MCA 报告路径(下面图表的上半部分),第二个领域将是 IIO 报告路径(下面图表的下半部分)。
For the first domain, the MCA error logged can be due to parity errors from sideband links/bridges, crypto subsystem errors, microcontroller errors, or firmware errors, all of which are highlighted below. The only errors that are signaled outside of S3M within this domain are fatal errors ). The error signature shown below will be the same for any fatal error within this domain. Specific error details are logged in the S3M_HW_ERR_FATAL_STATUS and S3M_FW_ERR_FATAL_STATUS registers.
对于第一个领域,MCA 错误日志可能是由于侧带链接/桥接口的奇偶校验错误、加密子系统错误、微控制器错误或固件错误引起的,所有这些错误都在下面进行了突出显示。在此领域之外信令的唯一错误是致命错误 )。在此领域内的任何致命错误的错误签名如下所示。具体的错误详细信息记录在 S3M_HW_ERR_FATAL_STATUS 和 S3M_FW_ERR_FATAL_STATUS 寄存器中。
For the second domain, error messages sent to the sIEH are the default and preferred signaling path for errors. This domain exists due to the set of integrated PCI devices (highlighted) that were once a part of the PCH. The 6 PCI devices shown send DO_SERR messages in the event of a system error and the primary to sideband (P2SB) bridge can also send UNCORRECTABLE_AER messages in the event of unsupported requests or unclaimed transactions targeting S3M. The sIEH strapped to S3M will log that S3M signaled a system error and the details about the unsupported request in LERRUNCSTSO and LERR* 1 registers.
对于第二个域,发送到 sIEH 的错误消息是错误的默认和首选信令路径。这个域存在是因为一组集成的 PCI 设备(突出显示)曾经是 PCH 的一部分。所示的 6 个 PCI 设备在系统错误发生时发送 DO_SERR 消息,主辅带(P2SB)桥也可以在发生不支持的请求或未声明的事务针对 S3M 时发送 UNCORRECTABLE_AER 消息。连接到 S3M 的 sIEH 将记录 S3M 发出的系统错误以及 LERRUNCSTSO 和 LERR* 1 寄存器中有关不支持请求的详细信息。
All registers mentioned (including registers for the 6 PCI devices) are documented in this processor's register specification.
所提及的所有寄存器(包括 6 个 PCI 设备的寄存器)都在此处理器的寄存器规范中有文档记录。
Figure 33. S3M Runtime Error Reporting Overview
图 33. S3M 运行时错误报告概述

11.3 CPU Core and Uncore RAS Features
11.3 CPU 核心和不核心 RAS 功能

This section describes CPU core and uncore RAS features as listed in the following table. All registers are described in more detail in the Granite Rapids Processor Registers Specification .
本节描述了 CPU 核心和不核心 RAS 功能,如下表所列。所有寄存器在《花岗岩急流处理器寄存器规范》中有更详细的描述。
Table 90. CPU Core and Uncore RAS Features
表 90. CPU 核心和不核心 RAS 功能
RAS Feature Description
Standar
d
RAS
SKU
Advance
d
RAS SKU
Corrupt Data Containment -
数据损坏内容 -
Uncore
Corrupt Data Containment is a process of routing Uncorrected Data
数据损坏内容是将未校正的数据路由到事务中的过程
Errors (UCE) synchronous to the transaction thus enhancing the
错误(UCE)同步到事务,从而增强
containment of the fault and improving the reliability of the system.
故障的控制和系统可靠性的提高。
This feature covers the UCEs detected within uncore modules such as
此功能涵盖了在非核心模块中检测到的 UCEs,例如
M2M, IMC, and CHA.
M2M、IMC 和 CHA。
Yes Yes
Corrupt Data Containment -
数据损坏内容 -
Core
Corrupt Data Containment is a process of routing Uncorrected Data
数据损坏内容是将未校正的数据路由到事务中的过程
Errors (UCE) synchronous to the transaction thus enhancing the
错误(UCE)同步到事务,从而增强
containment of the fault and improving the reliability of the system.
故障的控制和系统可靠性的提高。
This feature covers the UCEs detected and reported within core
此功能涵盖了在核心模块中检测和报告的 UCEs。
modules such as MLC.
例如 MLC。
Yes Yes
PCI Express Corrupt Data
PCI Express 数据损坏
Containment (Data Poisoning)
包含(数据毒化)
Corrupt Data Containment is a process of routing Uncorrected Data
数据损坏包含是一种将未校正数据路由的过程
Errors (UCE) synchronous to the transaction thus enhancing the
错误(UCE)与事务同步,从而增强了故障的控制并提高了系统的可靠性。
containment of the fault and improving the reliability of the system.
这个特性涵盖了在 IIO 模块内检测到的 UCE,包括 PCI
This feature covers UCEs detected within IIO module including PCI
模块。
Express interfaces. More specifically, it allows detection of poison bit in
表达接口。更具体地说,它允许检测入站事务中的毒数据位,并将错误报告为“咨询性非致命性”。
inbound transactions and reporting the error as 'advisory non-fatal'.
毒数据包被发送到适当的子模块内。
Poisoned packet is sent to appropriate sub-module within the
processor.
Yes Yes
RAS Feature1 Description
Standar
d
RAS
SKU'
Advance
d
RAS SKU
Containment
Viral Mode offers error containment above and beyond the
病毒模式提供了超出“腐败数据封锁”功能提供的错误封锁。
containment provided by "Corrupt Data Containment" feature. In case
在发生致命错误的情况下,病毒警报指示器会传播到 Intel {{0}} UPI 和
of a fatal error, Viral alert indicator is propagated to Intel UPI and
更高级别的错误封锁。
PCIe interfaces. Intel UPI packet header sets Viral bit thus
PCIe 接口。Intel UPI 数据包头设置病毒位,从而传播病毒状态到所有其他连接的 CPU。PCIe 接口
propagating viral status to all other attached CPUs. PCIe interface
进入病毒状态,所有与 PCIe 设备的交易都会受到影响
enters into viral state and all transactions to and from PCIe devices
进入病毒状态,所有与 PCIe 设备的交易都会受到影响
are blocked/dropped. It accomplishes by start dropping all outbound
被阻止/丢弃。它通过开始丢弃所有出站
posted transactions and responses to inbound requests; It also
发布的交易和对入站请求的响应;它还
starting aborting all non-posted requests.
开始中止所有未发布请求。
No Yes
DCU Scrubbing improves system uptime by minimizing the impact due
DCU 清洗通过最小化核心数据缓存单元(DCU)中高能粒子打击(软错误)的影响来提高系统正常运行时间。
to high energy particle strike (Soft Errors) within core Data Cache Unit
DCU 是奇偶校验保护的,但是,如果一个缓存行处于“M”状态
(DCU). DCU is parity protected, however, if a cache-line is in "M" state
and a parity error is detected, it is escalated as fatal error. DCU
当检测到奇偶校验错误时,它会升级为致命错误。 DCU
Scrubbing writes back cache-lines in 'M' state to the Mid Level Cache
擦除将处于“M”状态的缓存行写回到中级缓存(MLC),在 DCU 中保留一个处于“E”状态的副本,从而最小化
(MLC), leaving a copy in DCU at 'E' state, thus minimizing the
(MLC)中的数据。
probability of fatal error.
致命错误的概率。
Yes
DCU Scrubbing DCU 擦拭
Time-out timers within various sub-modules are used to report the
各个子模块内部的超时计时器用于报告
failures as close as possible to the source of the fault, e.g., core
尽可能接近故障源的故障失败,例如核心
retirement watchdog timer, CBO-TOR time-out, and PCI Express
退休看门狗定时器,CBO-TOR 超时和 PCI Express
Completion Time-Out (CTO).
完成超时(CTO)。
Yes Yes
Time-out Timer Schemes 超时计时器方案
During the power-on initialization, BIST engine located within the
在上电初始化期间,位于处理器内的 BIST 引擎检查 MLC 和 LLC 缓存,如果检测到错误,
processor checks the MLC and LLC caches and if an error is detected,
it reports the error in an uncore CSR called BIST_RESULS. This allows
它报告了一个名为 BIST_RESULS 的不核心 CSR 中的错误。这允许
the BIOS to check the health of MLC and LLC prior to moving to next
BIOS 在继续引导过程之前检查 MLC 和 LLC 的健康状况
step in the boot process.
步骤。
Yes Yes
Processor BIST 处理器 BIST

NOTES: 注意事项:

  1. RAS features may not be supported on all SKUs of a processor type.
    处理器类型的所有 SKU 可能不支持 RAS 功能。
  2. Socket Workstation RAS follows Standard RAS SKU.
    套接字工作站 RAS 遵循标准 RAS SKU。

11.3.1 Corrupt Data Containment
11.3.1 数据损坏内容

Corrupt data containment (also known as data poisoning or CDC) is a process of routing uncorrected data error information synchronous to the transaction thus enhancing the fault containment and improving the reliability of the system. The processor incorporates corrupt data containment capability within the core, uncore, and IIO submodules.
数据损坏内容(也称为数据中毒或 CDC)是一种将未经校正的数据错误信息同步路由到事务的过程,从而增强故障容错性并提高系统的可靠性。处理器在核心、非核心和 IIO 子模块中集成了损坏数据内容能力。

11.3.2 Corrupt Data Containment - Core
11.3.2 损坏数据内容 - 核心

This section covers CDC capability within the core.
本节涵盖了核心内的 CDC 功能。
This feature allows recovery of a system when a HW uncorrected error is detected within the memory or MLC/LLC caches, thus improving the system reliability. It works in conjunction with the feature described in Corrupt Data Containment - Uncore on page 255 . If the receiver of corrupted data is a core (for example, application fetching data from memory), then the data is discarded and either the core may trigger a fatal MCERR or the core may trigger a recoverable MCERR (SRAR event) allowing the SW/OS/VMM layers to make an attempt to recover the system.
此功能允许在内存或 MLC/LLC 缓存中检测到 HW 未校正错误时恢复系统,从而提高系统的可靠性。它与第 255 页上描述的 Corrupt Data Containment - Uncore 功能一起工作。如果损坏数据的接收方是一个核心(例如,应用程序从内存中获取数据),那么数据将被丢弃,核心可能会触发致命的 MCERR,或者核心可能会触发可恢复的 MCERR(SRAR 事件),允许 SW/OS/VMM 层尝试恢复系统。
There is no separate configuration register available for enabling this feature besides the one described in Table 91 on page 255 . This feature will be enabled by the processor hardware internally depending upon the processor SKU and setting of the IA32_MCG_CONTAIN [0] bit. Once UEFI-FW enables this, features such as MCA recovery - Execution path, MCA recovery - non-execution path, and local MCE based recovery is enabled by the processor HW. Refer to System Level RAS Features on page 298 for further details.
除了表 91 中描述的配置寄存器外,没有单独的配置寄存器可用于启用此功能。此功能将由处理器硬件内部根据处理器 SKU 和 IA32_MCG_CONTAIN [0]位的设置来启用。一旦 UEFI-FW 启用此功能,处理器硬件将启用诸如 MCA 恢复-执行路径、MCA 恢复-非执行路径和基于本地 MCE 的恢复等功能。有关更多详细信息,请参阅第 298 页的系统级 RAS 功能。

11.3.3 Corrupt Data Containment - Uncore
11.3.3 数据损坏内容 - 非核心

This section covers CDC capability within the uncore and IRPRING.
本节涵盖了非核心和 IRPRING 中的 CDC 功能。
Once an uncorrected error is detected, the detector sets a "poison" bit which becomes part of the data payload. At the receiving end, the receiver does not consume the corrupted data and trigger a MCERR event. Depending upon the processor SKU, this MCERR event may be a recoverable event or a fatal event requiring a system reset. In the absence of this feature, the system is exposed to data contamination during a time-window between the detection of an uncorrected error and resetting of the system after triggering MCERR.
一旦检测到未校正的错误,检测器会设置一个“毒”位,该位成为数据有效载荷的一部分。在接收端,接收器不会消耗损坏的数据并触发 MCERR 事件。根据处理器 SKU 的不同,此 MCERR 事件可能是可恢复事件或需要系统重置的致命事件。在没有此功能的情况下,系统在检测到未校正错误并触发 MCERR 后重置系统之间的时间窗口内暴露于数据污染。
There is a single register bit used to enable/disable CDC - uncore feature. Refer to the following table for configuration register details. Once this feature is enabled, error reporting via the MCA banks changes within the IMC, M2M, CHA, IIO, and Intel UPI sub- modules.
有一个用于启用/禁用 CDC - uncore 功能的单个寄存器位。有关配置寄存器详细信息,请参考以下表格。一旦启用此功能,通过 IMC、M2M、CHA、IIO 和 Intel UPI 子模块的 MCA 银行的错误报告将发生变化。
Table 91. Corrupt Data Containment - Uncore Configuration Register
表 91. 损坏数据封装 - Uncore 配置寄存器
Scope Register Description
Global
IA32_MCG_CONTAIN (MSR IA32_MCG_CONTAIN(MSR
POISON_ENABLE bit0. Setting this bit will enable data poisoning capability
POISON_ENABLE 位 0。设置此位将启用数据毒化功能
within the uncore (CHA, M2M, IMC, Intel UPI, and IRPRING).
在 uncore(CHA、M2M、IMC、Intel UPI 和 IRPRING)内。
Note: This bit does not configure data poisoning within the PCIe ports. See
注意:此位未配置 PCIe 端口内的数据毒化。请参见第 260 页的表 94。
Table 94 on page 260.
11.3.3.1 DCU/IFU 未校正错误报告,启用损坏数据内容控制

11.3.3.1 DCU/IFU Uncorrected Error Reporting With Corrupt Data Containment Enabled

The processor supports the CDC feature within the DCU/IFU, and uncorrected data errors may be identified as recoverable by the software (SRAR type of UCR).
处理器支持 DCU/IFU 中的 CDC 功能,并且未经校正的数据错误可能被软件识别为可恢复的(UCR 类型的 SRAR)。
Within the processor, the "error containment" bit is carried all the way to the DCU/IFU. This allows isolation of the corrupted data (that is, attached to the affected "load" receiving corrupted data) and assists in potential software recovery. The following is a high-level description of the sequence of events:
在处理器内部,“错误隔离”位一直传递到 DCU/IFU。这允许隔离受损数据(即,附加到受影响的“负载”接收受损数据)并有助于潜在的软件恢复。以下是事件序列的高级描述:
  • Within the processor, the "error containment" bit is carried all the way to the DCU/ IFU. This allows isolation of the corrupted data (that is, attached to the affected "load" receiving corrupted data) and assists in potential software recovery IFU/DCU error logging. When the IFU/DCU logs the error that results from poisoned data, it logs the error with to indicate that the error may be recoverable. In addition, it captures the physical address of the error and sets the and bits. SW will look for an exact error signature to determine whether the error is recoverable or not. An example of an error signature for the IFU/DCU poisoned data errors is shown in the following table.
    在处理器内部,“错误隔离”位一直传递到 DCU/IFU。这允许隔离受损数据(即,附加到受影响的“负载”接收受损数据)并有助于潜在的软件恢复 IFU/DCU 错误记录。当 IFU/DCU 记录由有毒数据导致的错误时,它会记录带有 的错误,以指示错误可能是可恢复的。此外,它捕获错误的物理地址并设置 位。软件将寻找确切的错误签名以确定错误是否可恢复。IFU/DCU 有毒数据错误的错误签名示例如下表所示。
Table 92. DCU SRAR Type of UCR Error Log
DCU SRAR UCR 错误日志类型表 92
Register Bit Definition Value
IA32_MCi_STATUS 63 VAL 1
IA32_MCi_STATUS 62 OVER 0
IA32_MCi_STATUS 61 UC 1
IA32_MCi_STATUS 60 EN 1
IA32_MCi_STATUS 59 MISCV 1
IA32_MCi_STATUS 58 ADDRV 1
IA32_MCi_STATUS 57 PCC 0
IA32_MCi_STATUS 56 S 1
IA32_MCi_STATUS 55 AR 1
IA32_MCi_STATUS Corrected error status 已更正的错误状态
See Granite Rapids Processor Registers Specification for
请参阅 Granite Rapids 处理器寄存器规范
further details. 进一步细节。
IA32_MCi_STATUS Corrected error count 已更正的错误计数
See Granite Rapids Processor Registers Specification for
请参阅花岗岩急流处理器寄存器规范
further details. 进一步的细节。
IA32_MCi_STATUS Other info
See Granite Rapids Processor Registers Specification for
请参阅花岗岩急流处理器寄存器规范以获取
further details. 进一步的细节。
IA32_MCi_STATUS MSCOD
See Granite Rapids Processor Registers Specification for
查看花岗岩急流处理器寄存器规范以获取更多详细信息。
further details. 进一步的细节。
IA32_MCi_STATUS MCACOD 0x0134 (data read error)
0x0134(数据读取错误)
IA32_MCi_ADDR Upper address 上地址
IA32_MCi_ADDR LSB Lower address bits 低地址位 Physical address 物理地址
IA32_MCi_ADDR Lower address bits 低地址位
IA32_MCi_MISC Model specific 模型特定
See Granite Rapids Processor Registers Specification for
请参阅 Granite Rapids 处理器寄存器规范
further details. 进一步的细节。
IA32_MCi_MISC Address mode b010
IA32_MCi_MISC LSB b000110
IA32_MCG_STATUS
(Errored Thread) (错误线程)
1 EIPV 1
IA32_MCG_STATUS
(Errored Thread) (错误线程)
0 RIPV 0
IA32_MCG_STATUS
(Other Thread) (其他线程)
1 EIPV 0
IA32_MCG_STATUS
(Other Thread) (其他线程)
0 RIPV 1

11.3.3.2 MLC Uncorrected Error Reporting With CDC Enabled
11.3.3.2 MLC 在 CDC 启用时的未校正错误报告

The processor supports the CDC feature within MLC and allows reporting of certain types of uncorrected data errors as an uncorrected recoverable (UCNA type of UCR) error instead of as a fatal error enabling an advanced feature called MCA Recovery Execution Path.
处理器支持 MLC 中的 CDC 功能,并允许将某些类型的未校正数据错误报告为未校正可恢复 (UCNA 类型的 UCR) 错误,而不是作为致命错误,从而启用了一种称为 MCA Recovery Execution Path 的高级功能。
When the corrupt data containment feature is enabled, the CHA/LLC behavior with respect to the logging and signaling of uncorrected recoverable data errors changes. There are two main cases with such a change:
当启用损坏数据内容特性时,CHA/LLC 关于未校正可恢复数据错误的记录和信号行为会发生变化。有两种主要情况会发生这种变化:
  • Case 1 - Uncorrected error within the MLC data array:
    情况 1 - MLC 数据阵列内的未校正错误:
  • In this case, if the MLC finds an UCR type error on a lookup to the data array, then this is considered as a non-critical error. When such events occur, the MLC will attach an "error containment" bit synchronous to the data and sends it to the requester. Since MLC is the source of the UCR error, the MLC will also log the error in its machine check bank as an UCNA type of UCR error and signal CMCI if it is enabled (IA32_MCi_CTL2[30]=1). If CMCI is not enabled, this event is not signaled. In either case, the MLC does not signal MCERR.
    在这种情况下,如果 MLC 在查找数据数组时发现 UCR 类型错误,则将其视为非关键错误。当发生这种事件时,MLC 将在数据上同步附加一个“错误包含”位并将其发送给请求者。由于 MLC 是 UCR 错误的来源,MLC 还将将错误记录在其机器检查存储器中,作为 UCNA 类型的 UCR 错误,并在启用时发出 CMCI 信号(IA32_MCi_CTL2[30]=1)。如果未启用 CMCI,则不会发出此事件信号。在任何情况下,MLC 都不会发出 MCERR 信号。
  • Case 2 - Data with "error containment" bit set arrives from higher up in the cache hierarchy (for example, LLC):
    情况 2 - 具有设置了“错误包含”位的数据从高层缓存层次结构(例如,LLC)传入:
  • In this case, the MLC is not the source of the error, it simply requested data and received data with the "error containment" bit set. In this case, the MLC will neither log nor signal the error (the error should have been logged and the CMCI signaled by the source - CHA or iMC for example). It will simply forward the data to its destination with the "error containment" bit set. When writing this corrupted data into the MLC, the MLC will flip the ECC bits to create an uncorrectable error. Note that since the data interface is not at a cacheline granularity, the "error containment" bit may be set for one or more chunks of the data for that specific cacheline. In either case, the whole cacheline will be stored as corrupted data when written into the array.
    在这种情况下,MLC 不是错误的来源,它只是请求数据并接收到设置了“错误内容”位的数据。在这种情况下,MLC 不会记录或信号错误(错误应该已经被记录并由来源 - 例如 CHA 或 iMC 记录并由 CMCI 信号)。它只会将数据转发到目的地,并设置“错误内容”位。将这些损坏的数据写入 MLC 时,MLC 将翻转 ECC 位以创建一个不可纠正的错误。请注意,由于数据接口不是以缓存行粒度为单位,因此“错误内容”位可能针对该特定缓存行的一个或多个数据块设置。在任一情况下,当写入数组时,整个缓存行将被存储为损坏数据。

NOTE 注意

The MLC will ignore the "error containment" bit from other sources if it is not configured in Corrupt Data Containment mode.
如果未配置为损坏数据内容模式,则 MLC 将忽略来自其他来源的“错误内容”位。

11.3.3.3 B2CMI/IMC Uncorrected Error Reporting With CDC Enabled (Normal Memory Read)
当 CDC 功能启用时,B2CMI/IMC 未校正错误报告因处理器 SKU 而异。在正常内存读取访问期间,当检测到未校正数据错误时,B2CMI/IMC 模块不会发出 MCERR 信号。相反,IMC 使用 IA32_MCi_STATUS.PCC {{0}} 和 IA32_MCi_STATUS.UC=1(带有适当的 MCACOD 字段)记录错误。地址也在 IA32_MCi_ADDR 寄存器中报告。未校正(错误的)数据将带有“错误容纳”位发送到其目的地。定义“错误容纳”位的方式使其与数据同步。

The B2CMI/IMC uncorrected error reporting varies based upon the processor SKU when the CDC feature is enabled. During a normal memory read access, when an uncorrected data error is detected, the B2CMI/IMC modules do not signal an MCERR. Instead, the IMC logs the error with IA32_MCi_STATUS.PCC and IA32_MCi_STATUS.UC=1 (with an appropriate MCACOD field). The address is also reported in the IA32_MCi_ADDR register. The uncorrected (erroneous) data is sent to its destination with the "error containment" bit set. The "error containment" bit is defined such that it stays synchronous with the data.
在正常写入访问期间,如果 IMC 模块接收到带有“错误容纳”位设置的写入数据,则 IMC 将在 DRAM 中存储带有毒素指示的数据。
During a normal write access, if the IMC module receives write data with the "error containment" bit set, then the IMC will store the data in the DRAM with the poison indication.

NOTE 注意

As a general rule, detection of "poisoned data" will not result in an additional error log as the detector of the uncorrectable data logged the error and poisoned the line.
通常情况下,“毒数据”的检测不会导致额外的错误日志,因为不可纠正数据的检测器已记录错误并使该行受到污染。
During a subsequent read operation, when the IMC module detects the "error containment signature", the IMC module forwards the data to the B2CMI along with the "error containment" bit. This is an indication to the B2CMI that the data was
在随后的读取操作期间,当 IMC 模块检测到“错误包含签名”时,IMC 模块将数据与“错误包含”位一起转发给 B2CMI。 这是向 B2CMI 指示数据已经

stored as corrupted data and therefore B2CMI will neither signal nor log an error. The reason is that since this data was stored as corrupted data, the B2CMI/IMC module is not the source of the error and as such, the error should have already been logged at the error source.
存储为损坏数据,因此 B2CMI 将不会发出信号或记录错误。原因是,由于这些数据被存储为损坏数据,B2CMI/IMC 模块不是错误的来源,因此,错误应该已经在错误来源处记录。
As described earlier, the uncorrected error reporting methodology changes if the CDC feature is enabled or not. Similarly, the IMC patrol scrub is another feature whose behavior changes depending upon if the CDC is disabled or enabled. The following section describes the IMC's patrol scrubbing feature in more detail.
如前所述,如果启用了 CDC 功能,则未校正的错误报告方法会发生变化。同样,IMC 巡逻擦洗是另一个功能,其行为取决于 CDC 是否已禁用或已启用。以下部分更详细地描述了 IMC 的巡逻擦洗功能。

11.3.3.4 IMC Uncorrected Error Reporting With CDC Enabled (Patrol Scrubbing)
11.3.3.4 IMC 启用 CDC 的未校正错误报告(巡逻擦洗)

When the CDC feature is enabled and if during the patrol scrubbing operation, an uncorrected error is detected, then such errors are treated as a separate class of errors - defined as an uncorrectable error no action required (UCNA) type of UCR error. The OS/software action remains the same for error types that used to signal SRAO and now signaling UCNA.
当 CDC 功能启用并且在巡逻擦洗操作期间检测到未纠正的错误时,这些错误将被视为一类单独的错误 - 定义为不需要采取任何操作的不可纠正错误(UCNA)类型的 UCR 错误。对于以前用于信号 SRAO 现在信号 UCNA 的错误类型,操作系统/软件操作保持不变。
Table 93. IMC UCNA Type of UCR Error Log during Patrol Scrubbing
表 93. 在巡逻擦洗期间的 IMC UCNA 类型的 UCR 错误日志
Register Bit Definition Value
IA32_MCi_STATUS 63 VAL 1
IA32_MCi_STATUS 62 OVER 0
IA32_MCi_STATUS 61 UC 1
IA32_MCi_STATUS 60 EN 1
IA32_MCi_STATUS 59 MISCV 1
IA32_MCi_STATUS 58 ADDRV 1
IA32_MCi_STATUS 57 PCC 0
IA32_MCi_STATUS 56 S 0
IA32_MCi_STATUS 55 AR 0
IA32_MCi_STATUS Corrected error status 已纠正的错误状态
See the Granite Rapids Processor Registers
查看花岗岩急流处理器寄存器
Specification for further details.
有关详细信息,请参阅规范。
IA32_MCi_STATUS Corrected error count 已更正的错误计数
See the Granite Rapids Processor Registers
查看花岗岩急流处理器寄存器
Specification for further details.
有关详细信息,请参阅规格。
IA32_MCi_STATUS Other Info
Refer to Granite Rapids Processor Registers
参考花岗岩急流处理器寄存器
Specification for further details.
有关详细信息的规范。
IA32_MCi_STATUS MSCOD
IA32_MCi_STATUS MCACOD
b0000 00001100 CCCC
CCCC is the channel number.
CCCC 是频道号码。
IA32_MCi_ADDR Upper address 上地址
IA32_MCi_ADDR Lower address bits 低地址位 Physical address 物理地址
IA32_MCi_ADDR Lower address bits 低地址位
IA32_MCi_MISC Model specific 模型特定
See the Granite Rapids Processor Registers
请参阅 Granite Rapids 处理器寄存器
Specification for further details.
进一步了解的规范。
IA32_MCi_MISC Address mode b010
Register Bit Definition Value
IA32_MCi_MISC REC_ERR_LSB b001100
IA32_MCG_STATUS (All IA32_MCG_STATUS(全部
Threads)
1 EIPV 0
IA32_MCG_STATUS (All IA32_MCG_STATUS(全部
Threads)
0 RIPV 1
In summary, the following are the possible scenarios in which the IMC poisons the cacheline:
简而言之,以下是 IMC 毒害缓存行的可能情况:
  • The IMC detects UCE within its own pipelines during write operations.
    IMC 在写操作期间在其管线内部检测到 UCE。
  • The IMC detects UCE data during demand patrol or sparing read operation.
    IMC 在需求 巡逻或备用读取操作期间检测到 UCE 数据。
  • B2CMI for mirroring operation
    用于镜像操作的 B2CMI

11.3.3.5 Intel UPI Uncorrected Error Reporting with CDC Enabled
启用 CDC 的情况下,Intel UPI 未校正错误报告

When the CDC is enabled, the Intel UPI is merely a conduit for the poisoned indication. It simply passes the FLIT to the destination with the poison bit set without signaling any error. The intent is to allow the corrupt data to flow through the interface all the way to its destination and then allow the recipient of the corrupt data to take appropriate decision. By default, CDC is disabled.
当启用 CDC 时,Intel UPI 仅用作传递有毒指示的通道。它只是将带有毒位设置的 FLIT 传递到目的地,而不发出任何错误信号。其目的是允许损坏数据通过接口流向其目的地,然后允许接收损坏数据的方接受者做出适当的决定。默认情况下,CDC 处于禁用状态。

11.3.4 PCIe* CDC

This feature is part of the PCI Express Specification and allows forwarding of the packets with error information (synchronous error reporting). The processor implements this feature as per the PCI Express Specification. As per the specifications, it is an optional capability and therefore the hardware default is to leave it disabled, and it is the responsibility of SW/FW to enable this capability during the system initialization phase. Error reporting is incorporated as part of PCI Express AER logic.
该功能是 PCI Express 规范的一部分,允许转发带有错误信息的数据包(同步错误报告)。处理器根据 PCI Express 规范实现此功能。根据规范,这是一种可选功能,因此硬件默认情况下会将其禁用,启用此功能是软件/固件在系统初始化阶段的责任。错误报告作为 PCI Express AER 逻辑的一部分被整合进来。
This feature is primarily used to maintain the data integrity in both directions at the transaction level by attaching an EP bit to the header any time an uncorrected error is detected prior to forwarding the packet to the next agent. The receiver detects the poison TLP and reroutes the error event as an advisory non-fatal (a corrected error event) rather than signaling it as an uncorrected error that would have otherwise resulted in a system reset. Since the system continues to operate, it improves the system uptime or reliability. The following is the action taken by all the participating agents:
该功能主要用于通过在检测到未纠正的错误之前在事务级别的头部附加 EP 位来维护数据完整性,然后将数据包转发给下一个代理。接收方检测到毒性 TLP 并将错误事件重新路由为建议性的非致命错误(已纠正的错误事件),而不是将其标记为未纠正的错误,否则会导致系统重置。由于系统继续运行,它提高了系统的正常运行时间或可靠性。所有参与代理执行以下操作:
Transmitting agent: Agent detecting an uncorrectable data error will set the EP bit in the header indicating that data is poisoned thus maintaining the data integrity.
传输代理: 检测到不可纠正的数据错误的代理将在标头中设置 EP 位,指示数据被污染,从而保持数据完整性。
Intermediate forwarding agents: These agents will forward the data and will not report any error. In Intel Xeon processor based servers, the IIO module can be configured as an intermediate agent for inbound transactions thereby forwarding the poisoned data on the ring to the destination that could be main memory or another PCI Express port.
中间转发代理: 这些代理将转发数据,并不会报告任何错误。在基于英特尔 至强 处理器的服务器中,IIO 模块可以配置为入站事务的中间代理,从而将污染的数据在环上转发到可能是主存储器或另一个 PCI Express 端口的目的地。
Receiving agent: Upon detecting the EP bit in the header, the receiver may decide to replay the affected transaction or simply signal a fatal event thus resetting the system. For inbound transactions destined to main memory, the IIO module can be
接收代理: 在标头中检测到 EP 位后,接收方可以决定重播受影响的事务或简单地发出致命事件信号,从而重置系统。对于目的地为主存储器的入站事务,IIO 模块可以

configured to forward the poison information allowing the memory controller to store the poisoned data in the main memory. Further handling of such poisoned data is described in Corrupt Data Containment - Uncore on page 255.
配置为转发毒信息,使内存控制器能够将毒数据存储在主内存中。进一步处理此类毒数据的方法在第 255 页的 Corrupt Data Containment - Uncore 中描述。
The PCIe* CDC scope included both inbound data poison detection (Rx PTL) and outbound data poison setting (Tx PTL).
PCIe* CDC 范围包括入站数据毒检测(Rx PTL)和出站数据毒设置(Tx PTL)。
Enabling Rx PTL relies on several configuration parameters (see Table 94 on page 260 for various configuration and status reporting options):
启用 Rx PTL 依赖于几个配置参数(请参阅第 260 页的表 94,了解各种配置和状态报告选项):
  1. Configuration of the rest of the system in legacy IA-32 MCA versus corrupt data containment mode
    在传统的 IA-32 MCA 中配置系统的其余部分与腐败数据内容模式
  2. Configuring the inbound Poisoned Transport Layer (PTL) packet handling as fatal or an advisory non-fatal event thus allowing the SW error handler to take appropriate action, for example, replay the transaction
    配置入站毒化传输层(PTL)数据包处理为致命或建议性的非致命事件,从而允许软件错误处理程序采取适当的措施,例如重放事务
  3. Enabling/disabling of logging and signaling of the PTL error at the IIO root port
    在 IIO 根端口启用/禁用 PTL 错误的记录和信令
Enabling of Tx PTL relies on several register settings and configuration parameters:
Tx PTL 的启用依赖于几个寄存器设置和配置参数:
  1. Setting of POISOCFEN (bit 36 in IIOMISCCTRL register). This bit enables setting of the poison bit in outbound completion forwarding if the corrupted data is received by the IRP.
    POISOCFEN 的设置(IIOMISCCTRL 寄存器中的第 36 位)。如果 IRP 接收到损坏的数据,此位启用了在出站完成转发中设置毒数据位。
  2. Clearing of disable_ob_parity_check (bit 12 in MISCCTRLSTS register). Clearing this bit allows setting of the poison bit in an outbound transaction if the parity error is detected within the outbound buffers.
    disable_ob_parity_check 的清除(MISCCTRLSTS 寄存器中的第 12 位)。清除此位允许在出站事务中检测到奇偶校验错误时设置毒数据位。
  3. Enabling of PCIe* EDPC feature. See the PCIe* / CXL.io Enhanced Downstream Port Containment (EDPC) on page 290 for further details.
    启用 PCIe* EDPC 功能。有关更多详细信息,请参阅第 290 页的 PCIe* / CXL.io 增强下游端口约束 (EDPC)。
The following table lists various configuration and status reporting options.
以下表格列出了各种配置和状态报告选项。
Table 94. PCIe* CDC Configuration and Status Registers
表 94. PCIe* CDC 配置和状态寄存器
Configuration
Options
ENABLE_I
O_MCA
(NCEVEN
TS_CR_U
BOXERRC
TL2_CFG[
22])
POISO
N_ENA
BLE
(MSR
0x178[
)
POISFEN
(IIOMIS
CCTRL[3
)
PTLPE
UNCERR
MSK[12]
PTLPES
UNCERR
SEV[12]
ANFEM
CORERR
MSK[13]
Rx PTLP Error Reporting
Rx PTLP 错误报告
1. IA32-Legacy MCA.
Rx PTLP as fatal
Rx PTLP 作为致命错误
event
0 0 0 0 1
Logging: Uncorrected error
记录:未校正的错误
logged in UNCERRSTS [12].
记录在 UNCERRSTS [12] 中。
Header is logged. 头部已记录。
Signaling: Fatal event. 信号传递:致命事件。
Either via MSI, SMI/NMI, or
通过 MSI、SMI/NMI 或
ERROR_N[2] pin. ERROR_N[2] 引脚。
2. CDC mode
enabled. IIO
forwards poison but 向前毒害但
does not report any
不报告任何
event.
1 1 1
Logging: Disabled. 记录:已禁用。
Signaling: Disabled. 信令:已禁用。
3. CDC mode
enabled. IIO
forwards poison and 转发毒药和
logs as an advisory
日志作为咨询
non-fatal event. 非致命事件。
Header logging 头部记录
enabled. Applicable 已启用。适用
when SW based error
当基于软件的错误时
0 1 1 0 0 0
Logging: Advisory non-fatal
记录:咨询性非致命性
error logged in 记录的错误
CORRERRSTS [13]. CORRERRSTS [13]。
Uncorrected error logged in
记录的未更正错误
UNCERRSTS [12] and UNCERRSTS [12] 和
header is logged. 头部已记录。
Configuration
Options
ENABLE_I
O_MCA
(NCEVEN
TS_CR_U
BOXERRC
TL2_CFG[
22])
POISO
N_ENA
BLE
(MSR
POISFEN
(IIOMIS
CCTRL[3
7])
PTLPE
UNCERR
MSK[12]
PTLPES
UNCERR
SEV[12]
ANFEM
CORERR
MSK[13]
Rx PTLP Error Reporting
Rx PTLP 错误报告
handler is available 处理程序可用
or BMC based
reporting is enabled. 报告已启用。
Signaling: Corrected event.
信令:已更正事件。
Either via MSI or ERROR_N
通过 MSI 或 ERROR_N。
pin.
4. CDC mode
enabled. IIO
forwards poison and 转发毒药和
logs as an
uncorrected fatal 未纠正的致命
event. Header 事件。标题
logging enabled. 启用日志记录。
Applicable when FFM 适用于 FFM 时。
is used with
notification via the 通过通知。
SMI.
0 1 1 0 1 0
Logging: Uncorrected error
记录:未校正的错误
logged in UNCERRSTS [12].
记录在 UNCERRSTS [12] 中。
Header is logged. 头部已记录。
Signaling: Fatal event. 信令:致命事件。
Either via SMI or ERROR_N
通过 SMI 或 ERROR_N 之一
pin.
5. IA32-Legacy MCA. 5. IA32-Legacy MCA。
Rx PTLP as fatal
Rx PTLP 作为致命
event
1 0 0 0 1
Logging: Uncorrected error
记录:未纠正的错误
logged in IIO MCA bank and
记录在 IIO MCA 银行中
UNCERRSTS [12]. Header is
UNCERRSTS [12]. 头部是
logged.
Signaling: Fatal MCERR or
信令:致命的 MCERR 或
ERROR_N[2] pin. If EMCA2
ERROR_N[2] 引脚。如果 EMCA2
mode is enabled, then
如果启用了模式,则
MCERR will be morphed to
MCERR 将被转换为
MSMI.
6. CDC Mode
enabled. IIO
forwards poison and 转发毒药和
logs as an
uncorrected fatal 未校正的致命
event. Header 事件。标题
logging enabled. 日志已启用。
Applicable when FFM 当 FFM 适用时
is used with
notification via the 通过通知
SMI.
1 1 1 0 1 0
Logging: Uncorrected error
记录:未校正的错误
logged in IIO MCA bank and
登录了 IIO MCA 银行和
UNCERRSTS [12]. Header is
UNCERRSTS [12]。标题是
logged.
Signaling: Fatal MCERR or
信令:致命的 MCERR 或
ERROR_N[2] pin. If EMCA2
错误_N[2]引脚。如果启用 EMCA2
mode is enabled, then
模式,则 MCERR 将被转换为
MCERR will be morphed to
MSMI.

11.3.5 Viral Mode of Error Containment
11.3.5 病毒模式的错误容纳

This feature further enhances the scope of fault containment provided by features such as 'Corrupt Data Containment (CDC). While the scope of CDC is to contain the data faults detected within the memory and caches such as LLC/MLC, Viral Mode extends the coverage of containment when other types of uncorrected fatal faults are detected. It also provides enhanced containment in multi-socket system and minimizes propagation of corrupted data outside of the processor socket.
此功能进一步增强了由“Corrupt Data Containment (CDC)”等功能提供的故障容纳范围。虽然 CDC 的范围是包含在内存和缓存(如 LLC/MLC)中检测到的数据故障,但病毒模式在检测到其他类型的未校正致命故障时扩展了容纳范围。它还在多插槽系统中提供增强的容纳,并最小化在处理器插槽之外传播损坏数据。
In the absence of this feature, system is exposed to potential data contamination during a time-window between the detection of an uncorrected fatal error (other than the one's covered by the CDC feature) and resetting of the system.
在没有此功能的情况下,系统在检测到未校正致命错误(不是由 CDC 功能覆盖的错误)和系统重置之间的时间窗口内有可能暴露于潜在数据污染。
Viral Mode provides enhanced error containment in case of fatal errors, and Viral Condition is propagated to remote sockets via UPI protocol. It is important to note that errors that result in a viral condition are not recoverable or continuable. A system reset is required. Viral condition is cleared on reset. However, error logs in the machine check banks are preserved across warm reset. Transactions may continue to flow within the Intel UPI fabric and to main memory. However, because IIO aborts
病毒模式在发生致命错误时提供增强的错误封装,并通过 UPI 协议将病毒条件传播到远程套接字。需要注意的是,导致病毒条件的错误是不可恢复或可继续的。需要进行系统复位。复位时会清除病毒条件。然而,机器检查存储器中的错误日志会在热复位时保留。事务可能会继续在 Intel UPI 结构和主存储器中流动。然而,由于 IIO 中止了对 PCIe 设备的事务,因此显示消息到屏幕或写入磁盘的能力受到影响,因为这些需要访问 PCIe 设备。预期在检测到致命错误/病毒条件时必须执行系统复位。

transactions to PCIe devices, the ability to display messages to the screen or write to disk is impacted because these require access to PCIe devices. The expectation is that a system reset must be performed when a fatal error/viral condition is detected.
当启用病毒模式时,遵循以下通用规则:
Following are generic rules followed when Viral Mode is enabled:
  1. All uncorrected fatal error ( condition) trigger viral Event.
    所有未校正的致命错误( 条件)会触发病毒事件。
  2. IIO AER Severity 2 errors trigger Viral. In IOMCA mode, PCC condition can trigger viral.
    IIO AER 严重性 2 错误会触发病毒。在 IOMCA 模式下,PCC 条件可能会触发病毒。
  3. When the Viral Event is asserted for any reason, IIO and Intel UPI will capture and hold the viral status.
    当由于任何原因导致病毒事件被断言时,IIO 和英特尔 UPI 将捕获并保持病毒状态。
Triggering of Viral event upon detection of fatal error within core, PCU, Intel UPI, IIO sub-system, and memory sub-system are described next.
在核心、PCU、Intel UPI、IIO 子系统和内存子系统中检测到致命错误时触发病毒事件的描述如下。

11.3.5.1 Fatal Error Detected Within the Cores
11.3.5.1 在核心内检测到致命错误

  1. Core sends an IERR message to UBOX.
    核心向 UBOX 发送 IERR 消息。
  2. PCU asserts CATERR_N pin low and keeps low until system reset.
    PCU 断言 CATERR_N 引脚低,并保持低直到系统复位。
  3. UBOX also sends IERR message to IIO and Intel UPI which asserts viral alert thus containing the potentially corrupted data.
    UBOX 还向 IIO 和 Intel UPI 发送 IERR 消息,从而断言病毒警报,从而包含可能已损坏的数据。

11.3.5.2 External Fatal Events Detected by PCU
11.3.5.2 PCU 检测到的外部致命事件

  1. PCU sends a MCERR message with IERR-semantics to UBOX
    PCU 将带有 IERR 语义的 MCERR 消息发送到 UBOX
  2. The UBOX broadcasts fatal and viral message internally to all modules.
    UBOX 在内部向所有模块广播致命和病毒消息

11.3.5.3 Fatal Error Detected Within Intel UPI
在 Intel UPI 内部检测到致命错误 11.3.5.3

When Inte UPI link observes an internal fatal error or detects an incoming viral packet, the agent enters the viral state.
当 Intel UPI 链路观察到内部致命错误或检测到传入的病毒数据包时,代理进入病毒状态。
When Intel UPI agent enters into viral state, it propagates viral state to other sockets by setting viral alert bit in the Intel UPI packet header and will continue to send viral alert until viral state is cleared by error handling FW or system is reset.
当 Intel UPI 代理进入病毒状态时,通过在 Intel UPI 数据包头部设置病毒警报位来将病毒状态传播到其他插槽,并将持续发送病毒警报,直到错误处理 FW 清除病毒状态或系统被重置。
Intel UPI viral state is indicated by KTIVIRAL[31]: kti_viral_state. This bit is set when:
Intel UPI 病毒状态由 KTIVIRAL[31]: kti_viral_state 表示。当以下情况发生时,将设置此位:
  1. Intel UPI agent observes an internal fatal error.
    英特尔 UPI 代理观察到内部致命错误。
  2. Intel UPI agent receives an incoming viral packet from Intel UPI ports ( viral).
    英特尔 UPI 代理从英特尔 UPI 端口接收到传入的病毒数据包( 个病毒)。
  3. A socket fatal or viral condition is observed by the Intel UPI agent.
    英特尔 UPI 代理观察到套接字发生致命或病毒性状况。

NOTE 注意

Incoming viral packets on Intel UPI link do not cause assertion of CATERR_N pin, nor do they cause MCERR signaling, on the receiving socket. The CATERR_N hold indication is cleared on warm-reset. Intel UPI links only serve as a medium to broadcast the Viral state from one socket to another; traffic on the Intel UPI link is not restricted.
Intel UPI 链路上的传入病毒数据包不会导致 CATERR_N 引脚断言,也不会导致接收插座上的 MCERR 信号。CATERR_N 保持指示在热复位时被清除。Intel UPI 链路仅用作从一个插座向另一个插座广播 Viral 状态的媒介;Intel UPI 链路上的流量不受限制。

11.3.5.4 Fatal Error Detected Within IIO Sub-System
在 IIO 子系统内部检测到致命错误 11.3.5.4

When IIO observes an internal fatal error indication, the agent enters the viral state and signals other agents internally to the socket. IIO aborts traffic to PCIe*.
当 IIO 观察到内部致命错误指示时,代理进入病毒状态,并在套接字内部向其他代理发出信号。IIO 中止对 PCIe* 的流量。
IIO viral state is indicated by VIRAL_CFG[31]: IIO_VIRAL_STATE. This bit is set when:
IIO 病毒状态由 VIRAL_CFG[31]: IIO_VIRAL_STATE 指示。当以下情况发生时,设置此位:
  1. IIO detects an internal fatal error (Severity 2 )
    IIO 检测到内部致命错误(严重性 2)
  2. A socket fatal or viral condition is observed
    观察到套接字的致命或病毒性状况
While IIO is in the Viral State (when VIRAL_CFG[31] is set), handling of various transactions is as follows:
当 IIO 处于病毒状态时(当 VIRAL_CFG[31] 被设置时),各种事务的处理如下:
  1. All outbound requests will be master-aborted (including all requests in the transaction layer queues)
    所有出站请求将被主动中止(包括事务层队列中的所有请求)
  2. All outbound completion packets will be converted to Completer Abort completions
    所有出站完成数据包将被转换为完成者中止完成
  3. All inbound requests will be Completer Aborted (Posted requests will be dropped)
    所有入站请求将被完成者中止(已发布的请求将被丢弃)
  4. Transactions from the Coherent Interface targeting internal registers (configuration space, memory mapped registers, and so on) will be completed as normal.
    针对内部寄存器(配置空间、内存映射寄存器等)的一致接口事务将按正常完成。
  5. Packets that contain the error condition that caused Viral State will be aborted with these rules.
    包含导致病毒状态的错误条件的数据包将根据这些规则被中止。
  6. When the IIO enters viral mode it can drop legacy interrupts.
    当 IIO 进入病毒模式时,它可以丢弃传统中断。

11.3.5.5 Fatal Error Detected Within Memory Sub-System
在内存子系统中检测到致命错误。

IMC's action after Viral is triggered:
触发病毒后 IMC 的操作:
  1. When the IMC observes viral event, it will drain its transactions and return poison for all read requests.
    当 IMC 观察到病毒事件时,它将耗尽其事务并为所有读取请求返回毒药。
The following table lists all the viral mode configuration and status registers.
以下表格列出了所有病毒模式配置和状态寄存器。
Table 95. Viral Mode of Error Containment Enabling Registers
病毒模式的错误容错寄存器表 95
Scope Register Description
Viral Mode of Error Containment
病毒模式的错误容错
Configuration registers within IIO,
IIO 内的配置寄存器
Intel UPI, Memory, UBOX, and PCU
Intel UPI, 内存, UBOX 和 PCU
Global IA32_MCG_CONTAIN (MSR 0x178[1])
Bit 1 (VIRAL_EN) - Used to enable the
位 1 (VIRAL_EN) - 用于启用
viral feature in a system.
系统中的病毒特征。
IIO IIOVIRAL
Repeated for each IIO link. Controls
每个 IIO 链接都重复。控制
propagation of viral from IIO to/from the
从 IIO 传播病毒到/从中传播。
fatal and viral wires. Must be
致命和病毒性的线。必须在启用病毒时进行编程。
programmed when enabling viral.
每个 M2PCIe 块都要重复使用。
IIO R2GLERRCFG
Repeated for each M2PCIe block. Used
to control propagation of viral from IIO
控制病毒从 IIO 传播
and UBOX to the fatal and viral wires.
并将其传播到致命和病毒线。
Must be programmed when enabling
在启用时必须进行编程
viral
1. M2PCIE0 R2GLERRCFG (PCI
0XA8) - Controls viral propagation for
0XA8) - 控制病毒传播
bus 2 (PSTACK1) and UBOX
总线 2 (PSTACK1) 和 UBOX
Scope Register Description
2. M2PCIE1 R2GLERRCFG (PCI 3:22.0
0xA8) - Controls viral propagation for
0xA8) - 控制病毒传播
bus 0 (CSTACK) and bus 1
总线 0 (CSTACK) 和总线 1
(PSTACK0)
3. M2PCIE2 R2GLERRCFG (PCI 3:23.0
0xA8) - Controls viral propagation for
0xA8) - 控制总线 4(PSTACK3)的病毒传播
bus 4 (PSTACK3)
4. M2PCIE3 R2GLERRCFG (PCI 3:21.4
4. M2PCIE3 R2GLERRCFG(PCI 3:21.4)
OxA8) - Not used
OxA8)- 未使用
5. M2PCIE4 R2GLERRCFG (PCI 3:22.4
5. M2PCIE4 R2GLERRCFG(PCI 3:22.4)
0xA8) - Controls viral propagation for
0xA8) - 控制病毒传播
bus 3 (PSTACK2) 总线 3 (PSTACK2)
IIO/Intel UPI M3GLERRCFG
Repeated for each M3KTI block. Used to
每个 M3KTI 块都会重复。用于
control propagation of viral from IIO and
控制从 IIO 和 UBOX 传播病毒到致命和病毒线路。
UBOX to the fatal and viral wires. Must
必须
be programmed when enabling viral.
在启用病毒时进行编程。
Intel UPI 英特尔 UPI KTIVIRAL
Repeated for each Intel UPI link.
每个英特尔 UPI 链接都会重复。
Controls propagation of viral from Intel
控制从英特尔传播的病毒
UPI link (including internal errors and
UPI 链路(包括内部错误和
viral transmitted on the link) to/from the
通过该链路传输的病毒)到/从
fatal and viral wires. Must be
致命和病毒性电线。必须在启用病毒时进行编程。
programmed when enabling viral.
启用病毒时进行编程。
UBOX UBOXGLERRCFG
Controls propagation of viral between
控制病毒在之间的传播。
UBOX, PCU, pins, and various other
UBOX、PCU、引脚和其他各种
places.
PCU VIRAL_CONTROL
Controls how the PCU responds to viral
控制 PCU 如何响应病毒
and EMCA2 signaling. 和 EMCA2 信号。
Memory DDR4_VIRAL_CTL
Used to disable viral for specific IMC
用于禁用特定 IMC 的病毒功能
errors.
Memory IMC[0,1]_VIRALCTL
Used to clear viral status in IMC (note
用于清除 IMC 中的病毒状态(注意
that there is no accompanying viral
没有伴随病毒
status indication in IMC)
IMC 中没有状态指示)
Viral Mode of Error Containment
错误控制的病毒模式
Status registers 状态寄存器
Various MCA Banks 各种 MCA 银行
See Table 76 on page 230 and Table 77
请参阅第 230 页的表 76 和表 77
on page 231.

11.3.6 DCU Scrubbing 11.3.6 DCU 擦洗

Data Cache Unit (DCU) scrubbing improves system uptime by minimizing the impact due to high energy particle strike (soft errors) within the core DCU (L1D cache).
数据缓存单元(DCU)擦洗通过最小化核心 DCU(L1D 缓存)内高能粒子撞击(软错误)的影响,提高系统的正常运行时间。
The DCU structure is protected via parity. When a soft error impacts the DCU structure, the parity error is detected while accessing data. If data is not in an "M" state and a parity error is detected, then it is simply invalidated and is not considered as fatal event. If data is in an "M" state and a parity is detected, then a fatal MCERR is triggered.
DCU 结构通过奇偶校验进行保护。当软错误影响 DCU 结构时,访问数据时会检测到奇偶校验错误。如果数据不处于“M”状态且检测到奇偶校验错误,则简单地使其无效,并不被视为致命事件。如果数据处于“M”状态且检测到奇偶校验错误,则会触发致命的 MCERR。
DCU scrubbing is a feature that writes back cachelines in " " state to the Mid Level Cache (MLC), leaving a copy in the DCU at "E" state. The write back algorithm is configured to have minimal impact on performance (for example, done periodically with lower priority than demand requests). Since the MLC is ECC protected, the probability of a fatal event is minimized.
DCU 擦洗是一项功能,将处于“ ”状态的高速缓存行写回到中级缓存(MLC),同时在 DCU 中保留一份“E”状态的副本。写回算法配置为对性能影响最小(例如,定期以比需求请求低的优先级执行)。由于 MLC 受 ECC 保护,致命事件的概率被最小化。
This feature is available as part of standard RAS across all the processor SKUs.
此功能作为所有处理器 SKU 的标准 RAS 的一部分提供。
By default, the DCU scrubbing feature is enabled and there is no additional FW/SW enabling required.
默认情况下,DCU 擦洗功能已启用,无需额外的 FW/SW 启用。

11.3.7 Timeout Timer Schemes
11.3.7 超时计时器方案

Timeout timers within various sub-modules used to report the failures as close as possible to the source of the fault, for example, core retirement watchdog (core 3strike) timer, CBO-TOR time-out timer. This improves the system serviceability/ diagnosability.
各个子模块内部的超时计时器用于尽可能接近故障源报告故障,例如,核心退役看门狗(核心 3 次打击)计时器,CBO-TOR 超时计时器。这提高了系统的可维护性/诊断性。
This feature implies timeout timer hierarchy must cause the timer nearest to the nonresponsive entity to expire first and complete the transaction with error (or trigger the fault). This is to avoid the prevalence of core 3-strike timer expiry "only" which renders diagnostics extremely difficult because the log information captured with core 3 -strike timeout "only" is often not sufficient to identify the source of the fault.
此功能意味着超时计时器层次结构必须导致最接近不响应实体的计时器首先到期并以错误完成事务(或触发故障)。这是为了避免核心 3 次打击计时器到期“仅仅”导致诊断极其困难,因为仅仅使用核心 3 次打击超时的日志信息通常不足以识别故障源。
Timeout timer schemes improve serviceability within various sub-modules used to report the failures as close as possible to the source of the fault, for example, core 3strike timer timeout, CBO-TOR timer time-out, and PCIe* completion time-out. The goal is to be able to allow the system to attempt recovery or signal a fatal error, for example, Machine Check Exception (MCE), without leading to core 3-strike timer timeout "only" which renders diagnostics extremely difficult without a system reset.
超时计时器方案改善了各种子模块内的可维护性,以尽可能接近故障源报告故障,例如,核心 3 次打击计时器超时,CBO-TOR 计时器超时和 PCIe*完成超时。目标是使系统能够尝试恢复或发出致命错误信号,例如,机器检查异常(MCE),而不会导致核心 3 次打击计时器超时“仅仅”使诊断极其困难而无法进行系统复位。
The following timeout features have been implemented in the processor:
处理器中已实现了以下超时功能:
  1. Core 3-strike 核心 3 次打击
  2. CBO-TOR timeout CBO-TOR 超时
  3. Intel UPI Link Level Retry timeout
    Intel UPI 链路级重试超时
  4. Mesh-to-Memory (B2CMI) timeout (formerly called HA BT timeout)
    Mesh-to-Memory (B2CMI) 超时(以前称为 HA BT 超时)
  5. Primary lock timeout timer
    主锁定超时计时器
  6. IRP Config_retry_time-out
    IRP 配置重试超时
  7. PCI Express port completion timeout (CTO)
    PCI Express 端口完成超时(CTO)
The following table lists all the relevant registers for configuring various timeout timers.
下表列出了配置各种超时定时器的所有相关寄存器。
Table 96. Time-Out Timer Configuration Registers
表 96. 超时定时器配置寄存器
Timer Scope Register Description
Core 3-strike 核心 3 次打击 Global
MISC_FEATURE_CONTROL.disable_t
hree_strike_cnt (MSR 0x1A4 bit 11)
三次打击计数器(MSR 0x1A4 位 11)
Prevents the 3-strike counter from incrementing thus
阻止三次打击计数器递增,从而
disabling the triggering of the core 3-strike event.
禁用核心三次打击事件的触发。
Caching Home
Agent (CHA) Table 代理(CHA)表
of Request (TOR) 请求(TOR)的
timeout
Global QPI_TIMEOUT_CTRL Sets the values for the multiple levels of TOR timeout.
设置多个级别的 TOR 超时的值。
Caching Home
Agent (CHA) Table 代理人 (CHA) 表
of Request (TOR) 请求 (TOR) 的
timeout
Global QPI_TIMEOUT_CTRL2 Sets the values for the last levels of TOR timeout.
设置 TOR 超时的最后级别的值。
Intel UPI LLR
Retry Timeout 重试超时
Global KTILCL Recommend leaving the default value of 000b.
建议保留默认值 000b。
B2CMI Global TIMEOUT Repeated for each IMC. Sets the B2CMI timeout value.
为每个 IMC 重复。设置 B2CMI 超时值。
Timer Scope Register Description
PCIe* completion PCIe* 完成
timeout
IIO
per
port
DEVCTRL2.COMPLTOVAL
Completion timeout value on non-posted TX that the IIO
IIO 上非发布 TX 的完成超时值
issues on the PCIe*. In devices that support completion
在 PCIe* 上的问题。在支持完成超时可编程性的设备中
timeout programmability, this field allows system
,此字段允许系统软件修改完成超时范围。The
software to modify the completion timeout range. The
following encodings and corresponding timeout ranges
以下编码和相应的超时范围
are defined:
to
Reserved (IIO aliases to )
保留(IIO 别名为
Reserved (IIO aliases to )
保留(IIO 别名为
to
to
to
to
to
to
PCIe* completion PCIe* 完成
timeout
IIO
per
port
CTOCTRL.xp_to_pcie_timeout_select
When OS/UEFI-FW selects a timeout range of to
当操作系统/UEFI-FW 选择 的超时范围时
for a given PCIe* port (that affects non-posted TX
对于给定的 PCIe* 端口 (影响非发布 TX
sent to the PCIe*) using the root port's DEVCTRL2
通过根端口的 DEVCTRL2 寄存器发送到 PCIe*,此字段选择该较大范围内的子范围,以实现额外的可控性。
register, this field selects the sub-range within that
寄存器,此字段选择该较大范围内的子范围,以实现额外的可控性。
larger range for additional controllability.
更大范围内的子范围,以实现额外的可控性。
s
10:
11: Reserved
Notes: 1 . This field is not used at all when NTB is
注意:1. 当 NTB 启用时,此字段根本不使用
enabled on since there is no
因为没有
programmability of completion timeout in
完成超时的可编程性
that mode.
2. Repeated on each device.
2. 在每个设备上重复。
PCI Express config PCI Express 配置
retry timeout 重试超时
IIO
Global
IRP_MISC_DFX1.CFG_RETRY_TIMEO
UT
The lower bound timeout. The upper bound timeout is
下限超时。上限超时是
twice that value (for example, for setting of 3 , the
该值的两倍(例如,对于设置为 3 的情况,
timeout range would be 2 us to 4 us).
超时范围将是 2 微秒至 4 微秒)。
10: disabled
11: 2 us
Repeated for each bus.
每辆公共汽车都会重复。

11.3.8 Processor BIST 11.3.8 处理器 BIST

After power-up or an assertion of the RESET# pin, each processor on the system bus performs a hardware initialization sequence, a hardware reset and the optional BuiltIn Self-Test (BIST) runs. The results of the BIST are stored in the BIST_RESULTS CSR. UEFI-FW tracks the BIST results and takes corrective action as needed. In the processor, there are two major components of BIST, MLC BIST (handled by UCODE) and LLC BIST (handled by PCODE). The BIST flow is enabled through the pin BIST_ENABLE, described in the Reset chapter of this document.
在上电或 RESET# 引脚断言后,系统总线上的每个处理器执行硬件初始化序列、硬件复位以及可选的内建自测 (BIST) 运行。BIST 的结果存储在 BIST_RESULTS CSR 中。UEFI-FW 跟踪 BIST 结果,并根据需要采取纠正措施。在处理器中,BIST 有两个主要组件,MLC BIST(由 UCODE 处理)和 LLC BIST(由 PCODE 处理)。BIST 流程通过 BIST_ENABLE 引脚启用,本文档的复位章节中有描述。
The hardware BIST engine provides the primary input to the core disable for the FRB.
硬件 BIST 引擎为 FRB 的核心禁用提供主要输入。

11.4 Memory Sub-System RAS Features
11.4 内存子系统 RAS 功能

This section describes RAS features associated with the memory sub-system.
本节描述与内存子系统相关的 RAS 功能。
Table 97. Memory Sub-System RAS Features
表 97. 存储子系统 RAS 功能
RAS Feature Description
Standa
rd RAS
SKU
Advan
ced
RAS
SKU
Memory single device 存储单设备
data correction 数据校正
(SDDC)
SDDC feature allows managing DRAM persistent (hard) failures where the whole
SDDC 功能允许管理 DRAM 持久性(硬件)故障,其中整个
DRAM device is failed. See processor ECC specification for details.
DRAM 设备失败。有关详细信息,请参阅处理器 ECC 规范。
Yes Yes
Adaptive data 自适应数据
correction - single 修正 - 单个
region (ADC-SR) 区域 (ADC-SR)
ADC-SR feature allows managing single DRAM persistent (hard) failures through
ADC-SR 功能允许通过管理单个 DRAM 持久性 (硬件) 故障进行管理
adaptive virtual lockstep (AVLS), at per DDR channel granularity.
自适应虚拟锁步(AVLS),按照 DDR 通道粒度。
Yes Yes
Adaptive double 自适应双重
device data correction 设备数据校正
- multiple region - 多区域
(ADDDC-MR)
The "ADDDC-MR" feature allows managing multiple DRAM persistent (hard)
“ADDDC-MR” 功能允许管理多个 DRAM 持久性(硬)故障,以 DDR 通道粒度为单位。最多可以处理两个 DRAM 硬故障
failures, at DDR channel granularity. Up to two DRAM hard failures can be
,在 DDR 通道粒度上。最多可以处理两个 DRAM 硬故障。
corrected within different bank/rank, The technology applies to x4 DRAM devices
在不同的银行/排名内进行校正,该技术适用于 x4 DRAM 设备
only.
No Yes
DDR command/
address parity check 地址奇偶校验
and retry
DDR CMD/ADDR parity check and retry with following attributes:
DDR 命令/地址奇偶校验和重试,具有以下属性:
1. CMD/ADDR parity error "address" logging
1. CMD/ADDR 奇偶错误“地址”记录
2. CMD/ADDR retry 2. CMD/ADDR 重试
3. In mirror mode, the controller will NOT fail over to secondary channel.
3. 在镜像模式下,控制器不会切换到辅助通道。
Yes Yes
DDR Write data CRC
DDR 写数据 CRC
check and retry 检查并重试
DDR write data CRC check within the DRAM device and signaling an event back
在 DRAM 设备内进行 DDR 写数据 CRC 检查,并向后传递事件
to CPU/IMC for retry. DIMM will signal the CRC miss-match using PAR_ALERT
为 CPU/IMC 重试。DIMM 将使用 PAR_ALERT 信号 CRC 不匹配。
signal. When enabled, two additional bursts are added (a total of 10 bursts) to
信号。启用时,将添加两个额外的突发(总共 10 个突发)来传输写入 CRC 位。
transfer the write CRC bits.
Yes Yes
Memory data
scrambling with 使用混淆
command and address 命令和地址
Scrambles the data with address and command in "write cycle" and unscrambles
在“写周期”中使用地址和命令对数据进行混淆并解除混淆
the data in "read cycle". Addresses reliability by improving signal integrity at the
在“读取周期”中的数据。通过在物理层改善信号完整性来提高可靠性。此外,有助于检测地址位错误。
physical layer. Additionally, assists with detection of an address bit error.
内存需求和
Yes Yes
Memory demand and
patrol scrubbing 巡逻擦洗
Demand scrubbing is the ability to write corrected data back to the memory once
需求擦洗是将校正后的数据写回内存的能力
a correctable error is detected on a read transaction.
在读取事务中检测到可纠正错误时。
Patrol scrubbing pro-actively searches the system memory, repairing correctable
巡逻擦洗主动搜索系统内存,修复可纠正的
errors. Prevents accumulation of single- bit errors that may result in uncorrected
错误。防止单比特错误的积累,可能导致无法纠正
error.
Yes Yes
Memory mirroring 内存镜像
Memory Mirroring is a method of keeping a duplicate (secondary or mirrored)
内存镜像是一种方法,用于保留内存内容的副本(次要或镜像),作为冗余备份,以备主内存发生故障时使用。内存的镜像副本存储在内存中,位于主内存的后面。
copy of the contents of memory as a redundant backup for use if the primary
memory fails. The mirrored copy of the memory is stored in memory behind the
same integrated memory controller (IMC). Dynamic (without reboot) fail-over to
相同的集成内存控制器(IMC)。动态(无需重新启动)故障转移至
the mirrored DIMMs is transparent to the SW/OS.
镜像的 DIMM 对于软件/操作系统是透明的。
Yes Yes
Address range 地址范围
memory mirroring 内存镜像
In this mode, a subset of memory is mirrored, leaving the rest of the memory in
在此模式下,一部分内存被镜像,其余内存保持非镜像模式。这种镜像模式允许关键内存的镜像
non-mirror mode. Such mirroring mode allows mirroring of critical memory
非镜像模式。这种镜像模式允许关键内存的镜像
address range (called as 'more reliable memory') without incurring the cost of
在不产生完整内存镜像成本的情况下,地址范围(称为“更可靠的内存”)
full memory mirroring. Address ranges needs to be within the same IMC.
不需要完整内存镜像。地址范围需要在同一 IMC 内。
Provides OS level interface for requesting 'more reliable memory' as a percent of
为请求“更可靠的内存”作为百分比提供操作系统级接口
full visible memory address space. Optionally, platform FW can configure the
完整可见的内存地址空间。可选地,平台固件可以在引导时使用 EFI 实用程序配置
'more reliable memory' range during boot time using EFI utilities.
'更可靠的内存' 范围。
No Yes
DDR power up and
DDR 上电和
runtime post package 运行时后包
repair (PPR)
Starting with DDR technology there is an additional capability available known as
从 DDR 技术开始,还有一个额外的功能可用,称为
PPR. PPR offers additional spare capacity within the DDR DRAM that can be used
PPR。 PPR 提供了 DDR DRAM 内额外的备用容量,可供使用
to replace faulty cell areas detected during system boot time. DDR5 boot time
在系统引导时检测到故障单元区域并替换。DDR5 引导时间
hard PPR, boot time soft PPR (test mode), and run time soft PPR are supported.
支持硬件 PPR、引导时间软件 PPR(测试模式)和运行时间软件 PPR。
Yes Yes
Memory SMBus hang 存储器 SMBus 挂起
recovery
This RAS feature allows system recovery in case SMBus fails to respond during
此 RAS 功能允许系统在运行时恢复,以防 SMBus 在响应失败时导致系统崩溃。通过 SMI 通知 UEFI 固件,可能可以通过重置和重新激活链接来恢复 SMbus。
run-time thus preventing system crash. UEFI FW is notified via an SMI and it
此 RAS 功能允许系统在运行时恢复,以防 SMBus 在响应失败时导致系统崩溃。通过 SMI 通知 UEFI 固件,可能可以通过重置和重新激活链接来恢复 SMbus。
may be able to recover the SMbus by resetting and re-activating the link.
此 RAS 功能允许系统在运行时恢复,以防 SMBus 在响应失败时导致系统崩溃。通过 SMI 通知 UEFI 固件,可能可以通过重置和重新激活链接来恢复 SMbus。
Memory SMBus is actively used by the processor for thermal monitoring and
内存 SMBus 被处理器积极用于热监控和
implementing CLTT. 实现 CLTT。
Yes Yes
Memory disable/map- 内存禁用/映射-
out for FRB
Allows memory initialization and completion of booting flow even when memory
允许在发生内存故障时初始化内存并完成引导流程
fault occurs accomplishing FRB.
完成 FRB 时发生故障。
Yes Yes

NOTES: 注释:

  1. RAS features may not be supported on all SKUs of a processor type.
    RAS 功能可能不受处理器类型的所有 SKU 支持。
This section describes various memory RAS features used to maintain memory data integrity and protect data stored in DDR DIMMs against various types of faults such as high energy particle strike, stuck-at fault, persistent DRAM device failure, DDR link failures.
本节描述了用于维护内存数据完整性和保护存储在 DDR DIMM 中的数据免受各种故障影响的各种内存 RAS 功能,例如高能粒子打击、卡住故障、持久性 DRAM 设备故障、DDR 链路故障。

11.4.1 DDR Device Data Protection Overview
11.4.1 DDR 设备数据保护概述

The protection of the DDR5 DIMM data and DDR5 link is achieved by a combination of RAS features.
通过一系列 RAS 功能的组合实现了 DDR5 DIMM 数据和 DDR5 链路的保护。
Sophisticated Error Check and Correction (ECC) code techniques are incorporated to mitigate the impact of high energy particle strike (soft errors), link transient errors, and various types of DRAM device failures.
采用复杂的错误检查和纠正(ECC)代码技术,以减轻高能粒子打击(软错误)、链路瞬态错误和各种类型的 DRAM 设备故障的影响。
Sophisticated error accumulation prevention techniques are incorporated, for example, patrol scrub clears out soft errors. Sparing techniques such as device sparing as part of Adaptive Data Correction (ADC) and adaptive double device data correction (ADDDC) features map out hard failures. Prevention of error accumulation considerably reduces the probability of two independent errors on the same cacheline. This improves the probability of correction using existing ECC techniques.
采用了复杂的错误累积预防技术,例如巡逻擦除清除软错误。作为自适应数据校正(ADC)和自适应双设备数据校正(ADDDC)功能的一部分的设备备用技术映射出硬故障。错误累积的预防大大降低了同一缓存行上两个独立错误的概率。这提高了使用现有 ECC 技术进行纠正的概率。

Error Check and Correction Code (ECC)
错误检查和校正码(ECC)

  • The processor supports 128 -bit, 125-bit, and 96-bit ECC for 10x4 and 5x8 DDR5 DIMMs. Different ECC techniques are applied per the DDR5 configurations and the operating state, that is, SDDC/adaptive virtual lockstep (AVLS). The following table describes error detection and correction capability.
    处理器支持 128 位、125 位和 96 位 ECC 用于 10x4 和 5x8 DDR5 DIMM。根据 DDR5 配置和操作状态应用不同的 ECC 技术,即 SDDC/自适应虚拟锁步(AVLS)。以下表格描述了错误检测和校正能力。
  • The ECC logic makes use of Permanent Fault Detection (PFD) on Demand and Patrol read operations. In sparing flows such as ADDDC/ADC, the ECC logic makes use of the "failed device" programmed to the "adddc failed device in adddc failed region" to correct the single device uncorrected error. If the register is not configured correctly and the single device uncorrectable error triggers then the hardware mis-corrects.
    ECC 逻辑利用按需的永久故障检测(PFD)和巡逻读操作。在像 ADDDC/ADC 这样的备用流程中,ECC 逻辑利用“故障设备”编程到“在 adddc 失败区域的 adddc 失败设备”来纠正单设备未校正错误。如果寄存器配置不正确且单设备不可校正错误触发,则硬件会错误校正。
Table 98. Processor DDR5 ECC Coverage
处理器 DDR5 ECC 覆盖表 98
Mode of
Operation
ECC Type ECC Mode DDR5 DRAM Device Coverage
DDR5 DRAM 设备覆盖
Normal or
adaptive virtual 自适应虚拟
lockstep
Custom code
derived from
Reed-
Solomon
code with
Permanent
Fault Detect
(PFD) and
correction
capability
10x4 DRAM DIMM
A. Three ECC modes, 128 bit, 125 bit, and 96 bit
A. 三种 ECC 模式,128 位,125 位和 96 位
Permanent error: detection, and correction of all
永久错误: 检测,并对所有进行 更正
errors
Non-permanent error: detection, and better than
非永久性错误: 检测,比
correction. 125 -bit ECC mode provides more error
更好的纠正。125 位 ECC 模式提供比 96 位 ECC 更多的错误
correction coverage than 96 -bit ECC.
纠正覆盖范围。
In virtual lockstep ECC coverage is 32 bits, and detection,
在虚拟同步中,ECC 覆盖范围为 32 位,检测和
correction is always
修正始终为
Normal or
adaptive virtual 自适应虚拟
lockstep
Custom code
derived from
Reed-
Solomon
DRAM DIMM C. Three ECC modes, 128 bit, 125 bit, and 96 bit:
C. 三种 ECC 模式,128 位,125 位和 96 位:
Mode of
Operation
ECC Type ECC Mode DDR5 DRAM Device Coverage3,4
DDR5 DRAM 设备覆盖范围 3,4
code with
Permanent
Fault Detect
(PFD) and
correction
capability
Permanent error: detection, and correction of all
永久错误: 检测,并纠正所有
errors that fall exclusively in the right or left half of x 8 DRAM
仅出现在 x 8 DRAM 右半部分或左半部分的错误
device. Errors that fall on both the right and left half of the
设备上同时出现在右半部分和左半部分的错误不在 ECC 范围内,不会被纠正
device are outside the scope of the ECC and are not
设备上同时出现在右半部分和左半部分的错误不在 ECC 范围内,不会被纠正
corrected but detected for about of error patterns. 5
大约检测到 个错误模式的纠正,5
Non-permanent error: detection, and better than
非永久性错误: 检测,比
correction of all errors that fall exclusively in the
所有仅存在于其中的错误的 校正
right or left half of x8 DRAM device. Errors that fall on both
x8 DRAM 设备的右半部分或左半部分。落在设备的右半部分和左半部分的错误
the right and left half of the device are outside the scope of
超出 ECC 范围,不会被纠正,但会被检测到
the ECC and are not corrected but detected for about
约为
of error patterns. The 125 -bit ECC mode provides
个错误模式的源。125 位 ECC 模式提供
more error correction coverage than the bit ECC. 5 .
位 ECC 更多的错误校正覆盖范围。5。
- Virtual lockstep is not supported in this DIMM
- 此 DIMM 不支持虚拟锁步。
conformation.

NOTES: 注意:

  1. Customization done to handle storage of directory/poison/tag in the ECC DRAM space.
    已定制以处理目录/毒素/标记在 ECC DRAM 空间中的存储。
  2. Detection is not for device errors that affect both primary and non-failed bank/rank. Device errors can hit both primary and non-failed rank/bank in the intra rank VLS solutions. a: Errors on the same column bit from primary failed and non-failed DRAM device are detected. column correction occurs on all columns storing data bits. Columns storing the metadata bit correction is not .
    检测不适用于同时影响主要和未故障银行/等级的设备错误。设备错误可能会影响主要和未故障等级/银行在内部等级 VLS 解决方案中。a: 来自主要故障和未故障 DRAM 设备的相同列位上的错误被检测到。所有存储数据位的列上发生列纠正。存储元数据位纠正不适用。
  3. Note that errors beyond the scope of an ECC detection is not guaranteed.
    请注意,ECC 检测范围之外的错误不能保证被纠正。
  4. 2LM-NM, single DRAM device error that lands on metadata will be corrected.
    2LM-NM,单个 DRAM 设备错误如果落在元数据上将被纠正。
  5. Half device in DRAM configuration covers data bits that exclusively fall on DQ0-3 or DQ4-7 but not both. In x4 DRAM configuration covers data bits that exclusively fall on DQ0-1 or DQ 2-3 but not both.
    DRAM 配置中,一半设备覆盖的数据位仅落在 DQ0-3 或 DQ4-7 上,但不会同时落在两者上。在 x4 DRAM 配置中,覆盖的数据位仅落在 DQ0-1 或 DQ 2-3 上,但不会同时落在两者上。

Prevention of Error Accumulation - Patrol Scrub
防止错误累积 - 巡逻擦洗

The patrol scrub prevents accumulation of cosmic soft errors and link transient errors. The patrol scrub runs across the entire memory space in a pre-defined interval (patrol scrub period is typically 24 hours). A write back is done on the cacheline. This clears out the error if the error is soft/transient. The main goal of the patrol scrub is to prevent the collision of a soft error (affects single cacheline) with a hard fail such a full bank failure that affects a large number of cachelines. The patrol scrub also prevents the collision of two soft errors on the same cacheline. However, the probability of two soft errors on the same cacheline is already negligible since soft errors are typically single bit and the number of cachelines in a system.
巡逻擦洗防止宇宙软错误和链路瞬态错误的累积。巡逻擦洗在预定义的间隔(通常为 24 小时)内跨越整个内存空间运行。对缓存行进行回写。如果错误是软性/瞬态的,则清除错误。巡逻擦洗的主要目标是防止软错误(影响单个缓存行)与硬故障(例如影响大量缓存行的完整银行故障)的碰撞。巡逻擦洗还防止同一缓存行上的两个软错误发生碰撞。然而,同一缓存行上发生两个软错误的概率已经可以忽略不计,因为软错误通常是单个位,而系统中的缓存行数量。
  • See Memory Demand and Patrol Scrubbing on page 273 for further details of the patrol scrubber.
    有关巡逻擦洗器的更多详细信息,请参阅第 273 页的内存需求和巡逻擦洗。

Error Accumulation Prevention Technique - Sparing (Mapping Out Hard Failures)
错误累积预防技术 - 备用(映射硬故障)

  • The primary objective of sparing is to map-out any hard/persistent fault. Note that in case of a persistent fault within a device, the patrol scrubber will not be able to clear such a fault. The processor incorporates a sophisticated set of sparing techniques that cover various types of hard failures as described in the following table.
    备用的主要目标是映射出任何硬/持久性故障。请注意,在设备内出现持久性故障的情况下,巡逻清洁器将无法清除此类故障。处理器采用了一套复杂的备用技术,涵盖了以下表格中描述的各种类型的硬故障。
Table 99. Processor Sparing Techniques
表 99. 处理器备用技术
Sparing
Technique
Memory RAS
Feature
Boot-Time
Versus Run-
Time
Faults Covered 覆盖的故障 Comments
Spare rows within 内部的备用行
DRAM device
DDR runtime Post DDR 运行时后置
Package Repair 包修复
(PPR)
Run-time DRAM row
DIMM vendor provided resources. No
DIMM 供应商提供的资源。没有
additional cost. No performance
额外费用。没有性能
impact.
Spare rows within DIMM 供应商提供的备用行
DRAM device
DDR power up
PPR
Boot time DRAM row
DIMM vendor provided resources. No
资源。没有额外费用。没有性能
additional cost. No performance
impact.
Device sparing 设备备用
ADC-SR, ADDDC ADC-SR,ADDDC
(MR)
Run-time Row, bank, rank 行,组,等级
Uses the AVLS concept to map out the
使用 AVLS 概念来绘制
faulty region. Some performance
故障区域。在 VLS 之后预计会有一些性能
impact is expected after VLS is
影响。
triggered. No additional cost.
触发。无额外费用。

11.4.2 Memory Single Device Data Correction (SDDC)
11.4.2 内存单设备数据校正(SDDC)

The processor's IMC module implements error check and correction technique. The processor also implements a "read retry" algorithm to recover from a failure (for example, transient errors). This entire flow of error detection, correction, is defined as "Single Device Data Correction (SDDC)".
处理器的 IMC 模块实现了错误检查和校正技术。处理器还实现了一种“读取重试”算法,以从故障中恢复(例如,瞬态错误)。错误检测、校正的整个流程被定义为“单设备数据校正(SDDC)”。
The SDDC feature allows managing DRAM failures where the whole DRAM device is failed.
SDDC 功能允许管理整个 DRAM 设备故障的情况。

11.4.3 Adaptive Data Correction - Single Region (ADC-SR)
11.4.3 自适应数据校正 - 单区域 (ADC-SR)

ADC-SR, available in all processor SKUs, improves system uptime and service cost in the presence of memory faults.
ADC-SR 在所有处理器 SKU 中可用,在存在内存故障时提高系统正常运行时间和服务成本。
ADC-SR feature allows managing DRAM persistent (hard) failures using a technique called Adaptive Virtual Lockstep (AVLS) provided the hard failure is confined to a bank region. If first hard failure within a DRAM device occurs at bank region granularity then it is mapped out using 'bank level AVLS' method. UEFI-FW invokes this method using 'buddy bank'. Platform specific FW is required to identify DRAM failure at bank region granularity. ADC-SR is supported only with x4 DRAM based DIMMs and can be activated on a per channel basis. The UEFI-FW must enforce this restriction.
ADC-SR 功能允许使用一种称为自适应虚拟锁步(AVLS)的技术管理 DRAM 持久(硬件)故障,前提是硬件故障局限于一个银行区域。如果 DRAM 设备内的第一个硬件故障发生在银行区域粒度上,则使用“银行级 AVLS”方法将其映射出去。UEFI-FW 通过“伙伴银行”调用此方法。需要特定于平台的 FW 来识别银行区域粒度上的 DRAM 故障。ADC-SR 仅支持基于 x4 DRAM 的 DIMM,并且可以按通道激活。UEFI-FW 必须强制执行此限制。
The following table summarizes the ADC-SR basic flows covered in this section.
以下表格总结了本节中涵盖的 ADC-SR 基本流程。
Table 100. ADC-SR Capability Summary
表 100. ADC-SR 能力摘要
Feature Name Single-region, adaptive data correction (ADC-SR)
单区域,自适应数据校正(ADC-SR)
- ADC-SR feature allows managing DRAM persistent (hard) failures through the IMC adaptive virtual
- ADC-SR 功能允许通过 IMC 自适应虚拟锁步能力来管理 DRAM 持久性(硬)故障,之前已定义。
lockstep capability, defined previously.
锁步能力,之前已定义。
- UEFI-FW invokes the virtual lockstep (VLS) by establishing lockstep operation utilizing a 'non-failed/
- UEFI-FW 通过利用通道上的“非故障/伙伴存储器”来建立锁步操作来调用虚拟锁步(VLS)。进一步的错误校正继续进行。
buddy bank' on the channel. Further error correction continues.
- 如果发生第二次硬件故障,则无法采取其他 VLS 操作。假定 DIMM 已经失效。
- If second hard failure occurs, no other VLS action can be taken. DIMM is presumed to have gone
bad and need to be replaced.
坏了需要更换。
The operation requires DIMMS with x4 DRAM devices. A firmware based implementation is required
该操作需要具有 x4 DRAM 设备的 DIMMS。需要基于固件的实现。
to identify the hard failure at bank region granularity. ADC-SR can be activated on a per channel
以银行区域粒度识别硬故障。ADC-SR 可以在每个通道上激活。
basis.
Feature Name Single-region, adaptive data correction (ADC-SR)
单区域,自适应数据校正(ADC-SR)
The UEFI-FW, in the role of actor, keeps track of the various error correction events. Once the UEFI-
UEFI-FW,在扮演角色的同时,跟踪各种错误校正事件。一旦 UEFI-FW 确定 DRAM 设备中存在持久性故障,它会激活基于银行级别的自适应虚拟
FW determines a persistent fault within a DRAM device, it activates bank level adaptive virtual
FW 确定 DRAM 设备中存在持久性故障后,它会激活基于银行级别的自适应虚拟
lockstep along with a designated, pre-configured buddy bank.
与指定的、预配置的伙伴银行一起步调一致。
of the error at smaller thresholds, and invoke the VLS with DRAM ROW failure, and/or other
在较小的阈值处发生错误,并调用带有 DRAM 行故障的 VLS,和/或其他
proprietary error condition designed in the UEFI_FW handler.
在 UEFI_FW 处理程序中设计的专有错误条件。
The UEFI-FW continues to monitor corrected error events, additional failures are pointing to DIMMs
UEFI-FW 继续监视已纠正的错误事件,额外的故障指向需要进行修复的 DIMM
that need to undergo repair.
需要进行修复的 DIMM

11.4.4 Adaptive Double Device Data Correction - Multiple Region (ADDDC-MR)
11.4.4 自适应双设备数据校正 - 多区域(ADDDC-MR)

The 'ADDDC-MR' feature allows managing DRAM persistent (hard) failures more efficiently as compare to ADC-SR feature described in Adaptive Data Correction Single Region (ADC-SR) on page 270. Up to two DRAM hard failures can be corrected within different bank, rank, occur in a time-staggered manner. Both DRAM hard failures map out the affected bank/rank region by invoking adaptive virtual lockstep (AVLS) algorithm thus creating spare bank/rank region for replacement. Platform specific FW is required to identify hard failure at bank/rank region granularity. ADDDCMR is supported only with X4 DRAM based DIMMs and can be activated on a per channel basis. The UEFI-FW must enforce this restriction.
“ADDDC-MR”功能允许更有效地管理 DRAM 持久(硬件)故障,与第 270 页上描述的 ADC-SR 功能相比。最多可以在不同的银行、排名中纠正两个 DRAM 硬故障,以错位的方式发生。通过调用自适应虚拟锁步(AVLS)算法,两个 DRAM 硬故障映射出受影响的银行/排名区域,从而为替换创建备用银行/排名区域。需要特定于平台的 FW 来以银行/排名区域为粒度识别硬故障。ADDDCMR 仅支持基于 X4 DRAM 的 DIMM,并且可以按通道激活。UEFI-FW 必须强制执行此限制。
Table 101. ADDDC(MR) Summary
表 101。ADDDC(MR)摘要
Feature Name Adaptive, multi-region double device data correction, 'ADDDC-MR'
自适应、多区域双设备数据校正,“ADDDC-MR”
F
Des
- 'ADDDC-MR allows recovering from DRAM device persistent (hard) failures through an algorithm called
- 'ADDDC-MR 允许通过一种称为自适应虚拟锁步算法从 DRAM 设备持久性(硬件)故障中恢复。
adaptive virtual lockstep.
自适应虚拟锁步。
- If first hard failure within a DRAM device occurs at bank/rank region granularity then it is mapped out
- 如果 DRAM 设备内的第一个硬件故障发生在 bank/rank 区域粒度,则将其映射出。
using 'bank/rank level adaptive virtual lockstep' method. UEFI-FW invokes virtual lockstep utilizing non-
使用“银行/排名级自适应虚拟锁步”方法。UEFI-FW 调用虚拟锁步,利用 DDR 通道上的非失败/伙伴银行或排名。
failed/buddy bank or Rank on the DDR channel. Further error correction continues.
进行进一步的错误校正。
- 'ADDDC-MR' allows map out of up to two Bank/rank region on each DDR channel.
- 'ADDDC-MR'允许在每个 DDR 通道上映射出最多两个银行/排名区域。
- 'ADDDC-MR is applicable to x4 DIMMs.
- 'ADDDC-MR 适用于 x4 DIMM。
- 'ADDDC-MR' is available as part of 'advanced RAS'.
- 'ADDDC-MR' 作为 '高级 RAS' 的一部分可用。
UEFI FW Design UEFI 固件设计
Considerations
- Virtual lockstep is effective for as long as Errors are correctable. With uncorrectable errors, BIOS wont
- 虚拟锁步在错误可纠正的情况下是有效的。对于不可纠正的错误,BIOS 将无法识别故障设备,并无法映射出故障设备中的故障存储器。
be able to identify the failing device, and wont be able to map out the failing bank in the failing device.
UEFI-FW 需要在错误变为不可纠正之前调用映射出功能。
The UEFI-FW needs to invoke the map out prior to errors turn uncorrectable.
- UEFI-FW 需要在错误变为不可纠正之前调用映射出功能。
- The UEFI-FW, in the role of actor, keeps track of the various error correction events. Once the UEFI-FW
- UEFI-FW 作为执行者,跟踪各种错误校正事件。一旦 UEFI-FW 确定 DRAM 设备中存在持久性故障,它会激活银行或等级级别的自适应虚拟
determines a persistent fault within a DRAM device, it activates bank or rank level adaptive virtual
锁步,同时与指定的、预配置的伙伴银行或等级一起。
lockstep along with a designated, pre-configured buddy bank or rank.
- UEFI-FW 作为执行者,跟踪各种错误校正事件。一旦 UEFI-FW 确定 DRAM 设备中存在持久性故障,它会激活银行或等级级别的自适应虚拟
- The UEFI-FW code can better characterize the nature of the error by capturing multiple snap shots of
- UEFI-FW 代码可以通过捕获多个错误的快照来更好地描述错误的性质,以较小的阈值触发 VLS,以及使用 DRAM 行故障和/或其他专有错误条件来调用 UEFI_FW 处理程序。
the error at smaller thresholds, and invoke the VLS with DRAM ROW failure, and/or other proprietary
- UEFI-FW 代码可以通过捕获多个错误的快照来更好地描述错误的性质,以较小的阈值触发 VLS,以及使用 DRAM 行故障和/或其他专有错误条件来调用 UEFI_FW 处理程序。
error condition designed in the UEFI_FW handler.
- UEFI-FW 代码可以通过捕获多个错误的快照来更好地描述错误的性质,以较小的阈值触发 VLS,以及使用 DRAM 行故障和/或其他专有错误条件来调用 UEFI_FW 处理程序。
- The UEFI-FW continues to monitor corrected error events, and react to the subsequent DRAM failures
- UEFI-FW 继续监视已纠正的错误事件,并对随后的 DRAM 故障做出反应
by making use of the available VLS region. Once available VLS regions are exhausted, additional
通过利用可用的 VLS 区域。一旦可用的 VLS 区域耗尽,额外
failures are pointing to DIMMs that need to undergo repair.
故障指向需要进行修复的 DIMM。
- The DDR channel with impaired lockstep gang cannot recover from further device failures.
- 具有受损的锁步组的 DDR 通道无法从进一步的设备故障中恢复。

11.4.5 DDR Command and Address Parity Check and Retry
11.4.5 DDR 命令和地址奇偶校验和重试

Processor supports DDR command and address parity check and retry when enabled. Parity check is enabled in DRAM during training. When a command or address parity error is detected, the DDR DIMM drops the transaction, logs the address and asserts the DDRn_PAR_ERR_N Pin. The IMC retries all transactions in its pending queue, and if successful (aka transient error) system continues and condition is corrected. If CMD/
当启用时,处理器支持 DDR 命令和地址奇偶校验和重试。在训练期间,DRAM 中启用奇偶校验。当检测到命令或地址奇偶校验错误时,DDR DIMM 会丢弃事务,记录地址并断言 DDRn_PAR_ERR_N 引脚。IMC 会重试其挂起队列中的所有事务,如果成功(即瞬态错误),系统会继续运行并纠正条件。如果 CMD/
ADDR parity error is persistent, fatal error is logged by the IMC. The IMC will log the persistent error and can signal MSMI when operating in EMCA gen 2 mode. UEFI FW can access the error logs for FRU isolation.
ADDR 奇偶校验错误是持久性的,IMC 记录了致命错误。IMC 将记录持久性错误,并在以 EMCA gen 2 模式运行时可以发出 MSMI 信号。UEFI 固件可以访问错误日志以进行 FRU 隔离。
CMD/ADDR parity errors on the CMD/ADDR bus are signaled on the Alert pin and are recoverable. If Alert signal is asserted by any of the ranks (the Alert signal is a single signal per channel; alert signals from all DRAMs and ranks are wired-or connection). When an Alert signal is asserted, IMC will be unable to identify which rank/DIMM asserted the error signal or for which cmd. The error source will be identified in error recovery process after the assertion of Alert signal.
CMD/ADDR 总线上的 CMD/ADDR 奇偶校验错误会在警报引脚上发出信号,并且是可恢复的。如果任何等级(警报信号是每个通道的单个信号;来自所有 DRAM 和等级的警报信号是有线或连接)发出警报信号。当警报信号被断言时,IMC 将无法确定哪个等级/DIMM 发出了错误信号或为哪个 cmd。在警报信号断言后,错误源将在错误恢复过程中被识别。
The high level DDR CMD/ADDR parity recovery flow is described here:
这里描述了高级 DDR CMD/ADDR 奇偶校验恢复流程:
  1. DDR Register device detects CMD/ADDR Parity error. Drops the transaction. Logs Status bit and Address info and asserts DDRn_ALERT_N. Puts DIMMs in SelfRefresh. Wait for IMC to send channel RESTART command.
    DDR 寄存器设备检测到 CMD/ADDR 奇偶校验错误。丢弃事务。记录状态位和地址信息,并断言 DDRn_ALERT_N。将 DIMM 放入自刷新状态。等待 IMC 发送通道 RESTART 命令。
  2. IMC holds the pending transactions and performs following steps:
    IMC 保留待处理事务并执行以下步骤:
a. IMC logs the CMD/ADDR Parity error in Machine-check bank as corrected error , CorrErr + ). Triggers CMCI if threshold reached. Trigger CSMI if EMCA2 is enabled.
a. IMC 将 CMD/ADDR 奇偶校验错误记录在机器检查库中作为已校正错误 ,CorrErr +)。如果达到阈值,则触发 CMCI。如果启用了 EMCA2,则触发 CSMI。
b. IMC detects the DIMM that logged the error by reading the Register device and update the info in CORRERRSTATUS [10:8] register.
b. 通过读取寄存器设备,IMC 检测记录错误的 DIMM,并在 CORRERRSTATUS [10:8] 寄存器中更新信息。
  1. IMC clears the DIMM's Register device error log and resumes the normal transactions including all the previously pending transactions.
    IMC 清除 DIMM 的寄存器设备错误日志,并恢复正常事务,包括所有先前挂起的事务。
  2. In case of persistent fault:
    在持续故障的情况下:
a. IMC will make multiple attempts to read the register device info (default is 16 attempts), if not successful then declare a persistent fault. IMC will trigger MCERR with Link_fail condition ( , . IMC also updates the LINK_MCA_CTL.erro_log (bit1) register bit if enabled by setting LINK_MCA_CTL.erro_en (bit 0 ) register bit. IMC does not update CORRERRSTATUS register.
a. IMC 将尝试多次读取寄存器设备信息(默认为 16 次尝试),如果不成功,则声明为持久性故障。IMC 将触发 MCERR,带有 Link_fail 条件( )。如果通过设置 LINK_MCA_CTL.erro_en(位 0)寄存器位启用,IMC 还会更新 LINK_MCA_CTL.erro_log(位 1)寄存器位。IMC 不会更新 CORRERRSTATUS 寄存器。

NOTE 注意

Simply pulling DDRn_ALERT_N low is not going to create a fatal event.
简单地将 DDRn_ALERT_N 拉低不会导致致命事件产生。
  1. UEFI FW default setting is to enable CMD/ADDR Parity Check feature by default. However, during any debugging scenario, if CMD/ADDR parity check is disabled then CMD/ADDR parity errors will be detected but not corrected and it will be reported as fatal error.
    UEFI 固件的默认设置是默认启用 CMD/ADDR 奇偶校验功能。然而,在任何调试场景中,如果禁用 CMD/ADDR 奇偶校验,则会检测到 CMD/ADDR 奇偶校验错误,但不会进行纠正,并将报告为致命错误。
  2. In case of mirror mode, persistent CMD/ADDR parity errors on the primary channel will NOT cause a fail-over to the secondary channel.
    在镜像模式下,主通道上持续的 CMD/ADDR 奇偶校验错误不会导致切换到次要通道。

11.4.6 DDR Write Data CRC Check and Retry
11.4.6 DDR 写数据 CRC 校验和重试

The processor implements DDR5 Write DATA CRC check and retry for debug and in lab use purposes. DDR5 Write Data CRC Protection detects DDR5 data bus faults during write operation.
处理器实现了 DDR5 写数据 CRC 检查和重试,用于调试和实验室用途。DDR5 写数据 CRC 保护在写操作期间检测 DDR5 数据总线故障。
  • Sequence of steps during DDR5 CRC operation:
    DDR5 CRC 操作期间的步骤顺序:
  • IMC generates CRC checksum and adds to the write data frame.
    IMC 生成 CRC 校验和并添加到写数据帧中。
  • DDR5 DRAM checks for CRC error and reports to IMC via DDRn_ALERT_N signal.
    DDR5 DRAM 检查 CRC 错误并通过 DDRn_ALERT_N 信号向 IMC 报告。
  • IMC replays the data write transaction thus correcting the data bus transient fault.
    IMC 重新执行数据写入事务,从而纠正数据总线瞬态故障。
  • When DDR5 Write data CRC protection is enabled additional DDR cycles are required for CRC bit transition and check by the DIMM, which results in some performance impact. This feature has the most benefit in platforms that are known to have link health that is not optimal and are willing to trade the performance impact for error recovery.
    当启用 DDR5 写入数据 CRC 保护时,需要额外的 DDR 循环来进行 CRC 位转换和 DIMM 检查,这会导致一些性能影响。这个功能在已知链路健康状况不佳且愿意为错误恢复而牺牲性能影响的平台上具有最大的益处。
When Memory Mirroring or ADDDC MR or ADDDC SR are enabled, this feature must be disabled.
当启用内存镜像或 ADDDC MR 或 ADDDC SR 时,必须禁用此功能。

11.4.7 Memory Data Scrambling with Command and Address
11.4.7 使用命令和地址的内存数据混淆

Processor incorporates data Scrambling feature to minimize the impact of excessing di/dt on DDR VRs due to successive 1's and 0's on the data bus. Since Data is protected via ECC, any address bit error can also be detected.
处理器包含数据混淆功能,以最小化由数据总线上连续的 1 和 0 导致的 DDR VR 上的过度 di/dt 的影响。由于数据受到 ECC 的保护,任何地址位错误也可以被检测到。
Past experience has demonstrated that traffic on the data is not random and can have energy concentrated at specific spectral harmonics creating very high di/dt. As a result, Vddq power delivery is generally limited by a worst case data pattern that excites resonance between the package inductance and on die capacitances. The processor uses data scrambling to create pseudo-random pattern on the data bus.
过去的经验表明,数据流量并非随机的,可能在特定谐波频谱上能量集中,导致非常高的 di/dt。因此,Vddq 电源传输通常受到最坏情况数据模式的限制,该模式会激发封装电感和芯片内电容之间的共振。处理器使用数据混淆在数据总线上创建伪随机模式。
The data is randomized by hashing it with the 16-bit result of a polynomial calculation based on certain selected Memory Address, Bank Address and Chip Select bits of the CAS command. Selected MA, CS, and BA bits form a 16 bits of seed which is used to generate an LFSR (Linear Feedback Shift Register) sequence. This LFSR sequence is then XORed with the WRITE Data bits to generate a pseudo-random data pattern which is actually written to the DIMM. Memory Address for read and write may differ in the least significant bits, which select the critical chunk in the cache-line, but these address bits are excluded from the hash. When this feature is enabled, a location written or read to an address with an even number of bits in errors can be detected. This would otherwise escape the address parity detection. The polynomial calculation is performed every QCLK producing a different hash with every transfer. Any data bit corruption that occurs in DRAM will produce errors in the same data bits after descrambling, so this would not affect the data correction capabilities of the ECC code.
通过使用基于 CAS 命令的某些选定的 Memory Address、Bank Address 和 Chip Select 位的多项式计算的 16 位结果对数据进行哈希处理,从而使数据随机化。选定的 MA、CS 和 BA 位形成一个 16 位的种子,用于生成一个 LFSR(线性反馈移位寄存器)序列。然后将此 LFSR 序列与 WRITE Data 位进行 XOR 运算,生成一个伪随机数据模式,实际写入 DIMM。读取和写入的 Memory Address 可能在最低有效位上有所不同,这些位选择缓存行中的关键块,但这些地址位被排除在哈希之外。启用此功能时,可以检测到写入或读取具有偶数位错误的地址的位置。否则,这些错误可能会逃过地址奇偶校验的检测。每个 QCLK 执行一次多项式计算,每次传输生成不同的哈希。DRAM 中发生的任何数据位损坏在解扰后会导致相同的数据位出现错误,因此这不会影响 ECC 代码的数据校正能力。

11.4.8 Memory Demand and Patrol Scrubbing
11.4.8 内存需求和巡逻擦拭

Memory demand and patrol scrubbing features prevent accumulation of errors within the DIMM. Subsequent sub-sections further describes demand scrubbing and patrol scrubbing.
内存需求和巡逻擦洗功能可以防止 DIMM 中错误的累积。后续的子章节进一步描述了需求擦洗和巡逻擦洗。

11.4.8.1 Memory Demand Scrubbing
11.4.8.1 内存需求擦洗

Demand scrubbing is the ability to write corrected data back to the DDR memory once a correctable error is detected on a read transaction. This allows for correction of data in memory at detection point, and decrease the chances of a second error on the same address accumulating to cause a multi-bit error condition.
需求擦洗是在读取事务中检测到可纠正错误时将更正数据写回 DDR 存储器的能力。这允许在检测点对内存中的数据进行更正,并减少在同一地址上积累导致多位错误条件的第二个错误的机会。
Demand scrubbing action varies based on whether memory is configured in independent channel mode or mirror mode. In independent channel mode (also known as non-mirrored mode), when IMC returns data with error, the transaction is retried. If
需求清洗操作根据内存配置为独立通道模式或镜像模式而有所不同。在独立通道模式(也称为非镜像模式)中,当IMC返回带有错误的数据时,事务将被重试。如果

retry succeeds without any error, data is written back thereby completing the demand scrub. At this time, corrected error is logged in the IMC MCA bank. Demand scrub in mirrored mode is also known as "mirror scrub". In mirrored mode, when the primary channel returns data with error, the B2CMI retries the request to the primary channel. If the retry succeeds (no error returned), the B2CMI will write the data back, therefore completing the demand scrub and will increment its corrected error count. However, if the primary channel returns error on the retry, the B2CMI will issue a request to the secondary channel. If the retry succeeds (no error returned), the B2CMI will write the data back to the primary channel, therefore completing the demand scrub and will increment its corrected error count. If the secondary channel returns an error, the B2CMI will retry on the secondary channel. If the retry to the secondary channel succeeds, (no error returned), the B2CMI will write the data back to the primary channel, therefore completing the demand scrub and will increment its corrected error count. However, if the retry to the secondary channel returns an error, the B2CMI will report an uncorrected data error. How the B2CMI reports the uncorrected data error is dependent on whether 'data poisoning' is enabled or not. Refer to Error Reporting (MCA, AER) - Core, Uncore and IIO on page 228 for further details of error reporting in "data poisoning" mode.
重试成功且无任何错误,数据被写回,从而完成需求清洗。此时,已更正的错误被记录在IMC MCA银行中。在镜像模式下的需求清洗也被称为“镜像清洗”。在镜像模式下,当主通道返回带有错误的数据时,B2CMI会重试向主通道发送请求。如果重试成功(没有错误返回),B2CMI将写回数据,从而完成需求清洗并增加其已更正错误的计数。然而,如果主通道在重试时返回错误,B2CMI将向次要通道发出请求。如果重试成功(没有错误返回),B2CMI将把数据写回主通道,从而完成需求清洗并增加其已更正错误的计数。如果次要通道返回错误,B2CMI将在次要通道上重试。如果重试到次要通道成功(没有错误返回),B2CMI将把数据写回主通道,从而完成需求清洗并增加其已更正错误的计数。 但是,如果对辅助通道的重试返回错误,则 B2CMI 将报告未校正的数据错误。 B2CMI 如何报告未校正的数据错误取决于是否启用了“数据毒化”。 有关“数据毒化”模式下错误报告的更多详细信息,请参阅第 228 页的错误报告(MCA,AER)- 核心,非核心和 IIO。

11.4.8.2 DDR Memory Patrol Scrubbing
11.4.8.2 DDR 内存巡逻清洗

Patrol scrubbing is accomplished using an engine that generates requests to memory addresses in a stride. The engine will generate a memory read request at the preprogrammed interval. If a correctable error is detected, error is corrected and Patrol scrubber writes corrected data back and log and signals the error. The Uncorrected errors are poisoned and written back to DRAM.
巡逻清洗是通过使用一个引擎来生成对内存地址的请求来完成的。 该引擎将以预编程间隔生成内存读取请求。 如果检测到可纠正错误,则会纠正错误,并且巡逻清洗器会将纠正后的数据写回并记录并发出错误信号。 未纠正的错误被毒化并写回 DRAM。
Patrol scrubs are intended to ensure that data with a correctable error does not remain in DRAM long enough to stand a significant chance of further corruption to a uncorrectable error due to particle error. Patrol scrubbing and device sparing are managed by the same hardware logic, therefore only one action can be taken at a time. While rank sparing is in progress, patrol scrubbing operations are disabled. During device sparing, IMC will ping-pong between performing sparing operations and servicing normal requests.
巡逻擦洗旨在确保具有可纠正错误的数据不会在 DRAM 中停留足够长的时间,以致由于颗粒错误导致进一步损坏为不可纠正的错误的机会显著增加。巡逻擦洗和设备备用由相同的硬件逻辑管理,因此一次只能执行一个操作。在进行等级备用时,巡逻擦洗操作将被禁用。在进行设备备用时,IMC 将在执行备用操作和处理正常请求之间来回切换。
It is recommended to configure patrol scrub rate in such a way that the memory range managed by a given IMC is scrubbed once ever 24 -hours.
建议配置巡逻擦洗速率,以便由给定 IMC 管理的内存范围每隔 24 小时进行一次擦洗。
Patrol scrubbing is operating is controlled by SCRUBADDRESSHI/LO channel address bit field. The "last scrub address" captured is determined by this setting.
巡逻擦洗的操作由 SCRUBADDRESSHI/LO 通道地址位字段控制。通过此设置确定“最后擦洗地址”捕获的位置。
Patrol Scrubber logs system address in the mcbank for Corrected or UC errors.
巡逻清洁器在 mcbank 中记录系统地址以纠正或 UC 错误。
  • Legacy IA-32 MCA mode: Patrol scrubber detected UCE are reported as fatal in machine check bank.
    传统 IA-32 MCA 模式:巡逻清洁器检测到 UCE 会在机器检查库中报告为致命错误。
  • Corrupt data containment mode patrol scrubber detected UCE are reported as UCNA, CSMI can be triggered. The UEFI BIOS can signal the event to the OS via SCI, Recoverable UnCorrectable error.
    数据损坏内容模式巡逻清洁器检测到 UCE 会报告为 UCNA,CSMI 可以被触发。UEFI BIOS 可以通过 SCI 向操作系统发出事件信号,可恢复的不可纠正错误。
  • In Mirror mode, patrol scrub errors can be configured to report Corrected error, allowing good data from secondary to overwrite the UC data detected by the scrubber.
    在镜像模式下,巡逻擦除错误可以配置为报告已更正的错误,允许来自辅助数据覆盖由擦除器检测到的 UC 数据。

11.4.9 Memory Mirroring 11.4.9 内存镜像

Memory mirroring is the mechanism by which memory is duplicated across two channels and is handled by the B2CMI. When mirroring is enabled, primary and secondary copies of data are kept for the mirrored region:
内存镜像是通过在两个通道之间复制内存并由 B2CMI 处理的机制。启用镜像时,镜像区域会保留数据的主要和辅助副本。
  • All of memory in the case of full memory mirroring
    在完全内存镜像的情况下,所有内存
  • Parts of memory in the case of Partial Mirroring
    在部分镜像的情况下,部分内存
Writes are duplicated to both the primary and secondary, but reads are issued only to the primary. If the primary fails due to an uncorrectable error, then a read is issued to the secondary.
写入会复制到主要和次要位置,但读取只会发给主要位置。如果主要位置由于无法纠正的错误而失败,则会发给次要位置。
If data is retrieved from the secondary copy due to a UC error on the primary, a "mirror scrub" is issued by the HW state machine to test if the failure was a "soft error" or a "hard failure". A mirror scrub writes the error-free data from the secondary to the primary and then reads it back. If the read returns data with error, then this was a hard failure and will result in the mirror being broken (an event also known as a mirror-failover). The HW issues an SMI when this happens to let the BIOS know of the event. If the read returns data with no error, however, then the error was a soft error and mirroring will not be broken.
如果由于主要副本上的 UC 错误而从次要副本中检索数据,则硬件状态机将发出“镜像扫描”以测试故障是“软错误”还是“硬故障”。 镜像扫描将从次要副本向主要副本写入无错误的数据,然后将其读取回来。 如果读取返回带有错误的数据,则这是一个硬故障,并将导致镜像被破坏(也称为镜像故障转移事件)。 当这种情况发生时,硬件发出 SMI 以通知 BIOS 发生的事件。 但是,如果读取返回没有错误的数据,则错误是软错误,镜像不会被破坏。

11.4.9.1 Address Range Memory Mirroring
11.4.9.1 地址范围内存镜像

In this mode, a subset of memory is mirrored, leaving the rest of the memory in nonmirror mode. Such memory mirroring modes allow mirroring of critical memory address range (called as 'more reliable memory') without incurring the cost of full memory mirroring. Address ranges need to be within the same iMC. Provides OS level interface for requesting 'more reliable memory' as a percent of full visible memory address space. Optionally, platform FW can configure the 'more reliable memory' range during boot time using EFI utilities.
在这种模式下,一部分内存被镜像,其余内存保持非镜像模式。这种内存镜像模式允许镜像关键内存地址范围(称为“更可靠内存”),而不会产生完全内存镜像的成本。地址范围需要在同一 iMC 内。提供操作系统级接口,以请求“更可靠内存”作为完整可见内存地址空间的百分比。可选地,平台固件可以在引导时使用 EFI 实用程序配置“更可靠内存”范围。
OS/VMM could also use partial mirroring to keep critical code/data in mirrored region to optimize memory sacrifice. BIOS setup policy can be used to select size of mirrored DDR memory. BIOS will communicate persistent memory to OS in ACPI and E820 tables. OS could use the mirrored region for critical code/data.
操作系统/虚拟机管理程序还可以使用部分镜像来将关键代码/数据保留在镜像区域,以优化内存牺牲。BIOS 设置策略可用于选择镜像 DDR 内存的大小。BIOS 将在 ACPI 和 E820 表中向操作系统通信持久内存。操作系统可以使用镜像区域存储关键代码/数据。
Mirroring memory occur across two DDR channels, and transactions are handled by B2CMI. It implies given IMC will be handling transaction for both mirrored and nonmirrored memory. Address range mirroring granularity extends flexible to BIOS to have system memory to be both in mirror and non-mirrored address ranges. Birch Stream platform BIOS Writer's guide, RAS section discusses Intel BKC's approach in placing memory in Address range mirror across multiple sockets.
跨两个 DDR 通道发生镜像内存,并由 B2CMI 处理事务。这意味着给定的 IMC 将处理镜像和非镜像内存的事务。地址范围镜像粒度灵活扩展到 BIOS,使系统内存可以同时位于镜像和非镜像地址范围内。Birch Stream 平台 BIOS 编写指南,RAS 部分讨论了 Intel BKC 在跨多个插槽放置内存地址范围镜像的方法。
All error detection, signaling, and correction operations in full mirroring mode can be applied to partial mirroring mode within mirroring region.
在完全镜像模式中的所有错误检测、信号和校正操作都可以应用于镜像区域内的部分镜像模式。

11.4.9.2 Supported Memory Mirroring Configurations
11.4.9.2 支持的内存镜像配置

  • DDR5 mirroring to occur across adjacent DDR channels, that is, IMC 0&1, 2&3, , and so on.
    DDR5 镜像将在相邻的 DDR 通道之间发生,即 IMC 0 和 1,2 和 3, ,等等。
  • Identical memory density is installed across mirrored channels (total memory size on primary and secondary channels must be the same).
    在镜像通道上安装相同的内存密度(主要和次要通道上的总内存大小必须相同)。
  • Mirroring is supported in mode only.
    仅支持 模式中的镜像。
  • Not supported in Flat 2LM and CXL memory.
    不支持 Flat 2LM 和 CXL 内存。
  • Mirroring is supported with and DIMMs.
    镜像支持 DIMM。
  • No mirroring support in heterogenous interleave.
    异构交错不支持镜像。

NOTE 注意

Additional limitation may apply due to cross DIMM/RAS/CXL/security support plans.
由于跨 DIMM/RAS/CXL/安全支持计划,可能会有额外的限制。

11.4.10 DDR Power Up and Runtime Post Package Repair (PPR)
11.4.10 DDR 上电和运行时后包修复(PPR)

DRAM Post Package Repair (PPR) is a JEDEC concept supported by the platform. The processor supports PPR at system powerup/reboot and runtime. Both soft and hard PPR are supported by the memory reference code. Only soft PPR is applicable to runtime PPR. As the DRAM process is evolving and capacity is increasing, there is a higher probability of DRAM cell-level defects. The processor provide several features to manage such faults (for example, memory map-out for FRB). Memory disable/mapout for fault resilient booting will improve the coverage for memory repairability further. It uses DDR DRAM's capability called PPR. Following is the description of PPR and how it is used to implemented memory disable/map-out for fault resilient booting.
DRAM Post Package Repair (PPR) 是由平台支持的 JEDEC 概念。处理器支持系统上电/重启和运行时的 PPR。内存参考代码支持软件和硬件 PPR。只有软件 PPR 适用于运行时 PPR。随着DRAM工艺的发展和容量的增加,DRAM单元级别的缺陷可能性更高。处理器提供了几个功能来管理这些故障(例如,FRB的内存映射)。为了进一步提高内存修复的覆盖范围,故障恢复引导的内存禁用/映射将改善内存修复性。它利用DDR DRAM的PPR功能。以下是PPR的描述以及如何将其用于实现故障恢复引导的内存禁用/映射。
Runtime soft PPR is a new feature being supported on this platform due to run time error detection, correction and recovery becoming increasingly important as there could be a higher rate of experiencing DRAM failures in the field with higher density/ speed DRAMs. There is a need for a better and cheaper solution to managing failing DRAM devices and there is no performance impact when using the spare row in runtime PPR. Runtime PPR allows the system to manage failing DRAM devices without taking down the system. Runtime soft PPR is triggered by the memory controller correctable error counter hitting the threshold and triggering an SMI. SMM BIOS analyzes the error record and determines a row failure.
由于运行时错误检测、纠正和恢复变得越来越重要,因为在高密度/速度 DRAM 中可能会更频繁地发生 DRAM 故障,所以运行时软 PPR 是该平台上支持的新功能。管理故障 DRAM 设备需要更好、更便宜的解决方案,在运行时 PPR 中使用备用行时不会影响性能。运行时 PPR 允许系统在不关闭系统的情况下管理故障 DRAM 设备。运行时软 PPR 是由内存控制器可纠正错误计数器达到阈值并触发 SMI 来触发的。SMM BIOS 分析错误记录并确定行故障。
DDR devices support reserved/unused ROW to be enabled via PPR. PPR uses this spare capacity within the DDR DRAM to replace faulty cell areas detected during system boot time. DDR DRAM's PPR capability allows sparing/replacing failing ROW with reserved ROW available in bank group.
DDR 设备支持通过 PPR 启用保留/未使用的行。PPR 利用 DDR DRAM 内的这些备用容量,在系统引导时替换检测到的故障单元区域。DDR DRAM 的 PPR 功能允许使用银行组中可用的保留行来替换故障行。
With PPR capability, platforms have an ability to repair the failed cell to the spare cell through software (MRS command) which is part of the Memory Reference Code (MRC). DDR allows two command sequence options for software depending upon whether repaired row needs to be kept in auto precharge mode or not. The first sequence relies on WRA (write with auto precharge) which issues refreshes to other banks than one being repaired, while the second sequence uses WR (write command without auto precharge) command with no refresh sequence.
通过 PPR 功能,平台可以通过软件(MRS 命令)将故障单元修复到备用单元,这是 Memory Reference Code(MRC)的一部分。DDR 允许两种软件命令序列选项,具体取决于修复的行是否需要保持在自动预充电模式下。第一个序列依赖于 WRA(带自动预充电写入),它向除正在修复的银行之外的其他银行发出刷新,而第二个序列使用 WR(无自动预充电写入)命令,没有刷新序列。

11.4.11 Memory SMBus Hang Recovery
11.4.11 内存 SMBus 挂起恢复

Memory SMBus hang recovery allows system recovery in case SMBus fails to respond during run-time preventing a system crash. UEFI FW is notified via an SMI and it may be able to recover the SMBus by resetting and re-activating the link. Memory SMBus is actively used by the processor for thermal monitoring and implementing CLTT.
内存 SMBus 挂起恢复允许系统在运行时在 SMBus 在未响应的情况下进行系统恢复,防止系统崩溃。通过 SMI 通知 UEFI 固件,它可以通过重置和重新激活链接来恢复 SMBus。内存 SMBus 被处理器积极用于热监控和实施 CLTT。

11.4.12 Memory Disable/Map-Out for FRB
11.4.12 内存禁用/映射出 FRB

MRC implements a series of tests and policies to help with system boot despite of faulty DIMMs in the system. The BIOS writer's guide documents the MRC code flow and operation. Following is a brief description of fault handling capability during the three phases of memory initialization:
MRC 实施一系列测试和策略,以帮助系统在系统中存在故障的 DIMM 的情况下进行引导。 BIOS 编写指南记录了 MRC 代码流程和操作。 以下是在内存初始化的三个阶段中故障处理能力的简要描述:
During the discovery phase:
在发现阶段:
  1. SPD read protocol error detected: retry the transaction three times. If persists, disable channel.
    检测到 SPD 读取协议错误:重试事务三次。如果持续,请禁用通道。
  2. SPD data error detected: uncorrected and disable channel
    检测到 SPD 数据错误:无法纠正并禁用通道。
  3. DIMM population mismatch against POR configuration: downgrade to next supported configuration
    DIMM 与 POR 配置不匹配:降级到下一个支持的配置。
During Memory Training phase:
在内存训练阶段:
  1. Single device data errors ( ): tolerated
    单设备数据错误( ):可容忍
  2. Uncorrected data errors: disable channel
    无法纠正的数据错误:禁用通道
  3. CMD/ADDR parity error: detected and disable channel
    CMD/ADDR 奇偶校验错误:检测到并禁用通道
During the memory test phase:
在内存测试阶段:
  1. Ability to test each DQ-bit using converged pattern generator checker (CPGC)
    使用收敛模式生成器检查器(CPGC)测试每个 DQ 位的能力
  2. Apply PPR to map out failing ROW (faulty ROW identified by MemTest or identified by the BIOS in prior boot cycle).
    将 PPR 应用于绘制出失败的行 (由 MemTest 标识的故障行或由 BIOS 在先前引导周期中标识的故障行)。
  3. Map out failing ranks.
    绘制出失败的等级。

11.4.13 MCR DIMM RAS

Multiplexed Combined Rank or Monument Creek (MCR) DIMMs are a special DDR5 LRDIMM that has doubled bandwidth compared to a regular DDR5 DIMM. This following are key RAS details related to MCR DIMMs:
多路复用组合等级或纪念碑溪(MCR)DIMM 是一种特殊的 DDR5 LRDIMM,与普通的 DDR5 DIMM 相比,带宽翻倍。以下是与 MCR DIMM 相关的关键 RAS 细节:
  • iMC supports the same RAS capabilities across MCR DIMMs and standard DDR5 RDIMMs.
    iMC 支持 MCR DIMM 和标准 DDR5 RDIMM 之间相同的 RAS 功能。
  • iMC views MCR DIMMs the same way it views its DDR5 RDIMM counterpart.
    iMC 将 MCR DIMM 视为其 DDR5 RDIMM 对应物。
  • Error logging/signaling follows DDR5 RDIMM error logging/signaling; software/ BIOS error handling flows remains unchanged.
    错误记录/信令遵循 DDR5 RDIMM 错误记录/信令; 软件/ BIOS 错误处理流程保持不变。
  • DDR5 RDIMM RAS feature compatibility with other RAS features and security features also apply to MCR DIMMs.
    DDR5 RDIMM RAS 功能与其他 RAS 功能和安全功能的兼容性也适用于 MCR DIMM。
  • For MCR DIMMs, there is a change in the encoded 3-bit chip-select value:
    对于 MCR DIMM,编码的 3 位芯片选择值发生了变化:
  • Native mode (1LM DDR5 RDIMMs):
    本地模式 (1LM DDR5 RDIMMs)
  • CHIP_SELECT[2:0] DIMM index, rank index, sub-channel index
    芯片选择[2:0] DIMM 索引,rank 索引,子通道索引
  • MCR DIMM Mode: MCR DIMM 模式:
  • CHIP_SELECT[2:0] = {rank index, pseudo-channel index, sub-channel index}
    CHIP_SELECT[2:0] = {等级索引,伪通道索引,子通道索引}
  • This 3-bit chip-select is commonly seen in the iMC's MCi_MISC register (bits 48:46) and RETRY_RD_ERR_LOG_ADDRESS1 (bits 8:6).
    这个 3 位芯片选择通常出现在 iMC 的 MCi_MISC 寄存器(位 48:46)和 RETRY_RD_ERR_LOG_ADDRESS1(位 8:6)中。
The following table lists the key Intel UPI RAS features.
以下表格列出了关键的英特尔 UPI RAS 功能。
Table 102. Summary of Intel UPI RAS Features and Resources
表 102. 英特尔 UPI RAS 功能和资源摘要
RAS Feature Description
Standar
d
RAS
SKU
Advance
d
RAS SKU
Intel UPI protocol protection
英特尔 UPI 协议保护
via CRC (16 bit)
通过 CRC(16 位)
This function allows detection of transient data errors using a
此功能允许使用 16 位校验和检测瞬态数据错误。
checksum of 16 -bits.
16 位校验和可用于检测数据错误。
Yes Yes
Intel UPI link level retry
Intel UPI 链路级重试
Intel UPI link to retransmit when a transient error (CRC mismatch) is
Intel UPI 链路在检测到链路上的瞬态错误(CRC 不匹配)时重新传输
detected on the link.
检测到链路上的瞬态错误(CRC 不匹配)时重新传输。
Yes Yes
Intel UPI dynamic link width
Intel UPI 动态链接宽度
reduction
Dynamic Link Width Reduction (also called as Intel UPI Self-healing)
动态链接宽度缩减(也称为 Intel UPI 自愈)
is recovery from hard failure of one or more data lanes on a physical
是从物理上一个或多个数据通道的严重故障中恢复
Intel UPI link, by dynamically resizing to width.
通过动态调整大小到 宽度的 Intel UPI 链接。
No Yes
Intel UPI system quiescence
Intel UPI 系统静止
Mechanism to transition Intel UPI link from L0 to L1 thereby draining
从 L0 过渡到 L1 的机制,从而排空 Intel UPI 链接
outstanding transactions which are in-flight within the Intel UPI
在 Intel UPI 中正在进行的未完成交易
interface buffers, and preventing various other agents from issuing
接口缓冲区,并阻止其他各种代理发出新交易
new transactions without resetting the whole system. OS may have
而不重置整个系统。操作系统可能有
dependency on how long system can stay in quiescence mode.
系统可以保持静止模式的时间依赖性。
This is a basic building block for several system serviceability and
这是几个系统可维护性和可扩展性功能的基本构建块,例如,操作系统 CPU 的上线/下线。
scalability features, for example, OS CPU on-lining/off-lining. Intel
英特尔
UPI quiesce is required prior to on/off-lining the link.
在上/下线链接之前,需要先使 UPI 静止。
No Yes

NOTES: 注意:

  1. RAS features may not be supported on all SKUs of a processor type.
    处理器类型的所有 SKU 可能不支持 RAS 功能。
  2. Two-socket workstation follows standard RAS SKU.
    两插槽工作站遵循标准 RAS SKU。

11.5.1 Intel UPI Protocol Protection via 16-bit CRC
11.5.1 Intel UPI 协议通过 16 位 CRC 进行保护

Link level retry is supported using a circular FIFO retry queue where every info or idle flit being sent is put into the queue. It is only removed from the queue when an acknowledgment is returned from the receiver. The acknowledgment indicates that the target link layer received an info or idle flit error free. If the target receives a flit with a CRC error, it returns a link level retry indication. The processor has a time-out counter for LLR_Req to LLR_Ack. The value is required to be more than the round trip flight time between the two components. The value for this time-out is programmable in terms of Link Layer Flits received. To avoid deadlock cases L1 power state is exited in both Rx and Tx whenever a link-level retry is active in either the Rx or Tx.
使用循环 FIFO 重试队列支持链路级重试,其中每个发送的信息或空闲 flit 都放入队列中。仅当接收方返回确认时,才从队列中移除。确认表示目标链路层已无错误地接收了信息或空闲 flit。如果目标接收到带有 CRC 错误的 flit,则返回链路级重试指示。处理器具有 LLR_Req 到 LLR_Ack 的超时计数器。该值需要大于两个组件之间的往返飞行时间。此超时的值可编程为接收到的链路层 flit 数。为避免死锁情况,在 Rx 或 Tx 中任一处的链路级重试处于活动状态时,都会退出 L1 电源状态。
Cyclic Redundancy Check (CRC) is a mechanism to ensure the data integrity of a serial stream. The sender of the data generates CRC based on the data pattern and a polynomial equation (typically XOR of some bit positions). The resulting CRC is a unique encoding for a specific data. When the data arrives at the receiver, the receiver performs the same CRC calculation using the same polynomial equation. The CRCs are compared to detect a bad data. When a CRC error is detected, the receiver will ask the sender to retransmit the data. This action is termed "link level retry". The retry is "link level", because it is performed by the link layer logic. The protocol layer is unaware of this action.
循环冗余校验(CRC)是一种用于确保串行数据完整性的机制。数据发送方根据数据模式和多项式方程(通常是一些位位置的异或)生成 CRC。生成的 CRC 是特定数据的唯一编码。当数据到达接收方时,接收方使用相同的多项式方程执行相同的 CRC 计算。比较 CRC 以检测错误数据。当检测到 CRC 错误时,接收方将要求发送方重新传输数据。这个动作被称为“链路层重试”。重试是“链路层”的,因为它是由链路层逻辑执行的。协议层对此操作一无所知。
For reliable transmission, the Intel UPI link layer uses 16b CRC for transmission error detection. Each Intel UPI flit is 192 bits and contains a 16b CRC field.
为了可靠传输,英特尔 UPI 链路层使用 16 位 CRC 进行传输错误检测。每个英特尔 UPI flit 是 192 位,包含一个 16 位 CRC 字段。
Intel UPI agent (actor) initiates link-level retry (LLR) process for handling CRC error detected in the received _FLIT payload.
英特尔 UPI 代理(执行者)发起链路级重试(LLR)过程,以处理接收到的 _FLIT 负载中检测到的 CRC 错误。
Terminology: 术语:
Intel UPI Ultra Path Interconnect, next generation of Intel QPI
英特尔 UPI 超级路径互连,英特尔 QPI 的下一代
FLIT Unit of transfer at the link layer. 192 bits, 176-bit payload and 16-bit CRC
FLIT 链路层传输单位。192 位,176 位有效载荷和 16 位 CRC
receive module. Each Intel UPI agent has both and Tx modules.
接收模块。每个英特尔 UPI 代理都有 Rx 和 Tx 模块。
Tx transmit module. Each Intel UPI agent has both _Rx and Tx modules.
Tx 发射模块。每个英特尔 UPI 代理都有 Rx 和 Tx 模块。
Intel UPI LLR feature uses a circular FIFO queue, where every header, data, and idle flit transmitted is copied into the Link level retry queue (LLRQ, aka retry-buffer). The transmitter only removes a flit from the queue when the receiver returns an acknowledgment, which indicates that the target Link layer received the flit error-free. In addition, the link layer appends of CRC to the incoming flits payload.
Intel UPI LLR 功能使用循环 FIFO 队列,其中每个头部、数据和空闲 flit 传输都被复制到链路级重试队列(LLRQ,又称重试缓冲区)。当接收方返回确认时,表示目标链路层已无误接收 flit 时,发射机才从队列中移除一个 flit。此外,链路层在传入 flits 负载中附加 个 CRC。
On the receive side of the link, the CRC bits are stripped off and the 176b of payload is used to calculate the expected CRC of the flit. If the flit's calculated CRC matches the received , then the flit passes the CRC check and is placed into the RxQ. However, CRC comparison fails, the LLR flow is initiated.
在链路的接收端,CRC 位被剥离,176b 的负载用于计算 flit 的预期 CRC。如果 flit 的计算 CRC 与接收到的 匹配,则 flit 通过 CRC 检查并放入 RxQ。然而,CRC 比较失败,LLR 流程就会被启动。
Recovery is accomplished via a retransmission, specifically named LLR. This is accomplished by the Tx and Rx counting of the number of protocol valid flits sent. When a CRC is detected on an given endpoint, that endpoint's Tx sends a Replay Request (LLR.Req) that has an embedded copy of that bad flit's sequence number. When the LLR.Req is received, it indexes into the LLRQ for the given sequence number, and begins replaying flits. A Replay Acknowledgment (LLR.Ack) is also sent, indicating that the LLR. Req was serviced.
恢复是通过重新传输来完成的,具体命名为LLR。这是通过Tx和Rx计算发送的协议有效flits数量来完成的。当在给定端点上检测到CRC时,该端点的Tx会发送一个重播请求(LLR.Req),其中包含了该错误flit的序列号的嵌入副本。当接收到LLR.Req时,它会索引到给定序列号的LLRQ,并开始重放flits。还会发送一个重播确认(LLR.Ack),表示LLR.Req已被处理。
If the endpoint that initiates the replay request does not receive a replay request within a specified number of flits (configurable in KTILCL[6:4]), another LLR. Req is sent. If the Replay request is still unsuccessful after additional replay requests (configurable with KTILCL[11:8]), the link layer will request that the Physical Layer to a re-initialization. If re-INIT is still unsuccessful, the Intel UPI agent will signal an uncorrectable error - LRSM Abort (Link Layer Replay State Machine - Abort) indicating that the link layer had exhausted all possible actions available to recover the link. In case processor supports advanced RAS capability, then instead of entering LRSM
如果发起重播请求的端点在指定数量的 flits(在 KTILCL[6:4] 中可配置)内未收到重播请求,则发送另一个 LLR 请求。如果在额外的重播请求(在 KTILCL[11:8] 中可配置)后重播请求仍然失败,则链路层将要求物理层重新初始化。如果重新初始化仍然失败,Intel UPI 代理将发出一个不可纠正的错误信号 - LRSM 中止(链路层重播状态机 - 中止),表示链路层已经耗尽了所有可用于恢复链路的可能操作。如果处理器支持高级 RAS 能力,则不会进入 LRSM
Abort, link will attempt to survive by initiating 'dynamic link with reduction' feature. Refer to Figure 39 on page 281 for an example of this flow.
中止,链路将尝试通过启动“动态链路降速”功能来继续运行。请参考第 281 页上第 39 图以查看此流程的示例。
The following figures provides various example transaction flows:
以下图示提供了各种示例交易流程:
  1. Normal link layer flow. No fault.
    正常的链路层流程。无故障。
  2. Transient fault with LLR successful
    具有 LLR 成功的瞬时故障
  3. Transient fault with LLR failure and successful PHY-layer re-INIT.
    具有 LLR 失败和成功的 PHY 层重新初始化的瞬时故障。
  4. In-band Phy-INIT flow (LOc).
    带内 Phy-INIT 流程(LOc)。
  5. Persistent fault resulting link failure and triggering fatal MCERR/MCSMI (Standard RAS)
    持久故障导致链路故障并触发致命的 MCERR/MCSMI(标准 RAS)
  6. Persistent fault resulting in triggering dynamic link width reduction and surviving the Intel UPI link, that is, self-healing (Advanced RAS).
    持久故障导致触发动态链路宽度减小并幸存于英特尔 UPI 链路,即自愈(高级 RAS)。
Figure 34. Intel UPI Link Example 1- Normal Flow (No Fault)
图 34. Intel UPI 链接示例 1- 正常流程(无故障)
Figure 35. Intel UPI Link Example 2- Transient Fault With LLR Successful
图 35. Intel UPI 链接示例 2- 瞬态故障,LLR 成功
Figure 36. Intel UPI Link Example 3- Transient Fault with LLR Failure and Successful PHY-Layer Re-INIT
图 36. Intel UPI 链接示例 3- 瞬态故障,LLR 失败和成功的 PHY 层重新初始化
Figure 37. Intel UPI Link Example 4- PHY-Layer Initialization (LOc)
图 37. Intel UPI 链路示例 4- PHY 层初始化(LOc)
Figure 38. Intel UPI Link Example 5-Persistent Fault Resulting in Link Failure (Standard RAS)
图 38. Intel UPI 链路示例 5-持续故障导致链路失败(标准 RAS)
PI node0
Figure 39. Intel UPI Link Example 6 - Persistent Fault Resulting in Dynamic Link-Width Reduction (Advanced RAS)
图 39. Intel UPI 链路示例 6 - 持续故障导致动态链路宽度减小(高级 RAS)
NOTE 注意
This example 6 is for processors with advanced RAS and integration and validation of this feature is described in Intel UPI Dynamic Link Width Reduction on page 283.
本示例适用于具有先进 RAS 和集成的处理器,对该功能的集成和验证在 Intel UPI 动态链路宽度减小的第 283 页有描述。
Table 103 on page 282 describes the configuration and status registers for LLR feature.
第 282 页的表 103 描述了 LLR 功能的配置和状态寄存器。
Table 103. Intel UPI LLR Feature Configuration and Status Registers
表 103. Intel UPI LLR 功能配置和状态寄存器
Scope Register Description
Configuration Register 配置寄存器
Global KTILCL
Bit 13:12 - max_num_phy_reinit
位 13:12 - max_num_phy_reinit
Consecutive phy reInit's to
连续的 phy reInit's to
RETRY_ABORT
Note: HW default setting is 11 b.
注意:HW 默认设置为 11 b。
Bit 11:8 - max_num_retry
位 11:8 - max_num_retry
Consecutive LLRs to physical layer
连续的 LLR 到物理层
reInit.
Note: HW default setting 1111 b
注意:硬件默认设置 1111 b
Bit 6:4 - Ilr_retry_timeout
位 6:4 - Ilr_retry_timeout
LLR timeout value in terms of flits
以 flits 为单位的 LLR 超时值
received
- 000 - 4095 flits
- 000 - 4095 个 flits
- 001 - 2047 flits
- 001 - 2047 个 flits
- 010 - 1023 flits
- 010 - 1023 闪烁
- 011 - 511 flits
- 011 - 511 闪烁
- 100 - 255 flits
- 100 - 255 闪烁
- 101 - 127 flits
- 101 - 127 闪现
- 110 - 63 flits
- 110 - 63 闪现
- 111 - 31 flits
- 111 - 31 闪现
Note: HW default is . This
注意:HW 默认值为 。这
timeout value must be set
超时值必须设置
higher than the round-trip
高于往返时间
delay between this device and
此设备和之间的延迟
the remote device. Values
远程设备。值
below 4095 are intended for
4095 以下的值是为特定的
validation purposes and are
用于验证目的,不适用于生产系统。
not expected for use in
预期不用于
production systems. 生产系统。
Status Register 状态寄存器
Global KTILS
Bits 12:10 - retry_state
位 12:10 - 重试状态
Reflects the current state of local retry
反映本地重试的当前状态
state machine 状态机
- 000 - Retry_Local_Normal
- 000 - 重试_本地_正常
- 001 - Retry_LLRReq
- 001 - 重试_LLR 请求
- 010 - Retry_Local_Idle
- 010 - 重试_本地_空闲
- 011 - Retry_Phy_Reinit
- 011 - 重试_物理_重新初始化
- 100 - Retry_Abort
- 100 - 重试_中止
Bits 9:8 - init_state
位 9:8 - 初始状态
Reflects the current initialization state of
反映了链路层当前的初始化状态,包括任何停顿
the link Layer, including any stall
链路层的当前初始化状态,包括任何停顿
conditions controlled by KTILCL[3:2].
由 KTILCL[3:2] 控制的条件。
- 00 - NOT_RDY_FOR_INIT
- 00 - 未准备好初始化
- 01 - PARAM_EX
- 01 - 参数异常
- 10 - CRD_RETURN_STALL
- 11 - INIT_DONE
Bits 7:0
link_layer_retry_queue_consumed
Link Layer Retry Queue Consumed
链路层重试队列已消耗
Number of Retry Queue entries currently
当前等待 ACK 时已使用的重试队列条目数
consumed while waiting for ACK.
11.5.3 英特尔 {{0}} UPI 动态链路宽度减小
Intel UPI dynamic link width reduction provides the ability to allow the system to continue operating even when hard failure is detected in some of the lanes thus improving the system up-time and reliability. A link that is experiencing problems will have its width reduced from 24 lanes to 8 lanes. Link reduction can be initiated by either transmitter (Tx) or receiver (Rx) side DFT. Rx is generally the easiest of the two and tests the logic portion of the reduction. Tx initiated requires that the Rx circuits correctly drops the lane and passes the info to the logic portion of the phy in addition to testing the same logic as Rx initiated. Rx will be used during enabling with Tx providing the bulk of coverage. UPI failover can only use the upper or lower lane chunk, 7:0 or . See Intel UPI Specification for a description of this feature.
Intel UPI 动态链路宽度缩减提供了使系统在检测到某些通道存在硬故障时继续运行的能力,从而提高了系统的正常运行时间和可靠性。出现问题的链路将其宽度从 24 个通道减少到 8 个通道。链路缩减可以由发射端(Tx)或接收端(Rx)DFT 发起。Rx 通常是两者中最简单的,测试缩减的逻辑部分。Tx 发起需要 Rx 电路正确丢弃通道并将信息传递给物理层的逻辑部分,除了测试与 Rx 发起相同的逻辑。在启用时将使用 Rx,而 Tx 提供大部分覆盖。UPI 故障转移只能使用上部或下部 通道块,7:0 或 。请参阅 Intel UPI 规范以了解此功能的描述。
A multi lane failure will recover as long as all failures are not on both [7:0] and [23:16]. L0p is supported from a full-width Intel UPI link to for power savings only, and LOp will be disabled if a RAS condition results in a degraded port due to dynamic link width reduction.
只要所有故障不同时出现在 [7:0] 和 [23:16] 上,多通道故障将恢复。从全宽度的 Intel UPI 链路到 支持 L0p 仅用于节能,如果由于动态链路宽度缩减导致 RAS 条件导致端口降级,则 LOp 将被禁用。
Refer to Figure 39 on page 281 for an example flow.
参考第 281 页的第 39 图以获取示例流程。

11.6 IIO Sub-System RAS Features
11.6 IIO 子系统 RAS 功能

The following table summarizes the key IIO RAS features.
以下表格总结了关键的 IIO RAS 功能。
Table 104. Summary of IIO Module RAS Features and Resources
IIO 模块 RAS 功能和资源摘要表 104
RAS Feature Description
Standar
d
RAS
Advance
d
RAS
SKU
PCIe link CRC error check
PCIe 链路 CRC 错误检查
and retry
Ability to detect PCIe LCRC error using 32-bit pattern and retry the
能够使用 32 位模式检测 PCIe LCRC 错误并重试
transaction. It is used to ensure data integrity at the data link layer.
事务。它用于在数据链路层确保数据完整性。
LCRC error is reported as 'Bad DLLP". See the PCI Express 4.0 Base
LCRC 错误报告为“Bad DLLP”。请参阅 PCI Express 4.0 基本规范以获取更多详细信息。
Specification for more details.
规范以获取更多详细信息。
Yes Yes
PCIe Link Retraining and
PCIe 链路重新训练和
Recovery
Ability to re-train the link at either smaller width or lower speed when
当纠正的错误达到预定阈值时,能够以较小宽度或较低速度重新训练链路
corrected errors reaches a pre- determined threshold. See the PCI
请参阅 PCI
Express 4.0 Base Specification for more details.
Express 4.0 基础规范,了解更多详情。
Yes Yes
PCIe End to End CRC (ECRC)
PCIe 端到端 CRC(ECRC)
ECRC is used to ensure end-to-end data integrity detection in systems
ECRC 用于确保系统中端到端数据完整性检测
that require high data reliability. A Transaction Layer end-to-end 32-bit
需要高数据可靠性的情况。事务层端到端 32 位 CRC(ECRC)可以放置在事务层数据包(TLP)的摘要字段末尾
CRC (ECRC) can be placed in the transaction layer packet (TLP) Digest
一个 TLP 中。ECRC 覆盖所有不会改变的字段
field at the end of a TLP. The ECRC covers all fields that do not change
as the TLP traverses the path from requester to completer.
当 TLP 从请求者传输到完成者时。
Yes Yes
PCIe Enhanced Downstream
PCIe 增强下行
Port Containment (EDPC) 端口封装(EDPC)
DPC allows halting of PCIe traffic below a downstream port after an
DPC 允许在端口下方检测到或以下检测到未屏蔽的不可纠正错误后停止 PCIe 流量,
unmasked uncorrectable error is detected at or below the port,
从而避免任何数据损坏的潜在传播,
avoiding the potential spread of any data corruption, and permitting
并允许
error recovery if supported by software.
如果软件支持,进行错误恢复。
Yes Yes
RAS Feature Description
Standar
d
RAS
SKU
Advance
d
RAS
EDPC is an enhancement to the DPC thereby adding root port
EDPC 是对 DPC 的增强,从而添加根端口。
programmable IO (RPPIO) errors. RPPIO errors enable fine-grained
可编程 IO(RPPIO)错误。RPPIO 错误使得细粒度
DPC control when Non-posted requests detects certain uncorrected or
当非发布请求检测到某些未校正或
advisory non-fatal errors. See section 6.2.10 of the PCI Express
咨询性非致命错误时,DPC 控制。有关更多详细信息,请参阅 PCI Express
Specification for further details.
规范第 6.2.10 节。
PCIe Card Hot-Plug (Add/
PCIe 卡热插拔(添加/
Remove/Swap)
Allows removal or addition of a PCIe card while the system is running.
允许在系统运行时拔下或添加 PCIe 卡。
This feature is as per the PCIe hot-plug specification and require an
此功能符合 PCIe 热插拔规范,需要
OOB SMBus mechanism for hot-plug/removal operation. This does not
用于热插拔/拔出操作的 OOB SMBus 机制。这不包括 PCIe 意外拔出。
include PCIe surprise removal.
不包括 PCIe 意外拔出。
Yes Yes
PCIe Card Hot-Plug Surprise
PCIe 卡热插拔意外
Allows removal or addition of a PCIe card while the system is running.
在系统运行时允许移除或添加 PCIe 卡。
This feature is as per the PCIe Specification and does not require an
此功能符合 PCIe 规范,不需要 OOB SMBus 机制进行热插拔操作。
OOB SMBus mechanism for hot-plug/removal operation. Requires
需要
appropriate SW handling of surprise removal.
处理意外移除的软件适当性。
Yes Yes

NOTES: 注意事项:

  1. RAS features may not be supported on all SKUs of a processor type.
    处理器类型的所有 SKU 可能不支持 RAS 功能。
  2. two-socket workstation follows Standard RAS SKU.
    双插槽工作站遵循标准 RAS SKU。

11.6.1 PCIe* / CXL.io Leaky Bucket
11.6.1 PCIe* / CXL.io 漏桶

The following sections describe the PCIe* and CXL.io leaky bucket implementation.
以下各节描述了 PCIe* 和 CXL.io 漏桶的实现。

11.6.1.1 Functional Description
11.6.1.1 功能描述

The 3rd Gen Intel Xeon Scalable Processor introduced a new PCIe* correctable error filtering mechanism that allows a platform to tolerate bursts of receiver correctable errors and lets the errors deplete over time when the link is in L0 state, and pauses in L1 state. The goal of the PCIe* leaky bucket hardware is autonomous modulation of a link's correctable error reporting to enable protection from a large number of errors in a short time span, which is called burst-protection. Additionally, the hardware lets the counters deplete per the firmware programmed bit error rate for the link. This mechanism is based on a leaky bucket algorithm and allows the firmware to program the acceptable correctable error burst duration as well as the error depletion or leak rate in accordance with the design link's bit error rate target. The Leaky Bucket Calculator - Memory and PCIe, document number 655405 is available as a calculator for programming leaky bucket parameters.
第三代英特尔 至强 可扩展处理器引入了一种新的 PCIe* 可纠正错误过滤机制,允许平台容忍接收器可纠正错误的突发,并在链路处于 L0 状态时随时间减少错误,而在 L1 状态下暂停。PCIe* 漏桶硬件的目标是自主调制链路的可纠正错误报告,以便在短时间内保护免受大量错误,这称为突发保护。此外,硬件允许计数器根据链路的固件编程的误码率逐渐减少。该机制基于漏桶算法,允许固件根据设计链路的误码率目标编程可接受的可纠正错误突发持续时间以及错误减少或泄漏速率。漏桶计算器 - 内存和 PCIe,文档编号 655405 可用作编程漏桶参数的计算器。

11.6.1.2 Error Handling 11.6.1.2 错误处理

Once the burst protected and bit error rate adjusted error counts hits a threshold the link survival actions can be requested. The above-threshold leaky bucket error count is also referred to as a "prolonged-error" event. If the errors are not prolonged, no action is necessary. If the errors are prolonged, then link's physical layer survival actions such as re-equalization or link rate degradation may be requested to keep the link active. A recovery back to original link speed may also be initiated by the software stack.
一旦突发保护和误码率调整错误计数达到阈值,链接存活操作可以被请求。超过阈值的漏桶错误计数也被称为“持续错误”事件。如果错误不持续,那么不需要采取任何操作。如果错误持续,那么可以请求链接的物理层存活操作,如重新均衡或链接速率降级,以保持链接活动。软件堆栈也可以发起恢复到原始链接速度的操作。
At the heart of the leaky bucket hardware are 16, per lane, counters of 5-bits each (ERRCNTx). These counters serve as the buckets that accumulate and deplete errors on the lane. These counters can be cleared by:
漏桶硬件的核心是每条线路的 16 个 5 位计数器(ERRCNTx)。这些计数器充当累积和消耗该线路上的错误的桶。这些计数器可以通过以下方式清除:
  • Setting error threshold field to 0
    将错误阈值字段设置为 0
  • Enter Phase1 of link equalization
    进入链接均衡的第一阶段
  • Complete a link speed change
    完成链接速度更改
  • Enter the Detect state
    进入检测状态

Bucket Fill or Burst Protection
桶填充或爆破保护

A small programmable window called *AGGRERR is used to monitor occurrence of correctable error at the receiver. Two separate burst protection windows can be programmed: a 5-bit G3AGGRERR for the and links and a 20-bit AGGRERR for and links. This is due to error burst differences in and encoding used for these link rates, respectively. The purpose of this window is to collapse a burst of errors into a single error event. Therefore, an error-observed flag is set for any number of errors observed during the burst protection window. If the error-observed flag is set at the expiration of the burstprotection window, then it is counted as one error event. A per lane counter for the lane is incremented by 1 , and the burst protection window timer as well as errorobserved flag is reset, and error observation restarts. The leaky bucket algorithm monitors the link for the following type of errors:
一个名为*AGGRERR 的小型可编程窗口用于监视接收器处可纠正错误的发生。可以编程两个单独的突发保护窗口:一个 5 位 G3AGGRERR 用于 链路,一个 20 位 AGGRERR 用于 链路。这是因为这些链路速率分别使用的 编码的错误突发差异。该窗口的目的是将一连串的错误合并为单个错误事件。因此,在突发保护窗口期间观察到任意数量的错误时,将设置一个错误观察标志。如果在突发保护窗口到期时设置了错误观察标志,则将计为一个错误事件。每条通道的计数器将增加 1,突发保护窗口计时器以及错误观察标志将被重置,错误观察重新开始。漏桶算法监视以下类型的错误:
  • decoding violations
    解码违规
  • Framing violations 帧违规
  • Elastic buffer errors (overflow or underflow)
    弹性缓冲区错误(溢出或下溢)
  • Symbol lock and lane de-skew Loss
    符号锁定和通道去斜丢失
  • BAD_DLP (DLP CRC error)
    BAD_DLP(DLP CRC 错误)
  • BAD_TLLP (TLP ECRC error)
    BAD_TLLP(TLP ECRC 错误)

Bucket Leak or Error Depletion
Bucket 泄漏或错误耗尽

PCIe* base specification allows a link to tolerate certain ratio of errors to transmitted bits (BER) per unit interval. A larger 50-bit programmable window (EXP_BER) is used to deplete the error counter at the rate specified as the link's bit error rate. This free running timer continuously decrements (leaks) the per lane error counter (bucket) by one error at the expiration of this timer.
PCIe*基本规范允许链路在单位间隔内容忍一定比特错误率(BER)。使用较大的 50 位可编程窗口(EXP_BER)以指定链路比特错误率的速率来减少错误计数器。这个自由运行的计时器在计时器到期时以每次一个错误的速率连续递减(泄漏)每条通道的错误计数器(桶)。
Figure 40. PCIe* Leaky Bucket Algorithm
图 40. PCIe*泄漏桶算法
Once the error count in any of the per lane counters (ERRCNTx) exceeds a firmware programmable count threshold value, that lane will be flagged as faulty in the respective link rate's lane status register (G3ERRLNSTS/ ERRLNSTS) register. Two separate threshold values can be programmed for Gen4/Gen3 (128/130 encoding rates) and Gen2/Gen1 (8b/10b rates) links to account for the encoding differences discussed previously. The above-threshold leaky bucket error count is also referred to as a "prolonged-error" event. Once a link experiences a prolonged error event the leaky bucket algorithm supports following two types of optional link actions:
一旦任何通道的每条通道计数器(ERRCNTx)中的错误计数超过固件可编程的计数阈值,该通道将在相应链路速率的通道状态寄存器(G3ERRLNSTS/ERRLNSTS)中被标记为故障。可以为 Gen4/Gen3(128/130 编码速率)和 Gen2/Gen1(8b/10b 速率)链路分别编程两个不同的阈值,以考虑先前讨论的编码差异。上述阈值以上的泄漏桶错误计数也被称为“持续错误”事件。一旦链路经历持续错误事件,泄漏桶算法支持以下两种可选链路操作:
  • Link re-equalization request for the link rate that had the error. Once the request bit is set and system agents opt to process the re-equalization, they must follow the PCIe* Base Specification criteria. This action is supported only for link rates at or above (Gen3/Gen4).
    针对出现错误的链路速率的链路重新均衡请求。一旦设置请求位并且系统代理选择处理重新均衡,它们必须遵循 PCIe* 基本规范的标准。此操作仅支持在或高于 (Gen3/Gen4)的链路速率。
  • Disable the link rate that had the error thus directing the link training and status state machine (LTSSM) to retrain link at lower speed. This action is supported for all link rates. Once the link speed is disabled the link status register's link bandwidth management bit is also set. If the link management interrupts are enabled, then an MSI may be triggered. The system software agents can monitor the interrupt to process the request by taking software layer actions such as quiescing the link.
    禁用出现错误的链路速率,从而指示链路训练和状态状态机(LTSSM)以较低速度重新训练链路。此操作支持所有链路速率。一旦禁用链路速率,链路状态寄存器的链路带宽管理位也将被设置。如果启用了链路管理中断,则可能触发 MSI。系统软件代理可以监视中断,通过采取软件层操作(如静默链路)来处理请求。
  • Recovery: The system software can optionally clear a link rate disable bit in the link status register. This will in turn direct the LTSSM to retrain link back at the highest available link rate.
    恢复:系统软件可以选择在链路状态寄存器中清除链路速率禁用位。这将进而指示 LTSSM 以最高可用链路速率重新训练链路。
  • SMI/IEH Signaling: Leaky bucket hardware does not support generation of an SMI or ERR_COR message to the IIO error signaling hierarchy. However, the firmware handler for previously described correctable error thresholding mechanism, may be modified to signal that a leaky bucket action was taken. In the current Intel system BIOS reference code, a System Event Log (SEL) hook is created for the BMC reporting of the leaky bucket action. The BIOS also clears the lane faulty status (prolonged error event) bit.
    SMI/IEH 信令:漏桶硬件不支持生成 SMI 或 ERR_COR 消息到 IIO 错误信令层次结构。然而,先前描述的可纠正错误阈值机制的固件处理程序,可以修改为表示已采取漏桶动作。在当前的英特尔系统 BIOS 参考代码中,为 BMC 报告漏桶动作创建了一个系统事件日志(SEL)挂钩。BIOS 还清除了通道故障状态(持续错误事件)位。
PCIe* interface is an integral part of the IIO module and all the PCIe* links support link CRC and link level retry in case of a CRC error. This feature is implemented as specified in the PCI Express Base Specification, Revision 4.0. Refer to this specification for further details.
PCIe*接口是 IIO 模块的一个组成部分,所有 PCIe*链路在 CRC 错误时支持链路 CRC 和链路级重试。此功能按照 PCI Express 基础规范第 4.0 版的规定实现。有关更多详细信息,请参考该规范。

11.6.2.1 Functional Description
11.6.2.1 功能描述

PCIe* interface is an integral part of the IIO module and all the PCIe* links support link CRC and link level retry in case of a CRC error. The TLP transmission path through the Data Link Layer prepares each TLP for transmission by applying a sequence number, then calculating and appending a Link CRC (LCRC), which is used to ensure the integrity of TLPs during transmission across a Link from one component to another. TLPs have a sequence number attached to them by the DLL in the transmitter so that a packet can be re-sent if an error is detected on that packet by the receiver. Each sent packet is moved to a Retry Buffer until acknowledged as received by the receiver using the Ack/Nak protocol.
PCIe* 接口是 IIO 模块的一个组成部分,所有 PCIe* 链路都支持链路 CRC 和链路级重试,以防 CRC 错误。通过数据链路层的 TLP 传输路径,每个 TLP 在传输之前都会应用一个序列号,然后计算并附加一个链路 CRC(LCRC),用于确保 TLP 在从一个组件传输到另一个组件时的完整性。发射端的 DLL 会为每个 TLP 附加一个序列号,以便在接收端检测到该数据包的错误时可以重新发送数据包。每个发送的数据包都会移动到一个重试缓冲区,直到接收端使用 Ack/Nak 协议确认接收到该数据包为止。

11.6.2.2 Error Logging and Handling
11.6.2.2 错误记录和处理

High speed interconnects can be affected by a variety of issues that may cause signal integrity issues (such as jitter, noise, poor connections, trace lengths, and so on). The spec requires a BER of no worse than . Errors still happen with the minimum allowed or even better BER. Error protection is provided on PCIe* link on a per-packet basis. The CRC protects the entire packet.
高速互连可能受到各种问题的影响,可能导致信号完整性问题(如抖动、噪声、连接不良、迹长等)。规范要求误码率不得差于 。即使采用了最低允许或更好的误码率,错误仍可能发生。PCIe*链路提供每个数据包的错误保护。CRC 保护整个数据包。
TLP packets are protected with 32-bit CRC and DLLPs are protected by 16-bit CRC. The Data Link Layer uses a 32-bit CRC to detect errors in TLPs. If operating in firmware first mode, a CRC error will get signaled to the SIEH as long as if the Root Control Register is set for a correctable error. If operating in OS first mode, it will be signaled through MSI/INX.
TLP 数据包使用 32 位 CRC 进行保护,而 DLLP 使用 16 位 CRC 进行保护。数据链路层使用 32 位 CRC 来检测 TLP 中的错误。如果在固件优先模式下运行,只要根控制寄存器设置为可纠正错误,CRC 错误就会被传递给 SIEH。如果在操作系统优先模式下运行,将通过 MSI/INX 进行传递。
  • TLP Transaction Types: TLP 事务类型:
  • Memory Read/Write 内存读/写
  • IO Read/write IO 读/写
  • Configuration Read/Write
    配置读/写
  • Completion 完成
  • Message 消息
  • AtomicOp 原子操作
  • DLLP Transaction Types: DLLP 交易类型:
  • TLP Ack/Nak TLP 确认/否认
  • Power Management 电源管理
  • Link Flow Control 链路流控制
  • Vendor-specific 供应商特定
This feature is implemented as specified in the PCI Express* Base Specification. Refer to this specification for further details. There are three important PCIe* status registers involved with this feature: device status register should show that correctable error field was triggered, Correctable error status register should show BAD DLLP trigger, and Root Error Status should also show that correctable error was received.
此功能按照 PCI Express* 基本规范中指定的方式实现。有三个重要的 PCIe* 状态寄存器涉及到这个功能:设备状态寄存器应显示触发了可纠正错误字段,可纠正错误状态寄存器应显示 BAD DLLP 触发,根错误状态还应显示接收到可纠正错误。
PCIe* interface incorporates recovery mechanism when certain link degradation occurs whereby retraining the link without impacting the pending transactions. In case the degradation is observed across specific lane, the recovery mechanism degrades the link-width as per the defined link-with degradation rules (for example, x16 link will degrade to link). Refer to the respective Platform Design Guide (PDG) for all the defined link degradation rules. In case the degradation is observed randomly across multiple lanes then recovery algorithm attempts to retrain at next lower allowed speed. For further details refer to the PCI Express* Base Specification.
当某些链路退化发生时,PCIe* 接口会整合恢复机制,重新训练链路而不影响待处理事务。如果观察到特定通道的退化,恢复机制会根据定义的链路宽度退化规则(例如,x16 链路将退化为 链路)降低链路宽度。请参考相应的平台设计指南(PDG)以获取所有定义的链路退化规则。如果观察到随机分布在多个通道上的退化,则恢复算法会尝试以下一个较低允许的速度重新训练。有关更多详细信息,请参考 PCI Express* 基础规范。

11.6.3.1 Functional Description
11.6.3.1 功能描述

Link Retraining and Recovery is an ability of the CPU to initiate retraining of the PCIe* link due to error conditions. Error scenarios that induce a retrain include excess Link Layer Replays, inferred electrical idle, and excessive physical layer errors. PCIe* interface incorporates recovery mechanisms when certain link degradation occurs whereby retraining the link without impacting the pending transactions. In case the degradation is observed across a specific lane, the recovery mechanism degrades the link-width as per the defined link-width degradation rules (for example, x16 link degrades to link). Refer to the respective Platform Design Guide (PDG) for all the defined link degradation rules. In case the degradation is observed randomly across multiple lanes then recovery algorithm attempts to retrain at the next lower allowed speed.
链路重新训练和恢复是 CPU 启动 PCIe* 链路重新训练的能力,原因是错误条件。导致重新训练的错误场景包括过多的链路层重播、推断的电气空闲和过多的物理层错误。当发生某些链路退化时,PCIe* 接口会引入恢复机制,重新训练链路而不影响待处理事务。如果观察到特定通道上的退化,则恢复机制会根据定义的链路宽度退化规则(例如,x16 链路退化为 链路)。请参考相应的平台设计指南(PDG)以获取所有定义的链路退化规则。如果在多个通道上随机观察到退化,则恢复算法会尝试以下一个较低允许的速度重新训练。

11.6.3.2 Signaling 11.6.3.2 信号传输

The transmitter from the transaction layer obtains a sequence number from the TLP and that number is assigned from the 12-bit counter. The transmitter then generates the 32-bit LCRC to be transmitted. It then appends the generated CRC value to the same packet. The CRC is capable of error detection at the receiver, but cannot correct errors. The transmitter then places a copy of the TLP to be determined in the retry buffer. It can be retried with the buffer if the receiver claims to have received the packet with the error. When the transmitter receives an Ack with a sequence number, it discards the entries in the replay buffer with sequence numbers. It ensures that the packets are seen in order. A successful receipt of a TLP also ensures the receipt of all TLPs before it. Once the packet is received from the receiver, it is sent from the PHY layer to the data link layer where the LCRC check happens. If the check is successful, the NEXT_RCV_SEQ counter is used to ensure packets are received in the correct order. Once verification is successful of the sequence number, the TLP is determined to be free of an error and sent to the transaction layer of the receiver. If the LCRC check or sequence number fails, a Nak is generated with the sequence number of the last
事务层的发射器从TLP获取一个序列号,该序列号从12位计数器中分配。然后,发射器生成32位LCRC以进行传输。然后,它将生成的CRC值附加到同一数据包中。CRC能够在接收器端检测错误,但无法纠正错误。然后,发射器将TLP的副本放入重试缓冲区中以确定。如果接收器声称已接收到带有错误的数据包,则可以使用缓冲区进行重试。当发射器收到带有序列号的Ack时,它会丢弃具有序列号的重播缓冲区中的条目。它确保数据包按顺序查看。成功接收TLP还确保在其之前接收了所有TLP。一旦从接收器接收到数据包,它将从PHY层发送到数据链路层,LCRC检查将在那里进行。如果检查成功,则使用NEXT_RCV_SEQ计数器确保数据包按正确顺序接收。一旦确认序列号成功,TLP被确定没有错误并发送到接收器的事务层。如果 LCRC 检查或序列号失败,则会生成一个带有上一个接收到的好 TLP 的序列号的 Nak。发送器可以从{{0}}重新发送重放缓冲区中的所有数据包,因为假定数据包{{1}}已成功接收生成 Nak 的序列号为{{2}}。数据包{{3}}不再需要,并且可以丢弃,因为 Nak 确认了所有数据包直到{{4}}的成功接收。有一些与此功能直接相关的重要寄存器:

good TLP received. The transmitter can resend all packets in the replay buffer starting from because the packets are assumed to have been successfully received the sequence number generating the Nak is . Packets are no longer needed and can be discarded because the Nak acknowledges successful receipts of all packets up to . There are a few important registers that are directly related to this feature:
设备状态将显示触发了可纠正错误字段。
  • The device status will show that the correctable error field triggered.
    如果 LCRC 检查或序列号失败,则会生成一个带有上一个接收到的好 TLP 的序列号的 Nak。发送器可以从{{0}}重新发送重放缓冲区中的所有数据包,因为假定数据包{{1}}已成功接收生成 Nak 的序列号为{{2}}。数据包{{3}}不再需要,并且可以丢弃,因为 Nak 确认了所有数据包直到{{4}}的成功接收。有一些与此功能直接相关的重要寄存器:
  • The correctable error status should show the error type trigger.
    可纠正错误状态应显示错误类型触发器。
  • The root error status should also show that a correctable error or a multiple correctable error was received.
    根错误状态还应显示已接收到可纠正错误或多个可纠正错误。
  • The link status may show the link bandwidth management status bit set showing that a link retraining has completed.
    链路状态可能显示链路带宽管理状态位设置,显示链路重新训练已完成。
Assuming signaling was enabled, correctable errors are signaled with ERR_COR.
假设信令已启用,则可纠正的错误将使用 ERR_COR 信令。

11.6.4 PCIe* / CXL.io End-to-End CRC (ECRC)
11.6.4 PCIe* / CXL.io 端到端 CRC(ECRC)

The processor supports End-to-End CRC (ECRC). ECRC is an optional PCIe* data integrity protection field, intended to protect a Transaction Layer Packet (TLP) through different hierarchies in PCIe*. In platforms that utilize PCIe* switches and bridges, an ECRC is generated by one end point and is checked by the other endpoint. Refer to the PCI Express* Base Specification, Revision 4.0 for further details.
处理器支持端到端 CRC(ECRC)。ECRC 是一种可选的 PCIe* 数据完整性保护字段,旨在通过 PCIe* 中的不同层次保护事务层数据包(TLP)。在使用 PCIe* 交换机和桥接器的平台上,一个端点生成一个 ECRC,并由另一个端点检查。有关更多详细信息,请参阅 PCI Express* 基础规范,修订版 4.0。

11.6.4.1 ECRC - Functional Description
11.6.4.1 ECRC - 功能描述

The processor supports ECRC. ECRC is an optional PCIe* data integrity protection field, intended to protect a TLP through different hierarchies in PCIe*. To ensure endto-end data integrity detection in systems that require high data reliability, a transaction layer end-to-end 32-bit CRC can be placed in the TLP Digest field at the end of a TLP. Every device in the path needs to support ECRC. The ECRC covers all fields that do not change as the TLP traverses the path (invariant fields).
处理器支持 ECRC。ECRC 是一个可选的 PCIe* 数据完整性保护字段,旨在通过 PCIe* 中的不同层次保护 TLP。为了确保在需要高数据可靠性的系统中进行端到端数据完整性检测,可以将一个事务层端到端的 32 位 CRC 放置在 TLP 的 Digest 字段中。路径中的每个设备都需要支持 ECRC。ECRC 覆盖了随着 TLP 穿越路径而不改变的所有字段(不变字段)。
There are two bits that can change their value as they traverse the path:
有两个位可以在穿越路径时改变它们的值:
  • Bit 0 of the type field--changes when a configuration transaction is forwarded across a bridge and changes from type 1 to type 0 since it has reached its destination.
    类型字段的第 0 位--当配置事务被转发穿过桥时会发生变化,当它到达目的地时会从类型 1 变为类型 0。
  • EP bit--changes when a configuration transaction is forwarded across a bridge and changes from type 1 to type 0 since it has reached its destination.
    EP 位--当配置事务被转发穿过桥时会发生变化,当它到达目的地时会从类型 1 变为类型 0。
These two bits are therefore referred to as variants and hence excluded from ECRC coverage.
因此,这两位因此被称为变体,因此不包括在 ECRC 覆盖范围内。
A switch that supports ECRC must check the ECRC on TLPs that target the switch itself. It can also optionally check the ECRC on TLPs that it forwards. On the TLPs that the switch forwards, the switch must preserve the ECRC (forward it untouched) as an integral part of the TLP, regardless of whether the switch checks the ECRC or if the ECRC check fails. Checking the CRC verifies there are no transmission errors across a given link, but the CRC also gets regenerated at the egress port of a routing element. For data integrity, the ECRC is carried forward unchanged on its journey between the Requester and Completer. When the target device checks the ECRC, any error possibilities along the way have a high probability of being detected. Refer to the PCI Express* Base Specification, Section 2.7 for further details.
支持 ECRC 的交换机必须检查针对交换机本身的 TLP 上的 ECRC。它还可以选择性地检查它转发的 TLP 上的 ECRC。在交换机转发的 TLP 上,无论交换机是否检查 ECRC 或 ECRC 检查失败,交换机都必须保留 ECRC(不加修改地转发)。检查 CRC 可验证给定链路上是否没有传输错误,但 CRC 也会在路由元素的出口端口重新生成。为了数据完整性,ECRC 在请求方和完成方之间的传输过程中保持不变。当目标设备检查 ECRC 时,沿途的任何错误可能性都有很高的概率被检测到。有关更多详细信息,请参阅 PCI Express* 基础规范第 2.7 节。

11.6.5 PCIe* / CXL.io Enhanced Downstream Port Containment (EDPC)
11.6.5 PCIe* / CXL.io 增强型下游端口封装(EDPC)

EDPC is an optional feature and its main function is to abort PCIe* traffic below a downstream port after an unmasked uncorrectable error or poison is detected at or below the port. This guarantees error containment and allows for possible recovery via software.
EDPC 是一个可选功能,其主要功能是在下游端口或端口以下检测到未屏蔽的不可纠正错误或毒药后中止 PCIe* 流量。这保证了错误的封装,并允许通过软件进行可能的恢复。
A downstream port indicates support for EDPC by implementing a DPC Extended Capability structure, which contains all EDPC related control and status bits. EDPC is disabled by default. A platform that wishes to enable EDPC must do so by writing the trigger enable field in the DPC Control Register. DPC may be triggered due to either receipt of uncorrectable error messages, an unmasked UC error, or the DPC software trigger. When DPC is triggered due to receipt of a uncorrectable error message, the Requester ID from the message is recorded in the DPC Error Source ID register. The message is then discarded. When DPC is triggered due to an unmasked uncorrectable error, the error will not be signaled with a uncorrectable error message or ERR_COR message if enabled. When DPC is triggered, the downstream port behaves in the following manner:
下游端口通过实现一个包含所有与 EDPC 相关的控制和状态位的 DPC 扩展能力结构来指示对 EDPC 的支持。默认情况下,EDPC 被禁用。希望启用 EDPC 的平台必须通过在 DPC 控制寄存器中写入触发器启用字段来实现。DPC 可能由于接收到不可纠正的错误消息、未屏蔽的 UC 错误或 DPC 软件触发而被触发。当 DPC 由于接收到不可纠正的错误消息而被触发时,消息中的请求者 ID 将被记录在 DPC 错误源 ID 寄存器中。然后丢弃该消息。当 DPC 由于未屏蔽的不可纠正错误而被触发时,如果启用,则不会通过不可纠正错误消息或 ERR_COR 消息来发出错误信号。当 DPC 被触发时,下游端口的行为如下:
  • Immediately sets the DPC Trigger Status bit.
    立即设置 DPC 触发状态位。
  • Populates the DPC Trigger Reason Field (with one of Unmasked UC, ERR_FATAL, ERROR_NONFATAL, RP PIO error or SW triggered).
    填充 DPC 触发原因字段(其中之一为未屏蔽 UC、ERR_FATAL、ERROR_NONFATAL、RP PIO 错误或 SW 触发)。
  • Disables its link (the Link Training and Status State Machine or LTSSM is directed to Disabled state). Once it is in this state it will remain so until the DPC Trigger Status bit is cleared. To ensure that the LTSSM has time to reach the Disabled stated or at least bring the link down under a variety of error conditions, SW must leave the downstream port in DPC until the Data Layer Link Active bit in the Link Status Register reads 'Ob.
    禁用其链接(链接训练和状态状态机或 LTSSM 被定向到禁用状态)。一旦处于此状态,直到清除 DPC 触发状态位为止,它将保持在该状态。为确保 LTSSM 有时间达到禁用状态,或者至少在各种错误条件下将链接关闭,软件必须将下游端口保留在 DPC 状态,直到链接状态寄存器中的数据层链接活动位读取“Ob”为止。
When the DPC trigger status bit is set and the DPC RP Busy bit is set, software must leave the Root Port in DPC until the DPC RP busy bit reads ' 0 b. Once SW releases the downstream port from DPC, the affected link will attempt to retrain. SW can use one or both of the following to signal when the link reaches the DL_Active state again:
当设置了 DPC 触发状态位并且设置了 DPC RP 忙位时,软件必须将根端口保留在 DPC 状态,直到 DPC RP 忙位读取“0 b”为止。一旦软件释放了下游端口的 DPC,受影响的链接将尝试重新训练。软件可以使用以下一种或两种信号来表示链接再次达到 DL_Active 状态:
  • Data Link Layer State Changed Interrupts
    数据链路层状态更改中断
  • DL_Active ERR_COR signaling.
    DL_Active ERR_COR 信令。
Support for DL_ACTIVE_ERR_COR signaling is indicated via a bit with the same name in the DPC capability register. It is enabled by setting the DL_ACTIVE_ERR_COR_ Enable bit in the DPC Control Register. The DL_ACTIVE state is indicated by the Data Link Layer Link Active bit in the Link Status register. DL_ACTIVE ERR_COR signaling is managed independently of Data Link Layer State Changed interrupts, and it is permitted to use both mechanisms concurrently.
通过 DPC 能力寄存器中具有相同名称的位指示对 DL_ACTIVE_ERR_COR 信令的支持。通过在 DPC 控制寄存器中设置 DL_ACTIVE_ERR_COR_Enable 位来启用它。在链路状态寄存器中,数据链路层链路激活位指示 DL_ACTIVE 状态。DL_ACTIVE ERR_COR 信令独立于数据链路层状态更改中断进行管理,并允许同时使用这两种机制。
If the DL_ACTIVE_ERR_COR_Enable bit is set and the Correctable Error Reporting Enable bit in the Device Control register is set, the port must send an ERR_COR message each time the link transitions into the DL_ACTIVE state. Since this event is not handled as an error, the Correctable Error Detected bit in the Device Status Register must not be set due to DL_ACTIVE_ERR_COR signaling.
如果设置了 DL_ACTIVE_ERR_COR_Enable 位并且设备控制寄存器中的可纠正错误报告使能位已设置,则每次链路转换为 DL_ACTIVE 状态时端口必须发送 ERR_COR 消息。由于此事件不被视为错误,因此由于 DL_ACTIVE_ERR_COR 信令而导致的可纠正错误检测位不得设置在设备状态寄存器中。
DL_ACTIVE_ERR_COR is signaled only when the link enters DL_ACTIVE state. It is not signaled when it exits that state (as would be the case with Data Link Layer State changed Interrupts).
DL_ACTIVE_ERR_COR 仅在链路进入 DL_ACTIVE 状态时发出信号。当链路退出该状态时(如数据链路层状态更改中断的情况),不会发出信号。
For a given DL_ACTIVE event, if a Port is going to send both an ERR_COR Message and an MSI/MSI-X transaction, then the Port must send the ERR_COR Message prior to sending the MSI/MSI-X transaction. There is no corresponding requirement if the INTx mechanism is being used to signal DL_ACTIVE interrupts, since INTx Messages will not necessarily remain ordered with respect to ERR_COR Messages when passing through routing elements.
对于给定的 DL_ACTIVE 事件,如果一个端口将发送 ERR_COR 消息和 MSI/MSI-X 事务,则端口必须在发送 MSI/MSI-X 事务之前发送 ERR_COR 消息。如果使用 INTx 机制来信号 DL_ACTIVE 中断,则不存在相应的要求,因为当通过路由元素时,INTx 消息不一定会保持与 ERR_COR 消息的顺序。
It is recommended that Operating Systems use the Data Link Layer State Changed Interrupts method for signaling when DL_ACTIVE changes state. The DL_ACTIVE_ERR_COR signaling method represents a subset of the same events and is primarily intended for use by platform SW when it needs to be notified to do Downstream Port configuration or provide Firmware First services.
建议操作系统使用数据链路层状态更改中断方法来信号 DL_ACTIVE 更改状态。DL_ACTIVE_ERR_COR 信号方法代表相同事件的一个子集,主要用于平台软件需要被通知进行下游端口配置或提供固件首选服务时。

11.6.5.1 Functional Description
11.6.5.1 功能描述

DPC allows halting of PCIe* traffic below a downstream port after an unmasked uncorrectable error is detected at or below the port, avoiding the potential spread of any data corruption, and permitting error recovery if supported by software. EDPC is an enhancement to DPC thereby adding root port programmable IO (RPPIO) errors. RPPIO errors enable fine-grained DPC control when non-posted requests detect certain uncorrected or advisory non-fatal errors. EDPC is an optional feature and its main function is to abort PCIe* traffic below a downstream port after an unmasked uncorrectable error or poison is detected at or below the port. This guarantees error containment and allows for possible recovery via software. A downstream port indicates support for eDPC by implementing a DPC extended capability structure, which contains all EDPC related control and status bits. To enable eDPC, a platform must write to the trigger enable field in the DPC Control Register. EDPC provides robust error containment and opportunity for recovery, enables software to recover from asynchronous removal events, and gives the platform the opportunity to gracefully handle fatal error conditions.
DPC允许在端口或端口以下检测到未屏蔽的不可纠正错误后停止 PCIe* 交通,避免任何数据损坏的潜在传播,并允许如果软件支持则进行错误恢复。EDPC是对DPC的增强,因此添加了根端口可编程IO(RPPIO)错误。RPPIO错误使得在非发布请求检测到特定未纠正或咨询性非致命错误时,可以进行精细化的DPC控制。EDPC是一个可选功能,其主要功能是在端口或端口以下检测到未屏蔽的不可纠正错误或毒物后中止 PCIe* 交通。这保证了错误的封装并允许通过软件进行可能的恢复。下游端口通过实现DPC扩展能力结构来指示对eDPC的支持,其中包含所有与EDPC相关的控制和状态位。要启用eDPC,平台必须写入DPC控制寄存器中的触发器启用字段。EDPC提供了强大的错误封装和恢复机会,使软件能够从异步移除事件中恢复,并使平台有机会优雅地处理致命错误条件。

11.6.5.2 DPC Interrupts 11.6.5.2 DPC 中断

An eDPC-capable downstream port must support the generation of DPC interrupts. DPC interrupts are enabled by the DPC interrupt enable bit in the DPC control register. When an interrupts is signaled, the event is indicated by the DPC interrupt status bit in the DPC status register.
支持 eDPC 的下游端口必须支持生成 DPC 中断。DPC 中断通过 DPC 控制寄存器中的 DPC 中断使能位启用。当发生中断时,事件由 DPC 状态寄存器中的 DPC 中断状态位指示。
If the port is enabled for level-triggered interrupt signaling using INTx messages, the virtual INTx wire must be asserted whenever and as long as the following conditions are satisfied:
如果端口启用了使用 INTx 消息进行电平触发中断信号的功能,则只要满足以下条件,虚拟 INTx 线必须被断言。
  • The Interrupt disable bit in the command register is clear ('Ob).
    命令寄存器中的中断禁用位被清除('Ob)。
  • The DPC interrupt enable bit is set ('1b).
    DPC 中断使能位被设置为('1b)。
  • The DPC interrupt status bit is set ('1b).
    DPC 中断状态位被设置为('1b)。
If the port is enabled for edge-triggered interrupt signaling using MSI or MSI-X, an interrupt message must be sent every time the logical AND of the following conditions transitions from FALSE to TRUE:
如果端口启用了使用 MSI 或 MSI-X 进行边缘触发中断信号的中断,那么每当以下条件的逻辑与从 FALSE 转变为 TRUE 时,必须发送一个中断消息:
  • The associated vector is unmasked.
    相关联的向量未屏蔽。
  • The DPC Interrupt enable bit is set ( '1b).
    DPC 中断使能位被设置为('1b)。
  • The DPC interrupt status bit is set ('1b.)
    DPC 中断状态位已设置为 ('1b。)
The port may optionally send an interrupt message if interrupt generation has been disabled and the logical AND of the previous conditions is TRUE when interrupt generation is subsequently enabled. The interrupt message will use the vector indicated by the DPC interrupt message number field in the DPC capability register. This vector may be the same as or different from the vectors used by other interrupt sources within this function.
如果中断生成已被禁用,并且在随后启用中断生成时,先前条件的逻辑 AND 为 TRUE,则端口可以选择发送中断消息。中断消息将使用 DPC 能力寄存器中的 DPC 中断消息编号字段指示的向量。该向量可能与此功能内的其他中断源使用的向量相同或不同。

11.6.5.3 Root Port Programmed IO (RP PIO) Error Handling
11.6.5.3 根端口编程 IO(RP PIO)错误处理

A set of RP PIO error control registers are provisioned for providing fine-grained control over what happens when Non-Posted requests that tracked by the Root Port encounter certain uncorrectable or Advisory Non-Fatal errors. This set of control and status bits provides a fine-grained and more precise error handling for the following subset of uncorrectable errors:
为提供对由根端口跟踪的非发布请求遇到某些不可纠正或咨询性非致命错误时发生的情况进行精细控制而配置了一组 RP PIO 错误控制寄存器。这组控制和状态位为以下不可纠正错误子集提供了精细和更精确的错误处理:
  • Unsupported Request status (UR Cpl)
    不支持的请求状态(UR Cpl)
  • Completion with Completer Abort status (CA Cpl)
    具有完成者中止状态的完成(CA Cpl)
  • Completion Timeout (CTO)
    完成超时(CTO)
The controls and status are further specified on a per-request type:
控件和状态进一步按请求类型进行了详细说明:
  • Configuration Requests 配置请求
  • I/O Requests I/O 请求
  • Memory Requests 内存请求
An example of what this fine-grained control allows is that one could configure UR CpI errors with memory read requests to trigger DPC for proper containment and error handling and Configuration requests to return all '1s, without triggering DPC, for normal probing and enumeration.
这种细粒度控制所允许的一个示例是,可以配置 UR CpI 错误与内存读请求触发 DPC 以进行适当的限制和错误处理,以及配置请求返回所有“1s”,而不触发 DPC,用于正常的探测和枚举。
Of the RP PIO log registers only the RP PIO header log is implemented in the processor. How each unmasked RP PIO error is handled is determined by the settings in the RP PIO serverity register. For a given error, if the associated severity bit is set in the RP PIO severity register, the error is handled as uncorrectable and triggers DPC if DPC is enabled. A DPC interrupt and/or ERR_COR is also be signaled if enabled.
在处理器中,仅实现了 RP PIO 日志寄存器中的 RP PIO 头日志。如何处理每个未屏蔽的 RP PIO 错误取决于 RP PIO 严重性寄存器中的设置。对于给定的错误,如果在 RP PIO 严重性寄存器中设置了相关的严重性位,该错误将被处理为不可纠正的,并在启用 DPC 时触发 DPC。如果启用了 DPC,还将发出 DPC 中断和/或 ERR_COR 信号。

11.6.5.4 System Implications
11.6.5.4 系统影响

It is recommended that software not set the Link Disable bit in the Link Control register while DPC is enabled but not triggered. Setting the Link Disable bit will cause the Link to be directed to DL_Down, invoking some semantics similar to those in DPC, but lacking others. The subsequent arrival of Posted Requests will likely trigger DPC soon, anyway.
建议软件在启用但未触发 DPC 时不设置链路控制寄存器中的链路禁用位。设置链路禁用位将导致链路被定向到 DL_Down,调用一些类似于 DPC 的语义,但缺少其他语义。随后到达的 Posted 请求很可能很快会触发 DPC。
Similarly, it is recommended that a Port that supports DPC not set the Hot-Plug Surprise bit in the Slot Capabilities register. Having this bit set blocks the reporting of Surprise Down errors, preventing DPC from being triggered by this important error, greatly reducing the benefit of DPC.
同样,建议支持 DPC 的端口不要在槽位能力寄存器中设置热插拔惊喜位。设置此位会阻止惊喜下降错误的报告,从而阻止 DPC 被此重要错误触发,大大降低了 DPC 的效益。
Software will need to implement the extended capability and associated semantics. If recovery is to be implemented, SW is responsible for taking the appropriate steps to perform the error recovery.
软件需要实现扩展功能和相关语义。如果要实现恢复,软件需要负责采取适当步骤执行错误恢复。

11.6.5.5 EDPC Error Flow Examples
11.6.5.5 EDPC 错误流程示例

This section describes key DPC/EDPC error flow diagrams.
本节描述了关键的 DPC/EDPC 错误流程图。
Figure 41 on page 293 illustrates DPC error flow diagram when Malformed TLP is detected in upstream transaction.
第 293 页的图 41 展示了当在上游事务中检测到格式错误的 TLP 时的 DPC 错误流程图。
Figure 41. DPC Error Flow Diagram, Malformed TLP Detected in Upstream Transaction
图 41. DPC 错误流程图,上游事务中检测到格式错误的 TLP。
11.6.6 PCIe* / CXL.io Hot-Plug (Add/Remove/Swap)
11.6.6 PCIe* / CXL.io 热插拔(添加/移除/交换)

11.6.6.1 Functional Description
11.6.6.1 功能描述

The IO module has hot-plug/removal support only for PCIe* devices. This feature allows physical hot-plug/removal of a PCIe* device connected to the IO module. In addition, physical hot-plug/removal for other I/O devices downstream to the IO module may be supported by downstream bridges. PCIe* hot-plug is supported through the standard PCIe* native hot-plug. The IO module only supports the sideband hot-plug signals and does not support the in-band hot-plug messages. The IO module contains a Virtual Pin Port (VPP) that serially shifts in and out the sideband PCIe* hot-plug signals. External platform logic is required to convert the IO module serial stream to parallel. The virtual pin port is implemented via a dedicated SMBus port. Hot-plug refers to the action of adding or removing devices so that failed cards can be replaced with a running system while the OS runs and the repair is taking place. Allows for possibility to shut-down and restart software associated with the failed device. Hot-Add involves the addition of a PCIe* device into a running system. Hot-plug add is executed while under OS control, typically to replace a failing device or add capacity/capability. Hot-remove refers to the removal of a PCIe* device from a running system. Removal is done under OS control. Device can be removed because it has been raised as a failing device in preparation for replacement or an upgrade. Hotplug support increases platform availability and reduces unplanned downtime. Hotplug operation can be initiated by software or physical removal. Additional usage models include replacing misbehaving devices and adding the capacity.
IO模块仅支持PCIe*设备的热插拔功能。此功能允许将连接到IO模块的PCIe*设备进行物理热插拔。此外,IO模块下游的其他I/O设备的物理热插拔可能由下游桥接器支持。PCIe*热插拔通过标准PCIe*本地热插拔支持。IO模块仅支持侧带热插拔信号,不支持带内热插拔消息。IO模块包含一个虚拟引脚端口(VPP),用于串行移入和移出侧带PCIe*热插拔信号。需要外部平台逻辑将IO模块串行流转换为并行。虚拟引脚端口通过专用SMBus端口实现。热插拔是指添加或移除设备的操作,以便在操作系统运行时用运行中的系统替换故障卡,并进行维修。允许关闭和重新启动与故障设备相关的软件。热添加涉及将PCIe*设备添加到运行中的系统中。热插拔添加是在操作系统控制下执行的,通常用于替换故障设备或增加容量/功能。 热拔插是指从运行中的系统中移除 PCIe* 设备。移除是在操作系统控制下进行的。设备可以被移除,因为它已被标记为需要更换或升级的故障设备。热插拔支持增加了平台的可用性,减少了计划外的停机时间。热插拔操作可以由软件或物理移除来启动。其他用途包括替换行为异常的设备和增加容量。
The summary of the IO module PCIe* hot-plug support is:
IO 模块 PCIe* 热插拔支持的摘要是:
  • Support for hot-plug slots selectable by the BIOS
    BIOS 可选择的热插槽支持
  • Support for serial mode hot-plug only using SMBus devices like PCA9555
    仅支持使用 PCA9555 等 SMBus 设备进行串行模式热插拔
  • Single SMBus is used to control hot-plug slots.
    单个 SMBus 用于控制热插槽。
  • Support for CEM/SIOM/Cable form factors
    支持 CEM/SIOM/Cable 形式因子
  • Support MSI or ACPI paths for hot-plug interrupts
    支持 MSI 或 ACPI 路径用于热插拔中断
  • IO module does not support inband hot-plug messages on PCIe*.
    IO 模块不支持 PCIe* 上的带内热插拔消息。
  • IO module does not issue them and discards them silently if received.
    IO 模块不会发出这些消息,如果收到会悄悄丢弃。
  • A hot-plug event cannot change the number of ports of the PCIe* interface (such as, bifurcation).
    热插拔事件无法更改 PCIe* 接口的端口数量(例如,分流)。

11.6.6.2 Flow 11.6.6.2 流程

OS and driver configuration:
操作系统和驱动程序配置:
  • All unpopulated slots are in Detect Mode after POST.
    在 POST 后,所有未填充的插槽都处于检测模式。
  • Disable PCIe* active state power management.
    禁用 PCIe* 主动状态电源管理。
  • Enable PCIe* hot-plug ACPI tables.
    启用 PCIe* 热插拔 ACPI 表。
  • PCIe* topology and link/slot capability discovery.
    PCIe* 拓扑和链路/插槽能力发现。
  • Device drivers registers with OOB service to be notified of events.
    设备驱动程序向 OOB 服务注册,以便通知事件。
  • Program slot capabilities and slot control registers.
    编程插槽能力和插槽控制寄存器。
Table 105. Hot-Plug Interface
表 105. 热插拔接口
Signal Name Description Action
ATNLED
This indicator is connected to the Attention LED on
此指示灯连接到主板上的 Attention LED。
the baseboard. For a precise definition see the
有关精确定义,请参阅
Express Base Specification, Revision 5.0 .
表达基本规范,修订版 5.0。
Indicator can be off, on, or blinking. The required
指示灯可以关闭、打开或闪烁。指示灯的所需状态由注意指定。
state for the indicator is specified with the attention
状态。
indicator register. IO module blinks this LED at .
指示器寄存器。IO 模块会在 时闪烁此 LED。
PWRLED
This indicator is connected to the Power LED on the
此指示灯连接到底板上的电源指示灯。
baseboard. For a precise definition see the
有关精确定义,请参阅
Express Base Specification.
表达基本规范。
Indicator can be off, on, or blinking. The required
指示灯可以关闭、打开或闪烁。所需的
state for the indicator is specified with the power
指示器的状态由电源指示灯指定
Indicator register. IO module blinks this LED at .
指示器寄存器。IO 模块以 的频率闪烁该 LED。
BUTTON#
Input signal per slot which indicates that the user
每个插槽的输入信号,指示用户
wishes to hot remove or hot add a PCIe* card/
希望热插拔或热添加 PCIe* 卡/
module.
If the button is pressed (BUTTON# is asserted), the
如果按钮被按下(BUTTON# 被断言),则
Attention Button Pressed Event bit is set and either
注意按钮按下事件位被设置,并且任何
an interrupt or a general-purpose event message
一个中断或通用事件消息
Assert/ Deassert_HPGPE is sent.
发送 Assert/Deassert_HPGPE。
PRSNT#
Input signal that indicates if a hot-pluggable PCIe*
输入信号,指示是否为可热插拔的 PCIe*
card/module is currently plugged into the slot.
卡/模块当前插入插槽中。
When a change is detected in this signal, the
当检测到此信号发生变化时,
Presence Detect Event Status register is set and
存在检测事件状态寄存器被设置并
either an interrupt or a general- purpose event
一个中断或通用事件
message Assert/ Deassert_HPGPE is sent
发送消息 Assert/Deassert_HPGPE
PWRFLT#
Input signal from the power controller to indicate
从电源控制器输入的信号表示
that a power fault has occurred.
发生了电源故障。
When this signal is asserted, the Power Fault Event
当此信号被断言时,电源故障事件
Register is set and either an interrupt or a general-
寄存器被设置,同时产生中断或一般性-
purpose event message Assert/Deassert_HPGPE
目的事件消息 Assert/Deassert_HPGPE
message is sent. 消息已发送。
PWREN#
Output signal allowing software to enable or disable
输出信号,允许软件启用或禁用
power to a PCIe* slot.
为 PCIe* 插槽供电。
If the Power Controller Register is set, the IO module
如果设置了电源控制器寄存器,则 IO 模块
asserts this signal. 断言此信号。
MRL#/EMILS
Manual retention latch status or Electromechanical
手动保持锁状态或电机保持锁状态输入指示保持锁是
latch status input indicates that the retention latch is
关闭或打开。手动保持锁用于
closed or open. Manual retention latch is used on the
platform to mechanically hold the card in place and
机械地固定卡片的平台
can be open/closed manually. Electromechanical
可手动打开/关闭。电机机械
latch is used to electro mechanically hold the card in
闩用于电机机械地固定卡片
Supported for the serial interface and MRL change
支持串行接口和 MRL 更改
detection results in either an interrupt or a general-
检测结果会导致中断或一般性事件消息 Assert/Deassert_HPGPE
purpose event message Assert/Deassert_HPGPE
通用事件消息 Assert/Deassert_HPGPE
message is sent. 消息已发送。
Note: IIO must not invert this bit before
注意:IIO 在将这些位反映在适当位置之前不得反转此位。
reflecting these bits in the appropriate
反映这些位。
software registers. 软件寄存器。
Signal Name Description Action
place and is operated by software. MRL# is used for
位置,并由软件操作。 MRL# 用于
card-edge and EMLSTS# is used for SIOM form
卡边缘和 EMLSTS# 用于 SIOM 形式
factors.
EMIL
Electromechanical retention latch control output that
电机机械保持锁控制输出,用于打开或关闭板上的保持锁
opens or closes the retention latch on the board for
用于此插槽。保持锁用于平台上
this slot. A retention latch is used on the platform to
mechanically hold the card in place. See the PCIe*
机械地将卡固定在位。请参阅 PCIe*
Server/Workstation Module Electromechanical Spec
服务器/工作站模块机电规范
for details of the timing requirements of this pin
有关此引脚的时序要求的详细信息
output.
Supported for the serial interface and is used only
仅支持串行接口,并且仅用于 SIOM 封装。
for the SIOM form-factor.
用于 SIOM 封装。

NOTE 注意

For legacy operating systems, the described Assert_HPGPE/Deassert_HPGPE mechanism is used to interrupt the platform for PCIe* hot-plug events. For newer operating systems, this mechanism is disabled and the MSI capability is used by the IO module instead.
对于传统操作系统,描述的 Assert_HPGPE/Deassert_HPGPE 机制用于中断平台以进行 PCIe* 热插拔事件。对于较新的操作系统,此机制被禁用,而 IO 模块则使用 MSI 能力。
Table 106. Setup Actions
表 106. 设置操作
Register Description Required Fields 必填字段
PCI link
capabilities
register
Enable surprise down error condition and status of
启用意外下降错误条件和数据链路层的状态,以便软件可以获取状态或
data link layer so software can obtain status or
否则知道何时可以枚举设备
otherwise knows when it can enumerate a device on
在数据链路层上的设备
the link.
- Surprise down error reporting capable
- 意外下降错误报告功能
Data link layer link active reporting capable
数据链路层链路激活报告功能
Miscellaneous
Control and
Status Register 状态寄存器
When this bit is set, all hot-plug events from the
当此位被设置时,所有来自 PCIe* 端口的热插拔事件都通过 HPGPE 消息处理,不会为热插拔生成任何 MSI/INTx 消息
PCIe* port are handled via_HPGPE messages and no
PCIe* 端口的所有热插拔事件都通过 HPGPE 消息处理,不会为热插拔生成任何 MSI/INTx 消息
MSI/INTx messages are ever generated for hot plug
当此位被设置时,所有来自 PCIe* 端口的热插拔事件都通过 HPGPE 消息处理,不会为热插拔生成任何 MSI/INTx 消息
events at the root port.
根端口处的事件。
- Enable ACPI mode for hot-plug = 1
- 为热插拔启用 ACPI 模式 = 1
Slot Capabilities 槽位功能
Register
Enable hot plug support capabilities for the PCIe*
为 PCIe* 端口启用热插拔支持功能
port and hot plug surprise which indicates a device in
端口和热插拔惊喜,表示插槽中的设备可能被从系统中移除
this slot may be removed from the system without
而不需要
prior knowledge. 先验知识。
- Hot-plug surprise = 1 (Change from default
- 热插拔惊喜 = 1(从默认更改
setting)
- Hot-plug capable = 1 (Change from current
- 热插拔能力 = 1(从当前更改
setting)
- Physical slot number = xxxx (following SlotID
- 物理插槽号 = xxxx (随后的 SlotID
format)
Slot Control
Register
Enable hot-plug interrupts or wake messages and
启用热插拔中断或唤醒消息和
software notification when data link layer active bit
数据链路层活动位激活时软件通知
in the link status register changes state.
在链接状态寄存器中更改状态。
- Presence detect changed enable (Default set
- 存在检测更改使能 (默认设置)
to 0 )
- Hot-plug interrupt enable (Default set to 0
- 热插拔中断使能 (默认设置为 0
- Data link layer state changed enable (Default
- 数据链路层状态更改启用 (默认
set to 0 )

Hot-Remove Flow 热移除流

  • System preparation: This is done at boot time as described in earlier section.
    系统准备:这是在引导时间完成的,如前一节所述。
  • Initial failure detection: System software detects or predicts the device failure. System software may trigger recovery actions and possibly a service call. The failed device is no longer accessed.
    初始故障检测:系统软件检测或预测设备故障。系统软件可能会触发恢复操作,可能还会进行服务调用。已故障设备将不再被访问。
  • Prepare for device replacement: System software notifies the device driver and the PCI bus driver that the device should be disabled. The drive is disabled, and the device Driver is unloaded and any clean-up is performed. Prior to removing the card, the driver must stop accessing the card and the card must stop initiating corresponding to new requests.
    为设备更换做准备:系统软件通知设备驱动程序和 PCI 总线驱动程序应禁用设备。驱动程序被禁用,设备驱动程序被卸载,并进行任何清理。在移除卡片之前,驱动程序必须停止访问卡片,卡片必须停止发起对新请求的响应。
  • Hot-removal HW event: The human operator removes the drive that has failed. This will generate interrupt(s) to indicate that the device was removed. The presence change interrupt and possibly the PCIe* link down interrupt are sent to the PCI bus driver. Optionally, the presence change interrupt (OOB) may be sent to the system management software.
    热插拔硬件事件:人工操作员移除已故障的驱动器。这将生成中断以指示设备已被移除。存在更改中断和可能的 PCIe* 链路断开中断被发送到 PCI 总线驱动程序。可选地,存在更改中断(OOB)可能会发送到系统管理软件。
  • Low level recovery: the PCI bus driver provides "Plug-and-Play" capabilities possibly with the assistance by the BIOS drivers. When the removal of the device driver is detected, any required clean-up such as unloading the device driver, reclaiming the assigned resources (address ranges, MSI) and resetting any hardware state is completed. The PCI Bus Driver will notify software that had registered for notification of the event earlier. The device is no longer in the system and the empty slot is ready for a new device to be added.
    低级恢复:PCI 总线驱动程序可能通过 BIOS 驱动程序的帮助提供“即插即用”功能。当检测到设备驱动程序的移除时,将完成任何必要的清理工作,例如卸载设备驱动程序、回收分配的资源(地址范围、MSI)并重置任何硬件状态。PCI 总线驱动程序将通知之前注册了事件通知的软件。设备不再存在于系统中,空插槽已准备好添加新设备。
As in the case of hot-add, in practice the interrupts, errors, or user interaction may change the order of operations or trigger redundant steps, however, the actions that are taken are the same.
与热插拔一样,在实践中,中断、错误或用户交互可能会改变操作顺序或触发冗余步骤,但所采取的操作是相同的。

Hot-Add Flow 热插拔流程

Once the system has been set up in preparation for a hot-plug event (that is, the system is hot-plug enabled), it will continue to run normally until a hot-plug event is triggered. Hot-add steps may vary depending on the usage model and system specific aspects.
一旦系统已经准备好进行热插拔事件(即系统已启用热插拔),它将继续正常运行,直到触发热插拔事件。热添加步骤可能会因使用模型和系统特定方面而有所不同。
  • System preparation: This is done at boot time as described earlier section.
    系统准备:这是在之前部分描述的启动时完成的。
  • Operator prepares system. This is an optional step where a human operator may use the system software interface or the Attention Button to prepare for hot-add. The system SW Attention Indicator LED blinks showing the operator the location where the device should be added. This step is not required but in many systems it can lower the chance of operator error.
    操作员准备系统。这是一个可选步骤,其中人类操作员可以使用系统软件界面或注意按钮来准备进行热添加。系统软件注意指示灯闪烁,向操作员显示设备应添加的位置。这一步骤不是必需的,但在许多系统中,它可以降低操作员出错的几率。
  • Hot-add hardware event: When the device is physically inserted, the platform HW generates an interrupt to notify the system. The presence change interrupt (inband) is sent to the PCI bus driver. Optionally, the presence change interrupt (OOB) can be sent to the system management software. The device does PCIe* training automatically if that was pre-configured or else this training happens when low level configuration software enables it.
    热插拔硬件事件:当设备被物理插入时,平台硬件会生成中断以通知系统。存在更改中断(inband)被发送到 PCI 总线驱动程序。可选地,存在更改中断(OOB)可以发送到系统管理软件。如果预先配置了,设备会自动进行 PCIe*训练,否则当低级配置软件启用时会进行此训练。
  • Low level configuration: The PCI bus driver provides "plug and play" capabilities, with possible assistance by BIOS drivers. Low level configuration includes ensuring PCIe* training has happened, providing PCIe* configuration cycles, and assigning slot resources from the pre-allocated space reserved in the system preparation step. The device driver is loaded and any system SW that had registered to be notified of a hot-add event is notified.
    低级配置:PCI 总线驱动程序提供“即插即用”功能,可能由 BIOS 驱动程序提供帮助。低级配置包括确保 PCIe*训练已完成,提供 PCIe*配置周期,并从系统准备步骤中预留的预分配空间中分配插槽资源。加载设备驱动程序,并通知任何已注册要接收热插拔事件通知的系统软件。

11.6.7 PCIe* / CXL.io Hot-Plug Surprise
11.6.7 PCIe* / CXL.io 热插拔惊喜

11.6.7.1 Functional Description
11.6.7.1 功能描述

NVMe* SSD PCIe* cards can be hot-added/removed into the PCIe* slots directly connected to the processor without going through prior notifications to the system. The flow is dependent on software interrupts. BIOS must correctly configure the "presence_detect_changed_enable" for hot-plug, and link_down conditions.
NVMe* SSD PCIe* 卡可以直接插入/拔出连接到处理器的 PCIe* 插槽,而无需事先通知系统。流程取决于软件中断。BIOS 必须正确配置“presence_detect_changed_enable”以进行热插拔,并处理 link_down 条件。
Processor's IIO supports termination of the outstanding transaction with the PCIe* link down condition. The software stack, including OS with hot-plug aware drivers, is
处理器的 IIO 支持在 PCIe* 链路断开条件下终止未完成的事务。软件堆栈,包括具有热插拔感知驱动程序的操作系统,

required to properly manage the hot insertion or removal flows. The most common case of hot-removal is software managed, where software ensures the drive is quiesced and then disabled. In the software managed case, there is also a failed drive indicator LED to guide the operator to remove the correct device. The surprise hot removal use case is the removal of a drive that is actively being used. This section describes mechanisms that create similar hot-plug behavior to what is common for SAS enclosures today. Specifically, the user can remove a single drive (SAS, SATA, or PCIe*) out of a RAID array without crashing the system or requiring a reboot. The most common reason for Surprise hot-removal is if the operator removes the wrong drive or removes the drive before software preparation is complete. There are IO device failures that manifest as a surprise hot-removal, so addressing Surprise hotremoval also covers a class of SSD failures. For example: the Voltage Regulator in the SSD failing or catastrophic PCIe* controller lock up, both show unstable accessability.
需要正确管理热插拔流程。最常见的热拔插情况是由软件管理的,其中软件确保驱动器处于静止状态,然后禁用它。在软件管理的情况下,还有一个故障驱动器指示灯,指导操作员移除正确的设备。意外热拔插用例是移除正在使用的驱动器。本节描述了创建类似于今天 SAS 外壳常见的热插拔行为的机制。具体来说,用户可以在不崩溃系统或需要重新启动的情况下从 RAID 阵列中移除单个驱动器(SAS、SATA 或 PCIe*)。意外热拔插最常见的原因是操作员移除错误的驱动器或在软件准备完成之前移除驱动器。有 IO 设备故障会表现为意外热拔插,因此解决意外热拔插也涵盖了一类 SSD 故障。例如:SSD 中的电压调节器故障或 PCIe*控制器崩溃,都会表现为不稳定的可访问性。

11.6.7.2 Hot-Plug Removal Surprise High Level Flow
11.6.7.2 热插拔移除意外高级流程

The following is the high level flow for Surprise Hot Removal of a PCIe* SSD:
以下是 PCIe* SSD 惊喜热拔插的高级流程:
  • Surprise hot-removal hardware event: The human operator removes an Enterprise PCIe* SSD. This generates interrupts to notify the system that the drive is removed. The presence change interrupt (in-band) and possibly PCIe* link down interrupt are sent to the PCI bus driver, and optionally the presence change interrupt (out-of-band) is sent to the system management software. There may also be a number of PCIe* errors created by system and connector noise as the drive is being removed. The Enterprise PCIe* SSD removed is assumed to have an undefined state. Previous committed writes would remain, but any inflight I/O transactions (including updates to metadata) are undefined. This is the same behavior as for existing SSDs and HDDs.
    惊喜热拔插硬件事件:人类操作员拔下企业 PCIe* SSD。这会生成中断以通知系统驱动器已被移除。存在更改中断(内部)和可能的 PCIe* 链路中断被发送到 PCI 总线驱动程序,还可以选择将存在更改中断(外部)发送到系统管理软件。当驱动器被移除时,系统和连接器噪声可能会产生一些 PCIe* 错误。假定已移除的企业 PCIe* SSD 处于未定义状态。之前提交的写入将保留,但任何正在进行的 I/O 事务(包括对元数据的更新)都是未定义的。这与现有 SSD 和 HDD 的行为相同。
  • Low level recovery: The PCI bus driver provides "Plug and Play" capabilities with possible assistance by BIOS drivers. When the PCI bus driver detects the removal of the Enterprise PCIe* SSD drive: it completes any additional clean up required such as unloading drivers, reclaims the assigned resources (address ranges, MSI), resets any hardware state, and notifies the storage system (for example, RAID) software.
    低级别恢复:PCI 总线驱动程序通过 BIOS 驱动程序的可能协助提供“即插即用”功能。当 PCI 总线驱动程序检测到企业 PCIe* SSD 驱动器被移除时:它会完成任何额外的清理工作,例如卸载驱动程序,回收分配的资源(地址范围,MSI),重置任何硬件状态,并通知存储系统(例如 RAID)软件。
  • Higher level recovery: The storage system (for example, RAID) software has to recover from the unexpectedly removed device, often using RAID mechanisms. Recovery and application survival depends on having enough RAID coverage. If the surprise hot-removal is the first drive error on this system and it was protected by RAID, then the system should recover and continue operating. If there was no PCIe* SSD storage redundancy or there was a previous error then the surprise hot-removal may be fatal to the application because of lost access to critical data even if the OS does not panic.
    更高级别的恢复:存储系统(例如 RAID)软件必须从意外移除的设备中恢复,通常使用 RAID 机制。恢复和应用程序的生存取决于具有足够的 RAID 覆盖范围。如果意外热插拔是该系统上的第一个驱动器错误,并且受到 RAID 保护,则系统应该恢复并继续运行。如果没有 PCIe* SSD 存储冗余或之前发生过错误,则意外热插拔可能对应用程序造成致命影响,因为即使操作系统不会发生崩溃,但会丢失对关键数据的访问。
This high-level flow is similar to the software managed hot-removal, but the fact it is unexpected means that all layers of the software stack could be in a wide range of states.
这种高级别流程类似于软件管理的热插拔,但意外发生意味着软件堆栈的所有层可能处于各种状态。

11.6.8 CXL .cachemem Viral Feature
11.6.8 CXL .cachemem 病毒特性

CXL links and CXL devices are expected to be Viral compliant. Viral support capability and control for enabling are reflected in the DVSEC. Viral mode is an error containment mechanism (See Viral Mode of Error Containment). Viral is not a replacement for existing error-reporting mechanisms. Instead, its purpose is an additional error-containment mechanism.
CXL 链路和 CXL 设备预计符合病毒规范。在 DVSEC 中反映了用于启用的病毒支持能力和控制。病毒模式是一种错误容纳机制(请参阅错误容纳的病毒模式)。病毒不是现有错误报告机制的替代品。相反,它的目的是作为一种额外的错误容纳机制。
Viral needs to be communicated in both directions. When Viral is enabled and the CPU runs into a Viral condition, it will communicate Viral across CXL.cache and/or CXL.mem to all downstream components. If a CXL device signals Viral to the CPU, it will propagated to the IIO subsystem and so forth.
病毒需要在两个方向上进行通信。当启用病毒并且 CPU 遇到病毒条件时,它将通过 CXL.cache 和/或 CXL.mem 向所有下游组件传递病毒。如果 CXL 设备向 CPU 发出病毒信号,则将传播到 IIO 子系统等。

11.7 System Level RAS Features
11.7 系统级 RAS 功能

The following table summarizes the key system level RAS features. Two-socket workstation follows Standard RAS SKU.
以下表格总结了关键的系统级 RAS 功能。双插槽工作站遵循标准 RAS SKU。
Table 107. Summary of System Level RAS Features
表 107. 系统级 RAS 功能摘要
Feature Name Description
Advance
d
RAS SKU
MCA Recovery (Execution,
MCA 恢复(执行,
non-execution path) 非执行路径)
Software layer assisted recovery from uncorrectable data errors. Enables
软件层辅助从不可纠正数据错误中恢复。使能
software layers (OS, VMM, DBMS, Application) to assist in system
软件层(操作系统、虚拟机监控程序、数据库管理系统、应用程序)以帮助系统从硬件级别无法纠正的错误中恢复。
recovery from errors that are not correctable at the hardware level.
需要启用 CDC 模式,允许损坏数据的传播。
Requires enabling CDC mode allowing the propagation of corrupted data
使系统能够在硬件级别无法纠正的错误发生时从中恢复。
to the CPU core. Require UEFI FW support if EMCA Gen 2 feature is
到 CPU 核心。如果需要 EMCA Gen 2 功能,则需要 UEFI 固件支持
enabled.
Yes Yes
MCA Recovery 2.0 (based
MCA 恢复 2.0(基于 EMCA gen2)
on EMCA gen2)
Software layer assisted recovery from uncorrected data errors as defined
软件层协助从未校正的数据错误中恢复,如 MCA 2.0(EMCA gen2)规范所定义。
by the MCA 2.0 (EMCA gen2) specification.
本地机器检查
No Yes
Local Machine Check 本地机器检查
(LMCE) Based Recovery (LMCE) 基于恢复
LMCE allows the capability to deliver the SRAR-type of UCR event to only
LMCE 允许仅将 SRAR 类型的 UCR 事件传递给受影响的逻辑处理器,接收损坏的数据(毒药)。
affected logical processor receiving the corrupted data (poison). It is an
它是
optional feature. Upon signaling MCE, SW is able to determine if the MCE
可选功能。在发出 MCE 信号时,软件能够确定 MCE 事件是否仅传递给一个逻辑处理器,而无需其他逻辑处理器的全局会合参与。
event is delivered to only one logical processor not requiring global
事件仅传递给一个逻辑处理器,无需其他逻辑处理器的全局会合参与。
rendezvous participation of rest of the logical processors.
其余逻辑处理器不需要全局会合参与。
Yes Yes
Failed DIMM Isolation 失败的 DIMM 隔离
Ability to identify a specific failing DIMM thereby enabling the user to
能够识别特定故障的 DIMM,从而使用户能够
replace only the failed DIMM(s). The processor iMC hardware wont
仅替换故障的 DIMM。处理器 iMC 硬件不会
isolate DIMM with CAP or WrDataCRC error.
使用 CAP 或 WrDataCRC 错误隔离 DIMM。
Yes Yes
OOB Access to Error Logs
OOB 访问错误日志
through Side band 通过 Side band
Allows BMC based RAS implementation. Uses side bankd (eg.PECI )
允许基于 BMC 的 RAS 实现。使用侧面通道(例如 PECI)接入 MCA 银行寄存器(核心/非核心)、内存错误
interface to access MCA bank registers (core/Uncore), memory error
日志,Intel {{0}} UPI 错误日志和 IIO 错误日志。工作站可能没有
logs, Intel UPI error logs, and IIO error logs. Workstation may not have
BMC but may use CSME/IE. PECI access allows atomic access to 64 bit
BMC 但可能使用 CSME/IE。PECI 访问允许对 64 位寄存器进行原子访问
registers. This is an OOB access to error logs feature.
。这是一种用于错误日志的 OOB 访问功能。
Yes Yes
Core Disable For Fault
用于故障的核心禁用
Resilient Boot (FRB) 弹性引导 (FRB)
The capability to disable core(s) at the boot time and therefore allowing
在引导时禁用核心的能力,从而允许系统在核心故障的情况下继续上电。
system to power-on despite of failing core(s). Platform uses BIST (Built-
平台使用 BIST(内建测试)
in Self Test) results from each core to detect the failing core(s) and
在自检中) 从每个核心获取结果以检测失败的核心,并
disables the impacted core(s) upon subsequent boot.
在随后的启动中禁用受影响的核心。
Yes Yes
Surprise Reset (aka, AWR/
意外重置(又称 AWR/
DWR)
With Surprise Reset, platform can continue to harvest error logs.
通过意外重置,平台可以继续收集错误日志。
Yes Yes
Pre-Go S1_IERR Warming/ 预先进行 S1_IERR 警告/
replaced with Suppress 替换为抑制
shutdown cycle 关机循环
MSR 0x60 helps suppress shutdown cycle from inband allowing BMC to
MSR 0x60 有助于抑制关机循环,允许 BMC 在带内控制何时调用 sysreset,在完成收集后
have control over when to invoke the sysreset, after it is done harvesting
它完成后,可以控制何时调用 sysreset
error logs
Yes Yes
Error Injection 错误注入
This capability is used to inject faults within various sub-systems and
此功能用于在各个子系统内注入故障,并允许验证 RAS 功能/流程。处理器包括错误注入
allow verification of RAS features/flows. Processor includes error injection
允许验证 RAS 功能/流程。处理器包括错误注入
capability within main memory, PCIe/IIO, and Intel UPI.
主内存、PCIe/IIO 和 Intel UPI 中的功能。
Yes Yes
Asynchronous MCA Error 异步 MCA 错误
Injection
Asynchronous MCA error injection (aka error spoofing) is a mechanism
异步 MCA 错误注入(又称错误欺骗)是一种机制
by which UEFI FW can write MCA Banks and trigger an event simulating
UEFI 固件可以通过它写入 MCA Bank 并触发模拟错误事件,而无需在硬件中注入真实错误。主要用于软件/操作系统和固件验证。
an error without injecting a real error within the HW. It is primarily useful
一个错误,而不在硬件中注入真实错误。主要用于软件/操作系统和固件验证。
for SW/OS and FW verification.
用于软件/操作系统和固件验证。
No Yes
Predictive Failure Analysis
预测性故障分析
OS/SW, BIOS/SMM, or BMC based failure prediction using various
使用各种校正错误日志和随时间变化的趋势进行基于 OS/SW、BIOS/SMM 或 BMC 的故障预测。
corrected error logs and trends over time. PFA algorithm can be
PFA 算法可以对其进行。
implemented for memory sub-system, Intel UPI, CPU Caches, and PCIe
为内存子系统、Intel UPI、CPU 高速缓存和 PCIe 实现
cluster.
Yes Yes

11.7.1 MCA Recovery - Execution Path
11.7.1 MCA 恢复 - 执行路径

The feature is referring to the software/hardware recover-ability from error condition where poisoned data is delivered to the DCU/IFU.
该功能指的是软件/硬件从错误条件中恢复的能力,其中有毒数据被传递到 DCU/IFU。
  • The processor supports CDC feature within DCU/IFU and allows reporting of certain types of uncorrected data errors as uncorrected recoverable (SRAR type of UCR) error instead of as fatal error enabling advance feature called MCA Recovery - Execution Path. Within the processor, the "error containment" bit is carried all the way to the DCU/IFU. This allows isolation of the corrupted data (that is, attached to the affected "load" receiving corrupted data) and assists in potential software recovery. The following is a high level description of the sequence of events:
    处理器支持 DCU/IFU 内的 CDC 功能,并允许将某些类型的未校正数据错误报告为未校正可恢复 (UCR 类型的 SRAR) 错误,而不是作为致命错误,从而启用称为 MCA Recovery - Execution Path 的高级功能。在处理器内,“错误包含”位一直传递到 DCU/IFU。这允许隔离已损坏的数据(即,附加到受影响的“负载”接收损坏数据)并有助于潜在的软件恢复。以下是事件序列的高级描述:
  • DCU/IFU receives data from MLC with "error containment" bit set.
    DCU/IFU 从 MLC 接收到设置了“错误包含”位的数据。
  • The DCU/IFU logs the error in the MC1/MCO MC bank with PCC=0 and other logging requirements (shown in "DCU Error Logging" and "IFU Error Logging" sub-bullets) and signals MCERR.
    DCU/IFU 在 MC1/MCO MC 银行中记录错误,PCC=0,并满足其他日志记录要求(显示在“DCU 错误记录”和“IFU 错误记录”子项目中),并发出 MCERR 信号。
  • DCU Error Logging: When the DCU logs the SRAR type of UCR error, it logs the error with to indicate that the error may be recoverable. In addition, it captures the physical address of the error. In addition, DCU will set the and AR bits to indicate that the error was signaled with MCERR and SW must take action right away. SW will look for an exact error signature to determine whether the error is recoverable or not.
    DCU 错误日志记录: 当 DCU 记录 UCR 错误的 SRAR 类型时,它会使用 记录错误,以指示错误可能是可恢复的。此外,它会捕获错误的物理地址。此外,DCU 将设置 和 AR 位,以指示错误是通过 MCERR 和 SW 立即采取行动发出的。SW 将寻找精确的错误签名来确定错误是否可恢复。
  • IFU Error Logging: If IFU is hit by instruction fetch and there is uncorrected error detected, then it will set PCC=1 if it cannot correct the error. If Instruction fetch misses IFU, and the delivered data is poisoned, then it may log the error with to indicate that the error may be recoverable. In addition, it captures the physical address of the error and sets the and AR bits. SW will look for an exact error signature to determine whether the error is recoverable or not.
    IFU 错误日志记录: 如果 IFU 受到指令获取的影响并检测到未纠正的错误,那么如果无法纠正错误,它将设置 PCC=1。如果指令获取未命中 IFU,并且传递的数据被毒害,那么它可能会使用 记录错误,以指示错误可能是可恢复的。此外,它会捕获错误的物理地址并设置 和 AR 位。SW 将寻找精确的错误签名来确定错误是否可恢复。

MLC Uncorrected Error Reporting with CDC Enabled:
启用 CDC 的 MLC 未纠正错误报告:

The processor supports CDC feature within MLC and allows reporting of certain types of uncorrected data errors as uncorrected recoverable (UCNA type of UCR) error instead of as fatal error. When CDC feature is enabled by BIOS the MLC behavior with respect to the logging and signaling of uncorrected data errors changes. There are two main cases when "error containment" bit is set within MLC:
处理器支持 MLC 中的 CDC 功能,并允许将某些类型的未校正数据错误报告为未校正可恢复 (UCNA 类型的 UCR) 错误,而不是作为致命错误。当 BIOS 启用 CDC 功能时,MLC 关于未校正数据错误的记录和信号化行为会发生变化。在 MLC 中设置“错误内容”位的两种主要情况是:
  • Case 1 - Uncorrected error within the MLC data array: In this case, if the MLC finds a UCR type error on a lookup to the data array, then this is considered as non-critical error. When such events occur, the MLC will attach an "error containment" bit synchronous to the data and sends it to the requester. Since MLC is the source of the UCR error, the MLC will also log the error in its Machine Check bank as UCNA type of UCR error and signal CMCI if it is enabled
    情况 1 - MLC 数据阵列中的未校正错误:在这种情况下,如果 MLC 在查找数据阵列时发现 UCR 类型错误,则将其视为非关键错误。当发生这种事件时,MLC 将在数据同步时附加一个“错误内容”位并将其发送给请求者。由于 MLC 是 UCR 错误的来源,MLC 还将在其 Machine Check 银行中记录错误为 UCNA 类型的 UCR 错误,并在启用时信号 CMCI
(IA32_MCi_CTL2[30]=1). If CMCI is not enabled this event will not be signaled. In either case, the MLC will not signal MCERR.
(IA32_MCi_CTL2[30]=1)。如果未启用 CMCI,则不会发出此事件的信号。在任何情况下,MLC 都不会发出 MCERR 信号。
  • Case 2 - Data with "error containment" bit set arrives from higher up in the cache hierarchy: In this case, the MLC is not the source of the error - it simply requested data and received data with the "error containment" bit set. In this case, the MLC will neither log nor signal the error (the error should have been logged and CMCI signaled by the source - CHA or IMC for example). It will simply forward the data to its destination with the "error containment bit set. When writing this error data into the MLC it does so with poison indication.
    情况 2 - 数据中的“错误容纳”位设置从高层缓存层次传入:在这种情况下,MLC 不是错误的来源 - 它只是请求数据并接收到带有“错误容纳”位设置的数据。在这种情况下,MLC 既不记录也不信号错误(错误应该已经被来源 - 例如 CHA 或 IMC 记录并通过 CMCI 信号)。它只是将数据转发到目的地,并设置“错误容纳”位。在将此错误数据写入 MLC 时,它会使用毒指示进行操作。
  • Note the MLC should never get data with the "error containment" bit set if CDC feature is disabled (Legacy IA32-MCA mode by setting POISON_ENABLE=0). In legacy IA32-MCA mode, the MLC will ignore "error containment" bit from other sources. No error will be signaled or logged by the MLC.
    请注意,如果禁用 CDC 功能(通过设置 POISON_ENABLE=0 启用传统 IA32-MCA 模式),MLC 不应该收到带有“错误容纳”位设置的数据。在传统 IA32-MCA 模式下,MLC 将忽略其他来源的“错误容纳”位。MLC 不会记录或信号任何错误。
  • Since such errors are considered as UCNA, the errors are logged as , , and , and MISCV=1. IA32_MCi_ADDR and IA32_MCi_MISC will log additional information about the address.
    由于这类错误被视为 UCNA,因此错误将被记录为 ,并且 MISCV=1。IA32_MCi_ADDR 和 IA32_MCi_MISC 将记录有关地址的其他信息。

11.7.2 MCA Recovery Non-execution Path
11.7.2 MCA 恢复非执行路径

The processor logs UCNA for the LLC EWL and iMC patrol scrubber uncorrected error detection events. SRAO error is not supported.
处理器记录 LLC EWL 和 iMC 巡逻清洁器未纠正错误检测事件的 UCNA。不支持 SRAO 错误。
The OS/software action remains the same for error types that used to signal SRAO and now signaling UCNA.
对于以前信号 SRAO 现在信号 UCNA 的错误类型,OS/软件操作保持不变。

11.7.3 MCA Recovery 2.0 (based on EMCA gen2)
11.7.3 MCA Recovery 2.0(基于 EMCA gen2)

This feature allows software layer assisted recovery from uncorrected data errors as defined by the MCA 2.0 (EMCA gen2) specification.
此功能允许软件层辅助从 MCA 2.0(EMCA gen2)规范定义的未校正数据错误中恢复。
EMCA gen2 is a capability that allows firmware to enhance the error logging capabilities of Machine Check Architecture (corrected and uncorrected errors) enabling a Firmware First Model (FFM) of error handling and possible recovery. This feature is only available in Advance RAS SKU.
EMCA gen2 是一种能力,允许固件增强机器检查体系结构的错误日志记录能力(已校正和未校正错误),从而实现错误处理和可能恢复的固件优先模型(FFM)。此功能仅在 Advance RAS SKU 中可用。

11.7.4 Local Machine Check Exception (LMCE) Based Recovery
11.7.4 本地机器检查异常(LMCE)基于恢复

LMCE allows the capability to deliver the SRAR-type of UCR event to only affected logical processor receiving the corrupted data (poison). LCMA addresses cases of multiple recoverable faults detected in close proximity resulting in fatal MCERR event and therefore prevent system recovery.
LMCE 允许将 SRAR 类型的 UCR 事件仅传递给接收到受损数据(毒药)的受影响逻辑处理器。LCMA 解决了在密集检测到多个可恢复故障的情况下导致致命 MCERR 事件并因此阻止系统恢复的情况。
It is an optional feature. Upon signaling MCE, SW is able to determine if the MCE event is delivered to only one logical processor not requiring global rendezvous participation of rest of the logical processors.
这是一个可选功能。在发出 MCE 信号时,软件能够确定 MCE 事件是否仅传递给一个逻辑处理器,而不需要其他逻辑处理器的全局会合参与。
LMCE implements following (additional details to be provided in a future release of this document):
LMCE 实现以下功能(此文档的未来版本将提供更多详细信息):
  • Enumeration: Software mechanism to identify HW support for LMCE feature
    枚举:用于识别硬件支持 LMCE 功能的软件机制
  • Control Mechanism: Ability for BIOS to enable/disable LMCE. Ability for SW to optin to LMCE.
    控制机制:BIOS 启用/禁用 LMCE 的能力。软件选择加入 LMCE 的能力。
  • Identification of LMCE: Upon MCE delivery, SW is able to determine if the delivered MCE was to only one logical processor and global rendezvous participation is not required.
    在 LMCE 的识别:在 MCE 交付时,软件能够确定交付的 MCE 是否仅针对一个逻辑处理器,并且不需要全局会合参与。

11.7.5 Failed DIMM Isolation
11.7.5 DIMM 故障隔离

The processor incorporates several error logs to help the UEFI FW to identify the source of error. See the BIOS RAS Specification for the detailed flow. The processor IMC hardware does not isolate to the DIMM with the CAP or WrDAtaCRC error.
处理器集成了几个错误日志,以帮助 UEFI 固件识别错误的来源。请参阅 BIOS RAS 规范以获取详细流程。处理器 IMC 硬件不会将错误隔离到具有 CAP 或 WrDataCRC 错误的 DIMM。

11.7.6 OOB Access to Error Logs
11.7.6 OOB 访问错误日志

This feature is used on RAS implementations that are based on the BMC. The BMC firmware uses available sideband access interface (PECI, I3C) interface to access MCA bank registers (core/uncore), memory error logs, Intel UPI error logs, and IIO AER logs. PECI access allows atomic access to 64-bit registers. This is an OOB access to error logs feature. Following are key use-cases:
此功能用于基于 BMC 的 RAS 实现。BMC 固件使用可用的侧带访问接口(PECI、I3C)接口访问 MCA bank 寄存器(核心/非核心)、内存错误日志、Intel UPI 错误日志和 IIO AER 日志。PECI 访问允许对 64 位寄存器进行原子访问。这是一种 OOB 访问错误日志的功能。以下是关键用例:
  1. Implementation of FRB: The BMC monitors the CATERR_N and if it is asserted low on multiple boot attempts, it reads various error log registers (refer to Error Reporting (MCA, AER) - Core, Uncore and IIO on page 228), including the MCA bank registers using the PECI interface commands and determines the source of error and takes further action as described in the FRB feature description (Core Disable For Fault Resilient Boot (FRB) on page 302).
    FRB 的实现:BMC 监视 CATERR_N,如果在多次引导尝试中它被断言为低电平,则使用 PECI 接口命令读取各种错误日志寄存器(请参阅第 228 页的错误报告(MCA、AER)-核心、非核心和 IIO),包括 MCA bank 寄存器,并确定错误源,并根据 FRB 功能描述(第 302 页的故障恢复引导(FRB)核心禁用)采取进一步操作。
  2. BMC based RAS implementation: Some processor architectures allow the BMC based availability features, such as memory mirroring, error reporting. Implementing these features depends upon the OOB access to the MCA banks and various other platform specific registers (CSR) access.
    基于 BMC 的 RAS 实现:一些处理器架构允许基于 BMC 的可用性功能,如内存镜像、错误报告。实现这些功能取决于对 MCA 银行和其他各种平台特定寄存器(CSR)访问的 OOB 访问。
  3. In-field Diagnosability: When MCERR is asserted with , host based methods for accessing the MCA bank registers may not be available. An Intel In-Target Probe tool may not be feasible in the field. Therefore, the BMC base OOB access to the MCA bank registers is a viable solution to collect and log all the useful information prior to resetting the system.
    现场诊断能力:当 MCERR 与 断言时,可能无法使用基于主机的方法访问 MCA 银行寄存器。在现场可能无法使用 Intel In-Target Probe 工具。因此,基于 BMC 的 OOB 访问 MCA 银行寄存器是在重置系统之前收集和记录所有有用信息的可行解决方案。

NOTE 注意

Users must use caution when reading the MCA banks registers while the system is in S0 state to prevent any race condition between the host-based read/write and the OOB access to the same register.
用户在系统处于 S0 状态时阅读 MCA 银行寄存器时必须小心,以防止主机读/写和 OOB 访问相同寄存器之间的竞争条件。
The OOB access to error logs requires an external agent such as a Baseboard Management Controller (BMC). The interface used to connect the processor with the BMC is called as Platform Environment Control Interface (PECI). Key commands for accessing processor error logs are RdIAMSR () and RdPCIConfigLocal ().
错误日志的 OOB 访问需要外部代理,如基板管理控制器(BMC)。 用于将处理器与 BMC 连接的接口称为平台环境控制接口(PECI)。 访问处理器错误日志的关键命令是 RdIAMSR()和 RdPCIConfigLocal()。
The OOB access to error logs feature is also used while integrating following RAS features:
集成以下 RAS 功能时还使用错误日志的 OOB 访问功能:
  1. BMC based FRB implementation:
    基于 BMC 的 FRB 实现:
a. Core disable for FRB
a. FRB 的核心禁用
  1. BMC based RAS implementation:
    基于 BMC 的 RAS 实现:
a. PCI Express* Card Surprise Hot-Plug
a. PCI Express* 卡惊喜热插拔
  1. BMC Based in-field diagnosis/maintenance:
    基于 BMC 的现场诊断/维护:
a. Failed DIMM isolation
a. 失败的 DIMM 隔离
b. Pre and post reset error log capturing
b. 在重置前后捕获错误日志
c. PIROM for System Information Storage
c. 用于系统信息存储的 PIROM

11.7.7 Core Disable For Fault Resilient Boot (FRB)
11.7.7 核心禁用用于故障恢复引导(FRB)

As the number of cores increases and in each generation, the single point of failure shifts from being the entire processor to smaller blocks in the processor such as a single core or part of the LLC. In this scenario, it is desirable to shift the focus on Fault Resilient Boot (FRB) such that, in addition to having the capability to disable the entire processor, it is also possible to have the capability to disable a specific core or cores.
随着核心数量的增加,以及每一代处理器,单点故障从整个处理器转移到处理器中的较小块,如单个核心或 LLC 的一部分。在这种情况下,希望将焦点转移到容错启动(FRB),除了具有禁用整个处理器的能力外,还可以禁用特定核心或多个核心的能力。
The processor supports core disabling. Core disabling is allowed only at a core level and a minimum of one core must be active to complete the system boot process. Core disabling can be used either to disable a bad core, or to remove the core count for normal operation. The methodology used to disable a core is as follows:
处理器支持核心禁用。核心禁用仅允许在核心级别进行,并且至少必须有一个核心处于活动状态才能完成系统引导过程。核心禁用可用于禁用有问题的核心,或者用于正常操作中去除核心计数。禁用核心的方法如下:
  1. BMC performs "error discovery" by interrogating error status registers via PECI interface (an OOB access mechanism). This is commonly done through reading MCA_ERR_SRC_LOG register or consulting processor BIST results reported via EAX register.
    BMC 通过 PECI 接口(一种 OOB 访问机制)对错误状态寄存器进行“错误发现”。通常通过读取 MCA_ERR_SRC_LOG 寄存器或查看通过 EAX 寄存器报告的处理器 BIST 结果来完成此操作。
  2. Once this information is gathered and narrowed down to specific failing cores, the BMC is expected to store the "failed core" information in non-volatile memory.
    一旦收集并缩小到特定的失败核心,预期 BMC 将在非易失性存储器中存储“失败核心”信息。
  3. The BIOS (or BMC in BMCINIT mode) is then responsible for programming CSR_DESIRED_CORE register to set appropriate 'cores off mask' field. Initiate warm-reset during which the desired cores will be disabled. After the system initialization, BIOS would read RESOLVED_CORE register confirming the active and disabling of the desired core(s). The BIOS (or BMC in BMCINIT mode) can also use a trial-and-error scheme to recursively disable a processor and see if the system initializes successfully.
    然后 BIOS(或处于 BMCINIT 模式的 BMC)负责编程 CSR_DESIRED_CORE 寄存器以设置适当的“关闭核心掩码”字段。在此期间发起热重置,期间所需的核心将被禁用。系统初始化后,BIOS 将读取 RESOLVED_CORE 寄存器,确认所需核心的激活和禁用。 BIOS(或处于 BMCINIT 模式的 BMC)还可以使用试错方案递归地禁用处理器,并查看系统是否成功初始化。

11.7.8 Surprise Reset (AWR)
11.7.8 意外重置(AWR)

Surprise Reset is the term based from a SoC perspective and AWR is used at a platform level. AWR scheme will overrule graceful warm reset flows and use for postreset error harvesting in response to IERR. AWR is an observed assertion of xxRESET# without the signal of PLTRST_SYNC# from S3M. With S3M, all fabric and PECI interfaces are reset when xxRESET (PLTRST) is asserted. S3M also resets itself when PLTRST follows a warm reset. Requirements include that platform must enable IERR/ ERR2.
惊喜重置是从 SoC 视角出发的术语,AWR 在平台级别上使用。 AWR 方案将覆盖优雅的热重置流程,并用于响应 IERR 中的 postreset 错误收集。 AWR 是在没有来自 S3M 的 PLTRST_SYNC# 信号的情况下观察到的 xxRESET# 的断言。 使用 S3M 时,当 xxRESET (PLTRST) 被断言时,所有织物和 PECI 接口都会被重置。 当 PLTRST 跟随热重置时,S3M 也会重置自身。 要求包括平台必须启用 IERR/ERR2。
DWR terminology has been deprecated. Asynchronous Warm Reset will be the only reset path taken in BHS platform in response to IERR.
DWR 术语已被弃用。在响应 IERR 时,异步热重置将是 BHS 平台上采取的唯一重置路径。
  1. Removes HPR TO dependency. Faster execution time by not waiting for HPRTO to expire.
    删除了 HPR TO 依赖。通过不等待 HPRTO 过期来加快执行时间。
  2. More Robust SoC IP operation after a warm reset. S3M, OOBMSM, fabrics will fully reset after a warm reset. No dependency on PCH ME/PMC requiring Global Reset.
    在温暖重置后,SoC IP 操作更加稳健。S3M、OOBMSM、面料在温暖重置后将完全重置。不依赖于需要全局重置的 PCH ME/PMC。

NOTE 注意

See the technical paper Birch Stream Debug: Using Crash Log and Asynchronous Reset Flows for IERR and S3M Error Triage, document number 779866 for further details.
有关更多详细信息,请参阅技术论文《Birch Stream Debug: 使用崩溃日志和异步重置流程进行 IERR 和 S3M 错误诊断》,文档编号 779866。

11.7.9 Suppress Special Cycle Reset
11.7.9 抑制特殊周期复位

MSR 0x60 bit 0, Suppress_shutdown allows BMC control system resets and collects error logs with no potential for inband reset disrupt error log harvesting.
MSR 0x60 位 0,Suppress_shutdown 允许 BMC 控制系统重置并在不会中断错误日志收集的情况下收集错误日志。

11.7.10 Error Injection Capabilities
11.7.10 错误注入能力

The processor provides an error injection feature within several modules. The primary objective of this feature is to test and debug error handling software and ensure that the SW behavior is matching system level requirement.
处理器在几个模块中提供了错误注入功能。此功能的主要目标是测试和调试错误处理软件,并确保软件行为与系统级要求匹配。

NOTE 注意

Delayed Authentication Mode (DAM) must be enabled for the hardware to allow an injection to occur with production ucode and Intel SGX enabled.
必须启用延迟认证模式(DAM),以便硬件允许在生产微码和启用英特尔 SGX 的情况下进行注入。
This section mainly describes injecting UCR type of errors within the memory subsystem and within the IIO module.
本节主要描述在内存子系统和 IIO 模块中注入 UCR 类型错误。
The error injection feature is designed to meet the Advanced Configuration and Power Interface (ACPI) Specification. This specification outlines an ACPI table mechanism, called EINJ, which allows for a generic interface mechanism through which OSPM can inject hardware errors to the platform without requiring platform specific OSPM level software. The primary goal of this mechanism is to support testing of the OSPM error handling stack by enabling the injection of hardware errors. Through this capability, OSPM is able to implement a simple interface for diagnostic and validation of errors handling on the system.
错误注入功能旨在满足高级配置和电源接口(ACPI)规范。该规范概述了一种 ACPI 表机制,称为 EINJ,通过该机制,OSPM 可以向平台注入硬件错误,而无需特定于平台的 OSPM 级软件。该机制的主要目标是通过启用硬件错误的注入来支持 OSPM 错误处理堆栈的测试。通过这种能力,OSPM 能够为系统上的错误处理实施一个简单的接口,用于诊断和验证错误处理。

11.7.10.1 Memory Error Injection
11.7.10.1 内存错误注入

Memory error injection mechanism can be used to verify various RAS features, including UCR types of errors. The basic HW to support this functionality is shown in the following figure.
内存错误注入机制可用于验证各种 RAS 功能,包括 UCR 类型的错误。支持此功能的基本硬件如下图所示。
Figure 42. High Level View of the Memory Error Injection Mechanism
图 42. 内存错误注入机制的高级视图
IMC module incorporates registers RSP_FUNC_ADDR_MATCH_LO and RSP_FUNC_ADDR_MATCH_HI that hold the physical address of the location where error injection is required. In addition, register RSP_FUNC_ADDR_MATCH_HI contains
IMC 模块包含寄存器 RSP_FUNC_ADDR_MATCH_LO 和 RSP_FUNC_ADDR_MATCH_HI,它们保存需要进行错误注入的物理地址。此外,寄存器 RSP_FUNC_ADDR_MATCH_HI 包含

a bit called bit RSP_FUNC_ADDR_MATCH_EN which is used to arm this mechanism and finally a method to lock these registers so that malicious SW can not access it during run time.
一个名为 RSP_FUNC_ADDR_MATCH_EN 的位,用于激活此机制,最后是一种方法,用于锁定这些寄存器,以防恶意软件在运行时访问它。
Once the Physical Address (PA) has been programmed and the mechanism is armed, the iMC matches the PA in the CSRs to the address that comes in to the IMC for write requests. If there is a match on the address, the iMC causes the DFT error injection logic to inject an uncorrectable data error into memory. Note that this is an uncorrected data error as opposed to writing an "error containment code". This is an important distinction in that if it writes "error contaminant code", the behavior of the HW on a subsequent read is different (refer to Corrupt Data Containment on page 254). As an example, there would be no CMCI signaled by IMC if "error containment code" is read out of the memory.
一旦物理地址(PA)已经被编程并且机制被激活,iMC 会在 CSRs 中匹配 PA 和进入 IMC 的写入请求中的地址。如果地址匹配,iMC 会导致 DFT 错误注入逻辑向内存注入一个不可纠正的数据错误。请注意,这是一个不可纠正的数据错误,而不是写入“错误包含代码”。这是一个重要的区别,如果写入“错误包含代码”,那么在后续读取时硬件的行为会有所不同(参见第 254 页的 Corrupt Data Containment)。例如,如果从内存中读取出“错误包含代码”,则 IMC 不会发出 CMCI 信号。
Once the data has been corrupted, it is then sent to the write pending queue where it is written to the memory DIMM.
一旦数据被损坏,它将被发送到待写入队列,然后写入内存 DIMM。

11.7.10.2 Unlocking the Memory Error Injection Mechanism
11.7.10.2 解锁内存错误注入机制

By default, the error injection mechanism comes up disabled. Since the usage model is for the BIOS to perform the error injection, the BIOS must write in to the ERR_INJ_LOCK MSR to unlock the mechanism. This MSR can be written only from SMM code.
默认情况下,错误注入机制处于禁用状态。由于使用模型是让 BIOS 执行错误注入,因此 BIOS 必须写入 ERR_INJ_LOCK MSR 以解锁该机制。此 MSR 只能从 SMM 代码中写入。

11.7.10.3 Memory Error Injection Flow
11.7.10.3 内存错误注入流程

The memory error injection flow is implemented to maintained architectural compatibility with previous generation of platforms and to maintain the same interface with the OS/BIOS. The following is the basic memory error injection flow.
内存错误注入流程被实现以保持与上一代平台的架构兼容性,并保持与操作系统/BIOS 相同的接口。以下是基本的内存错误注入流程。
  1. The OS indicates the Physical Address (PA) for the error (to be injected) to the BIOS using the ACPI EINJ table entry.
    操作系统使用 ACPI EINJ 表项向 BIOS 指示要注入的错误的物理地址(PA)。
a. Physical address picked by the OS must be OS visible, DRAM backed address in system address space.
a. 操作系统选择的物理地址必须是操作系统可见的,在系统地址空间中由 DRAM 支持的地址。
b. PA must be aligned on a 64-byte boundary.
b. PA 必须对齐到 64 字节边界。
c. The OS needs to make sure that this address is not targeted by DMA.
c. 操作系统需要确保该地址不被 DMA 所指定。
  1. The BIOS converts the PA to memory linear address (MA), identifies the corresponding memory controller, identifies nearest TAD address corresponding to the MA.
    BIOS 将 PA 转换为内存线性地址(MA),识别相应的内存控制器,识别与 MA 对应的最近的 TAD 地址。
a. It is the responsibility of the BIOS to validate the error injection address and negotiate with the OS for the appropriate distance from TAD address (important to patrol scrub errors).
a. BIOS 的责任是验证错误注入地址,并与操作系统协商与 TAD 地址的适当距离(对于巡逻擦拭错误很重要)。
  1. The BIOS identifies the target socket based on the MA.
    BIOS 根据 MA 确定目标插座。
a. The BIOS can run from existing monarch since remote CSR read/write is supported.
a. BIOS 可以从现有的君主运行,因为支持远程 CSR 读/写。
  1. The BIOS stops patrol scrub engine (if patrol scrub is enabled).
    BIOS 停止巡逻擦洗引擎(如果启用了巡逻擦洗)。
  2. The BIOS writes MA to scratch pad.
    BIOS 将 MA 写入临时存储器。
  3. The BIOS does a CLFLUSH on the PA to make sure that the line is not in cache.
    BIOS 对 PA 执行 CLFLUSH,以确保该行不在缓存中。
  4. The BIOS writes to error injection MSR to unlock the injection mechanism (write to this MSR is only possible within SMI).
    BIOS 写入错误注入 MSR 以解锁注入机制(仅在 SMI 内才能写入此 MSR)。
  5. The BIOS writes to Error_Injection CSRs to program the PA for injection and enable the mechanism.
    BIOS 写入 Error_Injection CSRs 以为注入程序 PA 并启用该机制。
  6. The BIOS writes any value to target PA to trigger the error injection (needs CLFLUSH to make sure memory is written).
    BIOS 写入任何值到目标 PA 以触发错误注入(需要 CLFLUSH 确保内存已写入)。
The steps after this point are dependent on the type (or purpose) of error to be triggered.
在此点之后的步骤取决于要触发的错误类型(或目的)。
  1. If the purpose of injection is to cause a patrol scrubbing error jump to step 11. To cause an error on a read:
    如果注入的目的是引发巡逻擦洗错误,请跳至第 11 步。要在读取时引发错误:
a. The BIOS returns control to the OS. The OS causes a read to be issued to the PA.
a. BIOS 将控制返回给操作系统。操作系统导致发出读取到 PA。
b. An error will be logged as either UCNA/SRAR type of UCR error in appropriate machine check bank and a machine check exception will be triggered since core is attempting to consume corrupt data.
b. 在适当的机器检查存储器中,将记录错误作为 UCNA/SRAR 类型的 UCR 错误,并且由于核心正在尝试消耗损坏数据,将触发机器检查异常。
  1. To inject an error for the purpose of causing a patrol scrub error:
    为了注入一个错误以引发巡逻刷错误:
a. The BIOS programs corresponding CSRs to arm the scrubber to start in a region close to the error injection PA.
a. BIOS 程序相应的 CSRs 以启动在接近错误注入 PA 的区域中的刷子。
b. The BIOS restarts the patrol scrub engine.
b. BIOS 重新启动巡逻清洁引擎。
c. The BIOS exits and returns control to the OS.
c. BIOS 退出并将控制权返回给操作系统。
d. The patrol scrubber eventually reaches the targeted memory address and the error will be triggered, logged as a SRAO type of UCR error in the appropriate machine check bank.
d. 巡逻清洁程序最终到达目标内存地址,错误将被触发,在适当的机器检查库中记录为 SRAO 类型的 UCR 错误。
The following table provides a summary of the memory error injection flow.
下表提供了内存错误注入流程的摘要。
Table 108. Memory Error Injection Flow
表 108. 内存错误注入流程
Step Agent Description
1 BIOS
The BIOS sets up EINJ ACPI tables during boot time. This table describes what errors can
BIOS 在启动时设置 EINJ ACPI 表。此表描述了可以发生的错误。
be injected and the method for injecting the error. The standard method consists of
被注入和注入错误的方法。标准方法包括
generating a software SMI (port 0xb2) with a specific command code.
使用特定命令代码生成软件 SMI(端口 0xb2)。
2 APEI/WHEA/OS
The OS specifies the PA where the error needs to be injected and the APIC ID of the
操作系统指定需要注入错误的 PA 和 APIC ID
processor that should see the error as defined in the Advanced Configuration and Power
应该根据高级配置和电源接口(ACPI)规范中定义的错误来查看的处理器
Interface (ACPI) Specification.
操作系统指示 BIOS 注入错误。这是通过 EXECUTE_OPERATION 完成的
3 APEI/WHEA/OS
The OS instructs the BIOS to inject the error. This is accomplished by EXECUTE_OPERATION
操作系统指示 BIOS 注入错误。这是通过 EXECUTE_OPERATION 完成的
command as defined in EINJ table.
在 EINJ 表中定义的命令。
4
The SMI handler locates the logical processor with specified APIC ID. It runs the following
SMI 处理程序定位具有指定 APIC ID 的逻辑处理器。它在该处理器上运行以下序列:
sequence on that processor:
- Unlock error injection by writing to ERR_INJ_LOCK MSR.
- 通过写入 ERR_INJ_LOCK MSR 解锁错误注入。
- Issue CLFLUSH on the PA to make sure that the line is not in cache.
- 在 PA 上发出 CLFLUSH 以确保该行不在缓存中。
- Arm the error injection registers in memory controller (for example, program error
- 在内存控制器中装备错误注入寄存器(例如,编程错误)。
address).
- Write to the specified PA.
- 写入到指定的 PA。
- Resume from SMM.
- 从 SMM 恢复。
5 APEI/WHEA/OS
The OS will issue a read to the specified logical processor at the specified address. This
操作系统将在指定地址向指定逻辑处理器发出读取请求。
read will result in an error being logged and signaled.
读取将导致错误被记录和发出信号。
6 APEI/WHEA/OS The OS informs the BIOS that it is done with this operation.
操作系统通知 BIOS 完成此操作。
7 SMM
SMM will lock out the error injection capability by writing one to ERR_INJ_LOCK MSR bit
SMM 将通过向 ERR_INJ_LOCK MSR 位写入一个来锁定错误注入功能。
zero.

11.7.10.4 PCI Express Error Injection
11.7.10.4 PCI Express 错误注入

An error injection methodology allows the software stack to test how it handles an error. For example, WHEA is a Windows hardware enabling specification that is defined to inject one Corrected and one uncorrectable error to the PCI Express interface within the IIO module. Following are the key features:
错误注入方法允许软件堆栈测试其处理错误的方式。例如,WHEA 是一个 Windows 硬件启用规范,旨在向 IIO 模块内的 PCI Express 接口注入一个已更正和一个不可更正的错误。以下是关键特性:
  1. A vendor specific error injection capability structure is defined for software recognition. The Intel defined Vendor Specific ID is .
    为软件识别定义了供应商特定的错误注入能力结构。英特尔定义的供应商特定 ID 为
  2. Programmatic way to inject correctable and uncorrectable error.
    以编程方式注入可纠正和不可纠正错误。
  • Two bits are provided to allow software to inject correctable and uncorrectable errors. Errors are injected when software sets one of the bits. Software must clear the bit before setting again to generate another error.
    提供两位来允许软件注入可纠正和不可纠正错误。当软件设置其中一位时,错误被注入。软件必须在再次设置以生成另一个错误之前清除该位。
  • This causes normal error logging and interrupt trigger. The error handler code is expected to be invoked. Error injection software should not preclude the invocation of the error handler.
    这会导致正常的错误记录和中断触发。预计将调用错误处理程序代码。错误注入软件不应排除调用错误处理程序。
  • Sets the correct status registers so that the OS handler can go through the flows.
    设置正确的状态寄存器,以便操作系统处理程序可以通过流程。
  1. Security aspect 安全方面
  • A write-once bit is provided to disable the error injection. The BIOS sets this bit in every boot to prevent a denial of service attack. The BIOS keeps this bit clear only when error injection is being tested.
    提供了一个一次写入位来禁用错误注入。BIOS 在每次启动时设置此位,以防止拒绝服务攻击。只有在进行错误注入测试时,BIOS 才会将此位清除。
  1. Leverage aspect 利用方面
  • A vendor specific error injection capability structure is defined for software recognition. The Intel defined Vendor Specific ID is . This allows for long term software support.
    为软件识别定义了供应商特定的错误注入能力结构。Intel 定义的供应商特定 ID 为 。这样可以实现长期的软件支持。
  • Error injection does not require dependency on any external component.
    错误注入不需要依赖于任何外部组件。
WHEA allows for the assertion of a pseudo-error. It is used to assert one corrected and one uncorrectable error at the local PCI Express link error assertion point. A standardized vendor specific capability structure is instantiated for each port. This register contains two bits for injection selection and a third bit to disable the feature. The error disable bit will be a write-once bit that can be set after testing is complete or set by BIOS in normal boot flows. Refer to Figure 43 on page 307 for a partial illustration of the error injection logic.
WHEA 允许断言伪错误。它用于在本地 PCI Express 链路错误断言点断言一个已校正和一个不可校正的错误。为每个端口实例化一个标准化的供应商特定能力结构。该寄存器包含两位用于注入选择的位和第三位用于禁用该功能。错误禁用位将是一个只写位,在测试完成后可以设置,或者在正常引导流程中由 BIOS 设置。请参考第 307 页上的图 43,部分展示了错误注入逻辑。
Figure 43. PCI Express Error Injection Logic Within the IIO Module
图 43. IIO 模块内的 PCI Express 错误注入逻辑

11.7.11 Asynchronous MCA Error Injection
11.7.11 异步 MCA 错误注入

Asynchronous MCA Error Injection (AMEI) is a mechanism by which OS/BIOS code can write MCA banks to simulate an error. The processor supports 2 MSRs for Error Spoofing capability and control. Refer to the Birch Stream Platform BIOS Writer's Guide (BWG) for details on the BIOS/OS software interface and usage flows for AMEI.
异步 MCA 错误注入 (AMEI) 是一种机制,通过该机制,操作系统/BIOS 代码可以写入 MCA bank 以模拟错误。处理器支持 2 个 MSR 用于错误欺骗功能和控制。有关 BIOS/OS 软件接口和 AMEI 的使用流程的详细信息,请参阅 Birch Stream 平台 BIOS 编写指南 (BWG)。
  • DEBUG_ERR_INJ_CTL (MSR 1E3H) - This register has controls to detect MC bank write capability and MCA/CMCI signaling capability. This register is only accessible from SMM. Accessing this register outside SMM causes general-protection (#GP) exception. This register is cleared by CPU reset.
    DEBUG_ERR_INJ_CTL (MSR 1E3H) - 该寄存器具有控制功能,用于检测 MC bank 写入能力和 MCA/CMCI 信令能力。此寄存器仅可从 SMM 访问。在 SMM 之外访问此寄存器会导致通用保护 (#GP) 异常。此寄存器在 CPU 复位时被清除。
  • MCBW_E (bit 0) - Machine Check bank write capability enable flag - If this bit is set, non-zero value writes to the MCi_STATUS, MCi_MISC and MCi_ADDR registers are allowed (does not cause general-protection (#GP) exception) from SMM and ring0. If this bit is clear, regular MC bank access rules apply to the MCi_STATUS, MCi_MISC and MCi_ADDR registers (that is, cause generalprotection (#GP) exception on non-zero value writes).
    MCBW_E (位 0) - 机器检查 bank 写入能力使能标志 - 如果此位被设置,非零值写入 MCi_STATUS、MCi_MISC 和 MCi_ADDR 寄存器是允许的 (不会导致通用保护 (#GP) 异常) 从 SMM 和 ring0。如果此位被清除,MC bank 的常规访问规则适用于 MCi_STATUS、MCi_MISC 和 MCi_ADDR 寄存器 (即,对非零值写入会导致通用保护 (#GP) 异常)。
  • MCA_CMCI_SE (bit 1) - MCA/CMCI signaling enable flag - If this bit is set, enables MCA and CMCI signaling through DEBUG_ERR_INJ_CTL2 register.
    MCA_CMCI_SE(位 1)- MCA/CMCI 信号使能标志 - 如果设置了此位,则通过 DEBUG_ERR_INJ_CTL2 寄存器启用 MCA 和 CMCI 信号。
  • DEBUG_ERR_INJ_CTL2 (MSR 1E4H) - This register has controls to detect MC bank write capability and MCA/CMCI signaling capability. This register is only accessible from SMM. Access of this register outside SMM causes general-protection (#GP) exception. CPU reset clears this register.
    DEBUG_ERR_INJ_CTL2(MSR 1E4H)- 此寄存器具有控制功能,用于检测 MC 银行写入能力和 MCA/CMCI 信号能力。此寄存器仅从 SMM 可访问。在 SMM 之外访问此寄存器会导致通用保护(#GP)异常。CPU 复位会清除此寄存器。
  • MCA_G (bit 0) - Generate MCA - Setting this bit causes MCA to be broadcast to all the threads in the system.
    MCA_G(位 0)- 生成 MCA - 设置此位会导致 MCA 广播到系统中的所有线程。
  • CMCI_G (bit 1) - Generate CMCI - Setting this bit causes CMCI to all the threads in the socket.
    CMCI_G(位 1)- 生成 CMCI - 设置此位会导致 CMCI 到套接字中的所有线程。

11.7.12 Predictive Failure Analysis (PFA)
11.7.12 预测性故障分析(PFA)

OS/SW, BIOS/SMM, or BMC based failure prediction using various corrected error logs and trends over time. PFA algorithm can be implemented for memory sub-system, Intel UPI, CPU caches, and PCIe cluster.
使用各种校正错误日志和随时间变化的趋势进行基于 OS/SW、BIOS/SMM 或 BMC 的故障预测。PFA 算法可用于内存子系统、Intel UPI、CPU 缓存和 PCIe 集群。
SW or firmware monitors error logging enabled by the processor to determine risk of possible future failure. SW or firmware based on analysis can recommend corrective action.
软件或固件通过处理器启用错误日志记录,以确定可能未来故障的风险。基于分析的软件或固件可以推荐纠正措施。

11.8 RAS Offload 11.8 RAS 卸载

Introduction 简介

In current systems, Correctable Error RAS handlers execute under the System Management Mode (SMM) environment. The processor and platform components generate System Management Interrupt (SMI) to invoke these handlers. These handlers communicate with the processor and platform components via memory mapped I/O registers in the processor address space. An alternative approach is to execute handlers under the BMC environment. The primary goal in Birch Stream/ Granite Rapids is to avoid SMIs from interfering with the workload and offloading certain correctable error events to the BMC accomplishes that. When RAS Offload is enabled, the processor and platform components assert ERRO pin when Correctable Errors occur. The BMC monitors the state of the ERRO pin and executes RAS handlers on signal assertion.
在当前系统中,可纠正错误的 RAS 处理程序在系统管理模式(SMM)环境下执行。处理器和平台组件生成系统管理中断(SMI)来调用这些处理程序。这些处理程序通过处理器地址空间中的内存映射 I/O 寄存器与处理器和平台组件通信。另一种方法是在 BMC 环境下执行处理程序。Birch Stream/Granite Rapids 中的主要目标是避免 SMI 干扰工作负载,并将某些可纠正错误事件卸载到 BMC 中实现。当启用 RAS 卸载时,处理器和平台组件在发生可纠正错误时会断言 ERRO 引脚。BMC 监视 ERRO 引脚的状态,并在信号断言时执行 RAS 处理程序。

NOTE 注意

There are some sources of Correctable Errors, that is Intel Ultra Path Interconnect (Intel UPI) Correctable Errors, that do not result in ERRO assertion, in these situations the BMC will poll for error events.
有一些可纠正错误的来源,即英特尔 超路径互连(Intel UPI)可纠正错误,并不会导致 ERRO 断言,在这些情况下,BMC 将轮询错误事件。

Architecture 架构

The term RAS handling has two potential aspects associated with it. First is error reporting, where the handler collects diagnostic information, formulates an error message, and makes it available to the OS. Secondly, there are actions where the FW tries to recover from an erroneous or potentially erroneous scenario. Supported RAS features are listed below.
RAS 处理这个术语有两个潜在的相关方面。首先是错误报告,处理程序收集诊断信息,制定错误消息,并使其对操作系统可用。其次,有一些操作,固件试图从错误或潜在错误的情况中恢复。支持的 RAS 功能如下所示。
Feature Trigger
Memory CE Reporting 内存 CE 报告 Rank level leaky bucket configured to trigger ERR0.
等级级漏桶配置为触发 ERR0。
ADC/ADDDC
Rank level leaky bucket configured to trigger ERR0 and spare
等级级漏桶配置为触发 ERR0 和备用
complete.
SPPR Rank level leaky bucket configured to trigger ERR0.
等级级漏桶配置为触发 ERR0。
DDR ECS Polls (default 24 hours interval) MR registers.
调查(默认 24 小时间隔)MR 注册。
MIrror Failover 镜像故障转移
Polls (default 24 hours interval) M2M bank (misc) shadow
调查(默认 24 小时间隔)M2M 银行(杂项)阴影
registers.
Intel UPI CE Reporting
Intel UPI CE 报告
Polls (default 60 second interval) Intel UPI shadow registers.
轮询(默认 60 秒间隔)Intel UPI 阴影寄存器。
PCIe CE Reporting PCIe CE 报告 IEH configured to trigger ERR0.
配置 IEH 以触发 ERR0。
PCIe eDPC IEH configured to trigger ERR0.
配置 IEH 以触发 ERR0。
RAS handling functionality has been added to an OpenBMC* service, namely rasmanager. At a high-level, the ras-manager service interfaces with various OpenBMC* components to support RAS, including host-error-monitor for monitoring ERRO, RasLib which includes the majority of RAS handling logic, the Intelligent Platform
已将 RAS 处理功能添加到 OpenBMC* 服务中,即 rasmanager。在高层次上,ras-manager 服务与各种 OpenBMC* 组件进行交互,以支持 RAS,包括用于监视 ERRO 的 host-error-monitor,包含大部分 RAS 处理逻辑的 RasLib,智能平台
Management Interface (IPMI) service for facilitating the Handshake Flow, and Memory Mapped BMC Interface (MMBI) for signaling and transferring messages to the Host during run-time. The RAS library is responsible for RAS handling and utilizes PECI for register read and write accesses.
用于促进握手流程的管理接口(IPMI)服务,以及用于在运行时向主机发出信号并传输消息的内存映射 BMC 接口(MMBI)。 RAS 库负责 RAS 处理并利用 PECI 进行寄存器读取和写入访问。

Runtime Flow 运行时流程

  1. Error Signaling/Trigger: CPU Error[0] pin asserts causing interrupt to BMC
    错误信号/触发器:CPU 错误[0] 引脚断言,导致中断传递给 BMC
  2. Error Handling: RAS handlers in BMC environment communicate with the processor and platform components via PECI (wire or MCTP)
    错误处理:BMC 环境中的 RAS 处理程序通过 PECI(线缆或 MCTP)与处理器和平台组件通信
  3. OS Error reporting/logging: The error information needs to be formatted into standard ACPI Error log format. BMC signals SCI to the host OS.
    操作系统错误报告/记录:错误信息需要格式化为标准的 ACPI 错误日志格式。BMC 向主机 OS 发送 SCI 信号。
  4. Host OS Error Log Transfer - Platform ASL code via MMBI (implemented over eSPI )gets error record from BMC, posts it in system memory and notifies OS.
    主机 OS 错误日志传输 - 通过 MMBI(通过 eSPI 实现)的平台 ASL 代码从 BMC 获取错误记录,将其发布在系统内存中并通知 OS。

打开易解析电脑客户端

为您提供更稳定、更快捷的服务,请打开客户端下载。若未安装客户端,请 点击下载
打开客户端