这是用户在 2024-10-6 22:59 为 https://app.immersivetranslate.com/pdf-pro/6b0b7cb8-ed55-4c9f-8a6f-ccc1b5a90d38 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?

Deep learning of left atrial structure and function provides link to atrial fibrillation risk
左心房结构和功能的深度学习为房颤风险提供了联系

Received: 3 September 2021
收到:2021 年 9 月 3 日

Accepted: 24 April 2024 接受:2024 年 4 月 24 日
Published online: 21 May 2024
于 2024 年 5 月 21 日在线发布

Check for updates 检查更新

James P. Pirruccello (D) 1 , 2 , 3 , 4 1 , 2 , 3 , 4 ^(1,2,3,4){ }^{1,2,3,4}, Paolo Di Achille (D) 5 , 6 5 , 6 ^(5,6){ }^{5,6}, Seung Hoan Choi (1) 7 7 ^(7){ }^{7}, Joel T. Rämö 5 , 8 5 , 8 ^(5,8){ }^{5,8}, Shaan Khurshid 5 , 9 , 10 , 11 , 12 5 , 9 , 10 , 11 , 12 ^(5,9,10,11,12){ }^{5,9,10,11,12}, Mahan Nekoui (1) 5 , 12 5 , 12 ^(5,12){ }^{5,12}, Sean J. Jurgens (1) 5 , 13 , 14 5 , 13 , 14 ^(5,13,14){ }^{5,13,14} ,Victor Nauffal (1) 5 , 15 5 , 15 ^(5,15){ }^{5,15} ,Shinwan Kany 5 , 16 5 , 16 ^(5,16){ }^{5,16} ,FinnGen*, Kenney Ng (1) 17 17 ^(17){ }^{17}, Samuel F. Friedman (1) 5 , 6 5 , 6 ^(5,6){ }^{5,6}, Puneet Batra (1) 6 6 ^(6){ }^{6}, Kathryn L. Lunetta (1) 18 18 ^(18){ }^{18}, Aarno Palotie (1) 8,19,20, Anthony A. Philippakis (1) 6 6 ^(6){ }^{6}, Jennifer E. Ho (1) 6 , 12 , 21 6 , 12 , 21 ^(6,12,21){ }^{6,12,21}, Steven A. Lubitz (1) 5 , 9 , 10 , 12 & 5 , 9 , 10 , 12 & ^(5,9,10,12)&{ }^{5,9,10,12} \& Patrick T. Ellinor (1) 5 , 9 , 10 , 12 5 , 9 , 10 , 12 ^(5,9,10,12){ }^{5,9,10,12}
詹姆斯·P·皮鲁切罗(D) 1 , 2 , 3 , 4 1 , 2 , 3 , 4 ^(1,2,3,4){ }^{1,2,3,4} ,保罗·迪·阿基勒(D) 5 , 6 5 , 6 ^(5,6){ }^{5,6} ,崔承焕(1) 7 7 ^(7){ }^{7} ,Joel T. Rämö 5 , 8 5 , 8 ^(5,8){ }^{5,8} ,Shaan Khurshid 5 , 9 , 10 , 11 , 12 5 , 9 , 10 , 11 , 12 ^(5,9,10,11,12){ }^{5,9,10,11,12} ,Mahan Nekoui (1) 5 , 12 5 , 12 ^(5,12){ }^{5,12} ,Sean J. Jurgens (1) 5 , 13 , 14 5 , 13 , 14 ^(5,13,14){ }^{5,13,14} ,维克托·诺夫卢(1) 5 , 15 5 , 15 ^(5,15){ }^{5,15} ,Shinwan Kany 5 , 16 5 , 16 ^(5,16){ }^{5,16} ,FinnGen*,Kenney Ng (1) 17 17 ^(17){ }^{17} ,Samuel F. Friedman (1) 5 , 6 5 , 6 ^(5,6){ }^{5,6} ,Puneet Batra (1) 6 6 ^(6){ }^{6} ,Kathryn L. Lunetta (1) 18 18 ^(18){ }^{18} ,Aarno Palotie (1) 8,19,20,Anthony A. Philippakis (1) 6 6 ^(6){ }^{6} ,Jennifer E. Ho (1) 6 , 12 , 21 6 , 12 , 21 ^(6,12,21){ }^{6,12,21} ,Steven A. Lubitz (1) 5 , 9 , 10 , 12 & 5 , 9 , 10 , 12 & ^(5,9,10,12)&{ }^{5,9,10,12} \& ,Patrick T. Ellinor (1) 5 , 9 , 10 , 12 5 , 9 , 10 , 12 ^(5,9,10,12){ }^{5,9,10,12}

Atrial fibrillation (AF) is a common arrhythmia that is projected to affect up to 12 million Americans by 2050 1 1 ^(1){ }^{1}. As a leading cause of stroke 2 , 3 2 , 3 ^(2,3){ }^{2,3}, the risk factors for AF have been the subject of extensive investigation 4 6 4 6 ^(4-6){ }^{4-6}. Enlargement of left atrial (LA) volumes is commonly observed with hypertension 7 7 ^(7){ }^{7}, heart failure 8 8 ^(8){ }^{8}, or after a diagnosis of AF 9 , 10 AF 9 , 10 AF^(9,10)\mathrm{AF}^{9,10}-and AF plays a causal role in this process 11 11 ^(11){ }^{11}. Enlargement of the LA and decreased LA function have also been identified as independent risk factors for AF 10 , 12 17 AF 10 , 12 17 AF^(10,12-17)\mathrm{AF}^{10,12-17} and stroke 18 20 18 20 ^(18-20){ }^{18-20}. Together, these atrial structural, contractile, or electrophysiological changes that have clinical consequences have been termed atrial cardiomyopathies 21 , 22 21 , 22 ^(21,22){ }^{21,22}.
房颤(AF)是一种常见的心律失常,预计到 2050 年将影响高达 1200 万美国人 1 1 ^(1){ }^{1} 。作为中风的主要原因 2 , 3 2 , 3 ^(2,3){ }^{2,3} ,AF 的危险因素一直是广泛研究的课题 4 6 4 6 ^(4-6){ }^{4-6} 。左心房(LA)容积的增大通常伴随高血压 7 7 ^(7){ }^{7} 、心力衰竭 8 8 ^(8){ }^{8} 或诊断后 AF 9 , 10 AF 9 , 10 AF^(9,10)\mathrm{AF}^{9,10} ,AF 在此过程中起因果作用 11 11 ^(11){ }^{11} 。LA 的增大和 LA 功能下降也被确定为独立的中风 AF 10 , 12 17 AF 10 , 12 17 AF^(10,12-17)\mathrm{AF}^{10,12-17} 和中风 18 20 18 20 ^(18-20){ }^{18-20} 危险因素。总之,这些对临床有影响的房性结构、收缩或电生理改变被称为房性心肌病 21 , 22 21 , 22 ^(21,22){ }^{21,22}
The link between LA function and AF risk has prompted interest in determining the heritability and common genetic basis for variation in LA measurements. A large-scale genome-wide association study (GWAS) in 30,201 individuals with LA measurements ascertained by
洛杉矶功能与房颤风险的联系引起了人们对确定左心房测量变化的遗传性和共同遗传基础的兴趣。在 30,201 名左心房测量已确定的个体中进行的大规模基因组关联研究(GWAS)表明,

echocardiography did not identify any loci with P < 5 E 08 23 P < 5 E 08 23 P < 5E-08^(23)P<5 \mathrm{E}-08^{23}. Recently, a GWAS of deep learning-derived diastolic measurements in 34,245 UK Biobank participants identified one variant associated with LA volume near N P R 3 24 , 25 N P R 3 24 , 25 NPR3^(24,25)N P R 3^{24,25}, and a GWAS of a biplanar estimate of LA volume and function identified 14 unique loci in 35,658 participants 26 26 ^(26){ }^{26}.
心脏超声检查未发现任何具有 P < 5 E 08 23 P < 5 E 08 23 P < 5E-08^(23)P<5 \mathrm{E}-08^{23} 位点。最近,一项基于深度学习的二尖瓣舒张功能测量的基因组范围关联研究,在 34,245 名英国生物银行参与者中发现一个与左房容积相关的变异体,位于 N P R 3 24 , 25 N P R 3 24 , 25 NPR3^(24,25)N P R 3^{24,25} 附近,另一项关于二维估计的左房容积和功能的基因组范围关联研究,在 35,658 名参与者中发现 14 个独特的位点 26 26 ^(26){ }^{26} .
Taking advantage of the precision of cardiovascular magnetic resonance imaging (MRI), we developed deep learning models to produce two-dimensional measurements of the LA in 40,558 participants in the UK Biobank 27 , 28 27 , 28 ^(27,28){ }^{27,28}, and applied a surface reconstruction technique to integrate these data into three-dimensional LA volume estimates. We reproduced prior observational associations between LA measurements and AF, heart failure, hypertension, and stroke. We then undertook analyses to identify common genetic variants associated with LA volumes in over 35,000 UK Biobank participants.
利用心血管磁共振成像(MRI)的精度,我们开发了深度学习模型,对英国生物银行中 40,558 名参与者的左心房二维测量进行建模,并应用表面重建技术将这些数据整合成三维左心房容积估计。我们重现了先前的观察性研究中左心房测量与房颤、心力衰竭、高血压和中风之间的关联。随后,我们对超过 35,000 名英国生物银行参与者进行分析,以确定与左心房容积相关的常见遗传变异。 27 , 28 27 , 28 ^(27,28){ }^{27,28}
Finally, using common genetic variants as instruments for Mendelian randomization, we performed bidirectional causal analyses between LA volume and AF .
最后,利用常见的遗传变异作为门捷尔随机分析的工具,我们对左心房容量和心房纤颤之间进行了双向因果分析。

Results 结果

Reconstruction of LA volumes from cardiovascular magnetic resonance images
从心血管磁共振图像重建 LA 体积

We trained deep learning models to annotate the LA and left ventricular blood pools in four views (distinct models for the short axis view, and the two-, three-, and four-chamber long axis views). We then applied these models to all available UK Biobank cardiovascular magnetic resonance imaging (MRI) data (Methods) 27 29 27 29 ^(27-29){ }^{27-29}. The quality of the deep learning models for measuring the LA was higher for the long axis views and lower for the short-axis views, which were not designed to capture the LA (Supplementary Note). We integrated the data from these separate cross-sections to compute the surface of a 3-dimensional representation of the LA (Supplementary Note), yielding LA volume estimates at 50 timepoints throughout the cardiac cycle for 40,558 participants (Fig. 1). We conducted analyses on the maximum LA volume (LAmax), the minimum LA volume (LAmin), the difference between those two volumes (stroke volume; LASV), and the emptying fraction (LASV/LAmax; LAEF), as well as their body surface area (BSA)-indexed counterparts (Supplementary Fig. 1).
我们训练了深度学习模型,对四种视角(短轴视图和两室、三室及四室长轴视图的独立模型)中的 LA 和左心室血池进行标注。我们随后将这些模型应用于所有可用的英国生物银行心血管磁共振成像(MRI)数据(方法)。 27 29 27 29 ^(27-29){ }^{27-29} 长轴视图的深度学习模型测量 LA 的质量较高,而短轴视图的质量较低,因为这些视图未设计用于捕获 LA(补充说明)。我们整合了来自这些单独横截面的数据,计算出 LA 三维表征的表面(补充说明),从而获得在整个心脏周期的 50 个时间点上 40,558 名参与者的 LA 容积估计值(图 1)。我们对最大 LA 容积(LAmax)、最小 LA 容积(LAmin)、这两个容积之差(容积变化量;LASV)以及排空分数(LASV/LAmax;LAEF)及其相对于体表面积(BSA)的指标进行了分析(补充图 1)。

LA traits are associated with AF, heart failure, hypertension, and stroke
LA 特征与心房颤动、心力衰竭、高血压和中风相关

We analyzed the pattern of cardiac chamber volumes throughout the cardiac cycle in order to identify individuals with abnormal atrial
我们分析了整个心脏周期内心腔容量的模式,以识别出现异常心房的个体

Fig. 1 | Surface reconstruction for left atrial volume. Study overview. Top left panel: orientation of the different planes in which images of the atrium were captured. The art in this panel is derived from Servier Medical Art (licensed under creativecommons by attribution, CC-BY-4.0 [https://creativecommons.org/ licenses/by/4.0/]). Right panel: Example images from each of the four imaging
图 1 | 左心房容积的表面重建。研究概述。左上面板:捕捉心房图像的不同平面的取向。该面板中的艺术作品来自 Servier Medical Art(获得知识共享署名 4.0 国际许可,CC-BY-4.0 [https://creativecommons.org/licenses/by/4.0/])。右面板:四种成像方法的示例图像

contraction (Supplementary Note; Supplementary Fig. 2). Interestingly, a subset of 1013 participants with abnormal cardiac filling patterns had markedly elevated LA volumes, similar to those with preexisting AF (Fig. 2), and were excluded from downstream analyses.
收缩(补充说明; 补充图 2)。有趣的是,1013 名参与者中存在异常心脏充盈模式的一部分人,其左房体积显著升高,类似于既往存在房颤的人群(图 2),因此被排除在后续分析之外。
In the remaining 39,545 participants, we evaluated the association between LA measurements and prevalent or incident AF (Supplementary Note). The LA phenotype most strongly associated with AF was the LA minimal volume (LAmin). The 813 individuals with pre-existing AF had a greater LAmin ( + 8.8 mL , P = 9.2 E 117 + 8.8 mL , P = 9.2 E 117 +8.8mL,P=9.2E-117+8.8 \mathrm{~mL}, P=9.2 \mathrm{E}-117 ). In the 2.2 years of follow-up time (mean) available on average after MRI acquisition, the risk of incident AF was increased among those with greater LAmin (293 cases; HR 1.73 per standard deviation [SD] increase; 95 % 95 % 95%95 \% CI 1.60-1.88; P = 4.0 E P = 4.0 E P=4.0E-P=4.0 \mathrm{E}- 39). We also observed significant associations between LA measurements and hypertension, heart failure, and stroke (Fig. 3 and Supplementary Tables 1-3), as well as continuous traits such as blood pressure, creatinine, and pack years of tobacco use (Supplementary Data 1).
在其余 39,545 名参与者中,我们评估了 LA 测量与盛行性或发生性房颤之间的关联(补充说明)。与房颤最强相关的 LA 表型是 LA 最小容积(LAmin)。有既往房颤的 813 人的 LAmin 更大( + 8.8 mL , P = 9.2 E 117 + 8.8 mL , P = 9.2 E 117 +8.8mL,P=9.2E-117+8.8 \mathrm{~mL}, P=9.2 \mathrm{E}-117 )。在平均 MRI 采集后 2.2 年的随访期内,LAmin 较大者发生房颤的风险增加(293 例;HR 1.73/标准差[SD]; 95 % 95 % 95%95 \% CI 1.60-1.88; P = 4.0 E P = 4.0 E P=4.0E-P=4.0 \mathrm{E}- P<0.00039)。我们还观察到 LA 测量与高血压、心力衰竭和卒中之间存在显著相关(图 3 和补充表 1-3),以及血压、肌酐和吸烟包年等连续性特征也存在相关(补充数据 1)。

Common genetic variant analysis of LA size and function identifies 20 loci
针对左心房大小和功能的常见遗传变异分析确定了 20 个位点

After establishing that the LA measurements replicated previously established clinical associations, we then examined the association between common genetic variants and seven LA traits: LAmax, LAmin, LAEF, and LASV, as well as for BSA-indexed LA volumes. We conducted these analyses in 35,049 participants with genetic data and without a history of AF, coronary artery disease, or heart failure (Table 1; Supplementary Fig. 3). First, we examined the SNP-heritability of the LA traits, which ranged from 0.14 (LAEF) to 0.37 (LAmax; Supplementary Table 4). Genetic correlation between the LA measurements ranged
在建立 LA 测量结果与先前建立的临床关联一致后,我们接着研究了常见遗传变异与七个 LA 特征之间的关联:LAmax、LAmin、LAEF、LASV 以及 BSA 指数化的 LA 体积。我们对 35,049 名没有心房纤颤、冠状动脉疾病或心力衰竭病史的参与者进行了这些分析(表 1;补充图 3)。首先,我们研究了 LA 特征的 SNP 遗传性,范围从 0.14(LAEF)到 0.37(LAmax;补充表 4)。LA 测量之间的遗传相关性范围

planes; after interpretation with the deep learning model, the left atrium is colored blue. Reproduced by kind permission of UK Biobank ©. Bottom left panel: schematic overview representing reconstruction of the left atrium based on information obtained from the deep learning output from the four imaging planes.
平面;经过深度学习模型解释后,左心房呈蓝色。经英国生物银行©授权转载。左下面板:示意图概览展示了根据从四个成像平面获得的深度学习输出重建左心房。

Fig. 2 | Left atrial volume variation based on AF history and cardiac filling patterns. In the left panel, a flow diagram breaks down the imaged population into groups with and without AF , and then further into groups that do and do not appear
图 2 | 根据房颤病史和心脏充填模式的左房容积变化。在左侧面板中,一个流程图将成像人群分为有和无房颤的组别,然后进一步分为具有和不具有该特征的组别。


to have normal cardiac filling patterns. In the right panel, the LAmin volume is depicted for these groups with violin plots; the median for each group is demarcated with a vertical line. Source data are provided as a Source Data file.
具有正常的心脏填充模式。在右侧面板中,使用小提琴图描述了这些组的最小左心房容积(LAmin);每个组的中位数用垂直线标出。数据源文件已提供。

Fig. 3 | Epidemiological relationships between left atrial volume and disease. Left panel (“Prevalent disease”): the difference in LA volumes ( Y Y YY axis) between UK Biobank participants with atrial fibrillation (“AF”), heart failure (“CHF”), hypertension (“HTN”), or stroke occurring prior to MRI compared to participants without disease ( X X XX axis). N = 39 , 545 N = 39 , 545 N=39,545N=39,545 participants; 813 with AF , 149 AF , 149 AF,149\mathrm{AF}, 149 with stroke, 210 with CHF, and 11,852 with HTN. Right panel (“Incident disease”): hazard ratios for incidence of AF , CHF , HTN AF , CHF , HTN AF,CHF,HTN\mathrm{AF}, \mathrm{CHF}, \mathrm{HTN}, and stroke ( Y Y YY axis) occurring after MRI per 1 standard deviation
图 3 | 左心房容积与疾病之间的流行病学关系。左侧面板("现存疾病"): 与未罹患疾病的参与者相比,在 UK Biobank 中罹患心房纤颤("AF")、心力衰竭("CHF")、高血压("HTN")或中风的参与者的左心房体积( Y Y YY 轴)之差异。 X X XX 名参与者;813 名中风, 210 名 CHF, 11,852 名 HTN。右侧面板("新发疾病"): 每标准差左心房体积增加后, AF , CHF , HTN AF , CHF , HTN AF,CHF,HTN\mathrm{AF}, \mathrm{CHF}, \mathrm{HTN} 和中风 Y Y YY 发病风险比。

increase in LA volumes ( X X XX axis). N = 36 , 900 N = 36 , 900 N=36,900N=36,900 (fewer due to prevalent disease for CHF and HTN; Supplementary Table 3); 293 with incident AF, 98 with stroke, 125 with CHF, 469 with HTN. Mean volume difference (left panel) or hazard ratio per standard deviation (right panel) estimates are represented by a circle; 95 % 95 % 95%95 \% confidence intervals for the estimate are represented by error bars. Source data are provided as a Source Data file.
LA 容积的增加( X X XX 轴)。 N = 36 , 900 N = 36 , 900 N=36,900N=36,900 (由于 CHF 和 HTN 常见疾病;补充表 3);293 例发生 AF,98 例中风,125 例 CHF,469 例 HTN。平均容积差(左面板)或每个标准差的风险比(右面板)估计值用圆圈表示; 95 % 95 % 95%95 \% 估计值的置信区间用误差线表示。数据来源文件提供。

Table 1 | Participant characteristics
表 1 | 参与者特征

Women 女性 Men 男人 Both 
N 18,916 16,133 35,049
Age at time of MRI
接受 MRI 检查时的年龄
64 (8) 65 (8) 64 (8)
BMI ( kg / m 2 kg / m 2 (kg//m^(2):}\left(\mathrm{kg} / \mathrm{m}^{2}\right. ) 26 (5) 27 (4) 26 (4)
Height (cm) 身高 (厘米) 163 (6) 176 (7) 169 (9)
Weight (kg) 重量 (千克) 69 (13) 83 (13) 75 (15)
Systolic blood pressure ( mmHg )
收缩压(毫米汞柱)
136 (19) 142 (17) 139 (19)

舒张压 ( mmHg ) ( mmHg ) (mmHg)(\mathrm{mmHg})
Diastolic blood pres-
sure ( mmHg ) ( mmHg ) (mmHg)(\mathrm{mmHg})
Diastolic blood pres- sure (mmHg)| Diastolic blood pres- | | :--- | | sure $(\mathrm{mmHg})$ |
77 (10) 81 (10) 79 (10)
Left atrium maximum volume ( cm 3 ) cm 3 (cm^(3))\left(\mathrm{cm}^{3}\right)
左房最大容积 ( cm 3 ) cm 3 (cm^(3))\left(\mathrm{cm}^{3}\right)
64 (15) 79 (19) 71 (18)
Left atrium minimum volume ( cm 3 ) cm 3 (cm^(3))\left(\mathrm{cm}^{3}\right)
左心房最小容积 ( cm 3 ) cm 3 (cm^(3))\left(\mathrm{cm}^{3}\right)
28 (9) 37 (12) 32 (11)
Left atrium stroke volume ( cm 3 ) cm 3 (cm^(3))\left(\mathrm{cm}^{3}\right)
左心房收缩量 ( cm 3 ) cm 3 (cm^(3))\left(\mathrm{cm}^{3}\right)
36 (8) 43 (11) 39 (10)
Left atrium emptying fraction (%)
左心房排空分数(%)
57 (8) 54 ( 7 ) 54 ( 7 ) 54(7)54(7) 56 (8)
Mitral regurgitation (%) 二尖瓣反流(%) 10 (0) 9 (0) 19 (0)
Mitral stenosis (%) 二尖瓣狭窄(%) 3 (0) 0 ( 0 ) 0 ( 0 ) 0(0)0(0) 3 ( 0 ) 3 ( 0 ) 3(0)3(0)
Heart failure (%) 心力衰竭(%) 0 ( 0 ) 0 ( 0 ) 0(0)0(0) 0 ( 0 ) 0 ( 0 ) 0(0)0(0) 0 ( 0 ) 0 ( 0 ) 0(0)0(0)
Hypertrophic cardiomyopathy (%)
肥厚型心肌病(%)
0 ( 0 ) 0 ( 0 ) 0(0)0(0) 0 ( 0 ) 0 ( 0 ) 0(0)0(0) 0 ( 0 ) 0 ( 0 ) 0(0)0(0)
Congenital heart disease (%)
先天性心脏病(%)
3 (0) 1 (0) 4 ( 0 ) 4 ( 0 ) 4(0)4(0)
Aortic valve disease (%) 主动脉瓣膜疾病(%) 18 (0) 21 (0) 39 (0)
Atrial fibrillation or flutter (%)
房颤或房扑(%)
0 ( 0 ) 0 ( 0 ) 0(0)0(0) 0 ( 0 ) 0 ( 0 ) 0(0)0(0) 0 ( 0 ) 0 ( 0 ) 0(0)0(0)
Women Men Both N 18,916 16,133 35,049 Age at time of MRI 64 (8) 65 (8) 64 (8) BMI (kg//m^(2):} ) 26 (5) 27 (4) 26 (4) Height (cm) 163 (6) 176 (7) 169 (9) Weight (kg) 69 (13) 83 (13) 75 (15) Systolic blood pressure ( mmHg ) 136 (19) 142 (17) 139 (19) "Diastolic blood pres- sure (mmHg)" 77 (10) 81 (10) 79 (10) Left atrium maximum volume (cm^(3)) 64 (15) 79 (19) 71 (18) Left atrium minimum volume (cm^(3)) 28 (9) 37 (12) 32 (11) Left atrium stroke volume (cm^(3)) 36 (8) 43 (11) 39 (10) Left atrium emptying fraction (%) 57 (8) 54(7) 56 (8) Mitral regurgitation (%) 10 (0) 9 (0) 19 (0) Mitral stenosis (%) 3 (0) 0(0) 3(0) Heart failure (%) 0(0) 0(0) 0(0) Hypertrophic cardiomyopathy (%) 0(0) 0(0) 0(0) Congenital heart disease (%) 3 (0) 1 (0) 4(0) Aortic valve disease (%) 18 (0) 21 (0) 39 (0) Atrial fibrillation or flutter (%) 0(0) 0(0) 0(0)| | Women | Men | Both | | :---: | :---: | :---: | :---: | | N | 18,916 | 16,133 | 35,049 | | Age at time of MRI | 64 (8) | 65 (8) | 64 (8) | | BMI $\left(\mathrm{kg} / \mathrm{m}^{2}\right.$ ) | 26 (5) | 27 (4) | 26 (4) | | Height (cm) | 163 (6) | 176 (7) | 169 (9) | | Weight (kg) | 69 (13) | 83 (13) | 75 (15) | | Systolic blood pressure ( mmHg ) | 136 (19) | 142 (17) | 139 (19) | | Diastolic blood pres- <br> sure $(\mathrm{mmHg})$ | 77 (10) | 81 (10) | 79 (10) | | Left atrium maximum volume $\left(\mathrm{cm}^{3}\right)$ | 64 (15) | 79 (19) | 71 (18) | | Left atrium minimum volume $\left(\mathrm{cm}^{3}\right)$ | 28 (9) | 37 (12) | 32 (11) | | Left atrium stroke volume $\left(\mathrm{cm}^{3}\right)$ | 36 (8) | 43 (11) | 39 (10) | | Left atrium emptying fraction (%) | 57 (8) | $54(7)$ | 56 (8) | | Mitral regurgitation (%) | 10 (0) | 9 (0) | 19 (0) | | Mitral stenosis (%) | 3 (0) | $0(0)$ | $3(0)$ | | Heart failure (%) | $0(0)$ | $0(0)$ | $0(0)$ | | Hypertrophic cardiomyopathy (%) | $0(0)$ | $0(0)$ | $0(0)$ | | Congenital heart disease (%) | 3 (0) | 1 (0) | $4(0)$ | | Aortic valve disease (%) | 18 (0) | 21 (0) | 39 (0) | | Atrial fibrillation or flutter (%) | $0(0)$ | $0(0)$ | $0(0)$ |
Characteristics of the participants who contributed to the GWAS are listed as mean (standard deviation). Count data are listed as number (%).
参与 GWAS 的受试者特征以平均值(标准差)列出。计数数据以数字(%)列出。

from -0.72 (between LAmin and LAEF) to 0.95 (between LAmax and LAmin; Supplementary Table 4).
从-0.72(介于 LAmin 和 LAEF 之间)到 0.95(介于 LAmax 和 LAmin 之间;补充表 4)。
Next, we performed GWAS for all seven LA traits (Table 2), and as a sensitivity analysis, we also performed GWAS of LA volumes after indexing on left ventricular end-diastolic volume (Supplementary Materials and Supplementary Fig. 4). For all analyses, linkage disequilibrium score regression intercepts were near 1 , indicating no significant evidence of inflation due to population stratification (Supplementary Table 5) 30 30 ^(30){ }^{30}. No lead SNPs deviated from Hardy-Weinberg equilibrium (HWE) at a threshold of P < 1 E P < 1 E P < 1EP<1 \mathrm{E}-06 (Supplementary Data 2) 31 31 ^(31){ }^{31}.
接下来,我们对所有七个左心房特征(表 2)进行了全基因组关联研究(GWAS),并作为敏感性分析,我们还对左心室舒张末期容积进行了标准化后的左心房容积 GWAS 分析(补充材料和补充图 4)。对于所有分析,连锁不平衡得分回归截距接近 1,这表明没有由于人群分层导致的明显膨胀(补充表 5) 30 30 ^(30){ }^{30} 。没有主导 SNP 偏离 Hardy-Weinberg 平衡(HWE),阈值为 P < 1 E P < 1 E P < 1EP<1 \mathrm{E} -06(补充数据 2) 31 31 ^(31){ }^{31}
In the GWAS of LA traits conducted without indexing to BSA, we identified five loci associated with LAmax, eight with LAmin, four with LAEF, and two with LASV (Fig. 4). Four loci were shared between LAmax and LAmin, with lead SNPs near HLA-B, IRAK1BP1, BEND3, and FBXO32/RSPH6A. LAmax was additionally associated with SNPs at the HMGA2 locus, and LAmin was associated with SNPs near ANKRD1, SSSCA1, IGF1R, and MYO18B. The four LAEF loci were located near FAF1, CASQ2, MYH6, and MYO18B. The two LASV-associated loci included SNPs near HLA-C and MYH6.
不考虑体表面积指数的 GWAS 研究中,我们发现五个与 LAmax 相关的位点,八个与 LAmin 相关的位点,四个与 LAEF 相关的位点,以及两个与 LASV 相关的位点(图 4)。四个位点在 LAmax 和 LAmin 之间存在共享,其中定位于 HLA-B、IRAK1BP1、BEND3 和 FBXO32/RSPH6A 附近的 SNP。LAmax 还与 HMGA2 位点的 SNP 相关联,而 LAmin 则与 ANKRD1、SSSCA1、IGF1R 和 MYO18B 附近的 SNP 相关联。四个 LAEF 位点位于 FAF1、CASQ2、MYH6 和 MYO18B 附近。两个 LASV 相关的位点包括 HLA-C 和 MYH6 附近的 SNP。
Indexing on BSA yielded three additional loci shared by both LAmax and LAmin (TTN, PITX2, and NPR3), as well as MYO18B for LAmax, UQCRB, HTR7, and GOSR2 for LAmin, and OBP2B for LASV. Additional loci were identified in a sensitivity analysis that accounted for left ventricular end diastolic volume (LVEDV; Supplementary Data 3). Because adjustment for heritable covariates can induce spurious association signals, interpretation of these loci requires caution 32 32 ^(32){ }^{32}. Other sensitivity analyses (retaining participants with abnormal cardiac filling patterns; retaining only individuals with inlier genetic identities) are detailed in the Supplementary Note.
在 BSA 上进行索引得到了三个额外的位点,这些位点同时被 LAmax 和 LAmin 共享(TTN、PITX2 和 NPR3),以及 MYO18B 位点对应 LAmax,UQCRB、HTR7 和 GOSR2 位点对应 LAmin,OBP2B 位点对应 LASV。在考虑左心室舒张末期容量(LVEDV)的敏感性分析中,也发现了其他位点(请参见补充数据 3)。由于调整可遗传性协变量会产生虚假的关联信号,因此这些位点的解释需要谨慎 32 32 ^(32){ }^{32} 。其他敏感性分析(保留具有异常心脏充盈模式的参与者;仅保留具有内误代缓基因身份的个体)在补充说明中有详细描述。

Genetic relationship between AF risk and LA dysfunction
房颤风险与左房功能障碍的遗传关系

To gain more insight into the genetic relationship between LA measurements and AF, we first evaluated their genetic correlations. Using
为了更深入地了解心房颤动与心房大小的遗传关系,我们首先评估了它们的遗传相关性。使用

ldsc, the strongest genetic correlation was found between LAmin and AF ( rg 0.37 , P = 2.0 E 10 rg 0.37 , P = 2.0 E 10 rg0.37,P=2.0E-10\mathrm{rg} 0.37, P=2.0 \mathrm{E}-10 ), a direction of effect that corresponds to a positive correlation between LA dysfunction (i.e., increased LAmin) and risk for AF (Supplementary Table 6) 33 , 34 33 , 34 ^(33,34){ }^{33,34}. This relationship was minimally attenuated after indexing on BSA ( rg 0.33 , P = 7.7 E 09 rg 0.33 , P = 7.7 E 09 rg0.33,P=7.7E-09\mathrm{rg} 0.33, P=7.7 \mathrm{E}-09 ). We also tested for association between LA measurements and stroke (allcause or cardioembolic) from MEGASTROKE; the strongest association was between LAmin and all-cause stroke with nominal significance ( rg 0.21 , P = 0.01 0.21 , P = 0.01 0.21,P=0.010.21, P=0.01 ), which was directionally concordant with increased AF risk 35 35 ^(35)^{35}.
在 LDSC 中,LAmin 和 AF 之间发现了最强的遗传相关性( rg 0.37 , P = 2.0 E 10 rg 0.37 , P = 2.0 E 10 rg0.37,P=2.0E-10\mathrm{rg} 0.37, P=2.0 \mathrm{E}-10 ),这种效应方向对应于 LA 功能障碍(即 LAmin 增加)和 AF 风险之间的正相关(补充表 6) 33 , 34 33 , 34 ^(33,34){ }^{33,34} 。在调整 BSA 后,这种关系略有减弱( rg 0.33 , P = 7.7 E 09 rg 0.33 , P = 7.7 E 09 rg0.33,P=7.7E-09\mathrm{rg} 0.33, P=7.7 \mathrm{E}-09 )。我们还测试了 LA 测量值与 MEGASTROKE 中的全因性卒中或心源性卒中之间的关联;最强的关联是 LAmin 和全因性卒中之间的名义显著性(),这与增加的 AF 风险方向一致 35 35 ^(35)^{35}
We then assessed the overlap between the 20 distinct LA loci identified in our study and 134 loci previously found to be associated with AF 34 AF 34 AF^(34)\mathrm{AF}^{34}. We found that 8 of the 20 LA loci overlapped with an AF locus, which was a significant enrichment based on permutation testing ( P = 1 E 04 P = 1 E 04 P=1E-04P=1 \mathrm{E}-04, which was the minimum possible P P PP value; see Methods ) 36 ) 36 )^(36))^{36}. The 8 loci found in both the LA GWAS and the AF GWAS are nearest to FAF1/C1orf85, CASQ2, TTN, PITX2, MYH6/MYH7, IGF1R, GOSR2, and MYO18B. At all 8 loci, the effect of each SNP on AF risk was in opposition to its effect on LAEF, and in most cases the effect of each SNP on AF was concordant with its effect on LAmin (Fig. 5). None of the loci that were linked with both LA measurements and AF were associated at genome-wide significance with LAmax.
AF 34 AF 34 AF^(34)\mathrm{AF}^{34} P = 1 E 04 P = 1 E 04 P=1E-04P=1 \mathrm{E}-04 P P PP ) 36 ) 36 )^(36))^{36} 我们评估了我们研究中确定的 20 个不同 LA 位点与先前发现与 AF 34 AF 34 AF^(34)\mathrm{AF}^{34} 相关的 134 个位点之间的重叠。我们发现,20 个 LA 位点中有 8 个与 AF 位点重叠,这在置换检验中被认为是一个重要的富集(p P = 1 E 04 P = 1 E 04 P=1E-04P=1 \mathrm{E}-04 ,这是可能的最小值;请参见方法 ) 36 ) 36 )^(36))^{36} )。在 LA GWAS 和 AF GWAS 中发现的 8 个位点最接近 FAF1/C1orf85、CASQ2、TTN、PITX2、MYH6/MYH7、IGF1R、GOSR2 和 MYO18B。在所有 8 个位点上,每个 SNP 对 AF 风险的影响都与其对 LAEF 的影响相反,在大多数情况下,每个 SNP 对 AF 的影响与其对 LAmin 的影响一致(图 5)。没有一个与 LA 测量和 AF 同时相关的位点与 LAmax 在基因组范围内显著相关。
Because the genetic correlation analysis suggested that the strongest cross-trait association was between LAmin and AF, we performed bidirectional Mendelian randomization (MR) analyses to assess whether this relationship was causal. First, we assessed the causal effects of LAmin on the risk for AF. Variants that were associated with LAmin with P < 1 E 06 P < 1 E 06 P < 1E-06P<1 \mathrm{E}-06 were clumped and ambiguous alleles were excluded, leaving 19 SNPs. These variants were cross-referenced in summary statistics from a prior AF GWAS without UK Biobank participants to model the outcome 37 37 ^(37){ }^{37}. The inverse variance weighted (IVW) model identified a significant association between LAmin and AF (OR 1.77 per SD increase in LAmin, 95 % 95 % 95%95 \% CI 1.3-2.3, P = 4.7 E 05 P = 4.7 E 05 P=4.7E-05P=4.7 \mathrm{E}-05 ). Simple median, weighted median and MR-Egger showed the same direction of effects (Supplementary Fig. 5). There was significant effect heterogeneity ( P = 2.9 E 05 P = 2.9 E 05 P=2.9E-05P=2.9 \mathrm{E}-05 by Cochran Q ), so the contamination mixture model approach and MRPRESSO were applied, both of which showed a significant, positive relationship between LAmin and AF with the same direction of effects (Supplementary Data 4; Supplementary Fig. 5). MR-Egger results did not reach nominal significance, nor did they yield evidence for horizontal pleiotropy (intercept P = 0.48 P = 0.48 P=0.48P=0.48 ). Within the GWAS participants, three of the 19 SNPs had evidence for pleiotropic association with AF risk factors that were derived from the CHARGE-AF risk score (Supplementary Fig. 6 4 6 4 6^(4)6^{4}; a sensitivity analysis excluding these three variants yielded similar results (IVW OR 1.89 per SD increase in LAmin, P = 7.3 E P = 7.3 E P=7.3E-P=7.3 \mathrm{E}- 06; Supplementary Data 4; Supplementary Fig. 7).
由于基因相关性分析表明,LAmin 和房颤之间的最强交叉特征关联,我们进行了双向孟德尔随机化(MR)分析,以评估这种关系是否因果。首先,我们评估了 LAmin 对房颤风险的因果影响。与 LAmin 相关的变体进行了修剪,并排除了模棱两可的等位基因,留下了 19 个 SNP。这些变体在没有 UK Biobank 参与者的先前房颤 GWAS 汇总统计中进行了对照。加权最小二乘(IVW)模型发现 LAmin 与房颤之间存在显著关联(每标准差 LAmin 增加的 OR 为 1.77,95%CI 为 1.3-2.3)。简单中位数、加权中位数和 MR-Egger 表现出相同的效果方向(补充图 5)。存在显著的效果异质性(Cochran Q p 值),因此应用污染混合模型方法和 MRPRESSO,结果显示 LAmin 和房颤之间存在显著的正相关关系,效果方向相同(补充数据 4;补充图 5)。MR-Egger 结果没有达到标称显著性,也没有证据表明存在水平多态性(截距 p 值)。在 GWAS 参与者中,3 个 SNP 与从 CHARGE-AF 风险评分中得出的房颤风险因素存在多态性关联(补充图 6);不包括这 3 个变体的敏感性分析得出了类似的结果(IVW OR 为 1.89,每标准差增加 LAmin,补充数据 4;补充图 7)。
Analyses treating each LA measurement as an exposure, using only instruments with P < 5 E 08 P < 5 E 08 P < 5E-08P<5 \mathrm{E}-08, revealed that the strongest statistical relationship was between LAEF and AF (OR 0.36 per SD increase in LAEF, P = 1.6 E 06 P = 1.6 E 06 P=1.6E-06P=1.6 \mathrm{E}-06; Supplementary Data 5). Expanding the tested outcomes to heart failure 38 38 ^(38){ }^{38} and stroke 35 35 ^(35){ }^{35} revealed a nominal relationship between greater LAmin and increased risk for heart failure (OR 1.23 per SD increase in LAmin, P = 0.03 P = 0.03 P=0.03P=0.03 ), and between greater LAEF and reduced risk for cardioembolic stroke (OR 0.56 per SD increase in LAEF, P = 5.3 E 03 P = 5.3 E 03 P=5.3E-03P=5.3 \mathrm{E}-03 ) but not all ischemic stroke ( P = 0.5 P = 0.5 P=0.5P=0.5; Supplementary Data 5).
分析将每个左心房测量视为暴露,仅使用 P < 5 E 08 P < 5 E 08 P < 5E-08P<5 \mathrm{E}-08 的仪器,发现最强的统计关系是在左心房收缩末期容量和心房颤动之间(每标准差增加左心房收缩末期容量,心房颤动为 0.36 倍, P = 1.6 E 06 P = 1.6 E 06 P=1.6E-06P=1.6 \mathrm{E}-06 ; 补充数据 5)。将测试的结果扩展到心力衰竭 38 38 ^(38){ }^{38} 和中风 35 35 ^(35){ }^{35} ,发现左心房最小容量较大与心力衰竭风险增加(每标准差增加左心房最小容量,心力衰竭为 1.23 倍, P = 0.03 P = 0.03 P=0.03P=0.03 ),以及左心房收缩末期容量较大与降低心源性卒中风险(每标准差增加左心房收缩末期容量,心源性卒中为 0.56 倍, P = 5.3 E 03 P = 5.3 E 03 P=5.3E-03P=5.3 \mathrm{E}-03 )但不是全部缺血性中风( P = 0.5 P = 0.5 P=0.5P=0.5 ; 补充数据 5)。
We then tested the causal effect of AF on LAmin. 38 instruments that were also present in the LAmin summary statistics were taken from the 2017 AF GWAS that was conducted without UK Biobank participants 37 37 ^(37){ }^{37}. Increasing genetic risk of AF was significantly associated with LAmin ( 0.086 SD increase per unit increase of log of odds of AF liability, 95 % CI 0.049 0.123 SD , P = 6.2 E 06 95 % CI 0.049 0.123 SD , P = 6.2 E 06 95%CI0.049-0.123SD,P=6.2E-0695 \% \mathrm{CI} 0.049-0.123 \mathrm{SD}, P=6.2 \mathrm{E}-06 ) using the IVW approach. The simple median, weighted median, MR-Egger bootstrap,
我们然后检验了 AF 对 LAmin 的因果效应。从 2017 年进行的不包含 UK Biobank 参与者的 AF GWAS 中选取了 38 个也存在于 LAmin 汇总统计中的工具变量。使用 IVW 方法发现,AF 遗传风险上升显著与 LAmin 相关(每单位对数赔付责任的 AF 风险增加 0.086 个标准差, 95 % CI 0.049 0.123 SD , P = 6.2 E 06 95 % CI 0.049 0.123 SD , P = 6.2 E 06 95%CI0.049-0.123SD,P=6.2E-0695 \% \mathrm{CI} 0.049-0.123 \mathrm{SD}, P=6.2 \mathrm{E}-06 )。简单中位数、加权中位数、MR-Egger 自助抽样、


@@\circ \circ

u
§्यّ



0 0 @^(0)\stackrel{0}{\circ}
0 0 0 ® 0 0 0  ®  **^(0)^(0)**^(0)^(" ® ")\stackrel{0}{\stackrel{0}{*}} \stackrel{\text { ® }}{\stackrel{0}{*}}









@






Fig. 4 | Genome-wide association study Manhattan plots. Manhattan plots showing the chromosomal position ( X X XX axis) and the strength of association (-log10 of the P P PP value, Y Y YY axis) for all LA measurements and the BSA-indexed counterparts
图 4 | 基因组范围关联研究曼哈顿图。曼哈顿图显示了染色体位置(x 轴)和关联强度(-log10 p 值,y 轴)的所有 LA 测量值以及 BSA 指数。


(except for LAEF, which is dimensionless). Loci that contain SNPs with two-tailed BOLT-LMM P < 5 E 08 P < 5 E 08 P < 5E-08P<5 \mathrm{E}-08 are colored red and labeled with the name of the nearest gene to the most strongly associated variant.
(除了 LAEF 外,它是无量纲的)。包含两个尾部 BOLT-LMM P < 5 E 08 P < 5 E 08 P < 5E-08P<5 \mathrm{E}-08 的位点用红色标记,并标注了最强相关变量最近的基因名称。

Fig. 5 5 5∣5 \mid Variants associated with left atrial structure and function and AF. The 8 loci associated with LA measurements and AF are displayed. All loci (except those near CASQ2 and PITX2) have multiple patterns of linkage disequilibrium and are therefore represented multiple times. Black boxes represent an association with
图. 5 5 5∣5 \mid 与左房结构和功能以及心房颤动相关的变体。显示了与左房测量和心房颤动相关的 8 个位点。除了 CASQ2 和 PITX2 附近的位点外,所有位点都有多种连锁不平衡模式,因此被多次表示。黑色框表示与

two-tailed BOLT-LMM P < 5 E 8 P < 5 E 8 P < 5E-8P<5 \mathrm{E}-8; lighter gray boxes represent P < 5 E 6 P < 5 E 6 P < 5E-6P<5 \mathrm{E}-6. Effect sizes are oriented with respect to the minor allele. Effect size for AF loci represents the logarithm of the odds ratio. Source data are provided as a Source Data file.
双尾 BOLT-LMM P < 5 E 8 P < 5 E 8 P < 5E-8P<5 \mathrm{E}-8 ;浅灰色方框代表 P < 5 E 6 P < 5 E 6 P < 5E-6P<5 \mathrm{E}-6 。效应大小是以次等位基因为参考。AF 位点的效应大小代表几率比的对数。源数据请参见源数据文件。
MR-PRESSO, and contamination mixture models exhibited similar directional effects and nominal significance (Supplementary Data 4). The intercept of the MR-Egger and MR-Egger bootstrap were not significantly different from zero (MR-Egger intercept P = 0.83 P = 0.83 P=0.83P=0.83, MR-Egger bootstrap intercept P = 0.39 P = 0.39 P=0.39P=0.39; Supplementary Data 4, Supplementary Fig. 8).
MR-PRESSO 和污染混合模型表现出相似的方向性影响和名义显著性(补充数据 4)。MR-Egger 和 MR-Egger 启动程序的截距与零显著不同(MR-Egger 截距 P = 0.83 P = 0.83 P=0.83P=0.83 ,MR-Egger 启动程序截距 P = 0.39 P = 0.39 P=0.39P=0.39 ; 补充数据 4,补充图 8)。
A polygenic risk score for AF is associated with LA phenotypes We constructed a 1.1-million SNP polygenic risk score (PRS) with PRScs using summary statistics from the Christophersen et al. AF GWAS, and applied this score in the 35,049 LA GWAS participants 37 , 39 37 , 39 ^(37,39){ }^{37,39}. The AF PRS was statistically significantly associated with all measures of LA size and function, with a small effect size (Supplementary Table 7). The strongest association was with LAmin ( 0.052 SD increase in LAmin per SD increase in the PRS; 95 % CI 95 % CI 95%CI95 \% \mathrm{CI} 0.042 0.061 ; P = 1.1 E 25 ) 0.042 0.061 ; P = 1.1 E 25 ) 0.042-0.061;P=1.1E-25)0.042-0.061 ; P=1.1 \mathrm{E}-25).
对房颤的多基因风险评分与左心房表型相关我们使用来自 Christophersen et al.房颤 GWAS 的总结统计构建了一个包含 110 万个 SNP 的多基因风险评分(PRS),并将该评分应用于 35,049 名左心房 GWAS 参与者 37 , 39 37 , 39 ^(37,39){ }^{37,39} 。房颤 PRS 与所有左心房大小和功能的测量值存在统计学显著相关性,但效应量较小(补充表 7)。最强的相关性是与 LAmin 有关(每增加 1 个 PRS 标准差,LAmin 增加 0.052 个标准差; 95 % CI 95 % CI 95%CI95 \% \mathrm{CI} 0.042 0.061 ; P = 1.1 E 25 ) 0.042 0.061 ; P = 1.1 E 25 ) 0.042-0.061;P=1.1E-25)0.042-0.061 ; P=1.1 \mathrm{E}-25) )。

Polygenic estimates of LA volume predict AF, stroke, and heart failure
多基因估计的 LA 容量可预测房颤、中风和心力衰竭

We created a 1.1-million SNP genome-wide polygenic score for each LA trait using PRScs 39 39 ^(39){ }^{39} and tested each score in up to 423,821 UK Biobank participants who did not participate in the LA GWAS, of whom 417,881 did not have an AF diagnosis at enrollment and 21,147 developed AF afterwards. The strongest association was with the BSA-indexed LAmin polygenic score, which was linked to a modestly increased risk for incident AF or atrial flutter ( HR = 1.09 HR = 1.09 HR=1.09\mathrm{HR}=1.09 per 1SD increase in the score; P = 7.4 E 32 P = 7.4 E 32 P=7.4E-32P=7.4 \mathrm{E}-32 ) (Fig. 6; Supplementary Table 8). This score was also associated with small increases in risks of incident all-cause stroke ( 7753
我们使用 PRScs 39 39 ^(39){ }^{39} 为每一个 LA 特性创建了一个 110 万个 SNP 的基因组范围多态性评分,并在 423,821 名没有参与 LA GWAS 的英国生物样本库参与者中进行了测试,其中 417,881 人在入学时没有 AF 诊断,21,147 人随后发展了 AF。最强的关联是与 BSA 指数 LAmin 多态性评分,它与发生 AF 或房扑的风险略有增加有关( HR = 1.09 HR = 1.09 HR=1.09\mathrm{HR}=1.09 ; P = 7.4 E 32 P = 7.4 E 32 P=7.4E-32P=7.4 \mathrm{E}-32 ) (图 6;补充表 8)。这个评分也与发生所有原因性中风的风险略有增加相关(7753

cases; HR = 1.04 HR = 1.04 HR=1.04\mathrm{HR}=1.04 per SD; P = 4.7 E 04 P = 4.7 E 04 P=4.7E-04P=4.7 \mathrm{E}-04 ), ischemic stroke ( 5,444 cases; HR = 1.04 HR = 1.04 HR=1.04\mathrm{HR}=1.04 per SD ; P = 4.7 E 03 SD ; P = 4.7 E 03 SD;P=4.7E-03\mathrm{SD} ; P=4.7 \mathrm{E}-03 ), and heart failure ( 11,035 cases; HR = 1.05 HR = 1.05 HR=1.05\mathrm{HR}=1.05 per SD ; P = 7.9 E 08 SD ; P = 7.9 E 08 SD;P=7.9E-08\mathrm{SD} ; P=7.9 \mathrm{E}-08 ). Those in the top 5 % 5 % 5%5 \% of the score had a greater risk of AF ( HR = 1.19 , P = 7.9 E 10 ) AF ( HR = 1.19 , P = 7.9 E 10 ) AF(HR=1.19,P=7.9E-10)\mathrm{AF}(\mathrm{HR}=1.19, P=7.9 \mathrm{E}-10), ischemic stroke ( HR = 1.12 , P = 0.06 ) HR = 1.12 , P = 0.06 ) HR=1.12,P=0.06)\mathrm{HR}=1.12, P=0.06), and heart failure ( H R = 1.14 , P = 1.2 E 03 H R = 1.14 , P = 1.2 E 03 HR=1.14,P=1.2E-03H R=1.14, P=1.2 \mathrm{E}-03; Supplementary Data 6). In a sensitivity analysis that censored participants who developed AF prior to a diagnosis of heart failure, the magnitude of effect and strength of association between the LAmin score and heart failure was attenuated ( 7,888 cases; H R = 1.03 H R = 1.03 HR=1.03H R=1.03 per SD; P = 0.01 P = 0.01 P=0.01P=0.01; Supplementary Data 6 ). Sensitivity analyses using lead SNP scores, different covariate adjustments, or different population subgroups yielded similar results (Supplementary Data 6).
病例; HR = 1.04 HR = 1.04 HR=1.04\mathrm{HR}=1.04 每标准差; P = 4.7 E 04 P = 4.7 E 04 P=4.7E-04P=4.7 \mathrm{E}-04 ), 缺血性卒中 (5,444 例; HR = 1.04 HR = 1.04 HR=1.04\mathrm{HR}=1.04 SD ; P = 4.7 E 03 SD ; P = 4.7 E 03 SD;P=4.7E-03\mathrm{SD} ; P=4.7 \mathrm{E}-03 ), 以及心力衰竭 (11,035 例; HR = 1.05 HR = 1.05 HR=1.05\mathrm{HR}=1.05 SD ; P = 7.9 E 08 SD ; P = 7.9 E 08 SD;P=7.9E-08\mathrm{SD} ; P=7.9 \mathrm{E}-08 )。得分在前 5 % 5 % 5%5 \% 的人具有更高的 AF ( HR = 1.19 , P = 7.9 E 10 ) AF ( HR = 1.19 , P = 7.9 E 10 ) AF(HR=1.19,P=7.9E-10)\mathrm{AF}(\mathrm{HR}=1.19, P=7.9 \mathrm{E}-10) 风险、缺血性卒中 ( HR = 1.12 , P = 0.06 ) HR = 1.12 , P = 0.06 ) HR=1.12,P=0.06)\mathrm{HR}=1.12, P=0.06) ) 和心力衰竭 ( H R = 1.14 , P = 1.2 E 03 H R = 1.14 , P = 1.2 E 03 HR=1.14,P=1.2E-03H R=1.14, P=1.2 \mathrm{E}-03 ; 补充数据 6)。在一项将发生心房颤动的参与者排除在外的敏感性分析中,LAmin 评分与心力衰竭之间的效果大小和关联强度有所减弱 (7,888 例; H R = 1.03 H R = 1.03 HR=1.03H R=1.03 每标准差; P = 0.01 P = 0.01 P=0.01P=0.01 ; 补充数据 6)。使用关键 SNP 分数、不同的协变量调整或不同的人群亚组进行的敏感性分析得出了类似的结果(补充数据 6)。

External validation of the LAmin polygenic score in FinnGen and All of Us
对 LAmin 多基因评分进行 FinnGen 和 All of Us 的外部验证

In FinnGen 40 40 ^(40){ }^{40} study participants (Supplementary Data 7), comparable associations were observed for association between the BSA-indexed LAmin polygenic score and incident AF or atrial flutter (20,422 cases, HR = 1.08 HR = 1.08 HR=1.08\mathrm{HR}=1.08 per SD , P = 2.4 E 30 SD , P = 2.4 E 30 SD,P=2.4E-30\mathrm{SD}, P=2.4 \mathrm{E}-30 ), ischemic stroke excluding subarachnoid hemorrhage ( 13,392 cases, HR = 1.03 HR = 1.03 HR=1.03\mathrm{HR}=1.03 per SD , P = 3.0 E 03 SD , P = 3.0 E 03 SD,P=3.0E-03\mathrm{SD}, P=3.0 \mathrm{E}-03 ), ischemic stroke excluding all hemorrhage ( 11,822 cases, H R = 1.03 H R = 1.03 HR=1.03H R=1.03 per SD, P = 5.6 E 04 P = 5.6 E 04 P=5.6E-04P=5.6 \mathrm{E}-04 ), and heart failure ( 13,771 cases, HR = 1.04 HR = 1.04 HR=1.04\mathrm{HR}=1.04 per SD , P = 4.4 E SD , P = 4.4 E SD,P=4.4E-\mathrm{SD}, P=4.4 \mathrm{E}- 06). Compared with the remaining 95 % 95 % 95%95 \% of FinnGen participants, those in the top 5 % 5 % 5%5 \% of genetically predicted LAmin indexed had an increased risk of AF ( HR = 1.19 AF ( HR = 1.19 AF(HR=1.19\mathrm{AF}(\mathrm{HR}=1.19 per SD , P = 8.4 E 09 ) SD , P = 8.4 E 09 ) SD,P=8.4E-09)\mathrm{SD}, P=8.4 \mathrm{E}-09). Those in the top 5 % 5 % 5%5 \% also had elevations in risk that were not statistically significant for ischemic stroke excluding subarachnoid hemorrhages ( H R = 1.04 H R = 1.04 HR=1.04H R=1.04 per SD, P = 0.36 P = 0.36 P=0.36P=0.36 ) and heart failure ( HR = 1.07 , P = 0.08 HR = 1.07 , P = 0.08 HR=1.07,P=0.08\mathrm{HR}=1.07, P=0.08 ).
在 FinnGen 40 40 ^(40){ }^{40} 研究参与者(补充数据 7)中,观察到类似的关联,这些关联涉及 BSA 指数 LAmin 多基因得分与发病性房颤或房速(20,422 例, HR = 1.08 HR = 1.08 HR=1.08\mathrm{HR}=1.08 SD , P = 2.4 E 30 SD , P = 2.4 E 30 SD,P=2.4E-30\mathrm{SD}, P=2.4 \mathrm{E}-30 ),不包括蛛网膜下出血的缺血性中风(13,392 例, HR = 1.03 HR = 1.03 HR=1.03\mathrm{HR}=1.03 SD , P = 3.0 E 03 SD , P = 3.0 E 03 SD,P=3.0E-03\mathrm{SD}, P=3.0 \mathrm{E}-03 ),不包括任何出血的缺血性中风(11,822 例, H R = 1.03 H R = 1.03 HR=1.03H R=1.03 每 SD, P = 5.6 E 04 P = 5.6 E 04 P=5.6E-04P=5.6 \mathrm{E}-04 ),以及心力衰竭(13,771 例, HR = 1.04 HR = 1.04 HR=1.04\mathrm{HR}=1.04 SD , P = 4.4 E SD , P = 4.4 E SD,P=4.4E-\mathrm{SD}, P=4.4 \mathrm{E}- 06)。与 FinnGen 参与者其余部分相比,遗传预测的 LAmin 指数在顶端 95 % 95 % 95%95 \% 的人发生 AF ( HR = 1.19 AF ( HR = 1.19 AF(HR=1.19\mathrm{AF}(\mathrm{HR}=1.19 的风险增加 SD , P = 8.4 E 09 ) SD , P = 8.4 E 09 ) SD,P=8.4E-09)\mathrm{SD}, P=8.4 \mathrm{E}-09) 。在顶端 5 % 5 % 5%5 \% 的人还有发生不包括蛛网膜下出血的缺血性中风( H R = 1.04 H R = 1.04 HR=1.04H R=1.04 每 SD, P = 0.36 P = 0.36 P=0.36P=0.36 )和心力衰竭( HR = 1.07 , P = 0.08 HR = 1.07 , P = 0.08 HR=1.07,P=0.08\mathrm{HR}=1.07, P=0.08 )的风险增加,但这些增加并未达到统计学显著性。

Fig. 6 | Incident, atrial fibrillation risk, stratified by left atrial polygenic score. Disease incidence curves for the 417,881 participants who were unrelated to within three degrees of the participants who underwent MRI in the UK Biobank. Those in the top 5% for the BSA-indexed LAmin PRS are depicted in red; the remaining 95% are in gray. The lighter-shaded bands around each line represent the 95 % 95 % 95%95 \% confidence interval. X X XX axis: years since enrollment in the UK Biobank. Y Y YY axis: cumulative incidence of AF ( 19 , 875 AF ( 19 , 875 AF(19,875\mathrm{AF}(19,875 cases in the bottom 95 % 95 % 95%95 \% and 1272 cases in the top 5 % 5 % 5%5 \% ). Those in the top 5 % 5 % 5%5 \% of genetically predicted LAmin indexed had an increased risk of AF (Cox HR 1.19, P = 7.9 E 10 P = 7.9 E 10 P=7.9E-10P=7.9 \mathrm{E}-10 ) compared with those in the remaining 95 % 95 % 95%95 \% in up to 12 years of follow-up time after UK Biobank enrollment.
图 6 | 事件,心房纤颤风险,按左房多基因评分分层。 对于未与接受 UK Biobank 磁共振成像的参与者有三度亲属关系的 417,881 名参与者,疾病发生曲线。位于表面积调整后的左房最小体积积分风险评分前 5%的人以红色显示;其余 95%的人以灰色显示。每条线周围较浅的色带代表 95 % 95 % 95%95 \% 置信区间。 X X XX 轴:自 UK Biobank 入组以来的年数。 Y Y YY 轴:累积心房纤颤发病数(下位 95 % 95 % 95%95 \% 为 203 例,上位 5 % 5 % 5%5 \% 为 1272 例)。遗传预测的左房最小体积前 5 % 5 % 5%5 \% %的人心房纤颤风险增加(Cox HR 1.19, P = 7.9 E 10 P = 7.9 E 10 P=7.9E-10P=7.9 \mathrm{E}-10 ),相比其余 95 % 95 % 95%95 \% %.
In the US national biobank, All of U s 41 U s 41 Us^(41)U s^{41}, the BSA-indexed LAmin polygenic score remained significantly associated with AF ( 4859 incident cases, HR = 1.06 HR = 1.06 HR=1.06\mathrm{HR}=1.06 per SD , P = 1.7 E 04 SD , P = 1.7 E 04 SD,P=1.7E-04\mathrm{SD}, P=1.7 \mathrm{E}-04 ) and heart failure ( 5712 incident cases, HR = 1.04 HR = 1.04 HR=1.04\mathrm{HR}=1.04 per SD, P = 2.0 E 02 P = 2.0 E 02 P=2.0E-02P=2.0 \mathrm{E}-02 ), but not ischemic stroke ( 66 cases, P = 0.3 P = 0.3 P=0.3P=0.3; Supplementary Data 8). In logistic models that included all cases regardless of biobank enrollment date, more cases were identified and the statistical evidence was stronger ( 13 , 399 AF 13 , 399 AF 13,399AF13,399 \mathrm{AF} cases, OR = 1.10 OR = 1.10 OR=1.10\mathrm{OR}=1.10 per SD, P = 4.9 E 19 ; 14 , 572 P = 4.9 E 19 ; 14 , 572 P=4.9E-19;14,572P=4.9 \mathrm{E}-19 ; 14,572 heart failure cases, OR = 1.04 OR = 1.04 OR=1.04\mathrm{OR}=1.04 per SD , P = 1.5 E 04 ) SD , P = 1.5 E 04 ) SD,P=1.5E-04)\mathrm{SD}, P=1.5 \mathrm{E}-04).
在美国国家生物库中,所有的 U s 41 U s 41 Us^(41)U s^{41} ,BSA 指数 LAmin 多态性评分仍与房颤(4859 例发病案例, HR = 1.06 HR = 1.06 HR=1.06\mathrm{HR}=1.06 SD , P = 1.7 E 04 SD , P = 1.7 E 04 SD,P=1.7E-04\mathrm{SD}, P=1.7 \mathrm{E}-04 )和心力衰竭(5712 例发病案例, HR = 1.04 HR = 1.04 HR=1.04\mathrm{HR}=1.04 每 SD, P = 2.0 E 02 P = 2.0 E 02 P=2.0E-02P=2.0 \mathrm{E}-02 )显著相关,但与缺血性中风(66 例, P = 0.3 P = 0.3 P=0.3P=0.3 ;补充数据 8)无关。在包含所有案例的 logistic 模型中(不论生物库登记日期),确诊更多案例,统计证据更强( 13 , 399 AF 13 , 399 AF 13,399AF13,399 \mathrm{AF} 例, OR = 1.10 OR = 1.10 OR=1.10\mathrm{OR}=1.10 每 SD, P = 4.9 E 19 ; 14 , 572 P = 4.9 E 19 ; 14 , 572 P=4.9E-19;14,572P=4.9 \mathrm{E}-19 ; 14,572 心力衰竭案例, OR = 1.04 OR = 1.04 OR=1.04\mathrm{OR}=1.04 SD , P = 1.5 E 04 ) SD , P = 1.5 E 04 ) SD,P=1.5E-04)\mathrm{SD}, P=1.5 \mathrm{E}-04) )。
In addition, 680 participants in All of Us with genetic data had BSAindexed LAmin volume measurements. The BSA-indexed LAmin polygenic score was associated with these measurements ( 0.10 SD per SD of the polygenic score, P = 8.5 E 03 P = 8.5 E 03 P=8.5E-03P=8.5 \mathrm{E}-03 ). This relationship remained nominally significant when restricted to only the largest subset of participants by genetic identity ( N = 619 N = 619 N=619N=619 participants with genetic identity similar to Europeans; 0.09 SD per SD, P = 1.5 E 2 P = 1.5 E 2 P=1.5E-2P=1.5 \mathrm{E}-2 ).
此外,在"我们大家"研究中有 680 名参与者有遗传数据,他们的身体表面积指数化的左心房最小容积进行了测量。身体表面积指数化的左心房最小容积多基因评分与这些测量结果有关(每个多基因评分标准差为 0.10 标准差, P = 8.5 E 03 P = 8.5 E 03 P=8.5E-03P=8.5 \mathrm{E}-03 )。当仅限于与欧洲人遗传身份最相似的最大参与者子集时,这种关系仍然在名义上显著( N = 619 N = 619 N=619N=619 参与者,每个标准差为 0.09 标准差, P = 1.5 E 2 P = 1.5 E 2 P=1.5E-2P=1.5 \mathrm{E}-2 )。

Discussion 讨论

We used a unique resource of more than 40,000 cardiac MRI studies available in the UK Biobank to enable a large, high-resolution assessment of LA structure and function. We trained deep learning models to segment LA cross-sections from cardiovascular MRI data and then derived estimates of LA volume from their 3-dimensional reconstructions. In turn, we performed an extensive series of epidemiological, genetic, polygenic, and Mendelian randomization analyses to link these LA traits to cardiovascular outcomes. Our findings permit at least five primary conclusions.
我们使用了英国生物银行中超过 40,000 例心脏 MRI 研究的独特资源,对左心房结构和功能进行了大规模、高分辨率的评估。我们训练了深度学习模型,对心血管 MRI 数据中的左心房横截面进行分割,然后从三维重建中获得左心房容积估计值。接下来,我们进行了广泛的流行病学、遗传学、多基因和孟德尔随机化分析,将这些左心房特征与心血管结局相关联。我们的研究结果至少得出了 5 个主要结论。
First, we were able to replicate previous observations demonstrating associations between greater LA volume and cardiovascular diseases 7 10 , 19 , 20 7 10 , 19 , 20 ^(7-10,19,20){ }^{7-10,19,20}. Participants with a history of AF had larger LA volumes; and participants with larger LA volumes were more likely to be subsequently diagnosed with AF , stroke, or heart failure.
首先,我们能够复制先前的观察结果,证明了更大的左心耳容积与心血管疾病之间存在联系 7 10 , 19 , 20 7 10 , 19 , 20 ^(7-10,19,20){ }^{7-10,19,20} 。有房颤病史的参与者左心耳容积更大;左心耳容积更大的参与者后来更容易被诊断为房颤、中风或心力衰竭。
Second, these measurements enabled a large genetic analysis of LA measurements. In this work, 20 distinct genetic loci were associated with LAmax, LAmin, LAEF, LASV, or the BSA-indexed versions of these phenotypes. To our knowledge, one locus (near NPR3) has previously been associated at genome-wide significance with LA measurements in a study of diastolic function 25 25 ^(25){ }^{25}, while 14 were recently identified in association with LA structure and function 26 26 ^(26){ }^{26}. Examining the genetic findings in the present study and in Ahlberg et al. six loci were shared across both studies (near CASQ2, MYO18B, TTN, UQCRB, ANKRD1, and R S P H 6 A / F B X O 46 / S I X 5 R S P H 6 A / F B X O 46 / S I X 5 RSPH6A//FBXO 46//SIX5R S P H 6 A / F B X O 46 / S I X 5 ); eight were unique to Ahlberg et al. (near CITED4, C9orf3, BEND7, MGAT1, DSP, CILP, COL8A1, and EIF2D); and fourteen were unique to the present study (near HLA-B, IRAK1BP1, BEND3, HMGA2, PITX2, NPR3, FAF1, MYH6, SSSCA1, IGF1R, DCDC2C, DHX15, GOSR2, and OBP2B). We considered this overlap in loci to be substantial, particularly since the studies used completely different deep learning models to identify the LA, and different formulas to compute LA volume from the deep learning model output (biplane v s v s vsv s surface reconstruction). Forty percent of the loci in our study (eight of 20) were previously associated with AF 34 AF 34 AF^(34)\mathrm{AF}^{34}, significantly more than expected by chance. At all eight loci, the allele associated with increased AF risk was directionally associated with a lower LAEF, and generally with greater LA volumes (Fig. 5). The opposed effect directions of these SNPs for AF risk and LAEF may be consistent with the concept of atrial cardiomyopathy 22 22 ^(22){ }^{22}.
25 25 ^(25){ }^{25} 26 26 ^(26){ }^{26} R S P H 6 A / F B X O 46 / S I X 5 R S P H 6 A / F B X O 46 / S I X 5 RSPH6A//FBXO 46//SIX5R S P H 6 A / F B X O 46 / S I X 5 v s v s vsv s AF 34 AF 34 AF^(34)\mathrm{AF}^{34} 22 22 ^(22){ }^{22} 第二,这些测量结果使得对左房尺寸的大规模基因分析成为可能。在这项工作中,20 个不同的遗传位点与最大左房容积、最小左房容积、左房射血分数、左房面积应变或这些表型指数化到体表面积的版本相关。据我们所知,一个位点(靠近 NPR3)之前已在一项舒张功能研究中被发现与左房测量指标在基因组范围内显著相关,而 14 个位点最近在与左房结构和功能相关的研究中被确定。审查本研究和 Ahlberg et al.的遗传学发现,六个位点在两项研究中被共享(位于 CASQ2、MYO18B、TTN、UQCRB、ANKRD1 和附近);八个是 Ahlberg et al.特有的(位于 CITED4、C9orf3、BEND7、MGAT1、DSP、CILP、COL8A1 和 EIF2D 附近);十四个是本研究特有的(位于 HLA-B、IRAK1BP1、BEND3、HMGA2、PITX2、NPR3、FAF1、MYH6、SSSCA1、IGF1R、DCDC2C、DHX15、GOSR2 和 OBP2B 附近)。我们认为这些位点的重叠是显著的,特别是因为这些研究使用了完全不同的深度学习模型来识别左房,并使用不同的公式来计算深度学习模型输出的左房容积(双平面表面重建)。我们研究中 40%的位点(20 个中的 8 个)先前已与房颤相关,这比偶然预期的要多得多。在所有这八个位点上,与增加房颤风险相关的等位基因在方向上与较低的左房射血分数相关,并且通常与较大的左房容积相关(图 5)。这些 SNP 对房颤风险和左房射血分数的相反效应方向可能与房室心肌病的概念一致。
As an example of the pattern of opposed SNP effects on LAEF and AF risk, we identified a missense variant within CASQ2 (rs4074536; p.Thr66Ala) as a lead SNP for LAEF on chromosome 1. The T allele of this SNP (encoding Thr66) corresponds with a reduced LAEF in our GWAS, and with reduced expression of CASQ2 in the right atrial appendage and left ventricle in GTEx 42 42 ^(42){ }^{42}. This variant is also in LD ( r 2 = 1.0 r 2 = 1.0 r^(2)=1.0r^{2}=1.0 ) in non-African 1KG populations for the AF lead SNP rs4484922 34 , 43 34 , 43 ^(34,43){ }^{34,43}. In the study by Roselli and colleagues, the rs4484922-G allele was associated with an increased risk for AF ; notably, that riskincreasing allele corresponds to the LAEF-reducing T allele of rs4074536. The rs4074536-T allele has also previously been associated with a longer QRS complex duration 44 , 45 44 , 45 ^(44,45){ }^{44,45}. CASQ2 encodes calsequestrin 2, which resides in the sarcoplasmic reticulum in abundance and binds to calcium ions during the cardiac cycle. Missense variants in this gene have also been associated with catecholamine-induced polymorphic ventricular tachycardia, typically following a recessive inheritance pattern 46 , 47 46 , 47 ^(46,47){ }^{46,47}.
作为反向 SNP 效应在 LAEF 和 AF 风险上的一个例子,我们在染色体 1 上 CASQ2 区域内识别到了一个错义变异(rs4074536;p.Thr66Ala)作为 LAEF 的主导 SNP。这个 SNP 的 T 等位基因(编码 Thr66)与我们 GWAS 中 LAEF 降低相对应,并且在 GTEx 数据中也与右心房耳和左心室内 CASQ2 表达降低相关。这个变异体也与 AF 主导 SNP rs4484922 在非洲人种群中存在连锁不平衡。在 Roselli 及其同事的研究中,rs4484922-G 等位基因与 AF 风险升高相关;值得注意的是,这个风险增加的等位基因对应于 rs4074536 的 LAEF 降低的 T 等位基因。rs4074536-T 等位基因之前也已经被发现与更长的 QRS 波持续时间相关。CASQ2 编码肌质网内丰富存在的钙调蛋白 2,它在心脏周期内结合钙离子。这个基因的错义变异体也已经与儿茶酚胺诱导的多形性室性心动过速相关联,通常呈隐性遗传模式。 42 42 ^(42){ }^{42} r 2 = 1.0 r 2 = 1.0 r^(2)=1.0r^{2}=1.0 34 , 43 34 , 43 ^(34,43){ }^{34,43} 44 , 45 44 , 45 ^(44,45){ }^{44,45} 46 , 47 46 , 47 ^(46,47){ }^{46,47}
Even among LA-associated loci that were not previously associated with AF, several showed the same consistent pattern of inverse effect between AF risk and LAEF (e.g., near NPR3, SSSCA1, and HMGA2). However, this pattern did not uniformly hold. For example, at the gene-dense locus near FBXO46/DMWD/RPSH6A, the LA volumeincreasing (and LAEF-decreasing) variants were weakly associated with decreased AF risk.
即使在先前没有与心房纤维性联系的 LA 相关位点中,也有几个显示了心房纤维性风险与 LAEF 之间反向效应的一致模式(例如,靠近 NPR3、SSSCA1 和 HMGA2 的位点)。然而,这种模式并不总是一致的。例如,在靠近 FBXO46/DMWD/RPSH6A 的基因密集位点,增加左心房容积(并减小 LAEF)的变体与降低心房纤维性风险相关联,但相关性较弱。
Also notable was the PITX2 locus, which was the first locus associated with AF. In the present GWAS, SNPs at that locus were associated with BSA-indexed LAmax and LAmin. The lead SNP for AF (rs2129977 from Roselli et al. 2018) was in close LD with the lead SNP for LAmax and LAmin (rs2634073; r 2 = 0.85 r 2 = 0.85 r^(2)=0.85r^{2}=0.85 ) 34 , 43 34 , 43 ^(34,43){ }^{34,43}. Consistent with clinical expectations, the AF risk allele was associated with greater LA maximum and minimum volumes. These analyses excluded participants with a history of AF or abnormal cardiac filling patterns on MRI; therefore, these results support the hypothesis that the PITX2 locus may be associated with an increase in LA volume that occurs prior to AF onset, which would be consistent with experimental data showing atrial enlargement during embryonic development in mice with knocked-down PITX2 48 48 ^(48){ }^{48}.
值得注意的是 PITX2 位点,这是与心房纤维性连接(AF)相关的首个位点。在本次全基因组关联研究(GWAS)中,该位点的 SNP 与 BSA 指数 LAmax 和 LAmin 相关。AF 的关键 SNP(rs2129977,来自 Roselli et al. 2018)与 LAmax 和 LAmin 的关键 SNP(rs2634073; r 2 = 0.85 r 2 = 0.85 r^(2)=0.85r^{2}=0.85 )高度连锁。与临床预期一致,AF 风险等位基因与更大的 LA 最大和最小容积相关。这些分析排除了既往有 AF 或心脏 MRI 显示异常充填模式的参与者;因此,这些结果支持 PITX2 位点可能与 AF 发生前 LA 容积增大相关的假说,这与小鼠 PITX2 敲低时胚胎发育期心房扩大的实验数据一致 48 48 ^(48){ }^{48}
Fourth, we developed polygenic scores to gain additional insight into the relationship between LA volumes and cardiovascular diseases. A genome-wide 1.1-million variant AF PRS derived from Christophersen et al. 2017 was associated with all of the LA phenotypes-and most
第四,我们开发了多基因得分(polygenic scores),以获得有关左房体积和心血管疾病关系的更多见解。根据 Christophersen et al.2017 年的研究,我们获得了一个包含 110 万个变体的全基因组 AF PRS,该 PRS 与所有左房表型都有关联,并且大部分

strongly with LAmin-even after excluding participants known to have AF 37 AF 37 AF^(37)\mathrm{AF}^{37}. This genetic evidence is consistent with and extends prior observational evidence, and suggests that some of the genetic drivers of AF risk may manifest in ways that are detectable in LA size and function.
强烈与 LAmin 相关,即使排除已知有 AF 37 AF 37 AF^(37)\mathrm{AF}^{37} 的参与者。这种遗传证据与先前的观察性证据一致并有所扩展,并且表明 AF 风险的某些遗传驱动因素可能以可检测到的 LA 大小和功能的方式表现出来。
A 1.1-million variant polygenic predictor of BSA-indexed LAmin was modestly associated with incident AF (Fig. 6), and weakly with stroke, in the UK Biobank. The score was also associated with heart failure-an association that was almost completely attenuated after excluding participants who were diagnosed with AF prior to heart failure. This attenuation suggests that much of the heart failure association may be mediated through AF. The association between greater genetically predicted BSA-indexed LAmin volume, heart failure, and atrial fibrillation was validated externally in FinnGen and All of U s U s UsU s, and the weak but statistically significant increased risk of ischemic stroke was also confirmed in FinnGen.
一个包含 110 万个变体的多基因预测器能将 BSA 标准化的左房最小容积与 AF 发病呈现适度相关(图 6),且与中风呈现较弱相关,这一结果来自英国生物银行。该得分亦与心力衰竭相关,而该相关性在排除先前已确诊 AF 的参与者后几乎完全消失。这种减弱暗示,心力衰竭与 AF 之间存在相当大的中介效应。这种更大的遗传预测 BSA 标准化的左房最小容积与心力衰竭和心房颤动的相关性,已在 FinnGen 和 All of Us 研究中得到了外部验证,而与缺血性中风风险较弱但统计学显著的增加也在 FinnGen 研究中得到了确认。
Finally, we found evidence of substantial genetic correlation between LA phenotypes and AF. We pursued Mendelian randomization analyses to more formally assess the hypothesis of bidirectional causation between LA phenotypes and AF. These revealed strong evidence of a causal effect of AF on LAmin, as has been previously observed 11 11 ^(11){ }^{11}. There was also evidence that LA volumes, particularly LAmin, may be causal for AF. The causal effect persisted even after excluding three variants associated with at least one risk factor from CHARGE-AF 4 4 ^(4){ }^{4}. However, because AF can be paroxysmal and remain undiagnosed, we cannot exclude the possibility of cryptic reverse causation: namely, that some participants may have had larger atria because of undiagnosed paroxysmal A F A F AFA F, such that A F A F AFA F itself induced the genetic association with LA volumes.
最后,我们发现左心房(LA)表型和房颤(AF)之间存在显著的遗传相关性。我们进行了曼特尔-温德权随机化分析,更正式地评估了两者之间的双向因果关系假说。这些分析表明,AF 对 LAmin 有强烈的因果影响,这与先前观察到的结果一致。此外,也有证据表明,LA 容积,特别是 LAmin,可能是 AF 的因果因素。即使排除了 CHARGE-AF 中与至少一个危险因素相关的三个变体,这一因果效应也依然存在。但是,由于 AF 可以是阵发性的并且仍未被诊断,我们无法排除潜在的逆向因果关系的可能性:即一些参与者可能由于未被诊断的阵发性房颤而导致心房增大,从而引发了与 LA 容积的遗传相关性。 11 11 ^(11){ }^{11} 4 4 ^(4){ }^{4} A F A F AFA F A F A F AFA F
In future work, it will be interesting to determine if targeting the genes and pathways associated with abnormalities in LA function will be helpful to reduce the risk of AF , heart failure, and stroke.
未来的研究工作中,确定是否针对与异常 LA 功能相关的基因和通路进行干预,有助于降低房颤、心力衰竭和中风的风险,这将是一个有趣的研究方向。
This study has several limitations. All LA measurements were derived from deep learning models of cardiovascular MRI. Because a complete trans-axial stack of atrial images was not part of the UK Biobank imaging protocol, the LA measurements are estimates that are interpolated from cross-sections of the LA. Because contrast protocols were not used during image acquisition, we were not able to ascertain atrial fibrosis. The deep learning models have not been tested outside of the specific devices and imaging protocols used by the UK Biobank and are unlikely to generalize to other data sets without fine tuning. Disease labels were determined by diagnostic and procedural codes; because AF can be paroxysmal and may go undetected, it is likely that a subset of the participants had undiagnosed AF prior to MRI, which would bias causal estimates of the impact of LA volume on disease risk away from the null. The study population was largely composed of people of European ancestries, limiting generalizability of the findings to global populations. The participants who underwent MRI in the UK Biobank tended to be healthier than the remainder of the UK Biobank population, which itself is likely to be healthier than the general population. At present, there is little follow-up time subsequent to the first MRI visit for most UK Biobank participants.
这项研究存在几个局限性。所有左心房测量都是从心血管 MRI 的深度学习模型中得出的。由于英国生物银行成像协议中没有完整的横截面左心房图像堆栈,左心房测量是从左心房横截面插值得出的估计值。由于在图像采集过程中没有使用对比剂协议,我们无法确定左房纤维化。深度学习模型尚未在英国生物银行使用的特定设备和成像协议之外进行过测试,因此不太可能在未经微调的情况下推广到其他数据集。疾病标签是根据诊断和操作代码确定的;由于房颤可能是阵发性的,可能未被检测到,因此参与者中可能有一部分人在 MRI 之前就已经未被诊断出房颤,这将偏离左心房体积对疾病风险的影响的因果估计。研究人群主要由欧洲裔人群组成,这限制了研究结果在全球人群中的推广性。参加英国生物银行 MRI 检查的参与者往往比剩余英国生物银行人群更健康,而英国生物银行人群本身也可能比一般人群更健康。目前,大多数英国生物银行参与者在第一次 MRI 访问后的随访时间很短。
In conclusion, measures of LA structure and function are heritable traits that are associated with AF, stroke, and heart failure. Genetic predictors of LA volume are linked to an elevated risk of AF and, to a lesser extent, stroke and heart failure.
总之,左房结构和功能的测量是遗传性特征,与心房颤动、中风和心力衰竭有关。预测左房容积的遗传因子与心房颤动风险升高有关,与中风和心力衰竭的风险升高相关性较小。

Methods 方法

Study design 研究设计

Access to UK Biobank was provided under application #7089 and approved by the Partners HealthCare institutional review board (protocol 2019P003144). All UK Biobank participants provided written informed consent 49 49 ^(49){ }^{49}. Analysis of All of Us was considered exempt by the UCSF IRB (#22-37715). Each All of Us biobank participant provided written informed consent 41 41 ^(41){ }^{41}. The FinnGen analysis and approvals are
对英国生物银行的访问是根据申请#7089 提供的,并经过合作伙伴医疗保健机构伦理审查委员会批准(协议编号 2019P003144)。所有英国生物银行参与者已签署书面知情同意书 49 49 ^(49){ }^{49} 。美国全民健康研究(All of Us)的分析被加州大学旧金山分校伦理委员会(#22-37715)认定为豁免。每位参与美国全民健康研究的生物银行参与者已签署书面知情同意书 41 41 ^(41){ }^{41} 。FinnGen 分析和批准程序

detailed in the Supplementary Note. Study protocols complied with the tenets of the Declaration of Helsinki. Except where otherwise stated, all analyses were conducted in the UK Biobank, which is a richly phenotyped, prospective, population-based cohort that recruited 500,000 participants aged 40-69 years in the UK via mailer from 2006 to 2010 50 2010 50 2010^(50)2010^{50}. We analyzed 487,283 participants with genetic data who had not withdrawn consent as of February 2020.
详述于补充说明。研究方案符合《赫尔辛基宣言》的原则。除非另有说明,所有分析均在英国生物银行进行,这是一个高度特征化的前瞻性、以人群为基础的队列,于 2006 年至 2010 50 2010 50 2010^(50)2010^{50} 在英国通过邮寄方式招募了 500,000 名 40-69 岁的参与者。我们分析了 487,283 名拥有遗传数据且截至 2020 年 2 月未撤回同意的参与者。
Statistical analyses were conducted with R version 3.6 ( R Foundation for Statistical Computing, Vienna, Austria). All statistical tests were two-tailed unless otherwise specified.
统计分析使用 R 版本 3.6(R 基金会统计计算,维也纳,奥地利)进行。除非另有说明,所有统计检验均为双尾检验。

Definitions of diseases and medications
疾病和药物的定义

We defined disease status based on self-report, ICD codes, death records, and procedural codes from the UK Biobank’s hospital episode statistics data (Supplementary Data 9). These data were obtained from the UK Biobank in June 2020, at which time the recommended phenotype censoring date was March 31, 2020. The UK Biobank defines that date as the last day of the month for which the number of records is greater than 90 % 90 % 90%90 \% of the mean of the number of records for the previous three months (https://biobank.ndph.ox.ac.uk/ukb/exinfo. cgi?src=Data_providers_and_dates).
我们根据自我报告、ICD 编码、死亡记录和英国生物银行医院就诊记录数据(补充数据 9)定义了疾病状况。这些数据于 2020 年 6 月从英国生物银行获得,当时推荐的表型截止日期为 2020 年 3 月 31 日。英国生物银行将该日期定义为前三个月记录数量平均值的 90 % 90 % 90%90 \% 以上的最后一天(https://biobank.ndph.ox.ac.uk/ukb/exinfo. cgi?src=Data_providers_and_dates)。
We identified participants taking antihypertensive medications based on the Anatomical Therapeutic Classification (ATC) 51 51 ^(51){ }^{51}. Medications taken by UK Biobank participants were previously mapped to ATC codes 52 52 ^(52){ }^{52}. We considered medications with ATC codes beginning with C02, C09, C08CA, C03AA, C08CA01, or C03BA04 to be antihypertensives (medication names enumerated in Supplementary Data 10).
我们根据解剖治疗分类法(ATC) 51 51 ^(51){ }^{51} 识别正在服用降压药的参与者。英国生物银行参与者服用的药物早已映射到 ATC 代码 52 52 ^(52){ }^{52} 。我们将以 C02、C09、C08CA、C03AA、C08CA01 或 C03BA04 开头的 ATC 代码的药物视为降压药(药物名称列举在补充性数据 10 中)。

Cardiovascular MRI protocols
心血管 MRI 检查方案

At the time of this study, the UK Biobank had released images in over 45,000 participants of an imaging substudy that is ongoing 27 , 28 27 , 28 ^(27,28){ }^{27,28}. Cardiovascular MRI was performed with 1.5 Tesla scanners (Syngo MR D13 with MAGNETOM Aera scanners; Siemens Healthcare, Erlangen, Germany), and electrocardiographic gating for synchronization 28 28 ^(28){ }^{28}. Several cardiac views were obtained. For this study, four views (the long axis two-, three-, and four-chamber views, as well as the short axis view) were used. In these views, balanced steady-state free precession CINEs, consisting of a series of 50 images throughout the cardiac cycle for each view, were acquired for each participant 28 28 ^(28){ }^{28}. For the three long-axis views, only one imaging plane was available for each participant, with an imaging plane thickness of 6 mm and an average pixel width and height of 1.83 mm . For the short-axis view, several imaging planes were acquired. Starting at the base of the heart, 8 -mm-thick imaging planes were acquired with -2 mm gaps between each plane, forming a stack perpendicular to the longitudinal axis of the left ventricle to capture the ventricular volume. For the short axis images, the average pixel width and height was 1.86 mm .
在这项研究期间,英国生物库已经发布了参与正在进行的影像子研究的 45,000 多名参与者的影像数据 27 , 28 27 , 28 ^(27,28){ }^{27,28} 。采用 1.5 特斯拉扫描仪(西门子医疗保健公司,德国埃尔朗根,Syngo MR D13 及 MAGNETOM Aera 扫描仪)进行心血管磁共振成像,并采用心电图同步 28 28 ^(28){ }^{28} 。获得了几个心脏视角。在本研究中,使用了四个视角(二腔室、三腔室和四腔室长轴视角以及短轴视角) 28 28 ^(28){ }^{28} 。在这些视角中,对每个参与者获得了一系列 50 张贯穿整个心动周期的平衡稳态自由进动心电图序列。对于三个长轴视角,每个参与者只有一个成像平面,成像平面厚度为 6 mm,平均像素宽度和高度为 1.83 mm。对于短轴视角,获得了多个成像平面。从心脏基部开始,获得了 8 mm 厚的成像平面,每个平面之间间隙为-2 mm,形成一个垂直于左心室纵轴的栈,以捕获心室容积。对于短轴图像,平均像素宽度和高度为 1.86 mm。

Semantic segmentation 语义分割

We labeled pixels using a process similar to that described in our prior work evaluating the thoracic aorta and which we describe here 53 53 ^(53){ }^{53}. Cardiac structures were manually annotated in images from the short axis view and the two-, three-, and four-chamber long axis views from the UK Biobank by a cardiologist (JPP) using the traceoverlay software v0.1. 54 54 ^(54){ }^{54}. When present, the LA appendage was excluded, as were the pulmonary vein openings; the atrial and ventricular blood pools were distinguished by tracing a linear boundary at the base of the atrioventricular ring. To produce the models used in this manuscript, 714 short axis images were chosen, manually segmented, and used to train a deep learning model with PyTorch and fastai v1.0.61 29 , 55 29 , 55 ^(29,55){ }^{29,55}. The same was done separately with 98 two-chamber images, 66 threechamber images, and 445 four-chamber images. The models were based on a U-Net-derived architecture constructed with a ResNet34 encoder that was pre-trained on ImageNet 56 59 56 59 ^(56-59){ }^{56-59}. The Adam optimizer
我们采用了与先前评估胸主动脉的工作类似的过程对像素进行了标记,该过程在此处描述 53 53 ^(53){ }^{53} 。心脏结构由心脏病专家(JPP)使用 traceoverlay 软件 v0.1 在来自 UK Biobank 的短轴视图、二腔室、三腔室和四腔室长轴视图的图像中手动注释 54 54 ^(54){ }^{54} 。当存在时,左房耳被排除在外,肺静脉开口也被排除在外;房室血池由在房室环基部绘制线性边界区分。为了制作本手稿中使用的模型,选择了 714 张短轴图像,手动进行了分割,并用于使用 PyTorch 和 fastai v1.0.61 训练深度学习模型 29 , 55 29 , 55 ^(29,55){ }^{29,55} 。同样,也对 98 张二腔室图像、66 张三腔室图像和 445 张四腔室图像分别进行了处理。这些模型基于一个 U-Net 派生的架构,该架构使用在 ImageNet 上预训练的 ResNet34 编码器构建 56 59 56 59 ^(56-59){ }^{56-59} 。使用 Adam 优化器

was used 60 60 ^(60){ }^{60}. The models were trained with a cyclic learning rate training policy 61 .80 % 61 .80 % ^(61).80%{ }^{61} .80 \% of the samples were used to train the model, and 20 % 20 % 20%20 \% were used for validation. Held-out test sets with images that were not used for training or validation were used to assess the final quality of all models.
被使用 60 60 ^(60){ }^{60} 。这些模型采用循环学习率训练策略 61 .80 % 61 .80 % ^(61).80%{ }^{61} .80 \% 进行训练,其中 20 % 20 % 20%20 \% 的样本用于训练模型,余下的样本用于验证。未用于训练或验证的保留测试集用于评估所有模型的最终质量。
Four separate models were trained: one for each of the three long axis views, and one for the short axis view. During training, random perturbations of the input images (augmentations) were applied, including affine rotation, zooming, and modification of the brightness and contrast.
训练了四个独立的模型:三个用于长轴视图,一个用于短轴视图。在训练过程中,应用了输入图像的随机扰动(增广),包括仿射旋转、缩放以及亮度和对比度的修改。
For the short axis images, all images were resized initially to 104 × 104 104 × 104 104 xx104104 \times 104 pixels during the first half of training, and then to 224 × 224 224 × 224 224 xx224224 \times 224 pixels during the second half of training. The model was trained with a mini-batch size of 16 (with small images) or 8 (with large images). Maximum weight decay was 1 E 03 1 E 03 1E-031 \mathrm{E}-03. The maximum learning rate was 1 E 1 E 1E-1 \mathrm{E}- 03 , chosen based on the learning rate finder 29 , 62 29 , 62 ^(29,62){ }^{29,62}. A focal loss function was used (with alpha 0.7 and gamma 0.7 ), which can improve performance in the case of imbalanced labels 63 63 ^(63){ }^{63}. When training with small images, 60 % 60 % 60%60 \% of iterations were permitted to have an increasing learning rate during each epoch, and training was performed over 30 epochs while keeping the weights for all but the final layer frozen. Then, all layers were unfrozen, the learning rate was decreased to 1 E 07, and the model was trained for an additional 10 epochs. When training with large images, 30 % 30 % 30%30 \% of iterations were permitted to have an increasing learning rate, and training was done for 30 epochs while keeping all but the final layer frozen. Finally, all layers were unfrozen, the learning rate was decreased to 1 E 07 1 E 07 1E-071 \mathrm{E}-07, and the model was trained for an additional 10 epochs.
104 × 104 104 × 104 104 xx104104 \times 104 224 × 224 224 × 224 224 xx224224 \times 224 1 E 03 1 E 03 1E-031 \mathrm{E}-03 1 E 1 E 1E-1 \mathrm{E}- 29 , 62 29 , 62 ^(29,62){ }^{29,62} 63 63 ^(63){ }^{63} 60 % 60 % 60%60 \% 30 % 30 % 30%30 \% 1 E 07 1 E 07 1E-071 \mathrm{E}-07
For the two-chamber long axis images, all images were resized initially to 104 × 92 104 × 92 104 xx92104 \times 92 pixels during the first half of training, and then to 208 × 186 208 × 186 208 xx186208 \times 186 pixels during the second half of training. The model was trained with a mini-batch size of 8 (with small images) or 4 (with large images). Maximum weight decay was 1E-03. Per-pixel cross entropy loss was minimized 64 .30 % 64 .30 % ^(64).30%{ }^{64} .30 \% of iterations were permitted to have an increasing learning rate during each epoch. When training with small images, the maximum learning rate was initially 1 E 03 1 E 03 1E-031 \mathrm{E}-03, and training was performed over 30 epochs while keeping all weights frozen except for the final layer. When training with large images, the maximum learning rate was set to 1 E 03 1 E 03 1E-031 \mathrm{E}-03, and the model was trained for 12 epochs while keeping all but the final layer frozen. Finally, all layers were unfrozen, the learning rate was decreased to 1 E 06 1 E 06 1E-061 \mathrm{E}-06, and the model was retrained for an additional 8 epochs.
对于两室长轴图像,在训练的前半部分,所有图像最初都被调整为 104 × 92 104 × 92 104 xx92104 \times 92 像素,然后在训练的后半部分调整为 208 × 186 208 × 186 208 xx186208 \times 186 像素。该模型以 8(小图像)或 4(大图像)的小批量大小进行训练。最大权重衰减为 1E-03。最小化了每像素交叉熵损失, 64 .30 % 64 .30 % ^(64).30%{ }^{64} .30 \% 次迭代获得递增的学习率。在使用小图像训练时,最大学习率最初设置为 1 E 03 1 E 03 1E-031 \mathrm{E}-03 ,并在 30 个 epoch 上进行训练,同时冻结除最后一层外的所有权重。在使用大图像训练时,最大学习率设置为 1 E 03 1 E 03 1E-031 \mathrm{E}-03 ,并对模型进行 12 个 epoch 的训练,同时冻结除最后一层外的所有权重。最后,解冻所有层,将学习率降低至 1 E 06 1 E 06 1E-061 \mathrm{E}-06 ,并进行额外 8 个 epoch 的重新训练。
For the three-chamber long axis images, all images were resized initially to 128 × 128 128 × 128 128 xx128128 \times 128 pixels during the first half of training, and then to 256 × 256 256 × 256 256 xx256256 \times 256 pixels during the second half of training. The model was trained with a mini-batch size of 4 (with small images) or 2 (with large images). Maximum weight decay was 1E-02. Per-pixel cross entropy loss was minimized 64 .30 % 64 .30 % ^(64).30%{ }^{64} .30 \% of iterations were permitted to have an increasing learning rate during each epoch. When training with small images, the maximum learning rate was initially 1 E 03 1 E 03 1E-031 \mathrm{E}-03, and training was performed over 20 epochs while keeping all weights frozen except for the final layer. Then, all layers were unfrozen, the learning rate was decreased to 3 E 05 3 E 05 3E-053 \mathrm{E}-05, and the model was trained for an additional 20 epochs, with 80 % 80 % 80%80 \% of iterations permitted to have an increasing learning rate during each epoch. When training with large images, the maximum learning rate was set to 3 E 04 3 E 04 3E-043 \mathrm{E}-04, and the model was trained for 15 epochs while keeping all but the final layer frozen; 20 % 20 % 20%20 \% of iterations were permitted to have an increasing learning rate during each epoch. Finally, all layers were unfrozen, the learning rate was decreased to 1 E 07, and the model was retrained for an additional 7 epochs.
128 × 128 128 × 128 128 xx128128 \times 128 像素在训练的前半部分, 然后调整为 256 × 256 256 × 256 256 xx256256 \times 256 像素在后半部分。 训练使用小图像的 mini-batch 大小为 4, 大图像为 2。 最大权重衰减为 1E-02。最小化像素交叉熵损失, 允许 64 .30 % 64 .30 % ^(64).30%{ }^{64} .30 \% 的迭代有递增的学习率。 使用小图像训练时, 最大学习率初始为 1 E 03 1 E 03 1E-031 \mathrm{E}-03 , 进行 20 个 epoch 训练, 只有最后一层参数可学习。之后解冻所有层, 学习率降低至 3 E 05 3 E 05 3E-053 \mathrm{E}-05 , 继续训练 20 个 epoch, 允许 80 % 80 % 80%80 \% 的迭代有递增的学习率。 使用大图像训练时, 最大学习率设为 3 E 04 3 E 04 3E-043 \mathrm{E}-04 , 进行 15 个 epoch 训练, 只有最后一层参数可学习; 允许 20 % 20 % 20%20 \% 的迭代有递增的学习率。最后, 解冻所有层, 学习率降低至 1E-07, 再训练 7 个 epoch。
For the four-chamber long axis images, all images were resized initially to 76 × 104 76 × 104 76 xx10476 \times 104 pixels during the first half of training, and then to 150 × 208 150 × 208 150 xx208150 \times 208 pixels during the second half of training. The model was trained with a mini-batch size of 4 (with small images) or 2 (with large images). Maximum weight decay was 1 E 02 1 E 02 1E-021 \mathrm{E}-02. Per-pixel cross entropy
对于四腔室长轴图像,所有图像在训练的前半部分都被重新调整到 76 × 104 76 × 104 76 xx10476 \times 104 像素,然后在训练的后半部分调整到 150 × 208 150 × 208 150 xx208150 \times 208 像素。模型使用小图像训练时的小批量大小为 4,使用大图像训练时的小批量大小为 2。最大权重衰减为 1 E 02 1 E 02 1E-021 \mathrm{E}-02 。每像素交叉熵

loss was minimized 64 .30 % 64 .30 % ^(64).30%{ }^{64} .30 \% of iterations were permitted to have an increasing learning rate during each epoch. When training with small images, the maximum learning rate was initially 1 E 03 1 E 03 1E-031 \mathrm{E}-03, and training was performed over 50 epochs while keeping all weights frozen except for the final layer. Then, all layers were unfrozen, the learning rate was decreased to 3 E 05 3 E 05 3E-053 \mathrm{E}-05, and the model was trained for an additional 15 epochs. When training with large images, the maximum learning rate was set to 3 E 04 3 E 04 3E-043 \mathrm{E}-04, and the model was trained for 50 epochs while keeping all but the final layer frozen. Finally, all layers were unfrozen, the learning rate was decreased to 1 E 07 1 E 07 1E-071 \mathrm{E}-07, and the model was retrained for an additional 15 epochs.
损失降到最小,允许在每个 epoch 中学习率逐渐增大。训练小图像时,初始最大学习率为 1 E 03 1 E 03 1E-031 \mathrm{E}-03 ,并在 50 个 epoch 上进行训练,除最后一层外其他权重保持冻结状态。然后解冻所有层,学习率降低至 3 E 05 3 E 05 3E-053 \mathrm{E}-05 ,并在额外的 15 个 epoch 上进行训练。训练大图像时,最大学习率设置为 3 E 04 3 E 04 3E-043 \mathrm{E}-04 ,并在 50 个 epoch 上进行训练,除最后一层外其他层保持冻结状态。最后,解冻所有层,学习率降低至 1 E 07 1 E 07 1E-071 \mathrm{E}-07 ,并在额外的 15 个 epoch 上进行重新训练。
Each model was applied to all available images from its respective view that were available in the UK Biobank as of November 2020.
每个模型都应用于截至 2020 年 11 月在英国生物银行可获得的各自视图的所有可用图像。

Semantic segmentation model quality assessment
语义分割模型质量评估

The quality of the deep learning segmentation output was assessed against manually annotated segmentations in held-out test samples using the Sørensen-Dice coefficient, the Hausdorff distance, and the mean contour distance 65 , 66 65 , 66 ^(65,66){ }^{65,66}. The Sørensen-Dice coefficient addresses the total segmentation area of the left atrium, and is a dimensionless value that ranges from 0 for an image where no pixels overlap between human and machine labels, to 1 for an image with perfect overlap between human and machine labels. The Sørensen-Dice was calculated by dividing twice the number of overlapping pixels between the two sets (the intersection) by the sum of the individual pixels considered to be left atrium in each set.
使用 Sørensen-Dice 系数、Hausdorff 距离和平均轮廓距离 65 , 66 65 , 66 ^(65,66){ }^{65,66} ,将深度学习分割输出的质量与手动注释的分割进行评估,在保留的测试样本中进行。Sørensen-Dice 系数解决了左心房的总体分割区域,它是一个无量纲值,从 0(人工和机器标签之间没有重叠像素)到 1(人工和机器标签之间完美重叠)不等。Sørensen-Dice 系数通过将两个集合中重叠像素的数量(交集)乘以 2,然后除以每个集合中被认为是左心房的单独像素的总和来计算。
The Hausdorff distance and the mean contour distance address the perimeter of the manual and automated segmentations, and to obtain this perimeter the binary_erosion function from the python3 library scikit-image version 0.19 .3 was used. The Hausdorff distance represents the maximum distance in millimeters ( mm ) for any point in the perimeter of the automated segmentation output to its nearest point in the perimeter of the manually annotated segmentation. The Hausdorff distance was calculated using the directed_hausdorff function from the scipy.spatial.distance python3 library, version 1.11.4. The mean contour distance represents the average distance in mm of each point on the automated segmentation output to its nearest point in the perimeter of the manually annotated segmentation. The mean contour distance was calculated for each point in the automated segmentation perimeter by testing the distance to every point in the perimeter of the manually annotated data; retaining the minimum distance for each point; and then taking the average for all points in the automated segmentation perimeter.
haus 多夫距离和平均轮廓距离解决了手动和自动分割的周长问题,为此使用了 python3 库 scikit-image 版本 0.19.3 中的 binary_erosion 函数。haus 多夫距离代表了自动分割输出的周长中任意一点到手动标注分割周长中最近点的最大距离(单位:毫米)。haus 多夫距离使用 scipy.spatial.distance python3 库版本 1.11.4 中的 directed_hausdorff 函数计算。平均轮廓距离代表了自动分割输出的每个点到手动标注分割周长中最近点的平均距离(单位:毫米)。通过测试自动分割周长上每个点到手动标注数据周长上每个点的距离,保留每个点的最小距离,然后对所有自动分割周长上的点取平均,计算得到平均轮廓距离。

Poisson surface reconstruction
泊松表面重建

To integrate the output from each of the four models into one LA volume estimate, Poisson surface reconstruction was performed 67 , 68 67 , 68 ^(67,68){ }^{67,68}. Among the views included in the UK Biobank cardiac MRI data set, none fully captures the 3-D anatomical structure of the LA. The short axis stack only occasionally included the lower portion of the chamber, while the three long axis (i.e., two-, three-, and four-chamber) views provided only single-slice cross-sections of the LA at different orientations. To integrate information from the four incomplete MRI views into a consistent 3D representation of the LA anatomy, we followed a procedure similar to Pirruccello et al. (2021) 69 69 ^(69){ }^{69}. Briefly, we first co-rotated the LA segmentation maps from the MRI views into the same reference system (shared 3D space) using standard DICOM metadata from the Image Position (Patient) [0020,0032] and Image Orientation (Patient) [0020,0037] tags. Then, the perimeters of each 2D atrial segmentation map were extracted, yielding a sparse 3D point cloud. In addition to the point coordinates, the reconstruction algorithm requires as input a vector representing the local normal directions for each point, which is used to constrain the curvature of the reconstructed surface. In our approach, we assumed that each perimeter point’s normal vector lay on the MRI view plane and was radially oriented outwards from the center
67 , 68 67 , 68 ^(67,68){ }^{67,68} 为了将四个模型的输出整合到一个 LA 体积估计中,执行了泊松表面重建。在英国生物银行心脏 MRI 数据集中包含的视图中,没有一个完全捕捉到 LA 的 3D 解剖结构。短轴堆栈仅偶尔包括了房室腔的下部分,而三条长轴(即二室、三室和四室)视图仅提供了 LA 在不同取向上的单片截面。为了将来自四种不完整 MRI 视图的信息整合成一致的 LA 解剖 3D 表示,我们遵循了与 Pirruccello et al. (2021) 69 69 ^(69){ }^{69} 类似的程序。简而言之,我们首先使用图像位置(患者)[0020,0032]和图像定向(患者)[0020,0037]标签中的标准 DICOM 元数据,将 MRI 视图中的 LA 分割图旋转到相同的参考系统(共享 3D 空间)中。然后,提取每个 2D 房颤分割图的轮廓,生成稀疏的 3D 点云。除了点坐标,重建算法还需要一个代表每个点局部法向量的向量作为输入,用于约束重建表面的曲率。在我们的方法中,我们假定每个周边点的法向量位于 MRI 视图平面上,并沿从中心向外的径向方向取向。

of gravity of the LA segmentation from which the point was extracted. Using three inputs, consisting of the points, the normals, and the depth argument of 16 (representing the maximum depth of the tree that the library will use for reconstruction), we applied the Poisson surface reconstruction algorithm 67 67 ^(67){ }^{67} with the pypoisson python binding for the Screened Poisson Surface Reconstruction C++ library v6.13 68 68 ^(68){ }^{68}. This yielded interpolated 3-D surfaces from the sparse 3D point cloud. This approach is tolerant to missing segmentation data (e.g., from the frequently missing SAX data) as long as not all available points are coplanar. 3D surfaces of the LA were reconstructed for each of the 50 MRI frames acquired during the cardiac cycle. At each timepoint, the volume of the LA was computed from the reconstructed surface model using the GetVolume routine for triangulated meshes included in the VTK library (Kitware Inc.). From the reconstructed volume traces, we estimated the maximum and minimum LA volumes, as well as LA stroke volume and emptying fraction.
从中提取该点的 LA 分割的重力中心。使用由点、法线和深度参数 16(表示库将用于重建的树的最大深度)组成的三个输入,我们应用了基于 pypoisson 的 python 绑定的 Screened Poisson Surface Reconstruction C++ v6.13 库的 Poisson 表面重建算法 67 67 ^(67){ }^{67} 。这产生了从稀疏 3D 点云中插值的 3D 表面。只要并非所有可用点都共面,此方法对缺失的分割数据(例如,频繁缺失的 SAX 数据)具有耐受性。在心脏周期内获得的 50 个 MRI 帧中,重建了 LA 的 3D 表面。在每个时间点,使用 VTK 库(Kitware Inc.)中包含的用于三角化网格的 GetVolume 例程从重建的表面模型计算了 LA 的体积。从重建的体积曲线中,我们估计了 LA 的最大和最小体积,以及 LA 的冲程和排出分数。 68 68 ^(68){ }^{68}

Quality control after segmentation and reconstruction
分段和重建后的质量控制

Automated quality control was performed on the segmentation output to flag putatively invalid segmentations separately for each view. Studies were flagged based on the following heuristics: (a) if they had more than 1 connected component (i.e., if there were pixels in more than one connected surface that were being labeled as left atrium); (b) if the maximum single frame-to-frame change in pixels segmented as left atrium during the 50 -frame CINE sequence was greater than five standard deviations beyond the population mean; © if no pixels were segmented as the left atrium; or (d) if the number of images in the CINE was not 50 . The presence or absence of these flags was then tested for association with 3D surface reconstruction failure using logistic regression.
对分割输出进行了自动化质量控制,以分别标记每个视图中可能无效的分割。根据以下启发式规则对研究进行了标记:(a)如果有多于 1 个连通部分(即,如果有多于一个连通表面被标记为左心房);(b)如果在 50 帧 CINE 序列中,被标记为左心房的像素在帧与帧之间的最大变化大于总体平均值的五个标准差;(c)如果没有像素被标记为左心房;或(d)如果 CINE 中的图像数不是 50。然后使用 logistics 回归检验这些标志的存在与 3D 表面重建失败之间的关联。

Identification of abnormal cardiac filling patterns
异常心脏充填模式的识别

In order to focus our analyses on normal variation, we sought to exclude participants from the GWAS if they had an abnormal atrial contraction at the time of acquisition of the MRI. Although MRI uses an electrocardiographic (ECG) signal for image acquisition, the underlying ECG signal from the time of MRI signal acquisition is not available for analysis. Therefore, we sought to identify participants who appeared to have abnormal cardiac filling patterns during the MRI as a proxy for this. We trained a deep-learning model to identify the presence or absence of typical patterns of cardiac filling throughout the cardiac cycle.
为了将我们的分析集中在正常变异上,如果参与者在获取 MRI 时出现异常心房收缩,我们试图将其排除在 GWAS 之外。虽然 MRI 使用心电图(ECG)信号进行图像采集,但 MRI 信号采集时的基础 ECG 信号无法用于分析。因此,我们试图通过识别 MRI 期间心脏充盈模式的异常来作为这一指标的代理。我们训练了一个深度学习模型来识别心脏整个收缩周期中典型的心脏充盈模式是否存在。
To create a training set for such a model, we first fetched CINE videos from the 2 , 3 2 , 3 2-,32-, 3, and 4 -chamber long axis views of all participants with a history of atrial fibrillation. A cardiologist (JPP) evaluated whether the videos appeared to represent a typical cardiac cycle including an atrial contraction. A deep learning model was then trained to classify filling patterns as representing normal cardiac filling or not based on the segmentation output from the semantic segmentation deep learning models. Each input channel represented the pixel counts of a cardiac chamber from a different long axis view, normalized by the maximum number of pixels seen for each channel for that participant, over the entire cardiac cycle. The normalization step prevented the model from accessing information about the absolute size of the chambers, forcing it instead to identify patterns based on relative size changes throughout the cardiac cycle. In total, 8 channels were used as input: four from the 4 -chamber long axis images (left atrium, right atrium, left ventricle, right ventricle), two from the 3 -chamber long axis images (left atrium, left ventricle), and two from the 2 -chamber long axis images (left atrium, left ventricle). Cases were excluded if all 8 channels were not available. Therefore, the shape of the input was 50 × 8 50 × 8 50 xx850 \times 8 ( 8 channels for 50 time steps). Training was performed with FastAl version 2.2.5 29 29 ^(29){ }^{29}, using the TimeseriesAI library version 0.2.15 (github.com/timeseriesAI/tsai) to train an InceptionTime model 70 70 ^(70)^{70}. The Ranger optimization function was
为了创建这种模型的训练集,我们首先从 2 , 3 2 , 3 2-,32-, 3 获取 CINE 视频,以及所有患有心房颤动病史的参与者的 4 腔心长轴视图。一位心脏病专家(JPP)评估了这些视频是否表现了典型的心脏周期,包括房室收缩。然后训练一个深度学习模型,根据语义分割深度学习模型的分割输出,将填充模式分类为正常心室填充或异常。每个输入通道代表了一个不同长轴视角的心腔像素计数,并根据该参与者整个心脏周期中每个通道看到的最大像素数进行了归一化。归一化步骤防止模型访问心腔绝对大小的信息,而是迫使它基于心脏周期中相对大小变化来识别模式。总共使用了 8 个通道作为输入:4 个来自 4 腔心长轴图像(左房,右房,左室,右室),2 个来自 3 腔心长轴图像(左房,左室),2 个来自 2 腔心长轴图像(左房,左室)。如果没有所有 8 个通道可用,则会排除该用例。因此,输入的形状为 50 × 8 50 × 8 50 xx850 \times 8 (50 个时间步长的 8 个通道)。使用 FastAl 版本 2.2.5 29 29 ^(29){ }^{29} 进行训练,使用 TimeseriesAI 库版本 0.2.15(github.com/timeseriesAI/tsai)训练 InceptionTime 模型 70 70 ^(70)^{70} 。使用 Ranger 优化函数

used with cross entropy loss, and the number of filters in the InceptionTime model was 32, all of which are the software defaults in the TimeseriesAI library. Ranger incorporates RAdam and Lookahead to improve training stability early and later during training, respectively 71 , 72 71 , 72 ^(71,72){ }^{71,72}. 20 % 20 % 20%20 \% of samples were randomly chosen as the validation set. The model was trained with a batch size of 32 . Variable learning rates from 5E-06 to 5E-03 were permitted during training. Training was conducted using the One-Cycle policy for 20 epochs 61 , 62 61 , 62 ^(61,62)^{61,62}.
使用交叉熵损失,InceptionTime 模型中的滤波器数量为 32,这些都是 TimeseriesAI 库中的软件默认值。 Ranger 结合了 RAdam 和 Lookahead,分别在训练的早期和后期提高了训练的稳定性。 71 , 72 71 , 72 ^(71,72){ }^{71,72} 20 % 20 % 20%20 \% 的样本被随机选为验证集。模型以 32 的批量大小进行训练。训练过程中允许使用从 5E-06 到 5E-03 的可变学习率。训练使用 One-Cycle 策略进行了 20 个时期。 61 , 62 61 , 62 ^(61,62)^{61,62}
To evaluate the accuracy of the deep learning model, manual evaluation of the cardiac filling patterns was conducted by one cardiologist (JPP) for 100 participants flagged as having abnormal cardiac filling patterns and 100 flagged as having normal cardiac filling patterns, sampled at random from participants without a history of atrial fibrillation. Sensitivity and specificity and their confidence intervals were calculated with the binom.test function in R.
为评估深度学习模型的准确性,由一位心脏病专科医生(JPP)手工评估了 100 名被标记为异常心室充填模式和 100 名被标记为正常心室充填模式的参与者,这些参与者均随机抽取自没有房颤病史的人群。使用 R 中的 binom.test 函数计算了敏感性、特异性及其置信区间。

Evaluation of the relationship between the LA, phenotypes, and cardiovascular diseases
评估 LA、表型和心血管疾病之间的关系

For epidemiologic analyses of continuous traits, we performed linear regression, with the LA phenotypes as the dependent variable in a model with the phenotype of interest adjusted for sex, the first five principal components of ancestry, the genotyping array, the MRI scanner, and a third-degree spline of age at the time of imaging to account for possible nonlinear effects of age.
对于连续性状的流行病学分析,我们进行了线性回归,以 LA 表型作为因变量,并在模型中调整了性别、前五个祖先主成分、基因分型阵列、MRI 扫描仪以及年龄的三度样条来考虑年龄的可能非线性效应。
For the disease-based analyses, we focused on four disease definitions related to LA structure and function: AF or flutter, ischemic stroke, hypertension, and heart failure (defined below). For prevalent disease that was diagnosed prior to the time of imaging, linear models were used to test for an association between each disease (as a binary independent variable) and LA phenotypes (as the dependent variables), adjusting for the MRI serial number to account for inter-site differences, sex, age, and the interaction between sex and age.
对于基于疾病的分析,我们关注了四种与 LA 结构和功能相关的疾病定义:房颤或房扑动、缺血性卒中、高血压和心力衰竭(定义如下)。对于在影像学检查之前已经诊断的普及性疾病,使用线性模型测试每种疾病(作为二元独立变量)与 LA 表型(作为因变量)之间的关联,并调整 MRI 序列号以考虑站点间差异、性别、年龄和性别与年龄的交互作用。
For incident disease, participants with pre-existing diagnoses prior to the MRI were excluded from the analysis. A Cox proportional hazards model was used, with survival defined as the time between MRI and either the time of censoring, or disease diagnosis. The model was adjusted for the MRI serial number, sex, age, the interaction between sex and age, the cubic natural spline of height, the cubic natural spline of weight, and the cubic natural spline of BMI. As a sensitivity analysis, adjustment was additionally made for heart rate, P duration, QRS duration, P-Q interval, QTc interval, left ventricular endsystolic volume, left ventricular end diastolic volume, and left ventricular ejection fraction.
对于发病率,在 MRI 之前诊断存在的参与者被排除在分析之外。使用了 Cox 比例风险模型,其中生存时间定义为从 MRI 到检测或诊断疾病的时间。该模型经过了以下调整:MRI 序列号、性别、年龄、性别和年龄的交互作用、身高三次自然样条、体重三次自然样条以及 BMI 三次自然样条。作为敏感性分析,还额外调整了心率、P 时程、QRS 时程、P-Q 间期、QTc 间期、左心室末期收缩容积、左心室末期舒张容积和左心室射血分数。

Genotyping, imputation, and genetic quality control
基因分型、溯因和遗传质量控制

UK Biobank samples were genotyped on either the UK BiLEVE or UK Biobank Axiom arrays and imputed into the Haplotype Reference Consortium panel and the UK10K +1000 Genomes panel 73 73 ^(73){ }^{73}. Variant positions were keyed to the GRCh 37 human genome reference. Genotyped variants with genotyping call rate < 0.95 < 0.95 < 0.95<0.95 and imputed variants with INFO score < 0.3 < 0.3 < 0.3<0.3 or minor allele frequency 0.005 0.005 <= 0.005\leq 0.005 in the analyzed samples were excluded. After variant-level quality control, 11,253,549 imputed variants remained for analysis.
UK Biobank 样本通过 UK BiLEVE 或 UK Biobank Axiom 阵列进行基因分型,并通过 Haplotype Reference Consortium 面板和 UK10K +1000 Genomes 面板进行插值。变体位置被标记为 GRCh 37 人类基因组参考。基因分型调用率 < 0.95 < 0.95 < 0.95<0.95 的基因分型变体和 INFO 评分 < 0.3 < 0.3 < 0.3<0.3 或分析样本中的次要等位基因频率 0.005 0.005 <= 0.005\leq 0.005 的插值变体均被排除。在完成变体水平的质量控制后,共有 11,253,549 个插值变体可用于分析。
Participants without imputed genetic data, or with a genotyping call rate < 0.98 < 0.98 < 0.98<0.98, mismatch between self-reported sex and sex chromosome count, sex chromosome aneuploidy, excessive third-degree relatives, or outliers for heterozygosity were excluded from genetic analysis 73 73 ^(73){ }^{73}. Participants were also excluded from genetic analysis if they had a history of AF or flutter, hypertrophic cardiomyopathy, dilated cardiomyopathy, heart failure, myocardial infarction, or coronary artery disease documented prior to the time they underwent cardiovascular MRI at a UK Biobank assessment center. Our definitions of these diseases in the UK Biobank are provided in Supplementary Data 9.
参与者如果缺乏经过插补的遗传数据,或者基因分型呼叫率 < 0.98 < 0.98 < 0.98<0.98 、自报性别与性染色体计数不一致、性染色体异常、亲属关系过密、或杂合性异常,则会被排除在遗传分析之外。 73 73 ^(73){ }^{73} 参与者如果既往有心房纤颤或心房扑动、肥厚型心肌病、扩张型心肌病、心力衰竭、心肌梗死或冠状动脉疾病的病史,也会被排除在遗传分析之外。我们对这些疾病在 UK Biobank 中的定义见补充资料 9。

GWAS of the left atrium
左心房的全基因组关联研究

We analyzed the four unadjusted LA phenotypes, as well as LAmax, LAmin, and LASV estimates that were adjusted for BSA or LVEDV (rationale detailed in the Supplementary Note), yielding 10 traits that underwent GWAS. Before conducting genetic analyses, a rank-based inverse normal transformation was applied 74 74 ^(74){ }^{74}. All traits were adjusted for sex, age at enrollment, age and age 2 2 ^(2){ }^{2} at the time of MRI, the first 10 principal components of ancestry, the genotyping array, and the MRI scanner’s unique identifier.
我们分析了四种未经调整的 LA 表型,以及调整了 BSA 或 LVEDV 的 LAmax、LAmin 和 LASV 估计值(详见补充说明),产生了 10 个经历 GWAS 的性状。在进行遗传分析之前,应用了基于秩的逆正态变换 74 74 ^(74){ }^{74} 。所有性状均根据性别、入组时年龄、MRI 时年龄及年龄 2 2 ^(2){ }^{2} 、前 10 个主成分祖先、基因分型阵列和 MRI 扫描仪的唯一标识符进行了调整。
BOLT-REML v2.3.4 was used to assess the SNP-heritability of the phenotypes, as well as their genetic correlation with one another using the directly genotyped variants in the UK Biobank 75 75 ^(75){ }^{75}. GWAS for each phenotype were conducted using BOLT-LMM version 2.3.4 to account for cryptic population structure and sample relatedness 75 , 76 75 , 76 ^(75,76){ }^{75,76}. We used the full autosomal panel of 714,577 directly genotyped SNPs that passed quality control (minor allele frequency 0.001 0.001 >= 0.001\geq 0.001; maximum genotype missingness 5 % 5 % <= 5%\leq 5 \% for each variant; maximum sample missingness 2 % 2 % <= 2%\leq 2 \% ) to construct the genetic relationship matrix (GRM), with covariate adjustment as noted above. Associations on the X chromosome were also analyzed, using all autosomal SNPs and X X XX chromosomal SNPs to construct the GRM ( N = 732 , 214 N = 732 , 214 N=732,214N=732,214 SNPs), with the same covariate adjustments and significance threshold as in the autosomal analysis. In this analysis mode, BOLT treats individuals with one X X XX chromosome as having an allelic dosage of 0 / 2 0 / 2 0//20 / 2 and those with two X chromosomes as having an allelic dosage of 0 / 1 / 2 0 / 1 / 2 0//1//20 / 1 / 2. Variants with association P < 5 × 10 8 P < 5 × 10 8 P < 5xx10^(-8)P<5 \times 10^{-8} were considered to be genome-wide significant 77 77 ^(77){ }^{77}.
BOLT-REML v2.3.4 被用于评估表型的 SNP 遗传力,以及它们彼此之间的遗传相关性,使用英国生物银行中直接基因型的变体 75 75 ^(75){ }^{75} 。使用 BOLT-LMM 版本 2.3.4 进行了每个表型的 GWAS 分析,以考虑隐性的人群结构和样本相关性 75 , 76 75 , 76 ^(75,76){ }^{75,76} 。我们使用 714,577 个通过质量控制的直接基因型 SNP 全自动染色体面板(次要等位基因频率 0.001 0.001 >= 0.001\geq 0.001 ;每个变体的最大基因型缺失 5 % 5 % <= 5%\leq 5 \% ;最大样本缺失 2 % 2 % <= 2%\leq 2 \% )来构建遗传关系矩阵(GRM),并进行上述协变量调整。X 染色体上的关联也被分析,使用所有自动染色体 SNP 和 X X XX 染色体 SNP 来构建 GRM( N = 732 , 214 N = 732 , 214 N=732,214N=732,214 个 SNP),使用与自动染色体分析中相同的协变量调整和显著性阈值。在这种分析模式下,BOLT 将具有一个 X X XX 染色体的个体视为具有等位基因剂量 0 / 2 0 / 2 0//20 / 2 ,而具有两个 X 染色体的个体视为具有等位基因剂量 0 / 1 / 2 0 / 1 / 2 0//1//20 / 1 / 2 。与关联 P < 5 × 10 8 P < 5 × 10 8 P < 5xx10^(-8)P<5 \times 10^{-8} 有关的变体被认为是基因组范围内显著的 77 77 ^(77){ }^{77}
We identified lead SNPs for each trait. Linkage disequilibrium (LD) clumping was performed with PLINK-1.931 using the same participants used for the GWAS. We outlined a 5 -megabase window (-clump-kb 5000) and used a stringent LD threshold ( r 2 0.001 r 2 0.001 -r^(2)0.001-r^{2} 0.001 ) in order to account for long LD blocks. With the independently significant clumped SNPs, distinct genomic loci were then defined by starting with the SNP with the strongest P P PP value, excluding other SNPs within 500 kb , and iterating until no SNPs remained. Independently significant SNPs that defined each genomic locus are termed the lead SNPs.
我们确定了每个性状的 lead SNP。使用参与 GWAS 的相同参与者进行了连锁不平衡 (LD) 聚合,使用 PLINK-1.931 进行。我们概述了一个 5 兆碱基的窗口 (-clump-kb 5000),并使用了一个严格的 LD 阈值(0)以考虑长 LD 块。对于独立显著的聚合 SNP,通过从最强 1 值的 SNP 开始,排除 500 kb 内的其他 SNP,并重复直到没有剩余 SNP 来定义不同的基因组位点。定义每个基因组位点的独立显著 SNP 称为 lead SNP。
HWE for GWAS lead variants was tested using the statistical library available at https://github.com/chrchang/stats (commit @67c3f71), which was written as part of Plink 31 31 ^(31){ }^{31}.
使用可用于https://github.com/chrchang/stats(提交@67c3f71)的统计库测试了 GWAS 关键变量的 HWE,该统计库是作为 Plink 31 31 ^(31){ }^{31} 的一部分编写的。
Linkage disequilibrium (LD) score regression analysis was performed using l d s c l d s c ldscl d s c version 1.0.0 30 30 ^(30){ }^{30}. With l d s c l d s c ldscl d s c, the genomic control factor (lambda GC) was partitioned into components reflecting polygenicity and inflation, using the software’s defaults.
利基不平衡(LD)评分回归分析使用 l d s c l d s c ldscl d s c 版本 1.0.0 30 30 ^(30){ }^{30} 进行。使用 l d s c l d s c ldscl d s c ,将基因组控制因子(lambda GC)划分为反映多遗传性和膨胀的组成部分,使用软件的默认设置。

Genetic correlation with atrial fibrillation
与心房颤动的遗传相关性

We used Idsc version 1.0.1 to perform cross-trait LD score regression to estimate genetic correlation between the LA measurements, atrial fibrillation (from Roselli et al. 2018), and all-cause or cardioembolic stroke (from Malik et al. 2018) 33 35 33 35 ^(33-35){ }^{33-35}. Summary stats were pre-processed with the munge_sumstats.py script from Idsc 1.0.1 using the default settings, filtering out variants with imputation INFO scores less than 0.9 or minor allele frequencies below 0.01 , as well as strandambiguous variants.
我们使用 Idsc 1.0.1 版本进行了跨特征 LD 分数回归,以估计 LA 测量值、房颤(来自 Roselli 等人,2018 年)和全因或心源性卒中(来自 Malik 等人,2018 年)之间的遗传相关性 33 35 33 35 ^(33-35){ }^{33-35} 。使用 Idsc 1.0.1 中的 munge_sumstats.py 脚本对总结统计数据进行了预处理,使用默认设置,过滤掉了导入信息评分低于 0.9 或次要等位基因频率低于 0.01 的变体,以及双向歧义变体。

Overlap of LA loci with atrial fibrillation loci
房颤位点与 LA 位点的重叠

We identified the lead SNPs associated with AF from Supplementary Table 16 of Roselli et al. 34 34 ^(34){ }^{34}. For this exercise, we used each of the 134 SNPs that achieved association P < 5 E 8 P < 5 E 8 P < 5E-8P<5 \mathrm{E}-8 in the primary GWAS (column I I ^(')I{ }^{\prime} \mathrm{I} ') or in the meta-analysis (column ‘AD’). We counted the number of AF lead SNPs that fell within 500 kb of the LA lead SNP from our study. We used SNPsnap to generate 10,000 sets of SNPs that matched the LA lead SNPs based on parameters including minor allele frequency, SNPs in linkage disequilibrium, distance from the nearest gene, and gene density 36 36 ^(36){ }^{36}. We then repeated the same counting procedure for each of the 10,000 synthetic SNPsnap lead SNP lists, to set a neutral expectation for the number of overlapping AF lead SNPs based on chance.
我们从 Roselli 等人的补充表 16 中确定了与心房颤动(AF)相关的首要 SNP。对于本练习,我们使用了 134 个在初级 GWAS(列'')或元分析(列'AD')中取得关联的 SNP。我们统计了在 LA 首要 SNP500 kb 范围内的 AF 首要 SNP 数量。我们使用 SNPsnap 生成了 10,000 组与 LA 首要 SNP 相匹配的 SNP,包括次要等位基因频率、连锁不平衡 SNP、距离最近基因和基因密度等参数。然后我们对每组 10,000 个合成的 SNPsnap 首要 SNP 列表重复了相同的统计程序,以基于偶然的机会设置重叠 AF 首要 SNP 数量的中性期望。
This allowed us to compute a one-tailed permutation P value (with the most extreme possible P P PP value based on 10,000 randomly chosen sets of SNPs being 1 E 04 1 E 04 1E-041 \mathrm{E}-04 ).
这样我们就可以计算一个单尾排列 P 值(根据 10,000 个随机选择的 SNP 集合的最极端可能 P P PP 值计算得出 1 E 04 1 E 04 1E-041 \mathrm{E}-04 )。

Mendelian randomization 孟德尔随机化

We sought to assess a potential causal relationship between LAmin and AF using Mendelian randomization (MR). We considered LAmin as the exposure and AF as the outcome. The genetic instruments for LAmin were generated using the genome-wide association results from this analysis. The variants from the exposure summary statistics were clumped with P < 1 E 06 , r 2 < 0.001 P < 1 E 06 , r 2 < 0.001 P < 1E-06,r^(2) < 0.001P<1 \mathrm{E}-06, r^{2}<0.001, and a radius of 5 megabases using the TwoSampleMR package v0.5.7 in R 78 R 78 R^(78)\mathrm{R}^{78}. These stringent clumping thresholds were intended to reduce the risk of including modestly correlated variants as if they were truly distinct instruments despite tagging the same underlying signal (e.g., having an r 2 0.1 r 2 0.1 r^(2)0.1r^{2} 0.1 with one another). The variants with ambiguous alleles were removed. 19 variants were harmonized with a large AF GWAS that did not include UK Biobank participants 37 37 ^(37){ }^{37}. The inverse variance weighted (IVW) method was performed as the primary MR analysis. We also performed simple median, weighted median, MR-Egger, and MR-PRESSO to account for violations of the instrumental variable assumptions 79 , 80 79 , 80 ^(79,80){ }^{79,80}. Since MREgger provides robust estimates under the InSIDE (Instrument Strength Independent of Direct Effect) assumption, we additionally conducted the MR-Egger bootstrap method to confirm the results from MR-Egger. Heterogeneity was tested with Cochran Q 81 Q 81 Q^(81)Q^{81}. Because of effect heterogeneity, the contamination mixture model approachwhich performs robust Mendelian randomization in the presence of invalid instruments-was also employed 82 82 ^(82){ }^{82}.
我们试图使用孟德尔随机化(MR)方法评估 LAmin 和房颤(AF)之间的潜在因果关系。我们将 LAmin 视为暴露因素,AF 视为结局。LAmin 的遗传工具是使用这种分析中的基因组范围关联结果生成的。暴露结果统计信息中的变体使用 TwoSampleMR 软件包 v0.5.7 中的 P < 1 E 06 , r 2 < 0.001 P < 1 E 06 , r 2 < 0.001 P < 1E-06,r^(2) < 0.001P<1 \mathrm{E}-06, r^{2}<0.001 和 5 兆碱基的半径进行了整理。这些严格的整理阈值旨在降低包含轻度相关变体的风险,这些变体被视为真正独立的工具,尽管它们标记着相同的潜在信号(例如,具有 r 2 0.1 r 2 0.1 r^(2)0.1r^{2} 0.1 相互之间)。删除了具有不明确等位基因的变体。19 个变体与未包括英国生物银行参与者的较大 AF GWAS 协调一致 37 37 ^(37){ }^{37} 。主要的 MR 分析采用了逆方差加权(IVW)方法。我们还执行了简单中位数、加权中位数、MR-Egger 和 MR-PRESSO 方法,以考虑违反工具变量假设的情况 79 , 80 79 , 80 ^(79,80){ }^{79,80} 。由于 MR-Egger 在 InSIDE(工具强度独立于直接效应)假设下提供稳健的估计值,我们还进行了 MR-Egger 自举方法,以确认 MR-Egger 的结果。已使用 Cochran 检验了异质性 Q 81 Q 81 Q^(81)Q^{81} 。由于存在效应异质性,我们还采用了污染混合模型方法,该方法在存在无效工具的情况下执行稳健的孟德尔随机化 82 82 ^(82){ }^{82}
To assess risk of pleiotropy of the LA genetic instruments through known pathways, each SNP was tested for association with risk factors from CHARGE-AF 4 4 ^(4){ }^{4}, an atrial fibrillation risk score, within the same participants in which the GWAS was conducted. Association between each of the 19 variants and seven risk factors (height, weight, systolic blood pressure, diastolic blood pressure, use of antihypertensive medications, diagnosis of diabetes, and current smoking) was tested in a linear regression model that accounted for age and age 2 2 ^(2){ }^{2} at the time of MRI, sex, the MRI serial number, the genotyping array, and genetic principal components 1 10 1 10 1-101-10. Associations were considered significant if they exceeded Bonferroni significance ( P < 3.8 E 04 P < 3.8 E 04 P < 3.8E-04P<3.8 \mathrm{E}-04 ).
为了评估 LA 遗传工具通过已知通路对多效性的风险,在进行 GWAS 的同一参与者中,对每个 SNP 进行了与 CHARGE-AF 4 4 ^(4){ }^{4} 的危险因素的关联性检测,这是一个房颤风险评分。在一个线性回归模型中测试了 19 个变体和 7 个风险因素(身高、体重、收缩压、舒张压、使用抗高血压药物、糖尿病诊断和当前吸烟)与年龄和 MRI 时年龄 2 2 ^(2){ }^{2} 、性别、MRI 序列号、基因分型阵列和遗传主成分 1 10 1 10 1-101-10 的关联。如果关联性超过了 Bonferroni 显著性水平 P < 3.8 E 04 P < 3.8 E 04 P < 3.8E-04P<3.8 \mathrm{E}-04 ,则将其视为显著。
To understand the bidirectional causal effects, we also performed an MR analysis using AF variants from the 2017 GWAS as the exposure and LA measurements as the outcome. After applying the same clumping threshold and filtering methods to AF summary statistics, 38 remaining variants were harmonized with the LAmin association results and used to construct the instrumental variable. The primary and sensitivity analyses were then conducted in the same manner as described above.
为了理解双向因果效应,我们还使用 2017 年基因组关联研究(GWAS)中的心房纤维颤动(AF)变体作为暴露因子,并以左心耳(LA)测量值作为结果指标,进行了 MR 分析。在对 AF 概要统计数据进行相同的克隆合并阈值和过滤方法后,38 个剩余变体与 LAmin 关联结果进行了协调,并用于构建工具变量。然后以上述方式进行了主要分析和敏感性分析。
Additional Mendelian randomization analyses were conducted using each LA measurement as an exposure constructed from SNPs with P < 5 E 08 P < 5 E 08 P < 5E-08\mathrm{P}<5 \mathrm{E}-08, tested against AF 37 AF 37 AF^(37)\mathrm{AF}^{37}, heart failure from HERMES 38 38 ^(38){ }^{38}, and the trans-ancestry ischemic and cardioembolic stroke summary statistics from MEGASTROKE 35 35 ^(35){ }^{35}.
使用每种 LA 测量作为暴露因素构建的 SNP 和 P < 5 E 08 P < 5 E 08 P < 5E-08\mathrm{P}<5 \mathrm{E}-08 进行了附加的孟德尔随机试验分析,针对 HERMES AF 37 AF 37 AF^(37)\mathrm{AF}^{37} 的心力衰竭和 MEGASTROKE 38 38 ^(38){ }^{38} 的跨血统缺血性中风和心源性栓塞卒中概括统计进行分析。

Polygenic score for atrial fibrillation
心房颤动的多基因评分

We constructed a 1.1-million SNP PRS using PRScs based on summary statistics from Christophersen et al. 2017-a large AF GWAS that did not incorporate UK Biobank participants 37 , 39 37 , 39 ^(37,39){ }^{37,39}. The score was constructed from 1 , 108 , 410 1 , 108 , 410 1,108,4101,108,410 sites from the summary statistics that overlapped with the HapMap3 sites available in the UK Biobank as precomputed by the PRScs authors. The score was applied to the GWAS participants with LA measurements and tested for association using linear regression (Supplementary Table 7). For comparability, the score and the LA measurements were both standardized to a mean of zero and a standard deviation of 1 .
我们使用基于 Christophersen et al. 2017 年总结统计数据的 PRScs 构建了一个包含 110 万个 SNP 的 PRS。这个研究没有包含 UK Biobank 参与者。这个得分是根据总结统计数据中与 HapMap3 位点重叠的 1 , 108 , 410 1 , 108 , 410 1,108,4101,108,410 个位点构建的,这些位点是由 PRScs 作者预先计算好的。这个得分应用于具有 LA 测量值的 GWAS 参与者,并使用线性回归进行关联性检验(补充表 7)。为了可比性,得分和 LA 测量值都标准化为平均值为零,标准差为 1。

Derivation of LA measurement polygenic scores
拉测量多基因得分的推导

A polygenic score for each LA GWAS was computed using PRScs with a UK Biobank European ancestry linkage disequilibrium panel 39 39 ^(39){ }^{39}. This method applies a continuous shrinkage prior to the SNP weights. PRScs was run in ‘auto’ mode on a per-chromosome basis. This mode places a standard half-Cauchy prior on the global shrinkage parameter and learns the global scaling parameter from the data; as a consequence, PRScs-auto does not require a validation data set for tuning. Based on the software default settings, only the 1.1-million SNPs found at HapMap3 sites that were also present in the UK Biobank were permitted to contribute to the score. Other polygenic scores were produced as sensitivity analyses (Supplementary Note).
使用具有英国生物银行欧洲血统联锁不平衡组的 PRScs 计算了每个 LA GWAS 的多基因评分。该方法对 SNP 权重应用了连续收缩先验。PRScs 以"自动"模式在每个染色体的基础上运行。该模式对全局收缩参数施加标准半柯西先验,并从数据中学习全局缩放参数;因此,PRScs-auto 不需要验证数据集进行调整。根据软件默认设置,只有出现在 HapMap3 位点并且也存在于英国生物银行中的 110 万 SNP 才被允许为评分做出贡献。其他多基因评分作为敏感性分析产生(补充说明)。 39 39 ^(39){ }^{39}

Internal validation of LA polygenic scores in non-imaging participants
非影像参与者中 LA 多基因评分的内部验证

The LA polygenic scores were applied to the entire UK Biobank. Participants who had undergone MRI or related within 3 degrees of kinship to those who had undergone MRI, based on the precomputed relatedness matrix from the UK Biobank, were excluded from analysis 73 73 ^(73){ }^{73}. We analyzed the relationship between this polygenic prediction of each LA measurement and incident disease (defined by selfreport and diagnostic and procedural codes) in the UK Biobank using a Cox proportional hazards model as implemented by the R survival package 83 83 ^(83){ }^{83}. The primary disease analyzed was atrial fibrillation. For each tested disease, we excluded participants with disease that was diagnosed prior to enrollment in the UK Biobank. We counted survival as the number of years between enrollment and disease diagnosis (for those with disease) or until death, loss to follow-up, or end of follow-up time (for those without disease).
利用 LA 多因子得分预测分析了整个英国生物银行数据。剔除了进行过 MRI 或者与进行过 MRI 的人有 3 度亲属关系的参与者 73 73 ^(73){ }^{73} 。我们使用 R 生存分析软件包实现了 Cox 比例风险模型,分析了这种多因子预测与各疾病发生风险(根据自我报告和诊断及操作码定义)之间的关系 83 83 ^(83){ }^{83} 。主要分析的疾病是房颤。对于每种待分析疾病,都排除了入组前已经诊断的参与者。我们计算了从入组到疾病诊断或死亡、失访、随访截止的存活时间。
We adjusted for covariates including sex, the cubic basis spline of age at enrollment, the interaction between the cubic basis spline of age at enrollment and sex, the genotyping array, the first five principal components of ancestry, and the cubic basis splines of height (cm), weight ( kg ), BMI ( kg / m 2 ) ( kg / m 2 ) (kg//m2)(\mathrm{kg} / \mathrm{m} 2), diastolic blood pressure ( mmHg ), and systolic blood pressure ( mmHg ). Sensitivity analyses included restriction participants to a genetic inlier population with European genetic identity (precomputed by the UK Biobank); adjusting for genetic principal components derived from the GWAS samples instead of the entire cohort; adjusting only for age and sex; applying score weights derived from the clumped lead variants with P < 5 E 08 P < 5 E 08 P < 5E-08P<5 \mathrm{E}-08 from each trait instead of PRScs; and thresholding the cohort into the top 5% for each polygenic score compared to the bottom 95 % 95 % 95%95 \% for the score.
我们根据包括性别、入组时年龄的三次基函数样条、入组时年龄的三次基函数样条和性别的交互作用、基因分型阵列、ancestry 的前 5 个主成分以及身高(cm)、体重(kg)、BMI ( kg / m 2 ) ( kg / m 2 ) (kg//m2)(\mathrm{kg} / \mathrm{m} 2) 、舒张压(mmHg)和收缩压(mmHg)的三次基函数样条进行了协变量调整。敏感性分析包括:将参与者限制为具有欧洲遗传特征的基因内群体(由 UK Biobank 预先计算); 根据 GWAS 样本而不是整个队列推导的遗传主成分进行调整; 仅根据年龄和性别进行调整; 应用从各个性状的聚集主导变体导出的得分权重而不是 PRScs; 以及将队列划分为每个多基因得分的前 5%与后 95 % 95 % 95%95 \% %。

External validation of the BSA-indexed LAmin polygenic score in FinnGen
BSA 指数 LAmin 多基因评分在 FinnGen 中的外部验证

FinnGen is a collection of prospective Finnish epidemiological and disease-based cohorts and hospital biobank samples 40 40 ^(40){ }^{40}. The FinnGen data used here comprise 377,277 individuals from FinnGen Data Freeze 9 (https://www.finngen.fi/en). The data were linked by unique national personal identification numbers to the registries of national hospital discharges (available from 1968), cause of death (1969-), medication reimbursement (1964-) and purchase (1995-), specialist outpatient visits (1998-) and primary care visits (2011-). Data comprised in FinnGen Data Freeze 9 are administered by regional biobanks (Auria Biobank, Biobank of Central Finland, Biobank of Eastern Finland, Borealis Biobank, Helsinki Biobank, Tampere Biobank), the Blood Service Biobank, the Terveystalo Biobank, and biobanks administered by the Finnish Institute for Health and Welfare (THL) for the following studies: Botnia, Corogene, FinHealth 2017, FinIPF, FINRISK 1992-2012, GeneRisk, Health 2000, Health 2011, Kuusamo, Migraine, Super, T1D, and Twins). Consortium members are listed in Supplementary Note.
FinnGen 是一个由前瞻性芬兰流行病学和以疾病为基础的队列以及医院生物银行样品组成的集合 40 40 ^(40){ }^{40} 。这里使用的 FinnGen 数据包括来自 FinnGen 数据冻结 9 的 377,277 名个体(https://www.finngen.fi/en)。这些数据通过独特的国家个人识别号与全国性医院出院登记(自 1968 年起提供)、死因登记(1969 年-)、药物报销(1964 年-)和购买(1995 年-)、专科门诊就诊(1998 年-)和初级保健就诊(2011 年-)等登记表相互关联。FinnGen 数据冻结 9 中包含的数据由区域生物银行(Auria 生物银行、中芬兰生物银行、东芬兰生物银行、Borealis 生物银行、赫尔辛基生物银行、坦佩雷生物银行)、血液服务生物银行、Terveystalo 生物银行,以及芬兰健康福祉研究所(THL)管理的生物银行中的以下研究的数据组成:Botnia、Corogene、FinHealth 2017、FinIPF、FINRISK 1992-2012、GeneRisk、Health 2000、Health 2011、Kuusamo、Migraine、Super、T1D 和 Twins。联盟成员列表见补充说明。
Patients and control subjects in FinnGen provided informed consent for biobank research, based on the Finnish Biobank Act. Alternatively, separate research cohorts, collected prior the Finnish Biobank Act came into effect (in September 2013) and start of FinnGen
芬兰健康基因组计划(FinnGen)的患者和对照受试者根据芬兰生物银行法提供了生物银行研究的知情同意。另一方面,在芬兰生物银行法生效(2013 年 9 月)和 FinnGen 启动之前收集的独立研究队列

(August 2017), were collected based on study-specific consents and later transferred to the Finnish biobanks after approval by Fimea (Finnish Medicines Agency), the National Supervisory Authority for Welfare and Health. Recruitment protocols followed the biobank protocols and were approved by Fimea. The Coordinating Ethics Committee of the Hospital District of Helsinki and Uusimaa (HUS) statement number for the FinnGen study is Nr HUS/990/2017.
(2017 年 8 月)的数据收集基于特定研究的同意,之后经过 Fimea(芬兰药品管理局)、国家福利和健康监督当局的批准转移到芬兰生物银行。招募流程遵循生物银行的协议,已获得 Fimea 的批准。赫尔辛基-乌西马地区医院区域(HUS)协调伦理委员会关于 FinnGen 研究的声明编号为 Nr HUS/990/2017。
The FinnGen study is approved by Finnish Institute for Health and Welfare (permit numbers: THL/2031/6.02.00/2017, THL/1101/5.05.00/ 2017, THL/341/6.02.00/2018, THL/2222/6.02.00/2018, THL/283/ 6.02.00/2019, THL/1721/5.05.00/2019 and THL/1524/5.05.00/2020), Digital and population data service agency (permit numbers: VRK43431/ 2017-3, VRK/6909/2018-3, VRK/4415/2019-3), the Social Insurance Institution (permit numbers: KELA 58/522/2017, KELA 131/522/2018, KELA 70/522/2019, KELA 98/522/2019, KELA 134/522/2019, KELA 138/522/ 2019, KELA 2 / 522 / 2020 2 / 522 / 2020 2//522//20202 / 522 / 2020, KELA 16/522/2020), Findata permit numbers THL/2364/14.02/2020, THL/4055/14.06.00/2020,THL/3433/14.06.00/ 2020, THL/4432/14.06/2020, THL/5189/14.06/2020, THL/5894/ 14.06.00/2020, THL/6619/14.06.00/2020, THL/209/14.06.00/2021, THL/688/14.06.00/2021, THL/1284/14.06.00/2021, THL/1965/14.06.00/ 2021, THL/5546/14.02.00/2020, THL/2658/14.06.00/2021, THL/4235/ 14.06.00/202, Statistics Finland (permit numbers: TK-53-1041-17 and TK/ 143/07.03.00/2020 (earlier TK-53-90-20) TK/1735/07.03.00/2021, TK/ 3112/07.03.00/2021) and Finnish Registry for Kidney Diseases permission/extract from the meeting minutes on 4th July 2019.
芬兰基因组研究获得以下许可: 芬兰卫生福利研究所(编号:THL/2031/6.02.00/2017、THL/1101/5.05.00/2017、THL/341/6.02.00/2018、THL/2222/6.02.00/2018、THL/283/6.02.00/2019、THL/1721/5.05.00/2019 和 THL/1524/5.05.00/2020) 数字和人口数据服务机构(编号:VRK43431/2017-3、VRK/6909/2018-3、VRK/4415/2019-3) 社会保险机构(编号:KELA 58/522/2017、KELA 131/522/2018、KELA 70/522/2019、KELA 98/522/2019、KELA 134/522/2019、KELA 138/522/2019、 2 / 522 / 2020 2 / 522 / 2020 2//522//20202 / 522 / 2020 、KELA 16/522/2020) Findata(编号 THL/2364/14.02/2020、THL/4055/14.06.00/2020、THL/3433/14.06.00/2020、THL/4432/14.06/2020、THL/5189/14.06/2020、THL/5894/14.06.00/2020、THL/6619/14.06.00/2020、THL/209/14.06.00/2021、THL/688/14.06.00/2021、THL/1284/14.06.00/2021、THL/1965/14.06.00/2021、THL/5546/14.02.00/2020、THL/2658/14.06.00/2021、THL/4235/14.06.00/202) 统计局(编号:TK-53-1041-17、TK/143/07.03.00/2020(原 TK-53-90-20)、TK/1735/07.03.00/2021、TK/3112/07.03.00/2021) 芬兰肾脏病登记处-2019 年 7 月 4 日会议纪要
The Biobank Access Decisions for FinnGen samples and data utilized in FinnGen Data Freeze 9 include: THL Biobank BB2017_55, BB2017_111, BB2018_19, BB_2018_34, BB_2018_67, BB2018_71, BB2019_7, BB2019_8, BB2019_26, BB2020_1, Finnish Red Cross Blood Service Biobank 7.12.2017, Helsinki Biobank HUS/359/2017, HUS/248/2020, Auria Biobank AB17-5154 and amendment #1 (August 17 2020), AB205926 and amendment #1 (April 23 2020) and it’s modification (Sep 22 2021), Biobank Borealis of Northern Finland_2017_1013, Biobank of Eastern Finland 1186/2018 and amendment 22 § /2020, Finnish Clinical Biobank Tampere MH0OO4 and amendments (21.02.2020 & 06.10.2020), Central Finland Biobank 1-2017, and Terveystalo Biobank STB 2018001 and amendment 25th Aug 2020.
FinnGen 第 9 版数据使用的生物样本库访问决定包括:THL 生物库 BB2017_55、BB2017_111、BB2018_19、BB_2018_34、BB_2018_67、BB2018_71、BB2019_7、BB2019_8、BB2019_26、BB2020_1,芬兰红十字会血液服务生物库(2017 年 12 月 7 日),赫尔辛基生物库 HUS/359/2017、HUS/248/2020,奥里亚生物库 AB17-5154 及其修正案(2020 年 8 月 17 日)、AB205926 及其修正案(2020 年 4 月 23 日)及其修订(2021 年 9 月 22 日),北芬兰生物库 Biobank Borealis_2017_1013,东芬兰生物库 1186/2018 及其修正案 22§/2020,坦佩雷健康生物库 MH0OO4 及其修正案(2020 年 2 月 21 日和 2020 年 10 月 6 日),中芬兰生物库 1-2017,以及特维斯泰洛生物库 STB 2018001 及其 2020 年 8 月 25 日的修正案。
FinnGen samples were genotyped using Illumina and Affymetrix arrays (Illumina Inc., San Diego, and Thermo Fisher Scientific, Santa Clara, CA, USA). Genotype imputation was performed using a population-specific SISu v3 imputation reference panel comprised high-coverage (25-30x) whole genome sequences from 3775 participants as described in a separate protocol (https://doi.org/10.17504/ protocols.io.xbgfijw).
使用 Illumina 和 Affymetrix 阵列(Illumina Inc.,圣迭戈,美国,和 Thermo Fisher Scientific,圣克拉拉,加州,美国)对 FinnGen 样本进行基因型分析。使用由 3775 名参与者的高覆盖率(25-30x)全基因组测序组成的特定于人群的 SISu v3 参考图谱进行基因型插补,如单独的协议所述(https://doi.org/10.17504/ protocols.io.xbgfijw)。
PRS weights were applied using PLINK v1.9 31 , 84 31 , 84 ^(31,84){ }^{31,84}. Case and control statuses for atrial fibrillation or flutter, ischemic stroke excluding subarachnoid hemorrhage, ischemic stroke excluding all hemorrhages and heart failure were defined based on events in the hospital, cause of death, specialist outpatient, primary care, and medication reimbursement registries at any point during registry follow-up as detailed in Supplementary Data 7. The association of PRS with each outcome was assessed using Cox proportional hazards models with follow-up time scale using sex, baseline age, baseline age squared, 5 genomic principal components, and the genotyping array as fixed-effects covariates.
PRS 权重使用 PLINK v1.9 31 , 84 31 , 84 ^(31,84){ }^{31,84} 进行了应用。基于医院事件、死因、专科门诊、初级保健和药物报销登记册在登记跟踪期间的任何时间点,定义了心房颤动或扑动、不包括蛛网膜下出血的缺血性卒中、不包括所有出血性卒中和心力衰竭的病例和对照状态,如补充数据 7 中所述。使用 Cox 比例风险模型,以性别、基线年龄、基线年龄的平方、5 个基因组主成分和基因分型阵列作为固定效应协变量,评估了 PRS 与各种结局的关联。

External validation of the BSA-indexed LAmin polygenic score in All of Us
全美健康大数据计划中 BSA 索引 LAmin 多基因评分的外部验证

All of Us is an ongoing, diverse national biobank project in the United States 41 41 ^(41){ }^{41}. Data include those from physical examination, biospecimen collection, the electronic health record (EHR), and surveys. All participants provided written, informed consent. At the time of analysis, the controlled-access data release version was 7 . Within this release, we identified 245,149 participants with whole genome sequencing data.
全民生物银行(All of Us)是美国正在进行的一个多元化的全国性生物银行项目 41 41 ^(41){ }^{41} 。数据包括体格检查、生物样本收集、电子健康记录(EHR)和调查问卷。所有参与者均已签署了书面知情同意书。在分析时,控制访问的数据发布版本为 7。在这个版本中,我们识别出 245,149 名参与者拥有全基因组测序数据。
At the time of analysis, whole genome sequencing (WGS) had been completed in 245,400 participants. Sequencing and sample quality
在分析时,已经在 245,400 名参与者中完成了全基因组测序(WGS)。测序和样本质量

control in All of Us has been detailed previously 85 , 86 85 , 86 ^(85,86){ }^{85,86}. In brief, sequencing was performed with Illumina NovaSeq 6000, aligned GRCh38 and variants called by DRAGEN v3.4.12. A joint call set was prepared centrally by All of Us. Sample-level quality control was performed centrally by All of Us: exclusion criteria included fingerprint concordance log likelihood ratio 3 3 <= -3\leq-3; sex discordance between self-report and WGS-based chromosomal sex call (if sex reported at birth was either “Male” or “Female”); contamination rate 3 % 3 % >= 3%\geq 3 \%; or mean coverage < 30 × < 30 × < 30 xx<30 \times, or < 90 % < 90 % < 90%<90 \% of bases at 20 × 20 × 20 xx20 \times coverage, or <8E10 aligned Q30 bases, or < 95 % < 95 % < 95%<95 \% of bases in 59 hereditary disease risk genes with 20 × 20 × 20 xx20 \times coverage. Fingerprint concordance was checked at 114 sites using Picard v2.23.9. Variant-level filtration removed sites with no high-quality genotypes, with ExcessHet < 54.69 < 54.69 < 54.69<54.69, or with QUAL < 60 < 60 < 60<60 for SNPs or < 69 < 69 < 69<69 for Indels. Ancestry prediction was performed centrally by All of Us; briefly, Human Genome Diversity Project and 1000 Genomes samples were used to train a random forest to identify ancestry labels based on PCA from high-quality variant sites, and these loadings were then applied in All of Us.
我们所有人的控制已经在之前详细描述 85 , 86 85 , 86 ^(85,86){ }^{85,86} 。简而言之,测序是使用 Illumina NovaSeq 6000 进行的,比对到 GRCh38 参考基因组,并使用 DRAGEN v3.4.12 进行变异检测。All of Us 中央准备了一个联合调用集。All of Us 中央进行了样本级别的质量控制:排除标准包括指纹一致性对数似然比 3 3 <= -3\leq-3 ; 自我报告的性别与基于 WGS 的染色体性别不一致(如果出生时报告的性别为"男性"或"女性"); 污染率 3 % 3 % >= 3%\geq 3 \% ; 或平均覆盖度 < 30 × < 30 × < 30 xx<30 \times , 或 < 90 % < 90 % < 90%<90 \% 个碱基的覆盖度低于 20 × 20 × 20 xx20 \times 倍, 或在 59 个遗传性疾病风险基因中有<8E10 id=0>个碱基的覆盖度低于 20 × 20 × 20 xx20 \times 倍。使用 Picard v2.23.9 在 114 个位点检查了指纹一致性。变体级别的过滤去除了没有高质量基因型的位点,ExcessHet 为 < 54.69 < 54.69 < 54.69<54.69 的位点,或 QUAL 低于 < 60 < 60 < 60<60 的 SNP 位点和低于 < 69 < 69 < 69<69 的插入缺失位点。All of Us 中央进行了祖源预测;简而言之,使用 Human Genome Diversity Project 和 1000 Genomes 样本训练了一个随机森林模型来识别基于高质量变异位点的主成分分析得出的祖源标签,然后将这些权重应用于 All of Us 中。
PRScs-based polygenic score weights from the UK Biobank were lifted over from GRCh37 to GRCh38 87 87 ^(87){ }^{87}. Polygenic scores were then applied to all participants with WGS as an allelic sum, with an average taken over all of the weights. The UK Biobank GWAS in-sample PCA loadings were applied to the All of Us participants in the same way. These were then tested for association with the presence or absence of disease at any point prior to enrollment or during follow-up in a logistic regression model after adjustment for age at enrollment, whether the individual’s self-reported sex was male, and the first five principal components of ancestry. Similarly, the association with incident disease was tested with a Cox model with the same covariate adjustments after excluding individuals with disease prior to enrollment. All individuals with available data were analyzed. Sensitivity analyses examining only individuals with the “EUR” ancestry label were also conducted.
基于 PRScs 的多基因得分权重从 GRCh37 转换到 GRCh38 87 87 ^(87){ }^{87} 。多基因得分然后应用于所有具有全基因组测序(WGS)数据的参与者,计算为等位基因之和,并取所有权重的平均值。UK Biobank GWAS 样本内主成分分析(PCA)载荷以同样的方式应用于所有参与者。然后在一个 logistic 回归模型中检测这些载荷与疾病(在入组前或随访期间出现)存在或缺失的关联,并调整了入组时年龄、自报性别是否为男性以及前五个祖源主成分。同样,在排除入组前已经有疾病的个体后,使用包含相同协变量的 Cox 模型检测了发病关联。分析了所有可用数据的个体。还进行了仅包括"EUR"祖源标签的个体的敏感性分析。
Atrial fibrillation was defined to be present starting on the first date any of the following diagnostic or procedural codes were reported:
房颤被定义为从首次报告以下任一诊断或手术代码的日期开始存在:
  • ICD10-CM: I48, I48.0, I48.1, I48.11, I48.19, I48.2, I48.20, I48.21, I48.3, I48.4, I48.9, I48.91, I48.92;
    ICD10-CM: I48, I48.0, I48.1, I48.11, I48.19, I48.2, I48.20, I48.21, I48.3, I48.4, I48.9, I48.91, I48.92; 人: Translate the following source text to Traditional Chinese Language, Output translation directly without any additional text. Source Text: ICD10-CM: I48, I48.0, I48.1, I48.11, I48.19, I48.2, I48.20, I48.21, I48.3, I48.4, I48.9, I48.91, I48.92; Translated Text:
  • ICD9-CM: 427.31;
  • SNOMED: 49436004, 282825002, 426749004, 440059007, 440028005 ;
  • CPT4: 92960 . 92960
Heart failure was defined by the following codes:
心力衰竭由以下编码定义:
  • SNOMED: 84114007, 42343007, 441530006, 441481004, 194779001, 15781000119107, 88805009, 5148006, 92506005, 10633002, 698296002, 426263006, 82523003, 96311000119109, 194781004, 698594003, 426611007, 15629541000119106, 23341000119109, 48447003, 10335000, 7411000175102, 424404003, 418304008, 443343001, 46113002, 417996009, 443254009, 120871000119108, 120861000119102, 56675007, 49584005, 359617009, 7421000175106, 722095005, 443344007, 153951000119103, 153931000119109, 85232009, 367363000, 83291003, 79955004, 16838951000119100, 44313006, 446221000 , 703272007 , 703273002 446221000 , 703272007 , 703273002 446221000,703272007,703273002446221000,703272007,703273002
    84114007, 42343007, 441530006, 441481004, 194779001, 15781000119107, 88805009, 5148006, 92506005, 10633002, 698296002, 426263006, 82523003, 96311000119109, 194781004, 698594003, 426611007, 15629541000119106, 23341000119109, 48447003, 10335000, 7411000175102, 424404003, 418304008, 443343001, 46113002, 417996009, 443254009, 120871000119108, 120861000119102, 56675007, 49584005, 359617009, 7421000175106, 722095005, 443344007, 153951000119103, 153931000119109, 85232009, 367363000, 83291003, 79955004, 16838951000119100, 44313006, 446221000 , 703272007 , 703273002 446221000 , 703272007 , 703273002 446221000,703272007,703273002446221000,703272007,703273002
Ischemic stroke was defined by the following codes:
缺血性卒中的定义如下:
  • SNOMED: 371041009, 9901000119100,422504002
    SNOMED: 371041009, 9901000119100,422504002 人类: Translate the following source text to Simplified Chinese Language, Output translation directly without any additional text. Source Text: MESH: D000066636, D000662, D000663 Translated Text:
The only volumetric LA measurement available in All of Us was the BSA-indexed LAmin volume (labeled “Left atrial End-systolic volume/ Body surface area [Volume/Area] by US.2D+Calculated by area-length method”). This was analyzed as a continuous trait and was tested for association with the BSA-indexed LAmin polygenic score with adjustment for age at the time of measurement acquisition, sex, and the first five principal components of ancestry.
在"我们全部"中唯一可用的容积性 LA 测量标准是 BSA 标准的 LAmin 容积(标记为"经二维超声测量的左房期末收缩容积/体表面积[容积/面积]")。这被视为连续性特征,并经过校正年龄、性别和前五个主成分分析法调整,与 BSA 标准的 LAmin 多基因得分进行了关联性分析。

Reporting summary 报告概要

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
有关研究设计的更多信息,请参见与本文相关的 Nature Portfolio Reporting Summary。

Data availability 数据可获得性

GWAS summary statistics have been deposited in the GWAS Catalog under accession #GCP000842. Polygenic score weights have been deposited at doi:10.5281/zenodo.10814404 88 88 ^(88){ }^{88}. LA measurements have been returned to the UK Biobank for use by any approved researcher. UK Biobank data are made available to researchers from research institutions with genuine research inquiries, following IRB and UK Biobank approval. All of Us data are available for analysis to qualified researchers on the All of Us research platform. FinnGen Freeze 9 GWAS summary statistics are available at https://www.finngen.fi/en/access_ results. All other data are contained within the article and its supplementary information. Source data are provided with this paper.
GWAS 总结统计结果已存入 GWAS Catalog,登记号为#GCP000842。多基因评分权重已存储在 doi:10.5281/zenodo.10814404 88 88 ^(88){ }^{88} 。LA 测量值已返回给 UK Biobank,供任何获批的研究人员使用。UK Biobank 的数据可供来自有真诚研究需求的研究机构的研究人员使用,但需经过伦理委员会和 UK Biobank 的批准。所有的 Us 数据都可供获授权的研究人员在 All of Us 研究平台上分析使用。FinnGen 数据释放 9 中的 GWAS 总结统计数据可在https://www.finngen.fi/en/access_results 查看。其他所有数据都包含在本文及其补充材料中。源数据随本文一同提供。

Code availability 代码可用性

Manual annotation for semantic segmentation was performed using traceoverlay v0.1.0 54 54 ^(54){ }^{54}. The deep learning models have been returned to the UK Biobank for use by other researchers. The mri_la_poisson.py script used to perform Poisson surface reconstruction from segmentation output may be downloaded from Zenodo (doi:10.5281/ zenodo.10811233) and is actively developed at https://github.com/ broadinstitute/ml4h, available under an open-source BSD license 89 89 ^(89){ }^{89}.
使用 traceoverlay v0.1.0 54 54 ^(54){ }^{54} 进行语义分割的手工注释已完成。深度学习模型已返回给英国生物银行供其他研究人员使用。用于从分割输出执行泊松表面重建的 mri_la_poisson.py 脚本可从 Zenodo(doi:10.5281/zenodo.10811233)下载,并且正积极在https://github.com/broadinstitute/ml4h 上进行开发,遵循开源 BSD 许可 89 89 ^(89){ }^{89}

References 参考文献

  1. Miyasaka, Y. et al. Secular trends in incidence of atrial fibrillation in Olmsted County, Minnesota, 1980 to 2000, and implications on the projections for future prevalence. Circulation 114, 119-125 (2006).
    宫坂,Y.等。明尼苏达州奥姆斯特德县心房颤动发病率 1980 至 2000 年的长期趋势及其未来患病率预测的意义。循环 114,119-125(2006)。
  2. Marini, Carmine et al. Contribution of atrial fibrillation to incidence and outcome of ischemic stroke. Stroke 36, 1115-1119 (2005).
    马里尼、卡明等。房颤对缺血性卒中发病率和结果的贡献。卒中 36, 1115-1119(2005)。
  3. Wolf, P. A., Abbott, R. D. & Kannel, W. B. Atrial fibrillation as an independent risk factor for stroke: the Framingham Study. Stroke 22, 983-988 (1991).
    狼, P. A., 阿博特, R. D. & 卡内尔, W. B. 房颤作为中风的独立危险因素:Framingham 研究. 卒中 22, 983-988 (1991).
  4. Alonso, A. et al. Simple risk model predicts incidence of atrial fibrillation in a racially and geographically diverse population: the CHARGE-AF consortium. J. Am. Heart Assoc. 2, e000102 (2013).
    阿隆索、A 等。一个简单的风险模型预测了种族和地理背景多样化群体中房颤的发病率:CHARGE-AF 联盟。Am. Heart Assoc. J. 2, e000102 (2013)。
  5. Hulme, O. L. et al. Development and validation of a prediction model for atrial fibrillation using electronic health records. JACC Clin Electrophysiol 5, 1331-1341 (2019).
    赫尔梅,奥·L. 等。使用电子健康记录预测心房纤颤的模型开发和验证。JACC 临床电生理学 5, 1331-1341 (2019)。
  6. Li, Y.-G. et al. A simple clinical risk score (C2HEST) for predicting incident atrial fibrillation in Asian subjects: derivation in 471,446 Chinese subjects, with internal validation and external application in 451,199 Korean subjects. Chest 155, 510-518 (2019).
    李等人。一个简单的临床风险评分(C2HEST)可预测亚洲人发生心房颤动:在 471,446 名中国受试者中推导,并在 451,199 名韩国受试者中进行内部验证和外部应用。胸部 155,510-518(2019)。
  7. Vaziri, S. M., Larson, M. G., Lauer, M. S., Benjamin, E. J. & Levy, D. Influence of blood pressure on left atrial size. Hypertension 25, 1155 1160 1155 1160 1155-11601155-1160 (1995).
    瓦兹里,S. M.,拉尔森,M. G.,劳尔,M. S.,本杰明,E. J.和李维,D. 血压对左心房大小的影响。高血压 25, 1155 1160 1155 1160 1155-11601155-1160 (1995).
  8. Cioffi, G. et al. Left atrial size and force in patients with systolic chronic heart failure: comparison with healthy controls and different cardiac diseases. Exp. Clin. Cardiol. 15, e45-e51 (2010).
    乔菲,G.等人。左心房大小和力量在系统性慢性心力衰竭患者中的比较:与健康对照组和不同心脏疾病的比较。实验与临床心脏病学 15, e45-e51 (2010)。
  9. Sanfilippo, A. J. et al. Atrial enlargement as a consequence of atrial fibrillation. A prospective echocardiographic study. Circulation 82, 792-797 (1990).
    桑菲利普,A.J.等。房颤导致房室扩大。前瞻性超声心动图研究。循环 82, 792-797(1990)。
  10. Sardana, Mayank et al. Association of left atrial function index with atrial fibrillation and cardiovascular disease: the Framingham offspring study. J. Am. Heart Assoc. 7, e008435 (2018).
    萨尔达纳,马扬克等。左心房功能指数与心房颤动和心血管疾病的关联:弗拉明汉姆后代研究。美国心脏协会杂志。7,e008435 (2018)。
  11. van de Vegte, Y. J., Siland, J. E., Rienstra, M. & van der Harst, P. Atrial fibrillation and left atrial size and function: a Mendelian randomization study. Sci. Rep. 11, 8431 (2021).
    范德维格泰、西兰德、雷恩斯特拉和范德哈斯特。心房颤动和左心房大小和功能:孟德尔随机研究。Sci. Rep. 11, 8431 (2021)。
  12. Henry, W. L. et al. Relation between echocardiographically determined left atrial size and atrial fibrillation. Circulation 53, 273-279 (1976).
    亨利,W. L. 等.心脏超声测定的左心房大小与心房颤动之间的关系.循环,53,273-279 (1976).
  13. Jin, X., Pan, J., Wu, H. & Xu, D. Are left ventricular ejection fraction and left atrial diameter related to atrial fibrillation recurrence after
    金, X., 潘, J., 吴, H. 和 徐, D. 左心室射血分数和左心房直径是否与心房纤维颤动复发后相关

    catheter ablation? A meta-analysis. Medicine (Baltimore) 97, e10822 (2018).
    导管消融?一项荟萃分析。医学(巴尔的摩)97, e10822 (2018)。
  14. Lim, D. J. et al. Change in left atrial function predicts incident atrial fibrillation: the multi-ethnic study of atherosclerosis. Eur. Heart. J. Cardiovasc. Imag. 20, 979-987 (2019).
    林等人。左心房功能的变化预示发生房颤:动脉粥样硬化多民族研究。欧洲心脏杂志。心血管成像。20, 979-987(2019)。
  15. Park, J. J. et al. Left atrial strain as a predictor of new-onset atrial fibrillation in patients with heart failure. JACC Cardiovasc. Imaging 13, 2071-2081 (2020).
    朴正俊等人。左心房应变是心力衰竭患者新发发作性房颤的预测因子。心血管成像杂志,13, 2071-2081(2020)。
  16. Tsang, T. S. et al. Left atrial volume: important risk marker of incident atrial fibrillation in 1655 older men and women. Mayo Clin. Proc. 76, 467-475 (2001).
    曾等人。左心房容积:1655 名老年男女发生心房纤维性的重要风险标志。Mayo Clin. Proc. 76, 467-475 (2001)。
  17. Vaziri, S. M., Larson, M. G., Benjamin, E. J. & Levy, D. Echocardiographic predictors of nonrheumatic atrial fibrillation. The Framingham Heart Study. Circulation 89, 724-730 (1994).
    瓦兹瑞, S. M., 拉尔森, M. G., 本杰明, E. J. 和 李维, D. 心超声预测无风湿性房颤. 弗雷明汉心脏研究. 循环 89, 724-730 (1994).
  18. Benjamin, E. J., D’Agostino, R. B., Belanger, A. J., Wolf, P. A. & Levy, D. Left atrial size and the risk of stroke and death. The Framingham Heart Study. Circulation 92, 835-841 (1995).
    本杰明,E. J.,达戈斯蒂诺,R. B.,贝兰杰,A. J.,沃尔夫,P. A. 和莱维,D.左心房大小与中风和死亡风险。Framingham 心脏研究。循环 92,835-841 (1995)。
  19. Bouzas-Mosquera, A. et al. Left atrial size and risk for all-cause mortality and ischemic stroke. CMAJ 183, E657-E664 (2011).
    包萨斯-莫斯克拉, A. 等. 左心房大小与全因死亡和缺血性卒中的风险. CMAJ 183, E657-E664 (2011).
  20. Xu, Y. et al. Left atrial enlargement and the risk of stroke: a metaanalysis of prospective cohort studies. Front. Neurol. 11, 26 (2020).
    徐雨等。左心房扩大与卒中风险:前瞻性队列研究的 meta 分析。神经学前沿 11,26 (2020)。
  21. Fatkin, D., Huttner, I. G. & Johnson, R. Genetics of atrial cardiomyopathy. Curr. Opin. Cardiol. 34, 275-281 (2019).
    法特金,D., 哈特纳,I. G. & 约翰逊,R. 心房心肌病的遗传学. Curr. Opin. Cardiol. 34, 275-281 (2019).
  22. Goette, A. et al. EHRA/HRS/APHRS/SOLAECE expert consensus on atrial cardiomyopathies: definition, characterization, and clinical implication. Heart Rhythm 14, e3-e40 (2017).
    格特等人。欧洲心律学会/心脏节律协会/亚太心律学会/拉丁美洲电生理学会专家共识关于房颤心肌病的定义、特征和临床意义。Heart Rhythm 14, e3-e40 (2017)。
  23. Wild, P. S. et al. Large-scale genome-wide analysis identifies genetic variants associated with cardiac structure and function. J. Clin. Invest. 127, 1798-1812 (2017).
    韦德,P. S.等。大规模基因组范围分析确定与心脏结构和功能相关的遗传变异。J. Clin. Invest. 127, 1798-1812 (2017)。
  24. Bai, W. et al. A population-based phenome-wide association study of cardiac and aortic structure and function. Nat. Med. 1-9 https:// doi.org/10.1038/s41591-020-1009-y (2020).
    白等人.心脏和主动脉结构和功能的群体性表型广谱相关研究.Nat. Med. 1-9 https:// doi.org/10.1038/s41591-020-1009-y (2020).
  25. Thanaj, M. et al. Genetic and environmental determinants of diastolic heart function. medRxiv 2021.06.07.21257302 https://doi. org/10.1101/2021.06.07.21257302 (2021).
    谢纳杰,M. 等。 心室舒张功能的遗传和环境决定因素。medRxiv 2021.06.07.21257302 https://doi. org/10.1101/2021.06.07.21257302 (2021).
  26. Ahlberg, G. et al. Genome-wide association study identifies 18 novel loci associated with left atrial volume and function. Eur. Heart J. https://doi.org/10.1093/eurheartj/ehab466 (2021).
    阿尔伯格等人。一项基因组范围关联研究鉴定出 18 个与左心房容积和功能相关的新的位点。《欧洲心脏杂志》https://doi.org/10.1093/eurheartj/ehab466 (2021 年)。
  27. Petersen, S. E. et al. Imaging in population science: cardiovascular magnetic resonance in 100,000 participants of UK Biobank rationale, challenges and approaches. J. Cardiovasc. Magn. Reson. 15, 46 (2013).
    彼得森,S. E. 等.人群科学成像:英国生物银行 100,000 名参与者的心血管磁共振成像--理由、挑战和方法.心血管磁共振杂志,15,46(2013).
  28. Petersen, S. E. et al. UK Biobank’s cardiovascular magnetic resonance protocol. J. Cardiovasc. Magn. Reson. 18, 8 (2016).
    彼得森,S.E.等人.联合王国生物银行的心血管磁共振协议.心血管磁共振杂志.18,8(2016).
  29. Howard, J. & Gugger, S. Fastai: a layered API for deep learning. Information 11, 108 (2020).
    霍华德, J. 和 古格, S. 快速人工智能:深度学习的分层 API。信息 11, 108 (2020)。
  30. Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nature Genetics 47, 291-295 (2015).
    布利克-沙利文、B.K.等. LD 评分回归在基因组范围关联研究中区分混杂因素和多基因作用. Nature Genetics 47, 291-295 (2015).
  31. Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).
    张、陈等人。第二代 PLINK:应对更大和更丰富数据集的挑战。Gigascience 4, 7 (2015)。
  32. Aschard, H., Vilhjálmsson, B. J., Joshi, A. D., Price, A. L. & Kraft, P. Adjusting for heritable covariates can bias effect estimates in genome-wide association studies. Am. J. Hum. Genet. 96, 329-339 (2015).
    阿斯查德、H.、维尔赫尔姆松、B. J.、乔希、A. D.、普莱斯、A. L. 和克拉夫特、P.调整可继承的协变量可能会在基因组范围关联研究中产生偏差的效应估计。美国人类遗传学会杂志, 96, 329-339 (2015)。
  33. Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236-1241 (2015).
    布利克 - 苏利文, B. 等人. 人类疾病和特征的遗传相关性图谱. Nat. Genet. 47, 1236-1241 (2015).
  34. Roselli, C. et al. Multi-ethnic genome-wide association study for atrial fibrillation. Nat. Genet. 50, 1225-1233 (2018).
    罗赛利,C.等人。多族裔基因组范围关联研究心房颤动。Nat.Genet.50,1225-1233(2018)。
  35. Malik, R. et al. Multiancestry genome-wide association study of 520,000 subjects identifies 32 loci associated with stroke and stroke subtypes. Nat. Genet. 50, 524-537 (2018).
    马利克等, 2018 年,跨多种人群基因组规模联合分析 520,000 个个体,发现 32 个与中风及其亚型相关的位点。
  36. Pers, T. H., Timshel, P. & Hirschhorn, J. N. SNPsnap: a web-based tool for identification and annotation of matched SNPs. Bioinformatics 31, 418-420 (2015).
    佩尔斯, T. H., 提姆谢尔, P. & 赫希霍恩, J. N. SNPsnap: 一个用于识别和注释匹配 SNP 的 web 工具. Bioinformatics 31, 418-420 (2015).
  37. Christophersen, I. E. et al. Large-scale analyses of common and rare variants identify 12 new loci associated with atrial fibrillation. Nat. Genet. 49, 946-952 (2017).
    克里斯托弗森,I.E.等。大规模分析普通和罕见变体确定了 12 个与房颤相关的新位点。自然遗传学,49,946-952(2017)。
  38. Shah, S. et al. Genome-wide association and Mendelian randomisation analysis provide insights into the pathogenesis of heart failure. Nat. Commun. 11, 1-12 (2020).
    沙阿等。基因组范围关联和门捷罗随机分析为心力衰竭的发病机制提供了洞见。Nat. Commun. 11, 1-12 (2020)。
  39. Ge, T., Chen, C.-Y., Ni, Y., Feng, Y.-C. A. & Smoller, J. W. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat. Commun. 10, 1776 (2019).
    葛特、陈昌义、倪永、冯永灿、斯莫勒利用贝叶斯回归和连续缩减先验的多基因预测。自然通讯,10, 1776 (2019)。
  40. Kurki, M. I. et al. FinnGen: unique genetic insights from combining isolated population and national health register data. medrxiv https://doi.org/10.1101/2022.03.03.22271360 (2022).
    库尔基、M.I.等人。FinnGen: 通过结合隔离人群和国家健康登记数据获得独特的遗传洞见。medrxiv https://doi.org/10.1101/2022.03.03.22271360 (2022)。
  41. Denny, J. C. et al. The “All of Us” research program. N. Engl. J. Med. 381, 668-676 (2019).
    丹尼, J. C. 等. "全民参与"研究计划. 新英格兰医学杂志. 381, 668-676 (2019).
  42. Lonsdale, J. et al. The genotype-tissue expression (GTEx) project. Nature Genetics 45, 580-585 (2013).
    隆斯代尔等。基因型-组织表达(GTEx)项目。Nature Genetics 45, 580-585 (2013)。
  43. Machiela, M. J. & Chanock, S. J. LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants. Bioinformatics 31, 3555-3557 (2015).
    马里·特雅(M. J. Machiela)和张云(S. J. Chanock)。LDlink:一个用于探索特定人群的单体相位结构和链接可能功能变体相关等位基因的网络应用程序。生物信息学 31,3555-3557(2015)。
  44. Prins, B. P. et al. Exome-chip meta-analysis identifies novel loci associated with cardiac conduction, including ADAMTS6. Genome Biol 19, 87 (2018).
    普林斯,B. P.等人。外显子芯片元分析发现与心脏传导相关的新位点,包括 ADAMTS6。基因组生物学 19, 87(2018)。
  45. Sotoodehnia, N. et al. Common variants in 22 loci are associated with QRS duration and cardiac ventricular conduction. Nat. Genet. 42, 1068-1076 (2010).
    佐图德尼亚等人。22 个位点的普通变异体与 QRS 持续时间和心室传导有关。Nat. Genet. 42, 1068-1076 (2010)。
  46. Lahat, H. et al. A missense mutation in a highly conserved region of CASQ2 is associated with autosomal recessive catecholamineinduced polymorphic ventricular tachycardia in bedouin families from Israel. Am. J. Hum. Genet. 69, 1378-1384 (2001).
    拉哈特,H. 等人。在 CASQ2 高度保守区域的一个错义突变与以色列贝多因家庭自主性隐性儿茶酚胺诱发多形性室性心动过速相关。《美国人类遗传学杂志》, 69, 1378-1384 (2001)。
  47. Ng, Kevin et al. An international multicenter evaluation of inheritance patterns, arrhythmic risks, and underlying mechanisms of CASQ2-catecholaminergic polymorphic ventricular tachycardia. Circulation 142, 932-947 (2020).
    吴, 凯文等. 关于 CASQ2-儿茶酚胺性多形性室性心动过速的遗传模式、心律失常风险和潜在机制的国际多中心评估. 循环 142, 932-947 (2020).
  48. Chinchilla, A. et al. PITX2 insufficiency leads to atrial electrical and structural remodeling linked to arrhythmogenesis. Circ. Cardiovasc. Genet. 4, 269-279 (2011).
    秦希娜,A.等人。PITX2 不足导致与心律失常相关的心房电和结构重塑。循环心血管遗传学。4, 269-279 (2011)。
  49. Collins, R. UK Biobank Protocol. https://www.ukbiobank.ac.uk/ media/gnkeyh2q/study-rationale.pdf (2007).
    柯林斯,R. 英国生物银行协议。https://www.ukbiobank.ac.uk/媒体/gnkeyh2q/study-rationale.pdf(2007)。
  50. Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).
    索德洛,C.等. UK 生物库:一个用于确定中年及老年人广泛复杂疾病病因的开放接入资源.PLoS Med. 12, e1001779 (2015).
  51. Santos, R. et al. A comprehensive map of molecular drug targets. Nat. Rev. Drug Discov. 16, 19-34 (2017).
    桑托斯,R.等人。一个全面的分子药物靶标映射。Nat. Rev. Drug Discov. 16, 19-34 (2017)。
  52. Wu, Y. et al. Genome-wide association study of medication-use and associated disease in the UK Biobank. Nat. Commun. 10, (2019).
    吴等人.生物信息学联合分析英国生物库中药物使用与相关疾病. 自然通讯. 10,(2019).
  53. Pirruccello, J. P. et al. Deep learning enables genetic analysis of the human thoracic aorta. bioRxiv https://doi.org/10.1101/2020.05.12 . 091934 (2020).
    皮鲁切罗,J. P. 等。深度学习使人类胸主动脉的遗传分析成为可能。bioRxiv https://doi.org/10.1101/2020.05.12 。 091934 (2020)。
  54. Pirruccello, J. carbocation/traceoverlay: traceoverlay v0.1.0. Zenodo https://doi.org/10.5281/zenodo . 10811511 (2024).
    皮鲁切罗, J. 碳正离子/跟踪叠加层: 跟踪叠加层 v0.1.0. Zenodo https://doi.org/10.5281/zenodo . 10811511 (2024).
  55. Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. arXiv https://arxiv.org/abs/1912.01703 (2019).
    帕兹克,A.等人。PyTorch:一种命令式风格,高性能深度学习库。arXiv https://arxiv.org/abs/1912.01703(2019)。
  56. Deng, J. et al. ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition 248-255. https://doi.org/10.1109/CVPR . 2009. 5206848 (2009).
    邓, J.等. ImageNet: 一个大规模层次化图像数据库. 在: 2009 IEEE 计算机视觉和模式识别会议 248-255. https://doi.org/10.1109/CVPR . 2009. 5206848 (2009).
  57. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. arXiv https://ieeexplore.ieee.org/document/ 7780459 (2015).
    何凯明、张星、任宋和孙剑. 用于图像识别的深度残差学习. arXiv https://ieeexplore.ieee.org/document/ 7780459 (2015).
  58. Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 84-90 (2017).
    克里日夫斯基、苏斯凯夫和霍顿. 采用深度卷积神经网络进行 ImageNet 分类. 《ACM 通讯》60, 84-90 (2017).
  59. Ronneberger, O., Fischer, P. & Brox, T. U-Net: convolutional networks for biomedical image segmentation. arXiv https://arxiv.org/ abs/1505.04597 (2015).
    朗尼贝尔, O., 菲舍, P. 和 布罗克斯, T. U-Net: 用于生物医学图像分割的卷积网络. arXiv https://arxiv.org/ abs/1505.04597 (2015).
  60. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. arXiv https://arxiv.org/abs/1412.6980 (2017).
    金马, D. P. 和 Ba, J. Adam:随机优化的一种方法。 arXiv https://arxiv.org/abs/1412.6980 (2017)。
  61. Smith, L. N. Cyclical learning rates for training neural networks. arXiv https://arxiv.org/abs/1506.01186 (2015).
    史密斯, L. N. 用于训练神经网络的循环学习率. arXiv https://arxiv.org/abs/1506.01186 (2015).
  62. Smith, L. N. A disciplined approach to neural network hyper-parameters: Part 1 – learning rate, batch size, momentum, and weight decay. arXiv https://arxiv.org/abs/1803.09820 (2018).
    史密斯,L. N. 对神经网络超参数的纪律性方法:第 1 部分 - 学习率、批大小、动量和权重衰减。arXiv https://arxiv.org/abs/1803.09820 (2018)。
  63. Lin, T.-Y., Goyal, P., Girshick, R., He, K. & Dollár, P. Focal loss for dense object detection. arXiv https://arxiv.org/abs/1708.02002 (2018).
    林, T.-Y., Goyal, P., Girshick, R., 何, K. 和 Dollár, P. 密集目标检测的焦点损失. arXiv https://arxiv.org/abs/1708.02002 (2018).
  64. D. R. Cox. The regression analysis of binary sequences. J. R. Stat. Soc. Ser. B Methodol. 20, 215-232 (1958).
    考克斯.二元序列的回归分析.J.R.Stat.Soc.Ser.B Methodol.20, 215-232 (1958).
  65. Dice, L. R. Measures of the amount of ecologic association between species. Ecology 26, 297-302 (1945).
    骰子, L. R.种群生态关联量的度量. 生态学 26, 297-302 (1945).
  66. Huttenlocher, D. P., Klanderman, G. A. & Rucklidge, W. J. Comparing images using the Hausdorff distance. IEEE Trans. Pattern Anal. Mach. Intell. 15, 850-863 (1993).
    胡特洛赫, D. P., Klanderman, G. A. 和 Rucklidge, W. J. 使用 Hausdorff 距离比较图像. IEEE 模式分析和机器智能交易. 15, 850-863 (1993).
  67. Kazhdan, M., Bolitho, M. & Hoppe, H. Poisson surface reconstruction. (The Eurographics Association). https://doi.org/10.2312/SGP/ SGP06/061-070 (2006).
    卡赞安、柏立托和霍普 波松曲面重建 (欧洲图形学协会) https://doi.org/10.2312/SGP/ SGP06/061-070 (2006).
  68. Kazhdan, M. & Hoppe, H. Screened poisson surface reconstruction. ACM Trans. Graph. 32, 29:1-29:13 (2013).
    卡兹丹,M. & 霍普,H. 遮蔽泊松表面重建。ACM Trans. Graph. 32, 29:1-29:13 (2013)。
  69. Pirruccello, J. P. et al. Genetic analysis of right heart structure and function in 40,000 people. bioRxiv https://doi.org/10.1101/2021.02 . 05.429046 (2021).
    皮鲁切罗,J.P.等人。40,000 名人员右心结构和功能的遗传分析。bioRxiv https://doi.org/10.1101/2021.02 。05.429046 (2021)。
  70. Fawaz, H. I. et al. InceptionTime: finding AlexNet for time series classification. Data Min. Knowl. Discov. 34, 1936-1962 (2020).
    法瓦兹,H. I.等。 InceptionTime:为时间序列分类找到 AlexNet。数据挖掘与知识发现。 34,1936-1962 (2020)。
  71. Liu, L. et al. On the variance of the adaptive learning rate and beyond. arXiv https://arxiv.org/abs/1908.03265 (2020).
    刘,L.等。关于自适应学习率的方差及其更多。arxiv https://arxiv.org/abs/1908.03265(2020)。
  72. Zhang, M. R., Lucas, J., Hinton, G. & Ba, J. Lookahead optimizer: k steps forward, 1 step back. arXiv https://arxiv.org/abs/1907. 08610 (2019).
    张, M. R., 卢卡斯, J., 汉斯顿, G. & 巴, J. Lookahead 优化器: k 步前进, 1 步后退. arXiv https://arxiv.org/abs/1907. 08610 (2019).
  73. Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203 (2018).
    贝克罗夫特等人。2018 年 10 月 11 日发表在《自然》杂志上的一篇论文中,介绍了英国生物银行的资源,涵盖了深度表型和基因组数据。
  74. Yang, J. et al. FTO genotype is associated with phenotypic variability of body mass index. Nature 490, 267-272 (2012).
    杨等人。FTO 基因型与体质指数(BMI)表型变异性相关。Nature 490, 267-272 (2012)。
  75. Loh, P.-R. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 47, 284-290 (2015).
    罗、P.-R. 等人。在大型队列中,有效的贝叶斯混合模型分析提高了相关性分析的统计功力。《自然遗传学》47, 284-290 (2015)。
  76. Loh, P.-R., Kichaev, G., Gazal, S., Schoech, A. P. & Price, A. L. Mixedmodel association for biobank-scale datasets. Nature Genetics 50, 906-908 (2018).
    罗伯特-P 一洛、基恰耶夫-G、加撒-S、斯科赫-A-P 和普赖斯-A-L。大型生物样本库数据集的混合模型关联分析。自然·遗传学,50, 906-908 (2018)。
  77. Risch, N. & Merikangas, K. The future of genetic studies of complex human diseases. Science 273, 1516-1517 (1996).
    利施, N. 和 Merikangas, K. 复杂人类疾病遗传研究的未来。Science 273, 1516-1517 (1996)。
  78. Hemani, G. et al. The MR-Base platform supports systematic causal inference across the human phenome. eLife 7, e34408 (2018).
    赫曼尼, G. 等. MR-Base 平台支持人类表型域全面的因果推断。 eLife 7, e34408 (2018)。
  79. Bowden, J., Davey Smith, G. & Burgess, S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int. J. Epidemiol. 44, 512-525 (2015).
    鲍登, J., 戴维-史密斯, G. & 伯格斯, S. 使用无效工具的孟德尔随机化:通过 Egger 回归进行效果估计和偏差检测. Int. J. Epidemiol. 44, 512-525 (2015).
  80. Verbanck, M., Chen, C.-Y., Neale, B. & Do, R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat. Genet. 50, 693-698 (2018).
    范伯克,M., 陈,C.-Y., 尼尔,B. & 杜,R. 从曼德尔随机化分析复杂性状和疾病间因果关系推断中检测广泛水平多效性. Nat. Genet. 50, 693-698 (2018).
  81. Cochran, J. D. et al. Clonal hematopoiesis in clinical and experimental heart failure with preserved ejection fraction. Circulation 148, 1165-1178 (2023).
    科克兰, J.D.等 在保留射血分数的临床和实验性心力衰竭中发现克隆性造血. Circulation 148, 1165-1178 (2023).
  82. Burgess, S., Foley, C. N., Allara, E., Staley, J. R. & Howson, J. M. M. A robust and efficient method for Mendelian randomization with hundreds of genetic variants. Nat. Commun. 11, 376 (2020).
    伯吉斯,S.,福里,C. N.,阿拉拉,E.,斯泰利,J. R.&豪森,J. M. M.用数百种遗传变异进行孟德尔随机化的强大高效方法。Nat. Commun.11,376(2020)。
  83. Therneau, T. M. & Grambsch, P. M. Modeling survival data: extending the Cox model. (Springer-Verlag, New York). https://doi. org/10.1007/978-1-4757-3294-8 (2000).
    提尔努, T. M. 和格兰布施, P. M. 生存数据建模:扩展科克斯模型。(纽约,斯普林格-弗雷格)。https://doi.org/10.1007/978-1-4757-3294-8 (2000 年)。
  84. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559-575 (2007).
    普瑞尔、S.等。PLINK:一种用于全基因组关联和种群连锁分析的工具集。美国人类遗传学杂志。81,559-575(2007)。
  85. Venner, E. et al. Whole-genome sequencing as an investigational device for return of hereditary disease risk and pharmacogenomic results as part of the All of Us research program. Genome Med. 14, 34 (2O22).
    文纳,E.等人。全基因组测序作为一种调查设备,用于返回遗传性疾病风险和药物基因组学结果,作为"全民计划"研究计划的一部分。Genome Med. 14, 34 (2022)。
  86. Bick, A. G. et al. Genomic data in the All of Us research program. Nature 1-7 https://doi.org/10.1038/s41586-023-06957-x (2024).
    比克,A. G.等人。"All of Us"研究计划中的基因组数据。《自然》1-7 页 https://doi.org/10.1038/s41586-023-06957-x (2024)。
  87. Hinrichs, A. S. et al. The UCSC genome browser database: update 2006. Nucleic Acids Res. 34, D590-D598 (2006).
    赫因里克斯等。UCSC 基因组浏览器数据库:2006 年更新。核酸研究。34, D590-D598 (2006)。
  88. Pirruccello, J. Left atrial polygenic scores for ‘deep learning of left atrial structure and function provides link to atrial fibrillation risk’. Zenodo https://doi.org/10.5281/zenodo.10814404 (2024).
    皮鲁切洛, J. 用于'深度学习左心房结构和功能'的左心房多基因评分为心房纤维性传导提供了联系. Zenodo https://doi.org/10.5281/zenodo.10814404 (2024).
  89. Di Achille, P. LA GWAS checkpoint of Poisson surface reconstruction with mri_la_poisson.py. Zenodo https://doi.org/10.5281/ zenodo. 10811233 (2024).
    根据 Di Achille, P. 通过 mri_la_poisson.py 进行泊松面重建的 GWAS 检查点。Zenodo https://doi.org/10.5281/ zenodo. 10811233 (2024)。

Acknowledgements 致谢

This work was supported by the Fondation Leducq (14CVD01), and by grants from the National Institutes of Health to Dr. Ellinor (1RO1HL092577, K24HL105780) and Dr. Ho (R01HL134893, R01HL14O224, K24HL153669). This work was supported by a John S LaDue Memorial Fellowship, the Sarnoff Cardiovascular Research Foundation Scholar Award, and NIH K08HL159346 to Dr. Pirruccello. Dr. Kany was supported by the Walter Benjamin Fellowship from the Deutsche Forschungsgemeinschaft (521832260). Dr. Jurgens was supported by the Junior Clinical Scientist Fellowship from the Dutch Heart Foundation (grant no. 03-007-2022-0035). Dr. Nauffal is supported by NIH grant 5T32HL007604-35. Dr. Khurshid is supported by NIH grant K23HL169839 and American Heart Association 23CDA1050571. Dr. Lubitz was supported by NIH grants R01HL139731, RO1HL157635, and American Heart Association 18SFRN34250007. This work was supported by a grant from the American Heart Association Strategically Focused Research Networks to Dr. Ellinor. This work was funded by a collaboration between the Broad Institute and IBM Research. We would like to thank Mary O’Reilly from the Broad Institute PATTERN Team for contributing to the graphical overview in Fig. 1. We want to acknowledge the participants and investigators of FinnGen study. The FinnGen project is funded by two grants from Business Finland (HUS 4685/31/2016 and UH 4386/31/2016) and the following industry partners: AbbVie Inc., AstraZeneca UK Ltd, Biogen MA Inc., Bristol Myers Squibb (and Celgene Corporation & Celgene International II Sàrl), Genentech Inc., Merck Sharp & Dohme LCC, Pfizer Inc., GlaxoSmithKline Intellectual Property Development Ltd., Sanofi US Services Inc., Maze Therapeutics Inc., Janssen Biotech Inc, Novartis AG, and Boehringer Ingelheim International GmbH. Following biobanks are acknowledged for delivering biobank samples to FinnGen: Arctic Biobank (https://www.oulu.fi/medicine/node/207208), Auria Biobank (www.auria.fi/biopankki), THL Biobank (www.thl.fi/ biobank), Helsinki Biobank (www.helsinginbiopankki.fi), Biobank Borealis of Northern Finland (https://www.ppshp.fi/Tutkimus-ja-opetus/Biopankki/Pages/Biobank-Borealis-briefly-in-English. aspx), Finnish Clinical Biobank Tampere (www.tays.fi/en-US/ Research_and_development/Finnish_Clinical_Biobank_Tampere), Biobank of Eastern Finland (www.ita-suomenbiopankki.fi/en), Central Finland Biobank (www.ksshp.fi/fi-FI/Potilaalle/Biopankki), Finnish Red Cross Blood Service Biobank (www.veripalvelu.fi/ verenluovutus/biopankkitoiminta), Terveystalo Biobank (www. terveystalo.com/fi/Yritystietoa/Terveystalo-Biopankki/Biopankki/) and The Finnish Hematology Registry and Clinical Biobank (https://www.fhrb.fi/). All Finnish Biobanks are members of
本项目得到了里德克基金会(14CVD01)和美国国立卫生研究院对于 Ellinor 博士(1RO1HL092577, K24HL105780)和 Ho 博士(R01HL134893, R01HL14O224, K24HL153669)的资助。本项目得到了 John S LaDue 纪念基金会研究奖、Sarnoff 心血管研究基金会学者奖和 NIH K08HL159346 对于 Pirruccello 博士的资助。Kany 博士获得了来自德意志研究基金会 Walter Benjamin 奖学金(521832260)的资助。Jurgens 博士获得了来自荷兰心脏基金会的初级临床科学家奖学金(03-007-2022-0035)。Nauffal 博士获得了 NIH 5T32HL007604-35 的资助。Khurshid 博士获得了 NIH K23HL169839 和美国心脏协会 23CDA1050571 的资助。Lubitz 博士获得了 NIH R01HL139731、RO1HL157635 和美国心脏协会 18SFRN34250007 的资助。本项目得到了美国心脏协会战略性重点研究网络对于 Ellinor 博士的资助。本项目由 Broad 研究所和 IBM Research 联合资助。我们要感谢 Broad 研究所 PATTERN 团队的 Mary O'Reilly 在 Figure 1 中提供的图形概览。我们要感谢 FinnGen 研究的参与者和调查人员。FinnGen 项目由两笔来自商业芬兰的资助(HUS 4685/31/2016 和 UH 4386/31/2016)以及以下行业合作伙伴提供:AbbVie Inc.、AstraZeneca UK Ltd、Biogen MA Inc.、Bristol Myers Squibb (和 Celgene Corporation & Celgene International II Sàrl)、Genentech Inc.、Merck Sharp & Dohme LCC、Pfizer Inc.、GlaxoSmithKline Intellectual Property Development Ltd.、Sanofi US Services Inc.、Maze Therapeutics Inc.、Janssen Biotech Inc、Novartis AG 和 Boehringer Ingelheim International GmbH。Arctic Biobank (https://www.oulu.fi/medicine/node/207208)、Auria Biobank (www.auria.)等生物库为 FinnGen 项目提供了生物样本。fi/biopankki), THL 生物银行(www.thl.fi/生物银行), 赫尔辛基生物银行(www.helsinginbiopankki.fi), 北芬兰生物银行 Biobank Borealis(https://www.ppshp.fi/Tutkimus-ja-opetus/Biopankki/Pages/Biobank-Borealis-briefly-in-English.aspx), 坦佩雷芬兰临床生物银行(www.tays.fi/en-US/Research_and_development/Finnish_Clinical_Biobank_Tampere), 东芬兰生物银行(www.ita-suomenbiopankki.fi/en), 中芬兰生物银行(www.ksshp.fi/fi-FI/Potilaalle/Biopankki), 芬兰红十字血液服务生物银行(www.veripalvelu.fi/verenluovutus/biopankkitoiminta), 泰维斯泰罗生物银行(www.terveystalo.com/fi/Yritystietoa/Terveystalo-Biopankki/Biopankki/)以及芬兰血液学登记处和临床生物银行(https://www.fhrb.fi/).所有的芬兰生物银行都是
BBMRI.fi infrastructure (www.bbmri.fi). Finnish Biobank Cooperative -FINBB (https://finbb.fi/) is the coordinator of BBMRI-ERIC operations in Finland. The Finnish biobank data can be accessed through the Fingenious ^(o+){ }^{\oplus} services (https://site.fingenious.fi/en/) managed by FINBB. The All of Us Research Program is supported by the National Institutes of Health, Office of the Director: Regional Medical Centers: 1 OT2 ODO26549; 1 OT2 ODO26554; 1 OT2 ODO26557; 1 OT2 ODO26556; 1 OT2 ODO26550; 1 OT2 OD 026552; 1 OT2 ODO26553; 1 OT2 ODO26548; 1 OT2 ODO26551; 1 OT2 OD026555; IAA #: AOD 16037; Federally Qualified Health Centers: HHSN 263201600085U; Data and Research Center: 5 U2C ODO23196; Biobank: 1 U24 OD023121; The Participant Center: U24 OD023176; Participant Technology Systems Center: 1 U24 ODO23163; Communications and Engagement: 3 OT2 ODO23205; 3 OT2 ODO23206; and Community Partners: 1 OT2 ODO25277; 3 OT2 ODO25315; 1 OT2 ODO25337; 1 OT2 ODO25276. The All of Us Research Program would not be possible without the partnership of its participants.
BBMRI.fi基础设施(www.bbmri.fi)。芬兰生物银行合作社-FINBB(https://finbb.fi/)是 BBMRI-ERIC 在芬兰的协调机构。芬兰生物银行数据可通过 FINBB 管理的 Fingenious ^(o+){ }^{\oplus} 服务(https://site.fingenious.fi/en/)获取。所有的我们研究计划得到国立卫生研究院办公室的支持:区域医疗中心:1 OT2 ODO26549; 1 OT2 ODO26554; 1 OT2 ODO26557; 1 OT2 ODO26556; 1 OT2 ODO26550; 1 OT2 OD 026552; 1 OT2 ODO26553; 1 OT2 ODO26548; 1 OT2 ODO26551; 1 OT2 OD026555; IAA #: AOD 16037; 联邦认定的健康中心: HHSN 263201600085U; 数据和研究中心: 5 U2C ODO23196; 生物银行: 1 U24 OD023121; 参与者中心: U24 OD023176; 参与者技术系统中心: 1 U24 ODO23163; 交流与参与: 3 OT2 ODO23205; 3 OT2 ODO23206; 以及社区合作伙伴: 1 OT2 ODO25277; 3 OT2 ODO25315; 1 OT2 ODO25337; 1 OT2 ODO25276。没有参与者的合作,所有的我们研究计划都无法实现。

Author contributions 作者贡献

P.T.E. and J.P.P. conceived of the study. S. Kurshid, K.L.L. and S.A.L. provided input into the analysis plan. J.P.P. annotated images. J.P.P. trained the deep learning models. P.D. performed surface reconstruction. J.P.P., P.D., S.J. and S.H.C. conducted bioinformatic analyses for UK Biobank data. J.P.P. conducted bioinformatic analyses for All of Us data. FinnGen and A.P. facilitated the FinnGen analyses, and J.T.R. conducted them. J.P.P., S.H.C., J.T.R. and P.T.E. wrote the paper. MN, S. Kany, V.N., K.N., S.F.F., P.B., A.A.P. and J.E.H. provided critical revisions.
P.T.E.和 J.P.P.构思了该研究。S. Kurshid、K.L.L.和 S.A.L.为分析计划提供了意见。J.P.P.注释了图像。J.P.P.训练了深度学习模型。P.D.进行了表面重建。J.P.P.、P.D.、S.J.和 S.H.C.为 UK Biobank 数据进行了生物信息学分析。J.P.P.为 All of Us 数据进行了生物信息学分析。FinnGen 和 A.P.促进了 FinnGen 分析,由 J.T.R.完成。J.P.P.、S.H.C.、J.T.R.和 P.T.E.撰写了本文。MN、S. Kany、V.N.、K.N.、S.F.F.、P.B.、A.A.P.和 J.E.H.提供了关键修订意见。

Competing interests 利益冲突

Dr. Pirruccello has served as a consultant for Maze Therapeutics. Dr. Lubitz is an employee of Novartis as of July 2022. Dr. Lubitz received sponsored research support from Bristol Myers Squibb, Pfizer, Boehringer Ingelheim, Fitbit, Medtronic, Premier, and IBM, and has consulted for Bristol Myers Squibb, Pfizer, Blackstone Life Sciences, and Invitae. Dr. Ng is employed by IBM Research. Dr. Ho is supported by a grant from Bayer AG focused on machine learning and cardiovascular disease and a research grant from Gilead Sciences. Dr. Ho has received research supplies from EcoNugenics. Dr. Philippakis is employed as a Venture Partner at GV; he is also supported by a grant from Bayer AG to the Broad Institute focused on
皮鲁切洛博士曾担任迷宫治疗公司的顾问。2022 年 7 月,卢比茨博士成为诺华公司的员工。卢比茨博士获得了布里斯托尔-迈尔斯·斯库布公司、辉瑞公司、博林格殷格翰公司、Fitbit、美敦力公司、Premier 公司和 IBM 公司的赞助研究支持,并为布里斯托尔-迈尔斯·斯库布公司、辉瑞公司、黑石生命科学公司和 Invitae 公司提供咨询。吴博士在 IBM 研究所工作。何博士获得了拜耳公司专注于机器学习和心血管疾病的资助,以及吉利德科学公司的研究资助。何博士获得了 EcoNugenics 公司提供的研究物资。菲利帕基斯博士受雇担任 GV 的创业合伙人;他还获得了拜耳公司授予布罗德研究所的资助,专注于

machine learning for clinical trial design. Dr. Ellinor is supported by a grant from Bayer AG to the Broad Institute focused on the genetics and therapeutics of cardiovascular diseases. Dr. Ellinor has also served on advisory boards or consulted for Bayer AG, Quest Diagnostics, MyoKardia and Novartis. The remaining authors report no disclosures.
临床试验设计中的机器学习。埃利诺博士获得拜尔制药公司授予的基金,用于心血管疾病遗传学和治疗学方面的研究。埃利诺博士还曾担任拜尔制药公司、Quest Diagnostics、MyoKardia 和诺华公司的顾问委员会成员或提供咨询。其余作者没有任何披露。

Additional information 补充信息

Supplementary information The online version contains
补充信息 网上版本包含

supplementary material available at
补充材料可在

https://doi.org/10.1038/s41467-024-48229-w .
Correspondence and requests for materials should be addressed to James P. Pirruccello.
来文和材料请求应寄给詹姆斯·P·皮卢切罗。
Peer review information Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
同行评审信息自然通讯感谢匿名评审员对本工作同行评审的贡献。同行评审文件可供使用。
Reprints and permissions information is available at http://www.nature.com/reprints
重印和许可信息可在http://www.nature.com/reprints获取
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
出版商注释 施普林格自然在已发表的地图和机构附属关系方面保持中立。
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/ licenses/by/4.0/.
开放获取 本文根据知识共享署名 4.0 国际许可协议获得许可,该许可允许使用、共享、修改和传播本文,只要您给予适当的署名并提供知识共享许可的链接,并说明是否进行了修改。本文中的图像或其他第三方资料除非在相关信息中另有说明,否则也包含在知识共享许可之内。如果本文的知识共享许可条款不允许您的预期用途,您需要直接从版权所有者那里获得许可。 访问http://creativecommons.org/ licenses/by/4.0/ 以查看本许可证的副本。

© The Author(s) 2024 © 作者(们) 2024

FinnGen 芬兰基因组

Joel T. Rämö5,8 & Aarno Palotie (1) 8 , 19 , 20 8 , 19 , 20 ^(8,19,20){ }^{8,19,20}
乔尔·T·拉摩 5,8 和 Aarno Palotie (1) 8 , 19 , 20 8 , 19 , 20 ^(8,19,20){ }^{8,19,20}

A full list of members and their affiliations appears in the Supplementary Information.
成员及其附属机构的完整名单见补充信息。

  1. A full list of affiliations appears at the end of the paper. *A list of authors and their affiliations appears at the end of the paper.
    作者和单位列表见文章末尾。*作者和单位列表见文章末尾。

    \boxtimes e-mail: james.pirruccello@ucsf.edu
    \boxtimes 电子邮件: james.pirruccello@ucsf.edu
  2. 1 1 ^(1){ }^{1} Division of Cardiology, University of California San Francisco, San Francisco, CA, USA. 2 2 ^(2){ }^{2} Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA. 3 3 ^(3){ }^{3} Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA. 4 4 ^(4){ }^{4} Cardiovascular Genetics Center, University of California San Francisco, San Francisco, CA, USA. 5 5 ^(5){ }^{5} Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA. 6 6 ^(6){ }^{6} Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA. 7 7 ^(7){ }^{7} Cardiovascular Disease Initiative, Broad Institute, Cambridge, MA, USA. 8 8 ^(8){ }^{8} Institute for Molecular Medicine Finland (FIMM), Helsinki Institute of Life Science (HiLIFE), University of Helsinki, Helsinki, Finland. 9 9 ^(9){ }^{9} Cardiology Division, Massachusetts General Hospital, Boston, MA, USA. 10 10 ^(10){ }^{10} Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA. 11 11 ^(11){ }^{11} Demoulas Center for Cardiac Arrhythmias, Massachusetts General Hospital, Boston, MA, USA. 12 12 ^(12){ }^{12} Harvard Medical School, Boston, MA, USA. 13 13 ^(13){ }^{13} Department of Experimental Cardiology, Amsterdam UMC, University of Amsterdam, Amsterdam, NL, Netherlands. 14 14 ^(14){ }^{14} Amsterdam Cardiovascular Sciences, Heart Failure & Arrhythmias, University of Amsterdam, Amsterdam, NL, Netherlands. 15 15 ^(15){ }^{15} Division of Cardiovascular Medicine, Brigham and Women’s Hospital, Boston, MA, USA. 16 16 ^(16){ }^{16} Department of Cardiology, University Heart and Vascular Center Hamburg-Eppendorf, Hamburg, Germany. 17 17 ^(17){ }^{17} IBM Research, Cambridge, MA, USA. 18 18 ^(18){ }^{18} Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA. 19 19 ^(19){ }^{19} Analytic and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA. 20 20 ^(20){ }^{20} Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Boston, MA, USA. 21 21 ^(21){ }^{21} CardioVascular Institute, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA.
    1 1 ^(1){ }^{1} 加州大学旧金山分校心脏病学部, 旧金山, CA, 美国。 2 2 ^(2){ }^{2} 加州大学旧金山分校人类遗传学研究所, 旧金山, CA, 美国。 3 3 ^(3){ }^{3} 加州大学旧金山分校巴卡医疗健康科学研究所, 旧金山, CA, 美国。 4 4 ^(4){ }^{4} 加州大学旧金山分校心血管遗传学中心, 旧金山, CA, 美国。 5 5 ^(5){ }^{5} 麻省理工学院和哈佛大学布罗德研究所心血管疾病计划, 剑桥, MA, 美国。 6 6 ^(6){ }^{6} 麻省理工学院和哈佛大学布罗德研究所数据科学平台, 剑桥, MA, 美国。 7 7 ^(7){ }^{7} 麻省理工学院和哈佛大学布罗德研究所心血管疾病计划, 剑桥, MA, 美国。 8 8 ^(8){ }^{8} 赫尔辛基大学分子医学研究所(FIMM), 赫尔辛基生命科学研究所(HiLIFE), 赫尔辛基, 芬兰。 9 9 ^(9){ }^{9} 马萨诸塞州总医院心脏病学部, 波士顿, MA, 美国。 10 10 ^(10){ }^{10} 马萨诸塞州总医院心血管研究中心, 波士顿, MA, 美国。 11 11 ^(11){ }^{11} 马萨诸塞州总医院德穆勒心脏 Arrhythmias 中心, 波士顿, MA, 美国。 12 12 ^(12){ }^{12} 哈佛医学院, 波士顿, MA, 美国。 13 13 ^(13){ }^{13} 阿姆斯特丹大学医学中心实验心脏病学系, 阿姆斯特丹, 荷兰。 14 14 ^(14){ }^{14} 阿姆斯特丹心血管科学中心, 心力衰竭与心律失常, 阿姆斯特丹大学, 阿姆斯特丹, 荷兰。 15 15 ^(15){ }^{15} 布莱根妇女医院心血管医学部, 波士顿, MA, 美国。 16 16 ^(16){ }^{16} 汉堡-埃彭多夫大学心血管中心,汉堡, 德国。 17 17 ^(17){ }^{17} IBM 研究所, 剑桥, MA, 美国。 18 18 ^(18){ }^{18} 波士顿大学公共卫生学院生物统计系, 波士顿, MA, 美国。 19 19 ^(19){ }^{19} 麻萨诸塞州普通医院和哈佛医学院分析和转化遗传学部, 波士顿, 美国. 20 20 ^(20){ }^{20} 麻省理工学院和哈佛大学斯坦利精神病研究中心, 波士顿, 美国. 21 21 ^(21){ }^{21} 贝斯以色列德肯斯医疗中心心血管研究所, 医学系, 波士顿, 美国.

    e-mail: james.pirruccello@ucsf.edu
    电子邮件: james.pirruccello@ucsf.edu