- 期刊列表
- mSystems
- v.8(6);2023 年 11 月至 12 月
- PMC10734486
知道更多: PMC 免责声明 | PMC 版权告示
2023 年 12 月 5 日在线发布。doi: 10.1128/msystems.00697-23
比较基因组学揭示了应激反应基因和噬菌体在腐生葡萄球菌抗生素耐药性发展中的相关性
Kailun Zhang
1Department of Pathology and Immunology, Division of Laboratory and Genomic Medicine, Washington University School of Medicine in St. Louis, St. Louis, Missouri, USA
2The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri, USA
Robert F. Potter
1Department of Pathology and Immunology, Division of Laboratory and Genomic Medicine, Washington University School of Medicine in St. Louis, St. Louis, Missouri, USA
Jamie Marino
3Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, USA
Carol E. Muenks
1Department of Pathology and Immunology, Division of Laboratory and Genomic Medicine, Washington University School of Medicine in St. Louis, St. Louis, Missouri, USA
Matthew G. Lammers
1Department of Pathology and Immunology, Division of Laboratory and Genomic Medicine, Washington University School of Medicine in St. Louis, St. Louis, Missouri, USA
Jennifer Dien Bard
4Department of Pathology and Laboratory Medicine, Children’s Hospital Los Angeles, Los Angeles, California, USA
5Keck School of Medicine, University of Southern California, Los Angeles, California, USA
Tanis C. Dingle
6Department of Pathology and Laboratory Medicine, University of Calgary, Calgary, Alberta, Canada
Romney Humphries
7Department of Pathology, Microbiology, and Immunology, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
Lars F. Westblade
3Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, USA
Carey-Ann D. Burnham
1Department of Pathology and Immunology, Division of Laboratory and Genomic Medicine, Washington University School of Medicine in St. Louis, St. Louis, Missouri, USA
8Department of Medicine, Washington University School of Medicine in St. Louis, St. Louis, Missouri, USA
9Department of Molecular Microbiology, Washington University School of Medicine in St. Louis, St. Louis, Missouri, USA
10Department of Pediatrics, Washington University School of Medicine in St. Louis, St. Louis, Missouri, USA
Gautam Dantas
1Department of Pathology and Immunology, Division of Laboratory and Genomic Medicine, Washington University School of Medicine in St. Louis, St. Louis, Missouri, USA
2The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, Missouri, USA
9Department of Molecular Microbiology, Washington University School of Medicine in St. Louis, St. Louis, Missouri, USA
10Department of Pediatrics, Washington University School of Medicine in St. Louis, St. Louis, Missouri, USA
11Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, Missouri, USA
关联数据
- 补充材料
- msystems.00697-23-s0001.pdf (1.8M)DOI: 10.1128/msystems.00697-23.SuF1DOI: 10.1128/msystems.00697-23.SuF2DOI: 10.1128/msystems.00697-23.SuF3msystems.00697-23-s0004.xlsx (5.1M)DOI: 10.1128/msystems.00697-23.SuF4msystems.00697-23-s0005.xlsx (5.3M)DOI: 10.1128/msystems.00697-23.SuF5DOI: 10.1128/msystems.00697-23.SuF6DOI: 10.1128/msystems.00697-23.SuF7DOI: 10.1128/msystems.00697-23.SuF8DOI: 10.1128/msystems.00697-23.SuF9
- 数据可用性声明
The adapter removed Illumina reads, and scaffolds of all samples generated in this study have been submitted to the NCBI BioProject database under accession number PRJNA944649.
抽象
腐生葡萄球菌是单纯性尿路感染的主要革兰氏阳性原因。最近关于腐生链球菌抗菌素耐药性 (AMR) 增加的报道需要调查其未被充分研究的耐药模式。在这里,我们使用比较全基因组测序表征了腐生链球菌 (n = 275) 的多样化集合。我们对核心基因 (1,646) 进行了系统发育分析,对我们的腐生链球菌进行了分组,并研究了抗生素耐药基因 (ARG) 的分布。S. saprophyticus 分离株属于两个先前表征的谱系,14.91% (41/275) 表现出多药耐药性。我们将 S. saprophyticus 的抗菌药物敏感性表型与不同 ARGs 和基因等位基因的存在进行了比较。29.8% (82/275) 携带葡萄球菌盒染色体移动元件,其中 25.6% (21/82) 携带 mecA。青霉素耐药与 mecA 或 blaZ 的存在有关。mecA 基因可以作为推断腐生链球菌头孢西丁和苯唑西林耐药性的标志物,但该基因的缺失并不能预测易感性。利用计算建模,我们发现几个基因与头孢西丁和苯唑西林耐药性有关+麦卡−分离株,其中一些在应激反应和细胞壁合成中具有预测功能。此外,表型关联分析表明,在其他葡萄球菌中报道的针对非 β-内酰胺类的 ARG 可能作为腐生链球菌的耐药决定因素。最后,我们观察到噬菌体携带的两个 ARG [erm 和 erm (44)v] 与对红霉素 (11/11 和 10/10) 和克林霉素 (11/11 和 10/10) 的高表型不敏感相关。这项工作中鉴定的 AMR 相关遗传元件有助于改进抗生素治疗期间腐生链球菌的耐药性预测。
重要
性腐生葡萄球菌是与女性尿路感染 (UTI) 相关的第二大常见细菌。单纯性 UTI 的抗菌治疗方案通常是呋喃妥因、甲氧苄啶-磺胺甲噁唑 (TMP-SMX) 或氟喹诺酮类药物,而无需对从尿液标本中回收的腐生链球菌进行常规药敏试验。然而,最近在 UTI 患者以及我们的队列中检测到了 TMP-SMX 耐药的腐生链球菌。在此,我们通过将基因组抗生素抗性基因 (ARG) 含量与易感表型联系起来,研究了该致病物种研究不足的耐药模式。我们描述了 ARG 与已知和新的 SCCmec 构型以及 S. saprophyticus 中的噬菌体元件的关联,这些可能作为限制耐药性传递的干预或诊断靶点。我们的分析产生了一个与临床腐生链球菌分离株中 ARG 序列相关的表型数据的综合数据库,这对于耐药监测和预测至关重要,以实现腐生链球菌 UTI 的精确诊断和有效治疗。
介绍
在 1960 年代,从患有急性尿路感染 (UTI) 的女性的尿液中分离出一些凝固酶阴性葡萄球菌 (CoNS) 菌株 (1),这些菌株后来被归类为腐生葡萄球菌。迄今为止,据报道,腐生链球菌是女性单纯性 UTI 的第二大最常见病原体 (2)。在少数情况下,它还会导致多种并发症,包括急性肾盂肾炎 (3, 4)、血流感染 (5)、心内膜炎 (6) 和肾结石 (7)。在身体栖息地方面,腐生链球菌存在于人类体内,是会阴、直肠、尿道、宫颈和胃肠道皮肤和粘膜表面正常微生物群的一部分 (2, 8)。研究表明,腐生链球菌的直肠、尿道和阴道定植与该微生物引起的 UTI 有关 (9, 10)。该物种在环境中也广泛分布 (11, 12)。正如当前的临床实验室标准研究所 (CLSI) 指南所建议的那样,用于治疗无并发症的腐生链球菌 UTI 患者的抗菌药物选择通常在没有体外抗生素敏感性测试 (AST) 数据的情况下进行:该物种的分离株通常被认为对常用于治疗 UTI 的抗菌剂敏感;因此,不建议对尿路分离株进行常规 AST;首选抗生素通常是呋喃妥因和甲氧苄啶-磺胺甲噁唑 (TMP-SMX)。然而,最近的两项研究指出,来自巴西和伊朗的 UTI S. saprophyticus 分离株中分别有 17.9% 和 9.0% 对 TMP-SMX 具有耐药性 (14, 15)。这突出了在 UTI 治疗期间选择合适的抗生素覆盖率时考虑特定局部耐药模式的重要性。
细菌菌株的全基因组测序 (WGS) 已成为分析各种病原体的耐药决定因素(称为“基因型”)的理想方法。将基因组抗生素耐药基因 (ARG) 含量与抗生素耐药表型联系起来对于耐药监测至关重要。最近对大肠杆菌 (16,–18)、肺炎克雷伯菌 (16, 19)、结核分枝杆菌 (20, 21)、淋病奈瑟菌 (22, 23)、铜绿假单胞菌 (24, 25)、志贺氏菌 (26) 和金黄色葡萄球菌 (27) 进行了评估,–29)。最近对来自人类 UTI 和猪肉加工链的腐生链藻的 WGS 分析报告称,它们属于 G 和 S 两个主要谱系,并确定了 ARGs、噬菌体、血小板结合蛋白 (PBP) 与基因组重组率增加之间的强烈关联 (30)。这些研究和越来越多的关于腐生链球菌表型抗性增加的报道促使对腐生链球菌的抗性组基因型与表型关联的全面调查。
在这里,我们对全球 275 种临床腐生链球菌分离株进行了比较 WGS,并比较了它们的抵抗组基因型谱与表型易感性。与之前的报道 (30) 一致,我们的 S. saprophyticus 菌株也根据其核心基因系统发育特性分为两个主要谱系。我们观察到谱系之间不同的 ARG 模式和分布。为了确定耐药性的基因组决定因素,我们首先检查了易感表型与 ARG 之间的相关性,这些耐药性与葡萄球菌的耐药性有众所周知的关系,例如,β-内酰胺耐药的 mecA (31) 和多西环素耐药的 tet(K) (32, 33)。然后,我们利用计算模型来识别与腐生链球菌的关键抗性表型显着相关的新基因或突变。最后,鉴于噬菌体在 ARG 传播中的潜在作用 (34,–37),我们在队列的噬菌体元件中检测到 ARG,发现它们与对红霉素/克林霉素抗生素的高表型耐药性有关。
结果
Lineages of S. saprophyticus isolates exhibit different resistance patterns
S. saprophyticus 分离株的谱系表现出不同的抗性模式
We compared the annotated WGS of our S. saprophyticus isolates (n = 275) and found that the total pan-genome included 9,584 genes. Among these, 1,646 were considered core genes (>99% prevalence), 1,421 were considered shell genes (15%–99% prevalence), and 6,517 were cloud genes (<15% prevalence). We used core gene alignments to build a maximum-likelihood phylogenetic tree (
我们比较了我们的腐生链球菌分离株 (n = 275) 的注释 WGS,发现总泛基因组包括 9,584 个基因。其中,1,646 个被认为是核心基因(>99% 的患病率),1,421 个被认为是壳基因(15%-99% 的患病率),6,517 个是云基因(<15% 的患病率)。我们使用核心基因比对构建了最大似然系统发育树 (Fig. 1A 图 1A; Table S1). The “water striders” shape (38) of our S. saprophyticus tree exhibited a long internal branch separating two very distinct sub-populations (Fig. S1A). This result was consistent with a prior report of S. saprophyticus, though their genomic phylogeny was built on whole-genome single nucleotide polymorphisms (SNPs) mapped to a single reference S. saprophyticus strain ATCC 15305, and they designated the two subpopulations as lineage G and S (30). To confirm that lineage definitions were not biased by sampling, as well as to assign the lineage group of our isolates, we generated a combined core gene alignment of all published S. saprophyticus genomes and the genomes from our work (Fig. S1B). We confirmed that these genomes were separated into two major groups, and all G and S isolates from Lawal et al. belonged to different groups. Thus, we proceeded with utilizing G and S as the lineage names in our study. Among our cohort, 76% (209/275) of S. saprophyticus isolates were from lineage G, which differed by between 16 and 4,429 core gene single nucleotide polymorphisms (cgSNPs) with a whole-genome average nucleotide identity (ANI) of 99.2116%–99.9971%. Our isolates from lineage S (n = 66) had cgSNPs of 0–5,182 with an ANI of 99.2776%–99.9997% (Fig. S1C). The cgSNP distances and whole genome ANI compared between G and S isolates were 8,720–10,967 and 98.5267%–99.1449%, respectively (Fig. S1C). Isolates from lineage G were mostly from North America (160/209), while lineage S had more isolates collected from South America (30/66;
;表 S1)。我们的 S. saprophyticus 树的 “水黾 ”形状 (38) 表现出一个长的内部分支,将两个非常不同的亚群分开(图 S1A)。这一结果与之前关于腐生链球菌的报道一致,尽管它们的基因组系统发育建立在映射到单个参考腐生链球菌菌株 ATCC 15305 的全基因组单核苷酸多态性 (SNP) 之上,并且他们将这两个亚群命名为谱系 G 和 S (30)。为了确认谱系定义没有因采样而产生偏差,以及分配我们分离株的谱系组,我们生成了所有已发表的腐生链球菌基因组和我们工作中的基因组的组合核心基因比对(图 S1B)。我们证实这些基因组分为两大类,来自 Lawal 等人的所有 G 和 S 分离株都属于不同的组。因此,我们继续在我们的研究中使用 G 和 S 作为谱系名称。在我们的队列中,76% (209/275) 的腐生链球菌分离株来自 G 谱系,其差异在 16 到 4,429 个核心基因单核苷酸多态性 (cgSNP) 之间,全基因组平均核苷酸同一性 (ANI) 为 99.2116%–99.9971%。我们来自谱系 S (n = 66) 的分离株的 cgSNP 为 0–5,182,ANI 为 99.2776%–99.9997%(图 S1C)。G 和 S 分离株之间的 cgSNP 距离和全基因组 ANI 分别为 8,720-10,967 和 98.5267%-99.1449%(图 S1C)。来自谱系 G 的分离株主要来自北美 (160/209),而谱系 S 则有更多的分离株来自南美洲 (30/66;Fig. 1B 图 1B). Intriguingly, the tree shapes of the two lineages are dissimilar, indicating potentially distinct evolutionary patterns. To confirm this assumption, we utilized rhierBAPS (39, 40) to hierarchically cluster the core genes of S. saprophyticus. Four clusters were detected at level 1, among which three were from lineage G (cluster 1, n = 139; cluster 2, n = 53; cluster 4, n = 17), and all S isolates were characterized as one cluster (cluster 3;
).有趣的是,这两个谱系的树形不同,表明可能存在不同的进化模式。为了证实这一假设,我们利用 rhierBAPS (39, 40) 对 S. saprophyticus 的核心基因进行分层聚类。在1级检测到4个簇,其中3个来自谱系G(簇1,n = 139;簇2,n = 53;簇4,n = 17),所有S分离株均被表征为一个簇(簇3;Fig. 1A 图 1A). Furthermore, using principle coordinates analysis (PCoA) on the presence-absence matrix representing all accessary (i.e., non-core) genes, we found that different lineages or clusters mixed within the plot, indicating that S. saprophyticus accessory gene content does not recapitulate the core gene structure (Fig. S1D).
).此外,在代表所有附属(即非核心)基因的存在-不存在矩阵上使用主坐标分析 (PCoA),我们发现不同的谱系或簇在图中混合,表明 S. saprophyticus 附属基因内容并不能概括核心基因结构(图 S1D)。
Next, we identified ARGs encoded by S. saprophyticus and compared their distributions between lineages or clusters. We detected 29 ARGs of 9 antimicrobial categories in our cohort. The antimicrobial categories were used to describe the acquired resistance profile in S. aureus (41). All S. saprophyticus isolates carried at least two ARGs, and one isolate (UA-007) had up to 12 ARGs (Fig. 1A). The numbers of ARGs of G isolates (range: 2–12) were significantly larger than those in S isolates (range: 2–10) determined by Wilcoxon rank-sum test (P-value is 0.0016), although both contained a median of four ARGs (Fig. 1C). When comparing among clusters, we only observed differences in ARG numbers carried by isolates from cluster 1 and 3 (P-value is 0.0012 by Wilcoxon rank-sum test; Fig. 1C).我们分别通过纸片扩散和头孢蛋白酶测定确定了我们的 S. saprophyticus 分离株的抗菌素耐药性 (AMR) 表型 (AMR) 和 β-内酰胺酶活性。14.91% (41/275) 南。腐生菌分离株表现出多药耐药性 (MDR),定义为分离株对三种以上抗菌类别中的至少一种药物不敏感 (41),包括 β-内酰胺类、叶酸途径抑制剂、林可酰胺类、大环内酯类和四环素类。具体来说,7/41 分离株是 MDR,因为它们对 β-内酰胺类、大环内酯类和四环素类不敏感;24/41 分离株为 MDR,因为它们对 β-内酰胺类、林可酰胺类和大环内酯类不敏感;8/41 分离株为 MDR,因为它们对 β-内酰胺、叶酸途径抑制剂和大环内酯类不敏感;1/41 分离株为 MDR,因为它对 β-内酰胺类、叶酸途径抑制剂、大环内酯类和四环素类不敏感;1/41 分离株为 MDR,因为它对 β-内酰胺、叶酸途径抑制剂、林可酰胺和大环内酯类药物不敏感。我们观察到谱系或簇之间的 MDR 率没有差异 (χ2检验,P 值分别为 0.74 和 0.54;图 1D).此外,两个谱系之间的 β-内酰胺酶活性和对头孢西丁和苯唑西林的耐药率 (对非耐药菌株的耐药菌株数量) 差异显著 (χ2检验,P 值为 0.0005、0.0435 和 0.0015;图 1E);β-内酰胺酶活性和对克林霉素、多西环素、TMP-SMX、头孢西丁和苯唑西林的耐药率在簇之间显著差异(图 S1E)。为了可视化每个分离株的 AMR 模式,我们在核心基因系统发育树 (图 1F;表 S2)。总之,我们全球多样化的人类致病性腐生链球菌收藏包括两个主要谱系,它们表现出不同的 AMR 负担。基因型和表型数据都表明我们队列中 MDR 的发展。
在与 AMR 相关的腐生链球菌中鉴定出不可分型的 SCC 元件
葡萄球菌盒染色体 mec (SCCmec) 是一种遗传移动元件,它传达由 mecA 基因编码的广谱 β-内酰胺耐药的核心决定因素 (42)。此外,SCCmec 元件通常携带位点特异性重组酶,这些重组酶被指定为盒式染色体重组酶 (ccr) (43, 44)。迄今为止,已经根据金黄色葡萄球菌的 mec 和 ccr 基因组成对 11 种 SCCmec 类型进行了表征 (45),并且还观察到了不携带 mec 基因的 SCC 元件 (46)。鉴于 SCC 元件在传递甲氧西林耐药性中的作用,我们评估了 SCC 元件在 S. saprophyticus 中的分布。在我们的分离株中,29.8% (82/275) 携带 ccr 基因,mecA 基因的发生率为 7.6% (21/275;图 2A).谱系之间的 SCCmec 患病率没有差异(图 S2A)。在 SCC 分离株中,rlmH 位于 SCC 元件的 5' 端(图 S2B 和 C),表明 SCC 元件的插入位点与 rlmH 重叠,与其他葡萄球菌属的组织相似 (47)。编码 rRNA 大亚基甲基转移酶 H 的基因 rlmH 在所有 (275/275) S. 腐生菌中通过相互 BLAST 检测到,表明它们作为 SCC 转移的受体发挥作用。此外,腐生链球菌中的 SCC 元件包含其他特征基因,这些基因已在耐甲氧西林金黄色葡萄球菌 (MRSA) 中报道 (48),例如胶囊基因簇 (cap)、铜抗性 (cop)、镉抗性 (cad) 或砷抗性操纵子 (ars;无花果。S2B 和 C)。+
我们试图根据 ccr 和 mec 复合物的基因结构 (图 2B 和 C).我们发现,在我们的队列中发现的 mec 复合物中有 3/21 属于 B 类 [由 mecA 组成,mecA 上游插入序列 IS1272 产生的截短 mecR1,以及 mecA 下游的 IS431 (45)] 和 16/21 属于 A 类 [包含 mecA,mecA 上游的完整 mecR1 和 mecI 调节基因,以及 mecA 下游的 IS431 (45 )].在大多数 SCCmec 阳性分离株中检测到 IS431 序列,但在具有 mec 或 ccr 基因的分离重叠群上检测到。除此之外,两种分离株 (1809848 和 CHLA-009) 的基因组中显示 mec 基因和 IS256 共存 (图 S2B)。另一方面,与 MRSA 相比,S. saprophyticus 的 ccr 复合物的组成更加多样化和新颖。具体来说,8/21 mecA SCCmec 和 2/61+麦卡−SCC 元件被鉴定为携带两个 ccr 基因复合物。最常见的 ccr 组合是 ccrA1/ccrB3 (n = 48),其他包括 ccrA1/ccrB1 (n = 4)、ccrA1/ccrB2 (n = 1)、ccrA1/ccrB4 (n = 1)、ccrA2/ccrB3 (n = 1)、ccrA3/ccrB3 (n = 3) 和 ccrA4/ccrB4 (n = 4)。其中,MRSA 中仅报道了 ccrA1/ccrB1、ccrA3/ccrB3 和 ccrA4/ccrB4。此外,在 28 株分离株中检测到基因 ccrC1。两个分离株 (1612274 和 sap-wu-046) 携带 ccrA1 和 ccrB 的一个新等位基因联合 ccrB1 (1-875nt) 和 ccrB3 (940-1,626 nt;无花果。S2B 和 C)。总之,根据金黄色葡萄球菌的当前 SCCmec 分类,腐生链球菌中的新型 SCC 元件,例如携带新 ccr 成分的元件,是不可分型的 (45)。这突出了葡萄球菌种类之间的关键差异,并激发了对 SCC 元件传播的进一步研究。
In terms of AMR, we found that the mecA gene was able to serve as a marker to infer β-lactam resistance of S. saprophyticus, especially for cefoxitin and oxacillin (Fig. 2D). The resistance rates of mecA
S. saprophyticus were 66.7% (14/21), 100.0% (21/21), and 95.2% (20/21) against penicillin, cefoxitin, and oxacillin, respectively (+Fig. 2B), vs 2.8% (7/254), 6.7% (17/254), and 73.2% (186/254) in mecA− isolates (Fig. 2C). Of note, the presence of the mecA gene was also correlated with higher ARG numbers and higher phenotypic resistance against both β-lactam and non-β-lactam antibiotics which we detected in this work (Fig. S2D and E).
在 AMR 方面,我们发现 mecA 基因能够作为推断腐生链球菌对 β-内酰胺耐药性的标志物,尤其是对头孢西丁和苯唑西林的耐药性(图 2D)。mecA S. saprophyticus 对青霉素、头孢西丁和苯唑西林的耐药率分别为 66.7% (14/21)、100.0% (21/21) 和 95.2% (20/21) (+图 2B),而 mecA− 分离株为 2.8% (7/254)、6.7% (17/254) 和 73.2% (186/254) (图 2C)。值得注意的是,mecA 基因的存在也与我们在这项工作中检测到的较高的 ARG 值和对 β-内酰胺和非 β-内酰胺类抗生素的较高表型耐药性相关(图 S2D 和 E)。
S. saprophyticus mutants demonstrate variable resistance patterns to β-lactam antibiotics
Bacteria have developed various mechanisms to combat β-lactam antibiotics. One of the major resistance mechanisms relies on the production of β-lactamase enzymes which hydrolyze the β-lactam ring, thereby inactivating the drug (49). All our S. saprophyticus isolates carried a bla gene (class A β-lactamase, 873 bp), and six of them also encoded blaZ (penicillin-hydrolyzing class A β-lactamase, 846 bp; Fig. 3A). Interestingly, 83.3% (5/6) of blaZ isolates were from lineage S. Furthermore, penicillin resistance was higher for the isolates carrying mecA or blaZ genes (χ+2 test, both P-values are 0.0005; Fig. 3A and B). Inference of penicillin resistance using mecA or blaZ as the markers showed good performances with an accuracy (true susceptible and true resistant) of 96.73% (266/275; Fig. S3A). The prediction errors (major error and very major error are 7/265 and 2/265, respectively) were from the isolates only carrying mecA but not blaZ, although no relationship was found between mecA variants and penicillin phenotypes (Fig. S3B).
Next, we generated a hierarchical tree based on the amino acid sequence identities of bla (Fig. 3A) and tested the associations of different mutations with β-lactamase activities and β-lactam resistance. We defined different bla alleles if they carried a single amino acid substitution; alleles were numbered and analyzed if they were present in at least 10 isolates, otherwise were labeled as “Others”; T183, T109, and T54 represented truncated bla genes based on their putative peptide length. The distribution of bla alleles was highly correlated with S. saprophyticus lineages and clusters (χ2 test, both P-values are 0.0005; Fig. 3A). Isolates with allele 6 and the truncated bla (T54) did not show β-lactamase activity. The alignment of bla alleles highlights the unique mutations in allele 6, A120T and T215D, located in the catalytic domain (47–263) referring to the features of β-lactamase (UniProt-Q49V79_STAS1) protein of S. saprophyticus type strain ATCC 15305 (E-value is 8.01e-186). The Delta Delta G (DDG) values of these amino acid substitutions were −0.48 to −0.61 and −1.30 to −1.03 at normal urine pH [5.5–7.54 (50)], predicted by I-Mutant (51), suggesting their potential of decreasing protein stability and influencing β-lactamase productions (Fig. 3D; Fig. S3C). We also observed that 10 out of 21 mecA genes in our cohort were identified in isolates carrying T54 bla (n = 38), which were clustered together in the core gene phylogeny (Fig. S3D), indicating a different evolutionary history of these isolates. Furthermore, given the function of mecA against cefoxitin and oxacillin (Fig. 2D), we compared the resistances of bla alleles in mecA− isolates. The resistance rate against cefoxitin or oxacillin varied by the presence of different bla alleles (Fig. 3E and F), suggesting their roles in resisting these two β-lactam agents in S. saprophyticus. However, we could not only rely on bla alleles or β-lactamase production to predict susceptibility phenotypes of cefoxitin or oxacillin (Fig. S3E), and more genomic determinants needed to be discovered to explain and predict the AMR.
Putative antibiotic resistance determinants were detected in S. saprophyticus against β-lactams
To further address the knowledge gap in genomic determinants of cefoxitin and oxacillin resistance, particularly among mecA− S. saprophyticus (n = 254), we utilized computational models to identify genomic correlates of susceptibility phenotypes. All accessory genes present in at least 10 isolates, and 131 unique amino acid substitutions (referred to as “gene alleles” later) across 36 core genes served as candidates for the correlation analysis (Table S3). Analogs of these 36 genes are reported to be essential for β-lactam resistance in other staphylococci, especially S. aureus (52, 53), and are involved in encoding PBPs, cell envelope synthesis, stress responses, nucleotide metabolism, or metal homeostasis (Table S4). 105 genes and 15 gene alleles were significantly correlated with cefoxitin resistance (Table S5) detected by MaAsLin2 (54) with a q-value threshold for significance as 0.25. The top features anticorrelated or correlated with cefoxitin, respectively, were gene group_1086 (6/198 isolates with this gene resistant to cefoxitin vs 11/56 isolates absent of this gene resistant to cefoxitin, the coefficient is −0.079), mgrA_2 (6/198 vs 11/56, −0.079), pdhD_2 (6/198 vs 11/56, −0.079), and gene alleles of prkC (8/32 vs 9/222, 0.064), gdpP (8/37 vs 9/217, 0.061), and murF (8/38 vs 9/216, 0.060). On the other hand, 528 genes and 94 gene alleles exhibited significant correlations with oxacillin resistance (Table S6), and 70.0% of these genes encoded hypothetical proteins. The top correlated features with oxacillin resistance were gene alleles of pbpH (41/87 vs 145/167, −0.140), pyrB (61/111 vs 125/143, −0.125), ahpF (50/94 vs 136/160, −0.116), glmU (58/105 vs 128/149, −0.117), glmS (31/64 vs 155/190, −0.098), and mrcA (59/105 vs 127/149, −0.110), and two genes with unknown functions (group_1734: 0.147 and group_2559: −0.132). These novel candidate gene associations with resistance phenotypes necessitate future functional validation studies in S. saprophyticus.
Subsequently, we evaluated the ability to infer susceptibility phenotypes from genotype using a Random Forest Classifier (RFC), trained on the presence or absence of all accessory genes (Fig. 4A), gene alleles (Fig. 4B), or the genes with significant correlations tested above (Fig. 4C). The model with associated genes and gene alleles showed the best prediction performance for both cefoxitin and oxacillin (Fig. 4D): for cefoxitin resistance prediction, its area under the receiver operating characteristic (ROC) curve (AUC; 0.7657 ± 0.1151) was significantly higher than AUCs from the models with all accessory genes (0.6886 ± 0.1245) and gene alleles (0.6049 ± 0.1188) by Wilcoxon rank-sum test (P-value is 1.80e-05 and 2.22e-16, respectively); for oxacillin resistance prediction, its AUC (0.7846 ± 0.0511) was significantly higher than the AUC from the model with all accessory genes (0.7291 ± 0.0458) by Wilcoxon rank-sum test (P-value is 1.70e-11) but similar with the AUC from the model using gene alleles (0.7729 ± 0.0461, P-value is 0.3600).
Non-β-lactams ARGs previously described in other staphylococci explain most non-β-lactam resistance phenotypes in S. saprophyticus
S. saprophyticus isolates in our cohort showed non-susceptibility against four non-β-lactam antibiotics, doxycycline (22/275), TMP-SMX (14/275), erythromycin (129/275), and clindamycin (31/275). The 104 isolates that were erythromycin-resistant and clindamycin-susceptible were tested for inducible clindamycin resistance (ICR) via the disk-diffusion induction test (D-test) (13), and 20/104 isolates showed ICR. We observed significant differences in doxycycline susceptibility phenotypes between isolates with (n = 25) and without the tet(K) gene (Fig. 5A). Among the tet(K) isolates, three showed susceptible phenotypes against doxycycline. S. saprophyticus isolates with dfrG (n = 14), dfrC (n = 12), or folA_2 (n = 6) genes showed 35.7% (5/14), 41.7% (5/12), or 50.0% (3/6) non-susceptibility to TMP-SMX (+Fig. 5B). For erythromycin AST, isolates with erm (n = 18), erm (44)v (n = 14), erm(C) (n = 6), msr(A) (n = 20), or msr(A)/mph(C) (n = 72) genes had 94.4% (17/18), 100.0% (14/14), 100.0% (6/6), 100.0% (20/20), or 98.6% (71/72) non-susceptibility rates (Fig. 5C). Interestingly, mph(C) always coexisted with msr(A) in a S. saprophyticus isolate, a situation that has been described in other CoNS and S. aureus (55, 56), and gene sequences of msr(A) were different when mph(C) was present or absent (Fig. S4A). Lastly, isolates with abc-f (n = 10), erm (44)v (n = 14), and erm(C) (n = 6) genes showed 100.0% (10/10), 92.9% (13/14), or 50.0% (3/6) constitutive resistance rates against clindamycin, which were significantly different from isolates lacking these genes (Fig. 5D). For the four isolates carried erm (44)v or erm(C), they were not resistant to clindamycin with routine AST but showed ICR by D-test (Fig. 5D red dots). In addition, 17 S. saprophyticus isolates with erm gene did not have constitutive resistance to clindamycin were ICR (57), suggesting the need to detect such resistance by a simple D-test on a routine basis. Accordingly, we observe that typical non-β-lactam ARGs previously reported in staphylococci (Table 1) generally correlated with non-β-lactam phenotypic resistance in S. saprophyticus.
TABLE 1
ARG | Annotation | Isolate numbera | Non-susceptible rateb | Associated AMR | Reference of the ARG reported in other CoNS |
---|---|---|---|---|---|
tet(K) | Tetracycline efflux MFSd transporter | 25 | 88.0% | Doxycycline | Reference (58) |
dfrG | Trimethoprim-resistant dihydrofolate reductase | 14 | 35.7% | TMP-SMX | Reference (59) |
dfrC | Trimethoprim-resistant dihydrofolate reductase | 12 | 41.7% | TMP-SMX | References (60, 61) |
folA_2 | Dihydrofolate reductase | 6 | 50.0% | TMP-SMX | References (62, 63) |
msr(A) | ABC-F type ribosomal protection protein | 20 | 100.0% | Erythromycin | References (64, 65) |
mph(C)c | Mph(C) family macrolide 2'-phosphotransferase | 72 | 98.6% | Erythromycin | References (55, 56) |
erm | 23S ribosomal RNA methyltransferase | 18 | 94.4% | Erythromycin | Reference (66) |
erm (44)v | 23S rRNA (adenine(2058)-N (6))-methyltransferase | 14 | 100.0%, 92.9%, and 100.0% | Erythromycin, Clindamycin, ICR | Reference (67) |
erm(C) | 23S rRNA (adenine(2058)-N (6))-methyltransferase | 6 | 100.0%, 50.0%, and 100.0% | Erythromycin, Clindamycin, ICR | References (56, 68) |
abc-f | ABC-F type ribosomal protection protein | 10 | 100.0% | Clindamycin | Reference (69) |
Erythromycin/clindamycin ARGs are possibly transferred by phage elements
Bacteriophages can act as ARG carriers in various environments (35, 36, 70,–72). Phage-encoded ARGs are considered a substantial dissemination threat due to their prolonged persistence, fast replication rate, and potential board host range. Therefore, we assessed the prevalence of phage signatures in our cohort and analyzed their association with phenotypic non-susceptibility among S. saprophyticus. By analyzing WGS data, we identified 520 prophage sequences in 91.3% (251/275) isolates (Fig. 5E; Table S7). The average phage number, as well as the proportion of phage-containing isolates, was similar for S. saprophyticus from different lineages or clusters (Fig. S5B). After assigning phage taxonomy, we found that 210 of the phages belonged to the Siphoviridae family, and 17 were from the Myoviridae. The distributions of these two types of phages were different among lineages: 84.8% (178/210) of siphoviruses and 94.1% (16/17) myoviruses were detected in lineage G, whereas 15.2% (32/210) of siphoviruses and 5.9% (1/17) myoviruses were from lineage S (Fig. 5E). We grouped phage sequences with a 95% ANI cutoff and defined each such group as a phage “population.” Isolates from different lineages contained distinctive phage populations (Fig. 5F). This implies either a strain-level specificity of phage infections in S. saprophyticus or differential phage environments for G and S isolates.
Next, we detected ARGs in all phage sequences using AMRFinder (73). It found that 20.6% (107/520) of S. saprophyticus phages from our cohort carried ARGs, ranging from 1 to 3 (Table S8). Some phage populations contained more ARGs than others (Fig. 5F). These ARGs included fosD (n = 92), fosB (n = 7), erm (44)v (n = 10), erm (n = 11), and dfrG (n = 1; Fig. 5E). Since we had performed AST for erythromycin and clindamycin, we then characterized the resistance associations of correlated ARGs within phage elements [i.e., erm and erm (44)v], which were not found in other parts of the genome. We found erm gene was more abundant within S isolates (Fig. 5G; Fig. S5D) and showed high phenotypic non-susceptibility against clindamycin (11/11 ICR) and erythromycin (11/11). Gene erm (44)v showed 100.0% constitutive resistance to clindamycin (10/10) and non-susceptibility to erythromycin (10/10; Fig. 5G). In sum, our data suggest an important role for phage elements in encoding and spreading ARGs against erythromycin/clindamycin antibiotics across S. saprophyticus.
DISCUSSION
Here, we present a genomic comparison of 275 human pathogenic S. saprophyticus isolates, collected from multicenter healthcare networks. Building from our global phylogenomic characterization of S. saprophyticus lineages, we focused on profiling the S. saprophyticus antibiotic resistome, including analysis and prediction of genotype-phenotype associations. To test the cefoxitin and oxacillin resistance of our S. saprophyticus, we used the disk diffusion method following the procedural guidelines of Staphylococcus epidermidis outlined by the CLSI (M100 31st, 2021) (13), given that CLSI is still in the process of defining an optimal surrogate method for S. saprophyticus. When comparing susceptibility phenotypes between lineages, G isolates displayed lower phenotypic resistance against oxacillin (146/209 compared to 60/66 among S isolates, P-value is 0.0015 by χ2 test) but higher resistance against cefoxitin (34/209 compared to 4/66 among S isolates, P-value is 0.0435 by χ2 test; Fig. 1E). In some Staphylococcus species, β-lactam agents, such as oxacillin and cefoxitin, are used as surrogate markers to predict mecA-mediated methicillin resistance (74,–76). Accordingly, the discrepant burdens of oxacillin and cefoxitin resistance in our cohort prompted us to analyze the occurrence of mecA and SCCmec elements that transfer mecA. Approximately 7.6% (21/275) of our isolates encode mecA, which is similar to a previous report from Japan (77). However, the incidence of mecA gene in S. saprophyticus can be varied across cohorts differing in clinical significance, size, and area (15, 78, 79). We observed that the presence of mecA is predictive of oxacillin and cefoxitin resistance for S. saprophyticus, but its absence is not predictive of susceptibility. Indeed, 90.3% (186/206) and 44.7% (17/38) of S. saprophyticus isolates resistant to oxacillin and cefoxitin, respectively, did not carry mec genes. Interestingly, whereas most clinical S. epidermidis strains carry mecA (80,–82), the prevalence of mecA in S. saprophyticus is much lower. This may indicate that the SCCmec mobilization rate or the types of SCCmec elements may be different among different staphylococci. A previous study used PCR to characterize the SCCmec composition of eight mecA S. saprophyticus isolates (83) and found they were all non-typeable according to the current SCCmec schemes of S. aureus (45) due to the absence of amplification products for hitherto known ccr genes. Indeed, we also observed several mecA+− SCC elements and novel configurations of the ccr gene complex in our cohort. One of these non-typable ccr complexes, ccrA1/ccrB3, was previously reported in S. saprophyticus (83) but not other configurations such as ccrA1/ccrB2, ccrA1/ccrB4, and ccrA2/ccrB3 (Fig. 2B and C). In contrast, most mec complexes found in S. saprophyticus were conserved with those in MRSA, implying a similar evolutionary origin. Additionally, two S. saprophyticus isolates contained mec genes adjacent to the mobilization element IS256. This composition has been described in other CoNS, such as S. epidermidis, Staphylococcus haemolyticus, Staphylococcus hominis, Staphylococcus sciuri, and Staphylococcus cohnii (84, 85) but not in MRSA. IS256 is widespread in the genomes of multiresistant enterococci and staphylococci (86,–88). In S. epidermidis, IS256 has been recognized as a marker of hospital-acquired MDR and biofilm-forming strains causing opportunistic infections in immunocompromised patients (89,–92). In these two S. saprophyticus strains, we detected the biofilm operon (ica) next to the IS265 SCCmec that was inverted in the genome compared to other SCCmec (Fig. S2B), indicating a unique evolutionary history. Altogether, our findings highlight the diversity of SCC elements among S. saprophyticus and motivate further studies on their classification, transmission, and clinical relevance.
Since the mecA gene cannot explain all β-lactam resistances in S. saprophyticus, we used computational models to identify other potential genomic correlates of the resistance phenotypes, which would be beneficial for future resistance prediction. The results show that cefoxitin and oxacillin-correlated gene features are different, suggesting potentially distinctive resistance mechanisms against these two drugs in S. saprophyticus. Among mecA− isolates, gene group_1086, mgrA, and pdhD had positive correlations, and certain gene alleles of prkC, gdpP, and murF had negative correlations with cefoxitin resistance (Table S5). For oxacillin, gene alleles of pbpH, pyrB, ahpF, glmSU, and mrcA (Table S6) exhibited negative correlations to the resistance phenotype. We observed that several of these genes were involved in cell envelope synthesis and stress response and had been reported to correlate with AMR in other bacterial species. Gene group_1086 is annotated as an alcohol dehydrogenase (ADH) that is widely present among bacteria (93) and mitigates alcohol toxicity (94). A recent study noted that E. coli adhE was able to bind with ampicillin and exhibited higher expression levels under ampicillin stress and low intracellular alcohol conditions (95). Next, mgrA, a global regulator, has been found to affect several efflux pumps in S. aureus, such as norA and norB for MDR, and tet38 for tetracycline resistance (96, 97). Gene pdhD encodes a membrane protein, dihydrolipoyl dehydrogenase (DLD), which belongs to the oxidoreductase family and is essential for energy metabolism (98). In Vibrio parahaemolyticus, DLD levels were upregulated in antimicrobial peptide-resistant clones at both translational and transcriptional levels (99). GdpP is a phosphodiesterase that catalyzes the hydrolysis of intracellular secondary messenger c-di-AMP. S. aureus gdpP deletion mutants have been shown to elevate resistance to β-lactams and other cell wall-targeting antimicrobials (100, 101). It also has been shown that mutations of pyrB, encoding aspartate carbamoyltransferase, can alter the susceptibility of P. aeruginosa to β-lactams (102). Moreover, five types of PBP were detected in our cohort, including PBP2a (mecA), PBP 1A (mrcA), PBP 2B (pbpB), PBP H (pbpH), and PBP 1A/1B (ponA). Two alleles of mrcA and three alleles of phpH have been shown to influence oxacillin susceptibilities in S. saprophyticus (Table S6). Lastly, in general, single-pass transmembrane proteins with extracellular PBP and serine/threonine kinase-associated (PASTA) domains are important to the cell wall stress response (103, 104). For example, the PASTA kinases of S. aureus are essential for β-lactam resistance. However, resistance patterns vary amongst strains, and the mechanism is still understudied (105,–107). Genes in the mur operon are probably the substrates of S. aureus PASTA kinases (108), and murF is essential for the optimal expression of methicillin resistance (109). In S. saprophyticus, the serine/threonine-protein kinase PrkC is a potential element of the PASTA system. One prkC variant and one murF variant were correlated with cefoxitin resistance (Tabls S5). In addition, ahpF and glmU are also identified as potential substrates of PASTA kinases (110, 111). Alkyl hydroperoxide reductase AhpF, protecting cells against reactive oxygen species (112), is important for the tolerance of E. coli cells against antibiotics causing DNA damage (113). The GlmSMU pathway responds to produce the Uridine diphosphate-N-acetylglucosamine, an essential peptidoglycan and cell wall teichoic acid precursor (110). We identified alleles of ahpF and glmSU of S. saprophyticus that were anti-correlated with oxacillin resistance (Table S6). In summary, the diverse functions of the above genes indicate the complexity of putative mechanisms of resistance against cefoxitin and oxacillin, which might involve numerous catalytic and metabolic pathways in S. saprophyticus. One hypothesis is that genes affecting cell wall construction and stress response can increase bacterial tolerance to survive antibiotic assault. Better resistance prediction performance may be achieved by refining clinical AST breakpoints for S. saprophyticus and by including gene expression data from the different observed mutants.
Finally, we characterized phage signatures in S. saprophyticus, motivated by the understanding that phages could promote host genetic diversity and niche adaptation by horizontal gene transfer (114,–119). In our cohort, the majority of S. saprophyticus isolates (91.2%) contained at least one phage element. This high phage genomic prevalence may indicate that they contribute to S. saprophyticus fitness in certain environments (37). Earlier work has suggested that phage-carrying ARGs are enriched in the genomes of antibiotic-treated communities (37, 120, 121). Accordingly, we annotated ARGs within phage backbones and evaluated their correlation with phenotypic non-susceptibility since prior reports have suggested they may be non-functional (122). We found that 20.6% (107/520) of S. saprophyticus phage sequences contained ARGs, belonging to fosfomycin (fosB and fosD), macrolide/lincosamide [erm and erm (44)v], or trimethoprim (dfrG) resistance classes. The gene erm (44), having around 84% ANI with erm in S. saprophyticus, was previously reported in a prophage of Staphylococcus xylosus and exhibited resistance to erythromycin, together with inducible resistance to clindamycin (66). In contrast, gene erm (44)v was originally described in an S. saprophyticus isolate and conferred resistance to macrolides and lincosamides (67). We observed that phage-encoding erm and erm (44)v in our S. saprophyticus showed high non-susceptibility against erythromycin/clindamycin. Although neither lincosamides nor macrolides antibiotics were used clinically in the treatment of UTI due to the limited excretion in the urine, we tried to detect their clinical relevance for the S. saprophyticus isolates outside the urinary tract. While S. saprophyticus is primarily an uropathogen, our cohort included 24.5% (67/275) of non-urinary S. saprophyticus isolates; specifically, they were recovered from blood culture (n = 46), wounds/tissues/bone (n = 15), the respiratory system (n = 3), and sterile body fluid (n = 3). Surprisingly, the genomic dissimilarity between strains did not correlate with the body site of isolation (Fig. S1F). This may imply that S. saprophyticus can transmit and survive in diverse physiological conditions. We then compared the distribution of macrolide/lincosamide phenotypic non-susceptibility and ARGs between isolates from blood, urine, and wound/tissue/bone, and we observed similar distribution in these body sites, besides msr(A)/mph(C) gene associated with erythromycin resistance (Fig. S4B and C). Due to the unbalanced sampling, we assumed that we did not have enough statistical power to detect differences across the distribution based on body sites. A focused study of S. saprophyticus transmission will require a more balanced sample set in terms of body site of isolation. Further comparative genomics studies should also consider fecal samples from UTI patients, given prior reports of correlations between uropathogen bladder colonization and gastrointestinal colonization (123, 124).
Our study has limitations. Our AST method, disk diffusion, does not generate an minimal inhibitory concentration (MIC) value. However, disk diffusion is a reproducible and standardized CLSI standard method that is widely used in clinical testing and has been well investigated in the setting of Staphylococcus spp. relevant to the genotype-phenotype correlation. However, this method relied on manual measurement of the zone of clearance. In addition, only one brand of disk for each of the antibiotics and one brand of Mueller-Hinton agar were evaluated in our study. It is possible that variations of the cation concentration between manufacturers may influence AST results (125).
In summary, we performed a comparative phylogenomic and resistome analysis of a globally diverse collection of 275 human pathogenic S. saprophyticus isolates. We compared phenotypic antibiotic susceptibility with potential resistance determinants inferred from current ARG databases and staphylococcal literature. We found that a few documented ARGs [e.g., tet(K), dfrCG, erm, erm (44)v, erm(C), abc-f, msr(A), and msr(A)/mph(C)] from other staphylococci are associated with phenotypic resistance to doxycycline, TMP-SMX, erythromycin, or clindamycin in S. saprophyticus detected in our cohort. In contrast, the genetic antecedents of β-lactam resistance in S. saprophyticus are more complicated. Penicillin susceptibility is correlated with mecA or blaZ. For oxacillin and cefoxitin, the presence of mecA is indicative of a resistance phenotype, but the absence of this gene is not predictive of susceptibility to β-lactam antibiotics using current CLSI interpretive criteria. We also identified several genes involved in stress response and cell wall synthesis to be correlated with resistance to these two drugs. Finally, we describe ARG associations with known and novel SCCmec configurations as well as phage elements in S. saprophyticus, which may serve as intervention or diagnostic targets to limit resistance transmission.
MATERIALS AND METHODS
Study cohort
A total of 275 S. saprophyticus isolates were collected from five medical centers including Washington University School of Medicine in St. Louis (WUSM, n = 101), Children’s Hospital Los Angeles (CHLA, n = 12), Weill Cornell Medical College (WCMC, n = 10), Vanderbilt University Medical Center (VUMC, n = 21), University of Alberta Hospital (UA, n = 14), and the International Health Management Associates (IHMA, n = 117), spanning five continents (South America, n = 34; North America, n = 172; Europe, n = 46; Asia, n = 18; Africa, n = 5) during 2012–2021 (Table S1). Isolates were recovered from human urine specimens (n = 208), blood cultures (n = 46), wounds/tissues/bone (n = 15), the respiratory system (n = 3), and sterile body fluid (n = 3). Their purity was evaluated by streaking on blood agar plates (BAPs, Hardy Diagnostics). Microbial identification was confirmed as S. saprophyticus using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry with the VITEK MS system (bioMérieux).
Resistance characterization and Cefinase assay
Susceptibility testing was performed for TMP-SMX, doxycycline, erythromycin, clindamycin, penicillin, and cefoxitin using Hardy Kirby-Bauer Disks (Hardy Diagnostics) and oxacillin using BD BBL disks (Becton, Dickinson and Company). Methods followed the procedural guidelines outlined by the CLSI (documents M02 and M100) (13, 126, 127). Isolates were grown from the frozen stock onto BAPs and subcultured on BAPs and then 3–5 colonies of pure growth were suspended in 0.85% sterile saline at 0.5 McFarland standard. The suspension was used to inoculate a lawn on Mueller-Hinton Agar (MHA, Hardy Diagnostics). After 16–18 hours (24 hours for cefoxitin) incubation, the zone of clearance around the disks was manually measured with a metric ruler. S. aureus ATCC 25923 was used as a quality control strain. Detection of β-lactamase production was assessed by nitrocefin-based Cefinase disk test without induced (Hardy Diagnostics).
Illumina WGS and de novo genome assembly
Isolate DNA was extracted manually using the Bacteremia kit (Qiagen) as described previously (128) and was quantified with the Quant-iT PicoGreen double-stranded DNA assay (Thermo Fisher Scientific). 0.5 ng of purified isolate DNA was used as the input to prepare Illumina sequencing libraries with the Nextera kit (Illumina) (129). Libraries were pooled at equal concentrations and sequenced on the NovaSeq 6,000 platform (Illumina) to a minimum depth of 2 million reads per sample (2 × 150 bp). Illumina adapters were removed from demultiplexed reads using Trimmomatic (v0.38) with the following parameters: leading, 10; trailing, 10; sliding window, 4:20; and minimum length, 60 (130). Potential human read contamination was removed using DeconSeq (v0.4.3) (131), and the reads were repaired by BBtools (https://sourceforge.net/projects/bbmap/) with default parameters. Processed reads were de novo assembled using Unicycler (v0.4.7) with default settings. Assembly quality was evaluated using BBMap (https://sourceforge.net/projects/bbmap/), QUAST (v4.5) (132), and CheckM (v1.0.13) (133). Assemblies were included for further analysis if (i) they had an average coverage (read depth) ≥40×, (ii) they had a total length within 20.0% range of the reference S. saprophyticus strain ASM781411v1 genome size (2.35–3.13 Mbp), (iii) the total number of contigs ≤ 100 and N50 ≥ 10,000, and (iv) with completeness >95.0% and contamination <5.0%.
Comparative analysis
High-quality assemblies were annotated using Prokka (v1.14.5) with a minimum contig length of 200 bp to identify open reading frames (134). The general feature format (.gff) files outputted by Prokka were used for core gene alignment through Roary (v3.12.0), with default parameters (135). The alignment, composed of 1,646 genes shared by all isolates at a minimum 95.0% identity, was used to generate the maximum likelihood tree with RAxML (v8.2.11) (136). The resulting newick file was visualized in interactive Tree Of Life (iTOL) (137). RhierBAPS (39, 40) was used to identify core gene hierarchical clusters. SNP-sites (v2.4.0) were used to call isolate-specific SNPs against the core gene alignment file created by Roary (138). Whole genome ANI values were determined by FastANI (v1.32) (139) with assembly.fasta files from Unicycler. Next, accessory (non-core) genes identified from Roary are used to calculate the Jaccard distance between isolates through the vegdist function (R vegan package) (140). PCoA was performed on the Jaccard distances using the pcoa function (R ape package) (141).
ARG and SCC element identification
ARGs were identified by AMRFinder (v3.8.4) (73) using results from Prokka as inputs, including assembled genomes (.fna), predicted genes (.faa), and master annotations (.gff). A presence-absence matrix of all ARGs was generated using MATLAB, with associated metadata displayed as color strips to represent isolate lineage, cluster, and corresponding resistance phenotypes in Fig. 1F. Phage-carrying ARGs were also tested by AMRFinder with 75.0% identity and 50.0% coverage as the threshold for the purpose of identifying the functional ARGs. Gene alignment was performed using Clustal Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/) on extracted sequences at nucleotide or amino acid level. The hierarchical tree based on gene sequence alignment was generated using Jalview (https://www.jalview.org) and visualized with iTOL. All genes of SCC elements were identified and annotated with the online tool SCCmecFinder (45) using Roary pan-genome sequence as the input. Hierarchical clustering of mec and ccr gene contents was performed by the pheatmap function (R pheatmap packages) (142) and labeled with isolate lineage, cluster, and resistance phenotypes by color strips. The alignment of SCC elements was presented by Easyfig (143).
Prediction accuracy
To evaluate the prediction accuracy links between genotype and phenotype, we applied specific rules related to the presence of ARGs and antibiotic resistance. We assumed that all the ARGs found in the strains were expressed. All S. saprophyticus isolates carrying mecA were predicted to be resistant to β-lactams, including cefoxitin, oxacillin, and penicillin (Fig. 2D). Isolates carrying mecA or blaZ were also used to predict penicillin resistance (Fig. S3A). A very major error was defined as inferring susceptibility from genomic data, while the strain was resistant to AST. A major error was defined as inferring resistance from genomic data, while the strain was susceptible by AST. True resistant and true susceptible indicated that the prediction was identical to the AST result.
Determining phenotype-associated genes and RFC modeling
A presence-absence matrix was built for all accessory genes and 131 unique amino acid substitutions across 36 core genes related to β-lactams reported in other staphylococci (Table S3). Then, this matrix was analyzed by MaAsLin2 (54) to determine the features that were correlated with cefoxitin and oxacillin resistance using the following options: min_prevalence, 0.039 (i.e., present in ≥10 isolates); analysis_method, “LM”; normalization, “CLR”; transform, “None"; others were using the default. RefSeq and UniProtKB accession numbers of the top 8 correlates and the 36 core genes were detected by UniProtBLAST (https://www.uniprot.org/blast) using sequences in pan_genome_reference.fa from Roary (Tables S4 to 6) to check for their annotations and functions. To evaluate the prediction performance from genomic data to resistance phenotype, we conducted a custom machine-learning process employing random forest analysis using the randomForest function (R randomForest package) (144) with default parameters and the following adjustments: ntree = 5,000, proximity = FALSE, importance = TRUE, and mtry = 3. Genes or gene alleles with a prevalence >3.9% (i.e., ≥10 isolates) were included in the analysis. The model was run over 100 iterations of the 75/25 training/testing data set splits. The model performance was measured through the AUC estimator with the prediction and performance functions (R ROCR package) (145). The mean AUC value was reported with 95% confidence intervals. The ROC plot was generated using the predict and roc functions (R pROC package) (146).
Phage sequence identification and validation
Assemblies from Unicycler were piped through Cenote-Taker 2 to identify putative phage contigs (147) with end features as direct terminal repeats indicating circularity and inverted linear repeats (ITRs) or no features for linear sequences. The linear viral contigs were then binned by VAMB (148) due to the highly fragmented assemblies from short reads, resulting in 1,200 clusters. Contigs in each cluster were concatenated and filtered by length and completeness to remove false positives. Specifically, the length limits were 1,000 nt for the detection of circularity, 4,000 nt for ITRs, and 5,000 nt for other linear sequences. The completeness was computed as a ratio between the length of our phage sequence and the length of matched reference genomes by CheckV (149), and the threshold was set to 10.0%. Phage contigs passed these two filters were then run through VIBRANT with “virome” flag to further remove obvious non-viral sequences (150). As a result, 520 putative viral sequences were identified (Table S7).
Phage taxonomy and population
Protein sequences created by CheckV were used as input for vConTACT2 with “DIAMOND” and database “ProkaryoticViralRefSeq207-Merged” to assign taxonomy (151). For the “unsigned” ones from vConTACT2, we used the tentative taxonomy from Cenote-Taker 2 inferred using BLASTP against a custom database containing Refseq virus and plasmid sequences from GenBank (147). The final viral taxonomy was determined at the family level and used for further analysis (Table S7). Based on MIUViG recommended parameters (152), phages were grouped into populations if they shared ≥95% nucleotide identity across ≥85% of the genome using BLASTN and a CheckV supporting code, anicalc.py (https://bitbucket.org/berkeleylab/checkv/src/master/). The result was visualized using Cytoscape (https://cytoscape.org).
ACKNOWLEDGMENTS
We thank The Edison Family Center for Genome Sciences and Systems Biology staff, Eric Martin, Brian Koebbe, MariaLynn Crosby, and Jessica Hoisington-López for their expertise and technical support in high-throughput computing and sequencing. We thank the Barnes-Jewish Hospital Clinical Microbiology laboratory technicians for providing clinical isolates and assistance with collection and culturing. We also thank members of the Dantas lab for their helpful comments and discussion of the manuscript.
This work was supported in part by awards to G.D. through the National Institute of Allergy and Infectious Diseases of the NIH (grant numbers U01AI123394 and R01AI155893) and the Agency for Healthcare Research and Quality (grant number R01HS027621). The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies.
K.Z., R.F.P., C.A.B., and G.D. conceived the idea, designed the study, and analyzed and interpreted the data. R.F.P., J.M., C.E.M., M.L., J.D.B., T.C.D., R.H., and L.F.W. contributed to obtaining all isolates. R.F.P. and C.E.M. performed ASTs and Cefinase assays. C.E.M. and M.L. extracted all DNA, and K.Z. was responsive for DNA sequencing and generating assemblies. K.Z. and R.F.P. wrote the initial draft of the paper, and C.A.B. and G.D. provided critical revisions. All authors contributed to data interpretation and the final draft of the paper and approved the final version of the manuscript.
DATA AVAILABILITY
The adapter removed Illumina reads, and scaffolds of all samples generated in this study have been submitted to the NCBI BioProject database under accession number PRJNA944649.
ETHICS APPROVAL
This study used de-identified patient isolates, and the only pertinent clinical metadata used was the specimen source and year of isolation. Thus, IRB approval was not required for this investigation.
SUPPLEMENTAL MATERIAL
The following material is available online at https://doi.org/10.1128/msystems.00697-23.
Supplemental figures
msystems.00697-23-s0001.pdf:Fig. S1 to S5.
Table S1
msystems.00697-23-s0002.xlsx:Metadata and susceptibility phenotypes of all S. saprophyticus isolates in our cohort.
Table S2
msystems.00697-23-s0003.xlsx:Matrix of presence and absence of ARGs in each S. saprophyticus isolate.
Table S3
msystems.00697-23-s0004.xlsx:Matrix of presence and absence of accessory genes and genes alleles for testing cefoxitin and oxacillin resistance determinant candidates.
Table S4
msystems.00697-23-s0005.xlsx:Information of 36 core genes whose alleles are used for resistance correlation tests in Table S3.
Table S5
msystems.00697-23-s0006.xlsx:Cefoxitin resistance correlated gene and gene alleles in S. saprophyticus.
Table S6
msystems.00697-23-s0007.xlsx:Oxacillin resistance correlated gene and gene alleles in S. saprophyticus.
Table S7
msystems.00697-23-s0008.xlsx:Phage sequences identified in our cohort.
Table S8
msystems.00697-23-s0009.xlsx:ARGs carried by phages of S. saprophyticus.
ASM does not own the copyrights to Supplemental Material that may be linked to, or accessed through, an article. The authors have granted ASM a non-exclusive, world-wide license to publish the Supplemental Material files. Please contact the corresponding author directly for reuse.