这是用户在 2024-10-17 21:06 为 https://app.immersivetranslate.com/html/ 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?

FN Clarivate Analytics Web of Science VR 1.0 AU Zhou, Yanfeng Li, Lingrui Wang, Chenlong Song, Le Yang, Ge
FN Clarivate Analytics Web of Science VR 1.0 AU Zhou、Yanfeng Li、Lingrui Wang、Chenlong Song、Le Yang、Ge

GobletNet: Wavelet-Based High-Frequency Fusion Network for Semantic Segmentation of Electron Microscopy Images.
GobletNet:基于小波的高频融合网络,用于电子显微镜图像的语义分割。

Semantic segmentation of electron microscopy (EM) images is crucial for nanoscale analysis. With the development of deep neural networks (DNNs), semantic segmentation of EM images has achieved remarkable success. However, current EM image segmentation models are usually extensions or adaptations of natural or biomedical models. They lack the full exploration and utilization of the intrinsic characteristics of EM images. Furthermore, they are often designed only for several specific segmentation objects and lack versatility. In this study, we quantitatively analyze the characteristics of EM images compared with those of natural and other biomedical images via the wavelet transform. To better utilize these characteristics, we design a high-frequency (HF) fusion network, GobletNet, which outperforms state-of-the-art models by a large margin in the semantic segmentation of EM images. We use the wavelet transform to generate HF images as extra inputs and use an extra encoding branch to extract HF information. Furthermore, we introduce a fusion-attention module (FAM) into GobletNet to facilitate better absorption and fusion of information from raw images and HF images. Extensive benchmarking on seven public EM datasets (EPFL, CREMI, SNEMI3D, UroCell, MitoEM, Nanowire and BetaSeg) demonstrates the effectiveness of our model. The code is available at https://github.com/Yanfeng-Zhou/GobletNet.
电子显微镜 (EM) 图像的语义分割对于纳米级分析至关重要。随着深度神经网络(DNN)的发展,EM 图像的语义分割取得了显着的成功。然而,当前的电磁图像分割模型通常是自然或生物医学模型的扩展或改编。缺乏对电磁图像内在特征的充分探索和利用。此外,它们通常仅针对几个特定的​​分割对象而设计,缺乏通用性。在本研究中,我们通过小波变换定量分析了电磁图像与自然图像和其他生物医学图像的特征。为了更好地利用这些特性,我们设计了一种高频(HF)融合网络 GobletNet,它在 EM 图像的语义分割方面远远优于最先进的模型。我们使用小波变换生成 HF 图像作为额外输入,并使用额外的编码分支来提取 HF 信息。此外,我们在 GobletNet 中引入了融合注意力模块(FAM),以促进更好地吸收和融合原始图像和 HF 图像的信息。对七个公共 EM 数据集(EPFL、CREMI、SNEMI3D、UroCell、MitoEM、Nanowire 和 BetaSeg)的广泛基准测试证明了我们模型的有效性。代码可在 https://github.com/Yanfeng-Zhou/GobletNet 获取。

EI 1558-254X DA 2024-10-06 UT MEDLINE:39365717 PM 39365717 ER
EI 1558-254X DA 2024-10-06 UT MEDLINE:39365717 PM 39365717 ER

AU Zhang, Runshi Mo, Hao Wang, Junchen Jie, Bimeng He, Yang Jin, Nenghao Zhu, Liang
张AU、莫润世、王浩、杰俊辰、何必萌、金杨、朱能浩、梁

UTSRMorph: A Unified Transformer and Superresolution Network for Unsupervised Medical Image Registration.
UTSRMorph:用于无监督医学图像配准的统一变压器和超分辨率网络。

Complicated image registration is a key issue in medical image analysis, and deep learning-based methods have achieved better results than traditional methods. The methods include ConvNet-based and Transformer-based methods. Although ConvNets can effectively utilize local information to reduce redundancy via small neighborhood convolution, the limited receptive field results in the inability to capture global dependencies. Transformers can establish long-distance dependencies via a self-attention mechanism; however, the intense calculation of the relationships among all tokens leads to high redundancy. We propose a novel unsupervised image registration method named the unified Transformer and superresolution (UTSRMorph) network, which can enhance feature representation learning in the encoder and generate detailed displacement fields in the decoder to overcome these problems. We first propose a fusion attention block to integrate the advantages of ConvNets and Transformers, which inserts a ConvNet-based channel attention module into a multihead self-attention module. The overlapping attention block, a novel cross-attention method, uses overlapping windows to obtain abundant correlations with match information of a pair of images. Then, the blocks are flexibly stacked into a new powerful encoder. The decoder generation process of a high-resolution deformation displacement field from low-resolution features is considered as a superresolution process. Specifically, the superresolution module was employed to replace interpolation upsampling, which can overcome feature degradation. UTSRMorph was compared to state-of-the-art registration methods in the 3D brain MR (OASIS, IXI) and MR-CT datasets (abdomen, craniomaxillofacial). The qualitative and quantitative results indicate that UTSRMorph achieves relatively better performance. The code and datasets used are publicly available at https://github.com/Runshi-Zhang/UTSRMorph.
复杂的图像配准是医学图像分析的关键问题,基于深度学习的方法取得了比传统方法更好的结果。这些方法包括基于 ConvNet 和基于 Transformer 的方法。尽管ConvNets可以通过小邻域卷积有效地利用局部信息来减少冗余,但有限的感受野导致无法捕获全局依赖性。 Transformer 可以通过 self-attention 机制建立长距离依赖;然而,对所有令牌之间关系的密集计算导致了高度冗余。我们提出了一种新颖的无监督图像配准方法,称为统一变换器和超分辨率(UTSRMorph)网络,它可以增强编码器中的特征表示学习并在解码器中生成详细的位移场以克服这些问题。我们首先提出了一种融合注意力模块来整合ConvNets和Transformers的优点,它将基于ConvNet的通道注意力模块插入到多头自注意力模块中。重叠注意力块是一种新颖的交叉注意力方法,它使用重叠窗口来获得与一对图像的匹配信息的丰富相关性。然后,这些块被灵活地堆叠到一个新的强大编码器中。从低分辨率特征生成高分辨率变形位移场的解码器生成过程被认为是超分辨率过程。具体来说,采用超分辨率模块来代替插值上采样,这可以克服特征退化。将 UTSRMorph 与 3D 大脑 MR(OASIS、IXI)和 MR-CT 数据集(腹部、颅颌面)中最先进的配准方法进行了比较。 定性和定量结果表明UTSRMorph取得了相对较好的性能。使用的代码和数据集可在 https://github.com/Runshi-Zhang/UTSRMorph 上公开获取。

AU Siebert, Hanna Grossbrohmer, Christoph Hansen, Lasse Heinrich, Mattias P
AU Siebert、Hanna Grossbrohmer、Christoph Hansen、Lasse Heinrich、Mattias P

ConvexAdam: Self-Configuring Dual-Optimisation-Based 3D Multitask Medical Image Registration.
ConvexAdam:基于双优化的自配置 3D 多任务医学图像配准。

Registration of medical image data requires methods that can align anatomical structures precisely while applying smooth and plausible transformations. Ideally, these methods should furthermore operate quickly and apply to a wide variety of tasks. Deep learning-based image registration methods usually entail an elaborate learning procedure with the need for extensive training data. However, they often struggle with versatility when aiming to apply the same approach across various anatomical regions and different imaging modalities. In this work, we present a method that extracts semantic or hand-crafted image features and uses a coupled convex optimisation followed by Adam-based instance optimisation for multitask medical image registration. We make use of pre-trained semantic feature extraction models for the individual datasets and combine them with our fast dual optimisation procedure for deformation field computation. Furthermore, we propose a very fast automatic hyperparameter selection procedure that explores many settings and ranks them on validation data to provide a self-configuring image registration framework. With our approach, we can align image data for various tasks with little learning. We conduct experiments on all available Learn2Reg challenge datasets and obtain results that are to be positioned in the upper ranks of the challenge leaderboards. github.com/multimodallearning/convexAdam.
医学图像数据的配准需要能够精确对齐解剖结构同时应用平滑且合理的变换的方法。理想情况下,这些方法应该能够快速运行并适用于各种任务。基于深度学习的图像配准方法通常需要复杂的学习过程,并且需要大量的训练数据。然而,当他们的目标是在不同的解剖区域和不同的成像模式中应用相同的方法时,他们经常会遇到多功能性的问题。在这项工作中,我们提出了一种提取语义或手工制作的图像特征的方法,并使用耦合凸优化和基于 Adam 的实例优化来进行多任务医学图像配准。我们对各个数据集使用预先训练的语义特征提取模型,并将它们与我们用于变形场计算的快速双重优化程序相结合。此外,我们提出了一种非常快速的自动超参数选择过程,该过程探索许多设置并根据验证数据对它们进行排名,以提供自配置图像配准框架。通过我们的方法,我们只需很少的学习就可以为各种任务调整图像数据。我们对所有可用的 Learn2Reg 挑战数据集进行实验,并获得将位于挑战排行榜上位的结果。 github.com/multimodallearning/convexAdam。

AU Hu, Yan Wang, Jun Zhu, Hao Li, Juncheng Shi, Jun
胡AU、王艳、朱军、李浩、施俊成、Jun

Cost-Sensitive Weighted Contrastive Learning Based on Graph Convolutional Networks for Imbalanced Alzheimer's Disease Staging
基于图卷积网络的成本敏感加权对比学习,用于治疗不平衡的阿尔茨海默病分期

Identifying the progression stages of Alzheimer's disease (AD) can be considered as an imbalanced multi-class classification problem in machine learning. It is challenging due to the class imbalance issue and the heterogeneity of the disease. Recently, graph convolutional networks (GCNs) have been successfully applied in AD classification. However, these works did not handle the class imbalance issue in classification. Besides, they ignore the heterogeneity of the disease. To this end, we propose a novel cost-sensitive weighted contrastive learning method based on graph convolutional networks (CSWCL-GCNs) for imbalanced AD staging using resting-state functional magnetic resonance imaging (rs-fMRI). The proposed method is developed on a multi-view graph constructed by the functional connectivity (FC) and high-order functional connectivity (HOFC) features of the subjects. A novel cost-sensitive weighted contrastive learning procedure is proposed to capture discriminative information from the minority classes, encouraging the samples in the minority class to provide adequate supervision. Considering the heterogeneity of the disease, the weights of the negative pairs are introduced into contrastive learning and they are computed based on the distance to class prototypes, which are automatically learned from the training data. Meanwhile, the cost-sensitive mechanism is further introduced into contrastive learning to handle the class imbalance issue. The proposed CSWCL-GCN is evaluated on 720 subjects (including 184 NCs, 40 SMC patients, 208 EMCI patients, 172 LMCI patients and 116 AD patients) from the ADNI (Alzheimer's Disease Neuroimaging Initiative). Experimental results show that the proposed CSWCL-GCN outperforms state-of-the-art methods on the ADNI database.
识别阿尔茨海默病(AD)的进展阶段可以被视为机器学习中的不平衡多类分类问题。由于类别不平衡问题和疾病的异质性,这具有挑战性。最近,图卷积网络(GCN)已成功应用于AD分类。然而,这些工作并没有解决分类中的类别不平衡问题。此外,他们忽视了疾病的异质性。为此,我们提出了一种基于图卷积网络(CSWCL-GCN)的新型成本敏感加权对比学习方法,用于使用静息态功能磁共振成像(rs-fMRI)进行不平衡的 AD 分期。该方法是在由主体的功能连接(FC)和高阶功能连接(HOFC)特征构建的多视图图上开发的。提出了一种新颖的成本敏感加权对比学习程序来捕获少数类别的判别信息,鼓励少数类别中的样本提供充分的监督。考虑到疾病的异质性,将负对的权重引入对比学习中,并根据与类原型的距离来计算它们,这是从训练数据中自动学习的。同时,将成本敏感机制进一步引入对比学习中,以处理类别不平衡问题。拟议的 CSWCL-GCN 在 ADNI(阿尔茨海默病神经影像计划)的 720 名受试者(包括 184 名 NC、40 名 SMC 患者、208 名 EMCI 患者、172 名 LMCI 患者和 116 名 AD 患者)上进行了评估。实验结果表明,所提出的 CSWCL-GCN 优于 ADNI 数据库上最先进的方法。

AU Qiu, Zifeng Yang, Peng Xiao, Chunlun Wang, Shuqiang Xiao, Xiaohua Qin, Jing Liu, Chuan-Ming Wang, Tianfu Lei, Baiying
邱秋、杨子峰、肖鹏、王春伦、肖书强、秦晓华、刘静、王传明、雷天福、白英

3D Multimodal Fusion Network With Disease-Induced Joint Learning for Early Alzheimer's Disease Diagnosis
具有疾病诱导联合学习的 3D 多模态融合网络用于早期阿尔茨海默病诊断

Multimodal neuroimaging provides complementary information critical for accurate early diagnosis of Alzheimer's disease (AD). However, the inherent variability between multimodal neuroimages hinders the effective fusion of multimodal features. Moreover, achieving reliable and interpretable diagnoses in the field of multimodal fusion remains challenging. To address them, we propose a novel multimodal diagnosis network based on multi-fusion and disease-induced learning (MDL-Net) to enhance early AD diagnosis by efficiently fusing multimodal data. Specifically, MDL-Net proposes a multi-fusion joint learning (MJL) module, which effectively fuses multimodal features and enhances the feature representation from global, local, and latent learning perspectives. MJL consists of three modules, global-aware learning (GAL), local-aware learning (LAL), and outer latent-space learning (LSL) modules. GAL via a self-adaptive Transformer (SAT) learns the global relationships among the modalities. LAL constructs local-aware convolution to learn the local associations. LSL module introduces latent information through outer product operation to further enhance feature representation. MDL-Net integrates the disease-induced region-aware learning (DRL) module via gradient weight to enhance interpretability, which iteratively learns weight matrices to identify AD-related brain regions. We conduct the extensive experiments on public datasets and the results confirm the superiority of our proposed method. Our code will be available at: https://github.com/qzf0320/MDL-Net.
多模态神经影像提供了对于阿尔茨海默病 (AD) 的准确早期诊断至关重要的补充信息。然而,多模态神经图像之间固有的变异性阻碍了多模态特征的有效融合。此外,在多模态融合领域实现可靠且可解释的诊断仍然具有挑战性。为了解决这些问题,我们提出了一种基于多融合和疾病诱导学习(MDL-Net)的新型多模态诊断网络,通过有效融合多模态数据来增强早期 AD 诊断。具体来说,MDL-Net提出了一种多融合联合学习(MJL)模块,该模块有效地融合了多模态特征,并从全局、局部和潜在学习的角度增强了特征表示。 MJL 由三个模块组成,全局感知学习(GAL)、局部感知学习(LAL)和外部潜在空间学习(LSL)模块。 GAL 通过自适应 Transformer (SAT) 学习模态之间的全局关系。 LAL 构建局部感知卷积来学习局部关联。 LSL模块通过外积运算引入潜在信息,进一步增强特征表示。 MDL-Net 通过梯度权重集成疾病诱导区域感知学习 (DRL) 模块以增强可解释性,迭代学习权重矩阵来识别 AD 相关的大脑区域。我们对公共数据集进行了广泛的实验,结果证实了我们提出的方法的优越性。我们的代码将在以下位置提供:https://github.com/qzf0320/MDL-Net。

AU Lu, Ziru Zhang, Yizhe Zhou, Yi Wu, Ye Zhou, Tao
AU Lu、张自如、周一哲、吴一、周晔、涛

Domain-interactive Contrastive Learning and Prototype-guided Self-training for Cross-domain Polyp Segmentation.
用于跨域息肉分割的域交互式对比学习和原型引导的自我训练。

Accurate polyp segmentation plays a critical role from colonoscopy images in the diagnosis and treatment of colorectal cancer. While deep learning-based polyp segmentation models have made significant progress, they often suffer from performance degradation when applied to unseen target domain datasets collected from different imaging devices. To address this challenge, unsupervised domain adaptation (UDA) methods have gained attention by leveraging labeled source data and unlabeled target data to reduce the domain gap. However, existing UDA methods primarily focus on capturing class-wise representations, neglecting domain-wise representations. Additionally, uncertainty in pseudo labels could hinder the segmentation performance. To tackle these issues, we propose a novel Domain-interactive Contrastive Learning and Prototype-guided Self-training (DCL-PS) framework for cross-domain polyp segmentation. Specifically, domaininteractive contrastive learning (DCL) with a domain-mixed prototype updating strategy is proposed to discriminate class-wise feature representations across domains. Then, to enhance the feature extraction ability of the encoder, we present a contrastive learning-based cross-consistency training (CL-CCT) strategy, which is imposed on both the prototypes obtained by the outputs of the main decoder and perturbed auxiliary outputs. Furthermore, we propose a prototype-guided self-training (PS) strategy, which dynamically assigns a weight for each pixel during selftraining, filtering out unreliable pixels and improving the quality of pseudo-labels. Experimental results demonstrate the superiority of DCL-PS in improving polyp segmentation performance in the target domain. The code will be released at https://github.com/taozh2017/DCLPS.
结肠镜图像的准确息肉分割在结直肠癌的诊断和治疗中起着至关重要的作用。虽然基于深度学习的息肉分割模型取得了重大进展,但当应用于从不同成像设备收集的看不见的目标域数据集时,它们常常会出现性能下降的问题。为了应对这一挑战,无监督域适应(UDA)方法通过利用标记的源数据和未标记的目标数据来缩小域差距而受到关注。然而,现有的 UDA 方法主要侧重于捕获类级表示,而忽略了域级表示。此外,伪标签的不确定性可能会阻碍分割性能。为了解决这些问题,我们提出了一种新的域交互式对比学习和原型引导自我训练(DCL-PS)框架,用于跨域息肉分割。具体来说,提出了具有域混合原型更新策略的域交互式对比学习(DCL)来区分跨域的类特征表示。然后,为了增强编码器的特征提取能力,我们提出了一种基于对比学习的交叉一致性训练(CL-CCT)策略,该策略应用于由主解码器的输出和扰动的辅助输出获得的原型。此外,我们提出了一种原型引导的自训练(PS)策略,该策略在自训练过程中动态为每个像素分配权重,过滤掉不可靠的像素并提高伪标签的质量。实验结果证明了 DCL-PS 在提高目标域息肉分割性能方面的优越性。代码将发布在https://github.com/taozh2017/DCLPS。

AU Tian, Xiang Ye, Jian'an Zhang, Tao Zhang, Liangliang Liu, Xuechao Fu, Feng Shi, Xuetao Xu, Canhua
区田、叶向、张建安、张涛、刘亮亮、付雪超、石峰、徐雪涛、灿华

Multi-Path Fusion in SFCF-Net for Enhanced Multi-Frequency Electrical Impedance Tomography
SFCF-Net 中的多路径融合用于增强型多频电阻抗断层扫描

Multi-frequency electrical impedance tomography (mfEIT) offers a nondestructive imaging technology that reconstructs the distribution of electrical characteristics within a subject based on the impedance spectral differences among biological tissues. However, the technology faces challenges in imaging multi-class lesion targets when the conductivity of background tissues is frequency-dependent. To address these issues, we propose a spatial-frequency cross-fusion network (SFCF-Net) imaging algorithm, built on a multi-path fusion structure. This algorithm uses multi-path structures and hyper-dense connections to capture both spatial and frequency correlations between multi-frequency conductivity images, which achieves differential imaging for lesion targets of multiple categories through cross-fusion of information. According to both simulation and physical experiment results, the proposed SFCF-Net algorithm shows an excellent performance in terms of lesion imaging and category discrimination compared to the weighted frequency-difference, U-Net, and MMV-Net algorithms. The proposed algorithm enhances the ability of mfEIT to simultaneously obtain both structural and spectral information from the tissue being examined and improves the accuracy and reliability of mfEIT, opening new avenues for its application in clinical diagnostics and treatment monitoring.
多频电阻抗断层扫描 (mfEIT) 提供了一种无损成像技术,可根据生物组织之间的阻抗谱差异重建对象内的电特性分布。然而,当背景组织的电导率依赖于频率时,该技术在对多类病变目标进行成像时面临挑战。为了解决这些问题,我们提出了一种基于多路径融合结构的空间频率交叉融合网络(SFCF-Net)成像算法。该算法利用多路径结构和超密集连接来捕获多频电导率图像之间的空间和频率相关性,通过信息的交叉融合实现多类别病变目标的差分成像。根据仿真和物理实验结果,与加权频差、U-Net 和 MMV-Net 算法相比,所提出的 SFCF-Net 算法在病变成像和类别区分方面表现出优异的性能。所提出的算法增强了mfEIT同时从被检查组织获取结构和光谱信息的能力,提高了mfEIT的准确性和可靠性,为其在临床诊断和治疗监测中的应用开辟了新途径。

AU Wen, Chi Ye, Mang Li, He Chen, Ting Xiao, Xuan
区文、池野、李芒、何晨、肖霆、轩

Concept-based Lesion Aware Transformer for Interpretable Retinal Disease Diagnosis.
基于概念的病变感知变压器,用于可解释的视网膜疾病诊断。

Existing deep learning methods have achieved remarkable results in diagnosing retinal diseases, showcasing the potential of advanced AI in ophthalmology. However, the black-box nature of these methods obscures the decision-making process, compromising their trustworthiness and acceptability. Inspired by the concept-based approaches and recognizing the intrinsic correlation between retinal lesions and diseases, we regard retinal lesions as concepts and propose an inherently interpretable framework designed to enhance both the performance and explainability of diagnostic models. Leveraging the transformer architecture, known for its proficiency in capturing long-range dependencies, our model can effectively identify lesion features. By integrating with image-level annotations, it achieves the alignment of lesion concepts with human cognition under the guidance of a retinal foundation model. Furthermore, to attain interpretability without losing lesion-specific information, our method employs a classifier built on a cross-attention mechanism for disease diagnosis and explanation, where explanations are grounded in the contributions of human-understandable lesion concepts and their visual localization. Notably, due to the structure and inherent interpretability of our model, clinicians can implement concept-level interventions to correct the diagnostic errors by simply adjusting erroneous lesion predictions. Experiments conducted on four fundus image datasets demonstrate that our method achieves favorable performance against state-of-the-art methods while providing faithful explanations and enabling conceptlevel interventions. Our code is publicly available at https://github.com/Sorades/CLAT.
现有的深度学习方法在诊断视网膜疾病方面取得了显着的成果,展示了先进人工智能在眼科领域的潜力。然而,这些方法的黑箱性质掩盖了决策过程,损害了其可信度和可接受性。受到基于概念的方法的启发并认识到视网膜病变与疾病之间的内在相关性,我们将视网膜病变视为概念,并提出了一个本质上可解释的框架,旨在增强诊断模型的性能和可解释性。利用以其擅长捕获远程依赖性而闻名的变压器架构,我们的模型可以有效地识别病变特征。通过与图像级注释相结合,在视网膜基础模型的指导下实现了病变概念与人类认知的一致性。此外,为了在不丢失病变特定信息的情况下获得可解释性,我们的方法采用了基于交叉注意机制的分类器来进行疾病诊断和解释,其中解释基于人类可理解的病变概念及其视觉定位的贡献。值得注意的是,由于我们模型的结构和固有的可解释性,临床医生可以通过简单地调整错误的病变预测来实施概念级干预来纠正诊断错误。在四个眼底图像数据集上进行的实验表明,我们的方法相对于最先进的方法取得了良好的性能,同时提供了忠实的解释并实现了概念级干预。我们的代码可在 https://github.com/Sorades/CLAT 上公开获取。

AU Zhang, Ke Yang, Yan Yu, Jun Fan, Jianping Jiang, Hanliang Huang, Qingming Han, Weidong
张AU、柯阳、于彦、范军、蒋建平、黄汉良、韩清明、卫东

Attribute Prototype-guided Iterative Scene Graph for Explainable Radiology Report Generation.
用于生成可解释的放射学报告的属性原型引导的迭代场景图。

The potential of automated radiology report generation in alleviating the time-consuming tasks of radiologists is increasingly being recognized in medical practice. Existing report generation methods have evolved from using image-level features to the latest approach of utilizing anatomical regions, significantly enhancing interpretability. However, directly and simplistically using region features for report generation compromises the capability of relation reasoning and overlooks the common attributes potentially shared across regions. To address these limitations, we propose a novel region-based Attribute Prototype-guided Iterative Scene Graph generation framework (AP-ISG) for report generation, utilizing scene graph generation as an auxiliary task to further enhance interpretability and relational reasoning capability. The core components of AP-ISG are the Iterative Scene Graph Generation (ISGG) module and the Attribute Prototype-guided Learning (APL) module. Specifically, ISSG employs an autoregressive scheme for structural edge reasoning and a contextualization mechanism for relational reasoning. APL enhances intra-prototype matching and reduces inter-prototype semantic overlap in the visual space to fully model the potential attribute commonalities among regions. Extensive experiments on the MIMIC-CXR with Chest ImaGenome datasets demonstrate the superiority of AP-ISG across multiple metrics.
自动生成放射学报告在减轻放射科医生耗时任务方面的潜力在医疗实践中越来越得到认可。现有的报告生成方法已经从使用图像级特征发展到利用解剖区域的最新方法,显着增强了可解释性。然而,直接简单地使用区域特征来生成报告会损害关系推理的能力,并忽略跨区域可能共享的共同属性。为了解决这些限制,我们提出了一种新颖的基于区域的属性原型引导的迭代场景图生成框架(AP-ISG)用于报告生成,利用场景图生成作为辅助任务来进一步增强可解释性和关系推理能力。 AP-ISG的核心组件是迭代场景图生成(ISGG)模块和属性原型引导学习(APL)模块。具体来说,ISSG 采用自回归方案进行结构边缘推理,并采用上下文化机制进行关系推理。 APL增强了原型内匹配并减少了视觉空间中原型间语义重叠,以充分建模区域之间潜在的属性共性。使用 Chest ImaGenome 数据集对 MIMIC-CXR 进行的大量实验证明了 AP-ISG 在多个指标上的优越性。

AU Huang, Zhili Sun, Jingyi Shao, Yifan Wang, Zixuan Wang, Su Li, Qiyong Li, Jinsong Yu, Qian
AU Huang, 孙志立, 邵静一, 王一凡, 王子轩, 苏力, 李启勇, 于劲松, 钱

PolarFormer: A Transformer-based Method for Multi-lesion Segmentation in Intravascular OCT.
PolarFormer:一种基于 Transformer 的血管内 OCT 多病灶分割方法。

Several deep learning-based methods have been proposed to extract vulnerable plaques of a single class from intravascular optical coherence tomography (OCT) images. However, further research is limited by the lack of publicly available large-scale intravascular OCT datasets with multi-class vulnerable plaque annotations. Additionally, multi-class vulnerable plaque segmentation is extremely challenging due to the irregular distribution of plaques, their unique geometric shapes, and fuzzy boundaries. Existing methods have not adequately addressed the geometric features and spatial prior information of vulnerable plaques. To address these issues, we collected a dataset containing 70 pullback data and developed a multi-class vulnerable plaque segmentation model, called PolarFormer, that incorporates the prior knowledge of vulnerable plaques in spatial distribution. The key module of our proposed model is Polar Attention, which models the spatial relationship of vulnerable plaques in the radial direction. Extensive experiments conducted on the new dataset demonstrate that our proposed method outperforms other baseline methods. Code and data can be accessed via this link: https://github.com/sunjingyi0415/IVOCT-segementaion.
已经提出了几种基于深度学习的方法来从血管内光学相干断层扫描(OCT)图像中提取单一类别的易损斑块。然而,由于缺乏具有多类易损斑块注释的公开大规模血管内 OCT 数据集,进一步的研究受到限制。此外,由于斑块的不规则分布、独特的几何形状和模糊的边界,多类易损斑块分割极具挑战性。现有方法尚未充分解决易损斑块的几何特征和空间先验信息。为了解决这些问题,我们收集了包含 70 个回拉数据的数据集,并开发了一个名为 PolarFormer 的多类易损斑块分割模型,该模型结合了易损斑块空间分布的先验知识。我们提出的模型的关键模块是极地注意力,它模拟易损斑块在径向方向的空间关系。对新数据集进行的大量实验表明,我们提出的方法优于其他基线方法。代码和数据可以通过以下链接访问:https://github.com/sunjingyi0415/IVOCT-segementaion。

AU Yang, Yanwu Ye, Chenfei Su, Guinan Zhang, Ziyao Chang, Zhikai Chen, Hairui Chan, Piu Yu, Yue Ma, Ting
欧阳、叶彦武、苏晨飞、张桂楠、常子耀、陈志凯、陈海瑞、于彪、马跃、丁

BrainMass: Advancing Brain Network Analysis for Diagnosis with Large-scale Self-Supervised Learning.
BrainMass:通过大规模自我监督学习推进大脑网络分析诊断。

Foundation models pretrained on large-scale datasets via self-supervised learning demonstrate exceptional versatility across various tasks. Due to the heterogeneity and hard-to-collect medical data, this approach is especially beneficial for medical image analysis and neuroscience research, as it streamlines broad downstream tasks without the need for numerous costly annotations. However, there has been limited investigation into brain network foundation models, limiting their adaptability and generalizability for broad neuroscience studies. In this study, we aim to bridge this gap. In particular, (1) we curated a comprehensive dataset by collating images from 30 datasets, which comprises 70,781 samples of 46,686 participants. Moreover, we introduce pseudo-functional connectivity (pFC) to further generates millions of augmented brain networks by randomly dropping certain timepoints of the BOLD signal. (2) We propose the BrainMass framework for brain network self-supervised learning via mask modeling and feature alignment. BrainMass employs Mask-ROI Modeling (MRM) to bolster intra-network dependencies and regional specificity. Furthermore, Latent Representation Alignment (LRA) module is utilized to regularize augmented brain networks of the same participant with similar topological properties to yield similar latent representations by aligning their latent embeddings. Extensive experiments on eight internal tasks and seven external brain disorder diagnosis tasks show BrainMass's superior performance, highlighting its significant generalizability and adaptability. Nonetheless, BrainMass demonstrates powerful few/zero-shot learning abilities and exhibits meaningful interpretation to various diseases, showcasing its potential use for clinical applications.
通过自我监督学习在大规模数据集上进行预训练的基础模型在各种任务中表现出了卓越的多功能性。由于医疗数据的异质性和难以收集,这种方法对于医学图像分析和神经科学研究特别有益,因为它简化了广泛的下游任务,而不需要大量昂贵的注释。然而,对脑网络基础模型的研究有限,限制了它们对广泛神经科学研究的适应性和普遍性。在这项研究中,我们的目标是弥合这一差距。特别是,(1) 我们通过整理 30 个数据集的图像来整理一个综合数据集,其中包括 46,686 名参与者的 70,781 个样本。此外,我们引入伪功能连接(pFC),通过随机丢弃 BOLD 信号的某些时间点来进一步生成数百万个增强的大脑网络。 (2) 我们提出了通过掩模建模和特征对齐进行脑网络自监督学习的 BrainMass 框架。 BrainMass 采用 Mask-ROI 建模 (MRM) 来增强网络内依赖性和区域特异性。此外,潜在表示对齐(LRA)模块用于规范具有相似拓扑属性的同一参与者的增强大脑网络,以通过对齐其潜在嵌入来产生相似的潜在表示。对8个内部任务和7个外部脑部疾病诊断任务的大量实验显示了BrainMass的优越性能,凸显了其显着的通用性和适应性。尽管如此,BrainMass 展示了强大的少样本/零样本学习能力,并对各种疾病表现出有意义的解释,展示了其在临床应用中的潜在用途。

AU Jang, Se-In Pan, Tinsu Li, Ye Heidari, Pedram Chen, Junyu Li, Quanzheng Gong, Kuang
AU Jang、Se-In Pan、Tinsu Li、Ye Heidari、Pedram Chen、Junyu Li、Quanzheng Kong、Kuang

Spach Transformer: Spatial and Channel-Wise Transformer Based on Local and Global Self-Attentions for PET Image Denoising
Spach Transformer:基于局部和全局自注意力的空间和通道变换器,用于 PET 图像去噪

Position emission tomography (PET) is widely used in clinics and research due to its quantitative merits and high sensitivity, but suffers from low signal-to-noise ratio (SNR). Recently convolutional neural networks (CNNs) have been widely used to improve PET image quality. Though successful and efficient in local feature extraction, CNN cannot capture long-range dependencies well due to its limited receptive field. Global multi-head self-attention (MSA) is a popular approach to capture long-range information. However, the calculation of global MSA for 3D images has high computational costs. In this work, we proposed an efficient spatial and channel-wise encoder-decoder transformer, Spach Transformer, that can leverage spatial and channel information based on local and global MSAs. Experiments based on datasets of different PET tracers, i.e., F-18-FDG, F-18-ACBC, F-18-DCFPyL, and Ga-68-DOTATATE, were conducted to evaluate the proposed framework. Quantitative results show that the proposed Spach Transformer framework outperforms state-of-the-art deep learning architectures.
位置发射断层扫描(PET)因其定量优点和高灵敏度而广泛应用于临床和研究,但其信噪比(SNR)较低。最近,卷积神经网络(CNN)已被广泛用于提高 PET 图像质量。尽管 CNN 在局部特征提取方面成功且高效,但由于其感受野有限,无法很好地捕获远程依赖性。全局多头自注意力(MSA)是一种捕获远程信息的流行方法。然而,3D图像的全局MSA计算具有较高的计算成本。在这项工作中,我们提出了一种高效的空间和通道编码器-解码器变压器 Spach Transformer,它可以利用基于本地和全局 MSA 的空间和通道信息。基于不同 PET 示踪剂(即 F-18-FDG、F-18-ACBC、F-18-DCFPyL 和 Ga-68-DOTATATE)数据集进行实验来评估所提出的框架。定量结果表明,所提出的 Spach Transformer 框架优于最先进的深度学习架构。

AU Penso, Coby Frenkel, Lior Goldberger, Jacob
AU Penso、科比·弗兰克尔、利奥尔·戈德伯格、雅各布

Confidence Calibration of a Medical Imaging Classification System That is Robust to Label Noise
能够鲁棒地标记噪声的医学成像分类系统的置信度校准

A classification model is calibrated if its predicted probabilities of outcomes reflect their accuracy. Calibrating neural networks is critical in medical analysis applications where clinical decisions rely upon the predicted probabilities. Most calibration procedures, such as temperature scaling, operate as a post processing step by using holdout validation data. In practice, it is difficult to collect medical image data with correct labels due to the complexity of the medical data and the considerable variability across experts. This study presents a network calibration procedure that is robust to label noise. We draw on the fact that the confusion matrix of the noisy labels can be expressed as the matrix product between the confusion matrix of the clean labels and the label noises. The method is based on estimating the noise level as part of a noise-robust training method. The noise level is then used to estimate the network accuracy required by the calibration procedure. We show that despite the unreliable labels, we can still achieve calibration results that are on a par with the results of a calibration procedure using data with reliable labels.
如果分类模型的预测结果概率反映了其准确性,那么分类模型就被校准。校准神经网络在临床决策依赖于预测概率的医学分析应用中至关重要。大多数校准程序(例如温度缩放)通过使用保留验证数据作为后处理步骤运行。在实践中,由于医学数据的复杂性以及专家之间的巨大差异,很难收集具有正确标签的医学图像数据。这项研究提出了一种对标签噪声具有鲁棒性的网络校准程序。我们利用这样的事实:噪声标签的混淆矩阵可以表示为干净标签的混淆矩阵与标签噪声之间的矩阵乘积。该方法基于估计噪声水平,作为抗噪声训练方法的一部分。然后使用噪声水平来估计校准过程所需的网络精度。我们表明,尽管标签不可靠,我们仍然可以获得与使用具有可靠标签的数据的校准程序的结果相同的校准结果。

AU Chen, Qianqian Zhang, Jiadong Meng, Runqi Zhou, Lei Li, Zhenhui Feng, Qianjin Shen, Dinggang
陈AU、张倩倩、孟家栋、周润奇、李雷、冯振辉、沉前进、丁刚

Modality-Specific Information Disentanglement From Multi-Parametric MRI for Breast Tumor Segmentation and Computer-Aided Diagnosis
用于乳腺肿瘤分割和计算机辅助诊断的多参数 MRI 的模态特定信息分离

Breast cancer is becoming a significant global health challenge, with millions of fatalities annually. Magnetic Resonance Imaging (MRI) can provide various sequences for characterizing tumor morphology and internal patterns, and becomes an effective tool for detection and diagnosis of breast tumors. However, previous deep-learning based tumor segmentation methods from multi-parametric MRI still have limitations in exploring inter-modality information and focusing task-informative modality/modalities. To address these shortcomings, we propose a Modality-Specific Information Disentanglement (MoSID) framework to extract both inter- and intra-modality attention maps as prior knowledge for guiding tumor segmentation. Specifically, by disentangling modality-specific information, the MoSID framework provides complementary clues for the segmentation task, by generating modality-specific attention maps to guide modality selection and inter-modality evaluation. Our experiments on two 3D breast datasets and one 2D prostate dataset demonstrate that the MoSID framework outperforms other state-of-the-art multi-modality segmentation methods, even in the cases of missing modalities. Based on the segmented lesions, we further train a classifier to predict the patients' response to radiotherapy. The prediction accuracy is comparable to the case of using manually-segmented tumors for treatment outcome prediction, indicating the robustness and effectiveness of the proposed segmentation method. The code is available at https://github.com/Qianqian-Chen/MoSID.
乳腺癌正在成为一项重大的全球健康挑战,每年导致数百万人死亡。磁共振成像(MRI)可以提供表征肿瘤形态和内部模式的各种序列,成为乳腺肿瘤检测和诊断的有效工具。然而,先前基于多参数 MRI 的深度学习肿瘤分割方法在探索模态间信息和聚焦任务信息模态方面仍然存在局限性。为了解决这些缺点,我们提出了一种模态特定信息解缠(MoSID)框架,以提取模态间和模内注意图作为指导肿瘤分割的先验知识。具体来说,通过解开特定于模态的信息,MoSID 框架通过生成特定于模态的注意力图来指导模态选择和模态间评估,为分割任务提供补充线索。我们对两个 3D 乳房数据集和一个 2D 前列腺数据集进行的实验表明,即使在缺少模态的情况下,MoSID 框架也优于其他最先进的多模态分割方法。基于分割的病灶,我们进一步训练分类器来预测患者对放疗的反应。预测精度与使用手动分割肿瘤进行治疗结果预测的情况相当,表明所提出的分割方法的稳健性和有效性。代码可在 https://github.com/Qianqian-Chen/MoSID 获取。

AU Sengupta, Sourya Anastasio, Mark A.
AU Sengupta、Sourya Anastasio、Mark A.

A Test Statistic Estimation-Based Approach for Establishing Self-Interpretable CNN-Based Binary Classifiers
一种基于测试统计估计的方法,用于建立可自解释的基于 CNN 的二元分类器

Interpretability is highly desired for deep neural network-based classifiers, especially when addressing high-stake decisions in medical imaging. Commonly used post-hoc interpretability methods have the limitation that they can produce plausible but different interpretations of a given model, leading to ambiguity about which one to choose. To address this problem, a novel decision-theory-inspired approach is investigated to establish a self-interpretable model, given a pre-trained deep binary black-box medical image classifier. This approach involves utilizing a self-interpretable encoder-decoder model in conjunction with a single-layer fully connected network with unity weights. The model is trained to estimate the test statistic of the given trained black-box deep binary classifier to maintain a similar accuracy. The decoder output image, referred to as an equivalency map, is an image that represents a transformed version of the to-be-classified image that, when processed by the fixed fully connected layer, produces the same test statistic value as the original classifier. The equivalency map provides a visualization of the transformed image features that directly contribute to the test statistic value and, moreover, permits quantification of their relative contributions. Unlike the traditional post-hoc interpretability methods, the proposed method is self-interpretable, quantitative. Detailed quantitative and qualitative analyses have been performed with three different medical image binary classification tasks.
基于深度神经网络的分类器非常需要可解释性,特别是在处理医学成像中的高风险决策时。常用的事后可解释性方法具有局限性,即它们可以对给定模型产生看似合理但不同的解释,从而导致选择哪一种模型的模糊性。为了解决这个问题,研究了一种新颖的决策理论启发方法,在给定预训练的深度二元黑盒医学图像分类器的情况下建立可自我解释的模型。这种方法涉及利用可自解释的编码器-解码器模型以及具有统一权重的单层全连接网络。该模型经过训练来估计给定训练的黑盒深度二元分类器的测试统计量,以保持类似的准确性。解码器输出图像,称为等价图,是表示待分类图像的变换版本的图像,当由固定的全连接层处理时,产生与原始分类器相同的测试统计值。等价图提供了直接贡献于测试统计值的变换图像特征的可视化,此外,还允许量化它们的相对贡献。与传统的事后可解释性方法不同,所提出的方法是可自我解释的、定量的。对三种不同的医学图像二元分类任务进行了详细的定量和定性分析。

C1 Univ Illinois, Dept Elect & Comp Engn, Urbana, IL 61801 USA C1 Univ Illinois, Dept Bioengn, Urbana, IL 61801 USA SN 0278-0062 EI 1558-254X DA 2024-05-23 UT WOS:001214547800003 PM 38163307 ER
C1 伊利诺伊大学,Elect & Comp Engn,厄巴纳,IL 61801 美国 C1 伊利诺伊大学,生物工程系,厄巴纳,IL 61801 美国 SN 0278-0062 EI 1558-254X DA 2024-05-23 UT WOS:001214547800003 PM 38163307 ER

AU Beuret, Samuel Heriard-Dubreuil, Baptiste Martiartu, Naiara Korta Jaeger, Michael Thiran, Jean-Philippe
AU Beuret、Samuel Heriard-Dubreuil、Baptiste Martiartu、Naiara Korta Jaeger、Michael Thiran、Jean-Philippe

Windowed Radon Transform for Robust Speed-of-Sound Imaging With Pulse-Echo Ultrasound
窗口氡变换用于脉冲回波超声的鲁棒声速成像

In recent years, methods estimating the spatial distribution of tissue speed of sound with pulse-echo ultrasound are gaining considerable traction. They can address limitations of B-mode imaging, for instance in diagnosing fatty liver diseases. Current state-of-the-art methods relate the tissue speed of sound to local echo shifts computed between images that are beamformed using restricted transmit and receive apertures. However, the aperture limitation affects the robustness of phase-shift estimations and, consequently, the accuracy of reconstructed speed-of-sound maps. Here, we propose a method based on the Radon transform of image patches able to estimate local phase shifts from full-aperture images. We validate our technique on simulated, phantom and in-vivo data acquired on a liver and compare it with a state-of-the-art method. We show that the proposed method enhances the stability to changes of beamforming speed of sound and to a reduction of the number of insonifications. In particular, the deployment of pulse-echo speed-of-sound estimation methods onto portable ultrasound devices can be eased by the reduction of the number of insonifications allowed by the proposed method.
近年来,利用脉冲回波超声估计组织声速空间分布的方法获得了相当大的关注。它们可以解决 B 型成像的局限性,例如在诊断脂肪肝疾病方面。当前最先进的方法将声波的组织速度与使用受限的发射和接收孔径进行波束形成的图像之间计算的局部回声偏移相关联。然而,孔径限制影响相移估计的鲁棒性,从而影响重建声速图的准确性。在这里,我们提出了一种基于图像块 Radon 变换的方法,能够估计全孔径图像的局部相移。我们在肝脏上获得的模拟数据、模型数据和体内数据验证了我们的技术,并将其与最先进的方法进行比较。我们表明,所提出的方法增强了声波束形成速度变化的稳定性并减少了声穿透的数量。特别地,通过减少所提出的方法允许的声穿透的数量,可以简化脉冲回波声速估计方法在便携式超声设备上的部署。

AU Ortiz-Gonzalez, Antonio Kobler, Erich Simon, Stefan Bischoff, Leon Nowak, Sebastian Isaak, Alexander Block, Wolfgang Sprinkart, Alois M. Attenberger, Ulrike Luetkens, Julian A. Bayro-Corrochano, Eduardo Effland, Alexander
AU Ortiz-Gonzalez、Antonio Kobler、Erich Simon、Stefan Bischoff、Leon Nowak、Sebastian Isaak、Alexander Block、Wolfgang Sprinkart、Alois M. Attenberger、Ulrike Luetkens、Julian A. Bayro-Corrochano、Eduardo Effland、Alexander

Optical Flow-Guided Cine MRI Segmentation With Learned Corrections
具有学习校正的光流引导电影 MRI 分割

In cardiac cine magnetic resonance imaging (MRI), the heart is repeatedly imaged at numerous time points during the cardiac cycle. Frequently, the temporal evolution of a certain region of interest such as the ventricles or the atria is highly relevant for clinical diagnosis. In this paper, we devise a novel approach that allows for an automatized propagation of an arbitrary region of interest (ROI) along the cardiac cycle from respective annotated ROIs provided by medical experts at two different points in time, most frequently at the end-systolic (ES) and the end-diastolic (ED) cardiac phases. At its core, a 3D TV- $\boldsymbol {L<^>{1}}$ -based optical flow algorithm computes the apparent motion of consecutive MRI images in forward and backward directions. Subsequently, the given terminal annotated masks are propagated by this bidirectional optical flow in 3D, which results, however, in improper initial estimates of the segmentation masks due to numerical inaccuracies. These initially propagated segmentation masks are then refined by a 3D U-Net-based convolutional neural network (CNN), which was trained to enforce consistency with the forward and backward warped masks using a novel loss function. Moreover, a penalization term in the loss function controls large deviations from the initial segmentation masks. This method is benchmarked both on a new dataset with annotated single ventricles containing patients with severe heart diseases and on a publicly available dataset with different annotated ROIs. We emphasize that our novel loss function enables fine-tuning the CNN on a single patient, thereby yielding state-of-the-art results along the complete cardiac cycle.
在心脏电影磁共振成像 (MRI) 中,心脏在心动周期的多个时间点重复成像。通常,某个感兴趣区域(例如心室或心房)的时间演变与临床诊断高度相关。在本文中,我们设计了一种新颖的方法,允许根据医学专家在两个不同时间点(最常见的是收缩末期)提供的各自注释的 ROI,沿着心动周期自动传播任意感兴趣区域 (ROI) (ES) 和舒张末期 (ED) 心脏阶段。其核心是基于 3D TV- $\boldsymbol {L<^>{1}}$ 的光流算法计算连续 MRI 图像向前和向后方向的表观运动。随后,给定的终端注释掩模通过 3D 中的双向光流传播,然而,由于数值不准确,这导致分割掩模的初始估计不正确。然后,这些最初传播的分割掩模由基于 3D U-Net 的卷积神经网络 (CNN) 进行细化,该网络经过训练,可使用新颖的损失函数来强制与前向和后向扭曲掩模保持一致。此外,损失函数中的惩罚项控制与初始分割掩模的大偏差。该方法在包含严重心脏病患者的带注释单心室的新数据集和具有不同注释 ROI 的公开数据集上进行基准测试。我们强调,我们的新颖损失函数能够对单个患者的 CNN 进行微调,从而在整个心动周期中产生最先进的结果。

AU Qiao, Mengyun Wang, Shuo Qiu, Huaqi de Marvao, Antonio O'Regan, Declan P. Rueckert, Daniel Bai, Wenjia
AU Qiao, 王梦云, 邱硕, Huaqi de Marvao, Antonio O'Regan, Declan P. Rueckert, Daniel Bai, Wenjia

CHeart: A Conditional Spatio-Temporal Generative Model for Cardiac Anatomy
CHart:心脏解剖学条件时空生成模型

Two key questions in cardiac image analysis are to assess the anatomy and motion of the heart from images; and to understand how they are associated with non-imaging clinical factors such as gender, age and diseases. While the first question can often be addressed by image segmentation and motion tracking algorithms, our capability to model and answer the second question is still limited. In this work, we propose a novel conditional generative model to describe the 4D spatio-temporal anatomy of the heart and its interaction with non-imaging clinical factors. The clinical factors are integrated as the conditions of the generative modelling, which allows us to investigate how these factors influence the cardiac anatomy. We evaluate the model performance in mainly two tasks, anatomical sequence completion and sequence generation. The model achieves high performance in anatomical sequence completion, comparable to or outperforming other state-of-the-art generative models. In terms of sequence generation, given clinical conditions, the model can generate realistic synthetic 4D sequential anatomies that share similar distributions with the real data.
心脏图像分析中的两个关键问题是从图像中评估心脏的解剖结构和运动;并了解它们与性别、年龄和疾病等非影像学临床因素的关系。虽然第一个问题通常可以通过图像分割和运动跟踪算法来解决,但我们建模和回答第二个问题的能力仍然有限。在这项工作中,我们提出了一种新颖的条件生成模型来描述心脏的 4D 时空解剖结构及其与非成像临床因素的相互作用。临床因素被整合为生成模型的条件,这使我们能够研究这些因素如何影响心脏解剖结构。我们主要在两个任务中评估模型性能:解剖序列完成和序列生成。该模型在解剖序列完成方面实现了高性能,可与其他最先进的生成模型相媲美或优于其他最先进的生成模型。在序列生成方面,在给定临床条件下,该模型可以生成与真实数据具有相似分布的真实合成 4D 序列解剖结构。

AU Gao, Qi Li, Zilong Zhang, Junping Zhang, Yi Shan, Hongming
AU 高、李奇、张子龙、张军平、单毅、洪明

CoreDiff: Contextual Error-Modulated Generalized Diffusion Model for Low-Dose CT Denoising and Generalization
CoreDiff:用于低剂量 CT 去噪和泛化的上下文误差调制广义扩散模型

Low-dose computed tomography (CT) images suffer from noise and artifacts due to photon starvation and electronic noise. Recently, some works have attempted to use diffusion models to address the over-smoothness and training instability encountered by previous deep-learning-based denoising models. However, diffusion models suffer from long inference time due to a large number of sampling steps involved. Very recently, cold diffusion model generalizes classical diffusion models and has greater flexibility. Inspired by cold diffusion, this paper presents a novel COntextual eRror-modulated gEneralized Diffusion model for low-dose CT (LDCT) denoising, termed CoreDiff. First, CoreDiff utilizes LDCT images to displace the random Gaussian noise and employs a novel mean-preserving degradation operator to mimic the physical process of CT degradation, significantly reducing sampling steps thanks to the informative LDCT images as the starting point of the sampling process. Second, to alleviate the error accumulation problem caused by the imperfect restoration operator in the sampling process, we propose a novel ContextuaL Error-modulAted Restoration Network (CLEAR-Net), which can leverage contextual information to constrain the sampling process from structural distortion and modulate time step embedding features for better alignment with the input at the next time step. Third, to rapidly generalize the trained model to a new, unseen dose level with as few resources as possible, we devise a one-shot learning framework to make CoreDiff generalize faster and better using only one single LDCT image (un)paired with normal-dose CT (NDCT). Extensive experimental results on four datasets demonstrate that our CoreDiff outperforms competing methods in denoising and generalization performance, with clinically acceptable inference time. Source code is made available at https://github.com/qgao21/CoreDiff.
由于光子匮乏和电子噪声,低剂量计算机断层扫描 (CT) 图像会受到噪声和伪影的影响。最近,一些工作尝试使用扩散模型来解决先前基于深度学习的去噪模型遇到的过度平滑和训练不稳定的问题。然而,由于涉及大量采样步骤,扩散模型的推理时间较长。最近,冷扩散模型概括了经典扩散模型并具有更大的灵活性。受冷扩散的启发,本文提出了一种用于低剂量 CT (LDCT) 降噪的新型上下文误差调制广义扩散模型,称为 CoreDiff。首先,CoreDiff 利用 LDCT 图像来取代随机高斯噪声,并采用新颖的保均退化算子来模拟 CT 退化的物理过程,由于信息丰富的 LDCT 图像作为采样过程的起点,显着减少了采样步骤。其次,为了缓解采样过程中不完美恢复算子引起的误差累积问题,我们提出了一种新颖的上下文误差调制恢复网络(CLEAR-Net),它可以利用上下文信息来约束采样过程的结构失真和调制时间步嵌入特征,以便更好地与下一个时间步的输入对齐。第三,为了用尽可能少的资源将训练好的模型快速泛化到新的、未见过的剂量水平,我们设计了一个一次性学习框架,使 CoreDiff 只使用一张(未)与正常图像配对的 LDCT 图像来更快更好地泛化。剂量CT(NDCT)。 四个数据集的广泛实验结果表明,我们的 CoreDiff 在去噪和泛化性能方面优于竞争方法,并且具有临床可接受的推理时间。源代码可在 https://github.com/qgao21/CoreDiff 获取。

AU Zhang, Ruipeng Qin, Binjie Zhao, Jun Zhu, Yueqi Lv, Yisong Ding, Song
张AU、秦瑞鹏、赵斌杰、朱军、吕悦琪、丁一松、宋

Locating X-Ray Coronary Angiogram Keyframes via Long Short-Term Spatiotemporal Attention With Image-to-Patch Contrastive Learning
通过图像到斑块对比学习的长短期时空注意力定位 X 射线冠状动脉造影关键帧

Locating the start, apex and end keyframes of moving contrast agents for keyframe counting in X-ray coronary angiography (XCA) is very important for the diagnosis and treatment of cardiovascular diseases. To locate these keyframes from the class-imbalanced and boundary-agnostic foreground vessel actions that overlap complex backgrounds, we propose long short-term spatiotemporal attention by integrating a convolutional long short-term memory (CLSTM) network into a multiscale Transformer to learn the segment- and sequence-level dependencies in the consecutive-frame-based deep features. Image-to-patch contrastive learning is further embedded between the CLSTM-based long-term spatiotemporal attention and Transformer-based short-term attention modules. The imagewise contrastive module reuses the long-term attention to contrast image-level foreground/background of XCA sequence, while patchwise contrastive projection selects the random patches of backgrounds as convolution kernels to project foreground/background frames into different latent spaces. A new XCA video dataset is collected to evaluate the proposed method. The experimental results show that the proposed method achieves a mAP (mean average precision) of 72.45% and a F-score of 0.8296, considerably outperforming the state-of-the-art methods. The source code is available at https://github.com/Binjie-Qin/STA-IPCon.
在X射线冠状动脉造影(XCA)中定位移动造影剂的起始、顶点和结束关键帧进行关键帧计数对于心血管疾病的诊断和治疗非常重要。为了从与复杂背景重叠的类不平衡和边界无关的前景血管动作中定位这些关键帧,我们通过将卷积长短期记忆(CLSTM)网络集成到多尺度 Transformer 中来学习分段,从而提出长期短期时空注意力- 基于连续帧的深层特征中的序列级依赖性。图像到补丁对比学习进一步嵌入基于 CLSTM 的长期时空注意力模块和基于 Transformer 的短期注意力模块之间。图像对比模块重用了对XCA序列的图像级前景/背景对比的长期关注,而补丁对比投影则选择背景的随机补丁作为卷积核,将前景/背景帧投影到不同的潜在空间中。收集新的 XCA 视频数据集来评估所提出的方法。实验结果表明,所提出的方法实现了 72.45% 的 mAP(平均平均精度)和 0.8296 的 F 分数,大大优于最先进的方法。源代码可在 https://github.com/Binjie-Qin/STA-IPCon 获取。

AU Wu, Junde Zhang, Yu Fang, Huihui Duan, Lixin Tan, Mingkui Yang, Weihua Wang, Chunhui Liu, Huiying Jin, Yueming Xu, Yanwu
吴AU、张俊德、方宇、段慧慧、谭立新、杨明奎、王伟华、刘春慧、金惠英、徐月明、吴彦

Calibrate the Inter-Observer Segmentation Uncertainty via Diagnosis-First Principle
通过诊断第一原则校准观察者间的分割不确定性

Many of the tissues/lesions in the medical images may be ambiguous. Therefore, medical segmentation is typically annotated by a group of clinical experts to mitigate personal bias. A common solution to fuse different annotations is the majority vote, e.g., taking the average of multiple labels. However, such a strategy ignores the difference between the grader expertness. Inspired by the observation that medical image segmentation is usually used to assist the disease diagnosis in clinical practice, we propose the diagnosis-first principle, which is to take disease diagnosis as the criterion to calibrate the inter-observer segmentation uncertainty. Following this idea, a framework named Diagnosis-First segmentation Framework (DiFF) is proposed. Specifically, DiFF will first learn to fuse the multi-rater segmentation labels to a single ground-truth which could maximize the disease diagnosis performance. We dubbed the fused ground-truth as Diagnosis-First Ground-truth (DF-GT). Then, the Take and Give Model (T&G Model) to segment DF-GT from the raw image is proposed. With the T&G Model, DiFF can learn the segmentation with the calibrated uncertainty that facilitate the disease diagnosis. We verify the effectiveness of DiFF on three different medical segmentation tasks: optic-disc/optic-cup (OD/OC) segmentation on fundus images, thyroid nodule segmentation on ultrasound images, and skin lesion segmentation on dermoscopic images. Experimental results show that the proposed DiFF can effectively calibrate the segmentation uncertainty, and thus significantly facilitate the corresponding disease diagnosis, which outperforms previous state-of-the-art multi-rater learning methods.
医学图像中的许多组织/病变可能是不明确的。因此,医学分割通常由一组临床专家进行注释,以减少个人偏见。融合不同注释的常见解决方案是多数投票,例如,取多个标签的平均值。然而,这样的策略忽略了评分者专业知识之间的差异。受到临床实践中医学图像分割通常用于辅助疾病诊断的观察的启发,我们提出了诊断优先原则,即以疾病诊断为标准来校准观察者间分割的不确定性。遵循这个想法,提出了一个名为诊断优先分割框架(DiFF)的框架。具体来说,DiFF 将首先学习将多评估者分割标签融合到单个基本事实,这可以最大限度地提高疾病诊断性能。我们将融合的地面实况称为诊断优先地面实况(DF-GT)。然后,提出了从原始图像中分割 DF-GT 的 Take and Give 模型(T&G 模型)。通过 T&G 模型,DiFF 可以学习具有校准不确定性的分割,从而促进疾病诊断。我们验证了 DiFF 在三种不同医学分割任务上的有效性:眼底图像上的视盘/视杯(OD/OC)分割、超声图像上的甲状腺结节分割以及皮肤镜图像上的皮肤病变分割。实验结果表明,所提出的 DiFF 可以有效校准分割不确定性,从而显着促进相应的疾病诊断,优于以前最先进的多评估者学习方法。

Technol, Hefei 230037, Peoples R China C1 Pazhou Lab, Guangzhou 510005, Peoples R China C1 Univ Elect Sci & Technol China, Sch Comp Sci & Technol, Chengdu 611731, Sichuan, Peoples R China C1 South China Univ Technol, Sch Software Engn, Guangzhou 518055, Guangdong, Peoples R China C1 Jinan Univ, Shenzhen Eye Hosp, Big Data & Artificial Intelligence Inst, Shenzhen 518040, Peoples R China C1 Harbin Inst Technol, Dept Elect Sci & Technol, Harbin 150001, Peoples R China C1 ASTAR, Inst Infocomm Res, Singapore 138632, Singapore SN 0278-0062 EI 1558-254X DA 2024-09-18 UT WOS:001307429600009 PM 38669168 ER
合肥 230037, 人民 R 中国 C1 琶洲实验室, 广州 510005, 人民 R 中国 C1 科技大学, 科学计算科学与技术, 四川成都 611731, 人民 R 中国 C1 华南理工大学, 科学软件工程,广州 518055, 广东省, 人民路 C1 暨南大学, 深圳眼科医院大数据与人工智能研究所, 深圳 518040, 人民路 C1 哈尔滨理工学院, 哈尔滨 150001, 人民路 C1 ASTAR, 研究所Infocomm Res,新加坡 138632,新加坡 SN 0278-0062 EI 1558-254X DA 2024-09-18 UT WOS:001307429600009 PM 38669168 ER

AU Xiao, Chunlun Zhu, Anqi Xia, Chunmei Qiu, Zifeng Liu, Yuanlin Zhao, Cheng Ren, Weiwei Wang, Lifan Dong, Lei Wang, Tianfu Guo, Lehang Lei, Baiying
AU肖, 朱春伦, 夏安琪, 邱春梅, 刘子峰, 赵元林, 任成, 王伟伟, 董立凡, 王磊, 郭天福, 雷乐航, 白英

Attention-Guided Learning with Feature Reconstruction for Skin Lesion Diagnosis using Clinical and Ultrasound Images.
使用临床和超声图像进行皮肤病变诊断的注意力引导学习和特征重建。

Skin lesion is one of the most common diseases, and most categories are highly similar in morphology and appearance. Deep learning models effectively reduce the variability between classes and within classes, and improve diagnostic accuracy. However, the existing multi-modal methods are only limited to the surface information of lesions in skin clinical and dermatoscopic modalities, which hinders the further improvement of skin lesion diagnostic accuracy. This requires us to further study the depth information of lesions in skin ultrasound. In this paper, we propose a novel skin lesion diagnosis network, which combines clinical and ultrasound modalities to fuse the surface and depth information of the lesion to improve diagnostic accuracy. Specifically, we propose an attention-guided learning (AL) module that fuses clinical and ultrasound modalities from both local and global perspectives to enhance feature representation. The AL module consists of two parts, attention-guided local learning (ALL) computes the intra-modality and inter-modality correlations to fuse multi-scale information, which makes the network focus on the local information of each modality, and attention-guided global learning (AGL) fuses global information to further enhance the feature representation. In addition, we propose a feature reconstruction learning (FRL) strategy which encourages the network to extract more discriminative features and corrects the focus of the network to enhance the model's robustness and certainty. We conduct extensive experiments and the results confirm the superiority of our proposed method. Our code is available at: https://github.com/XCL-hub/AGFnet.
皮肤病变是最常见的疾病之一,大多数类别在形态和外观上高度相似。深度学习模型有效减少类间和类内的变异性,提高诊断准确性。然而,现有的多模态方法仅局限于皮肤临床和皮肤镜模态中皮损的表面信息,阻碍了皮损诊断准确性的进一步提高。这就需要我们进一步研究皮肤超声中病变的深度信息。在本文中,我们提出了一种新颖的皮肤病变诊断网络,该网络结合临床和超声模式来融合病变的表面和深度信息,以提高诊断准确性。具体来说,我们提出了一种注意力引导学习(AL)模块,该模块从局部和全局角度融合临床和超声模式,以增强特征表示。 AL模块由两部分组成,注意力引导局部学习(ALL)计算模态内和模态间相关性以融合多尺度信息,这使得网络专注于每个模态的局部信息,注意力引导局部学习(ALL)全局学习(AGL)融合全局信息以进一步增强特征表示。此外,我们提出了一种特征重建学习(FRL)策略,鼓励网络提取更多判别性特征并纠正网络的焦点,以增强模型的鲁棒性和确定性。我们进行了大量的实验,结果证实了我们提出的方法的优越性。我们的代码位于:https://github.com/XCL-hub/AGFnet。

EI 1558-254X DA 2024-09-04 UT MEDLINE:39208042 PM 39208042 ER
EI 1558-254X DA 2024-09-04 UT MEDLINE:39208042 PM 39208042 ER

AU Chaudhary, Muhammad F. A. Gerard, Sarah E. Christensen, Gary E. Cooper, Christopher B. Schroeder, Joyce D. Hoffman, Eric A. Reinhardt, Joseph M.
AU Chaudhary、Muhammad FA Gerard、Sarah E. Christensen、Gary E. Cooper、Christopher B. Schroeder、Joyce D. Hoffman、Eric A. Reinhardt、Joseph M.

LungViT: Ensembling Cascade of Texture Sensitive Hierarchical Vision Transformers for Cross-Volume Chest CT Image-to-Image Translation
LungViT:用于跨体积胸部 CT 图像到图像转换的纹理敏感分层视觉变换器的集成级联

Chest computed tomography (CT) at inspiration is often complemented by an expiratory CT to identify peripheral airways disease. Additionally, co-registered inspiratory-expiratory volumes can be used to derive various markers of lung function. Expiratory CT scans, however, may not be acquired due to dose or scan time considerations or may be inadequate due to motion or insufficient exhale; leading to a missed opportunity to evaluate underlying small airways disease. Here, we propose LungViT- a generative adversarial learning approach using hierarchical vision transformers for translating inspiratory CT intensities to corresponding expiratory CT intensities. LungViT addresses several limitations of the traditional generative models including slicewise discontinuities, limited size of generated volumes, and their inability to model texture transfer at volumetric level. We propose a shifted-window hierarchical vision transformer architecture with squeeze-and-excitation decoder blocks for modeling dependencies between features. We also propose a multiview texture similarity distance metric for texture and style transfer in 3D. To incorporate global information into the training process and refine the output of our model, we use ensemble cascading. LungViT is able to generate large 3D volumes of size $320\times 320\times320$ . We train and validate our model using a diverse cohort of 1500 subjects with varying disease severity. To assess model generalizability beyond the development set biases, we evaluate our model on an out-of-distribution external validation set of 200 subjects. Clinical validation on internal and external testing sets shows that synthetic volumes could be reliably adopted for deriving clinical endpoints of chronic obstructive pulmonary disease.
吸气时胸部计算机断层扫描 (CT) 通常辅以呼气 CT 来识别周围气道疾病。此外,共同记录的吸气-呼气量可用于得出肺功能的各种标志物。然而,由于剂量或扫描时间的考虑,可能无法获取呼气 CT 扫描,或者由于运动或呼气不足而导致扫描不充分;导致错失评估潜在小气道疾病的机会。在这里,我们提出了 LungViT——一种生成对抗性学习方法,使用分层视觉转换器将吸气 CT 强度转换为相应的呼气 CT 强度。 LungViT 解决了传统生成模型的几个限制,包括切片不连续性、生成体积的有限大小以及它们无法在体积水平上模拟纹理传输。我们提出了一种带有挤压和激励解码器块的移位窗口分层视觉变换器架构,用于对特征之间的依赖关系进行建模。我们还提出了一种用于 3D 纹理和风格迁移的多视图纹理相似性距离度量。为了将全局信息纳入训练过程并完善模型的输出,我们使用集成级联。 LungViT 能够生成大小为 $320\times 320\times320$ 的大型 3D 体积。我们使用 1500 名患有不同疾病严重程度的受试者组成的不同队列来训练和验证我们的模型。为了评估模型超越开发集偏差的通用性,我们在包含 200 名受试者的分布外外部验证集上评估我们的模型。内部和外部测试集的临床验证表明,可以可靠地采用合成体积来得出慢性阻塞性肺疾病的临床终点。

AU Luo, Yan Tian, Yu Shi, Min Pasquale, Louis R. Shen, Lucy Q. Zebardast, Nazlee Elze, Tobias Wang, Mengyu
AU Luo、Yan Tian、于石、Min Pasquale、Louis R. Shen、Lucy Q. Zebardast、Nazlee Elze、Tobias Wang、Mengyu

Harvard Glaucoma Fairness: A Retinal Nerve Disease Dataset for Fairness Learning and Fair Identity Normalization
哈佛青光眼公平:用于公平学习和公平身份标准化的视网膜神经疾病数据集

Fairness (also known as equity interchangeably) in machine learning is important for societal well-being, but limited public datasets hinder its progress. Currently, no dedicated public medical datasets with imaging data for fairness learning are available, though underrepresented groups suffer from more health issues. To address this gap, we introduce Harvard Glaucoma Fairness (Harvard-GF), a retinal nerve disease dataset including 3,300 subjects with both 2D and 3D imaging data and balanced racial groups for glaucoma detection. Glaucoma is the leading cause of irreversible blindness globally with Blacks having doubled glaucoma prevalence than other races. We also propose a fair identity normalization (FIN) approach to equalize the feature importance between different identity groups. Our FIN approach is compared with various state-of-the-art fairness learning methods with superior performance in the racial, gender, and ethnicity fairness tasks with 2D and 3D imaging data, demonstrating the utilities of our dataset Harvard-GF for fairness learning. To facilitate fairness comparisons between different models, we propose an equity-scaled performance measure, which can be flexibly used to compare all kinds of performance metrics in the context of fairness. The dataset and code are publicly accessible via https://ophai.hms.harvard.edu/datasets/harvard-gf3300/.
机器学习中的公平(也称为公平)对于社会福祉很重要,但有限的公共数据集阻碍了其进步。目前,没有专门的公共医疗数据集和用于公平学习的成像数据,尽管代表性不足的群体遭受更多的健康问题。为了解决这一差距,我们引入了哈佛青光眼公平 (Harvard-GF),这是一个视网膜神经疾病数据集,包括 3,300 名受试者,具有 2D 和 3D 成像数据以及用于青光眼检测的平衡种族群体。青光眼是全球不可逆性失明的主要原因,黑人的青光眼患病率是其他种族的两倍。我们还提出了一种公平身份标准化(FIN)方法来均衡不同身份组之间的特征重要性。我们的 FIN 方法与各种最先进的公平学习方法进行了比较,这些方法在使用 2D 和 3D 成像数据的种族、性别和民族公平任务中表现出色,证明了我们的数据集Harvard-GF 在公平学习方面的实用性。为了便于不同模型之间的公平性比较,我们提出了一种公平尺度的绩效衡量标准,可以灵活地用于在公平性的背景下比较各种绩效指标。数据集和代码可通过 https://ophai.hms.harvard.edu/datasets/harvard-gf3300/ 公开访问。

AU Zhao, Zihao Wang, Sheng Gu, Jinchen Zhu, Yitao Mei, Lanzhuju Zhuang, Zixu Cui, Zhiming Wang, Qian Shen, Dinggang
赵AU、王子豪、顾胜、朱金晨、梅一涛、庄兰珠菊、崔子旭、王志明、沉谦、丁刚

ChatCAD+: Towards a Universal and Reliable Interactive CAD using LLMs.
ChatCAD+:使用LLMs实现通用且可靠的交互式 CAD。

The integration of Computer-Aided Diagnosis (CAD) with Large Language Models (LLMs) presents a promising frontier in clinical applications, notably in automating diagnostic processes akin to those performed by radiologists and providing consultations similar to a virtual family doctor. Despite the promising potential of this integration, current works face at least two limitations: (1) From the perspective of a radiologist, existing studies typically have a restricted scope of applicable imaging domains, failing to meet the diagnostic needs of different patients. Also, the insufficient diagnostic capability of LLMs further undermine the quality and reliability of the generated medical reports. (2) Current LLMs lack the requisite depth in medical expertise, rendering them less effective as virtual family doctors due to the potential unreliability of the advice provided during patient consultations. To address these limitations, we introduce ChatCAD+, to be universal and reliable. Specifically, it is featured by two main modules: (1) Reliable Report Generation and (2) Reliable Interaction. The Reliable Report Generation module is capable of interpreting medical images from diverse domains and generate high-quality medical reports via our proposed hierarchical in-context learning. Concurrently, the interaction module leverages up-to-date information from reputable medical websites to provide reliable medical advice. Together, these designed modules synergize to closely align with the expertise of human medical professionals, offering enhanced consistency and reliability for interpretation and advice. The source code is available at GitHub.
计算机辅助诊断 (CAD) 与大型语言模型 ( LLMs ) 的集成在临床应用中展现了一个充满希望的前沿,特别是在类似于放射科医生执行的自动化诊断过程以及提供类似于虚拟家庭医生的咨询方面。尽管这种整合具有广阔的前景,但目前的工作至少面临两个局限性:(1)从放射科医生的角度来看,现有的研究通常适用的成像领域范围有限,无法满足不同患者的诊断需求。此外, LLMs诊断能力不足进一步损害了生成的医疗报告的质量和可靠性。 (2) 目前的LLMs缺乏必要的医学专业知识深度,由于在患者咨询期间提供的建议可能不可靠,导致他们作为虚拟家庭医生的效率较低。为了解决这些限制,我们引入了通用且可靠的 ChatCAD+。具体来说,它具有两个主要模块:(1)可靠的报告生成和(2)可靠的交互。可靠的报告生成模块能够解释来自不同领域的医学图像,并通过我们提出的分层上下文学习生成高质量的医学报告。同时,交互模块利用知名医疗网站的最新信息来提供可靠的医疗建议。这些设计的模块协同作用,与人类医疗专业人员的专业知识紧密结合,提供增强的解释和建议的一致性和可靠性。源代码可在 GitHub 上获取。

AU He, Along Li, Tao Yan, Juncheng Wang, Kai Fu, Huazhu
区赫、李阿龙、严涛、王俊成、付凯、华珠

Bilateral Supervision Network for Semi-Supervised Medical Image Segmentation
用于半监督医学图像分割的双边监督网络

Massive high-quality annotated data is required by fully-supervised learning, which is difficult to obtain for image segmentation since the pixel-level annotation is expensive, especially for medical image segmentation tasks that need domain knowledge. As an alternative solution, semi-supervised learning (SSL) can effectively alleviate the dependence on the annotated samples by leveraging abundant unlabeled samples. Among the SSL methods, mean-teacher (MT) is the most popular one. However, in MT, teacher model's weights are completely determined by student model's weights, which will lead to the training bottleneck at the late training stages. Besides, only pixel-wise consistency is applied for unlabeled data, which ignores the category information and is susceptible to noise. In this paper, we propose a bilateral supervision network with bilateral exponential moving average (bilateral-EMA), named BSNet to overcome these issues. On the one hand, both the student and teacher models are trained on labeled data, and then their weights are updated with the bilateral-EMA, and thus the two models can learn from each other. On the other hand, pseudo labels are used to perform bilateral supervision for unlabeled data. Moreover, for enhancing the supervision, we adopt adversarial learning to enforce the network generate more reliable pseudo labels for unlabeled data. We conduct extensive experiments on three datasets to evaluate the proposed BSNet, and results show that BSNet can improve the semi-supervised segmentation performance by a large margin and surpass other state-of-the-art SSL methods.
全监督学习需要大量高质量的标注数据,而这对于图像分割来说是很难获得的,因为像素级标注的成本很高,特别是对于需要领域知识的医学图像分割任务。作为替代解决方案,半监督学习(SSL)可以利用大量的未标记样本,有效减轻对注释样本的依赖。在 SSL 方法中,平均教师 (MT) 是最流行的一种。然而,在MT中,教师模型的权重完全由学生模型的权重决定,这将导致训练后期出现训练瓶颈。此外,对于未标记的数据仅应用像素级一致性,这忽略了类别信息并且容易受到噪声的影响。在本文中,我们提出了一种具有双边指数移动平均线(双边-EMA)的双边监督网络,称为 BSNet 来克服这些问题。一方面,学生模型和教师模型都基于标记数据进行训练,然后使用双边 EMA 更新它们的权重,因此两个模型可以相互学习。另一方面,伪标签用于对未标签数据进行双边监督。此外,为了加强监督,我们采用对抗性学习来强制网络为未标记的数据生成更可靠的伪标签。我们在三个数据集上进行了广泛的实验来评估所提出的 BSNet,结果表明 BSNet 可以大幅提高半监督分割性能,并超越其他最先进的 SSL 方法。

AU Li, Yinsheng Feng, Juan Xiang, Jun Li, Zixiao Liang, Dong
区莉、冯寅生、向娟、李军、梁子晓、董

AIRPORT: A Data Consistency Constrained Deep Temporal Extrapolation Method To Improve Temporal Resolution In Contrast Enhanced CT Imaging
AIRPORT:一种数据一致性约束的深度时间外推方法,可提高对比增强 CT 成像的时间分辨率

Typical tomographic image reconstruction methods require that the imaged object is static and stationary during the time window to acquire a minimally complete data set. The violation of this requirement leads to temporal-averaging errors in the reconstructed images. For a fixed gantry rotation speed, to reduce the errors, it is desired to reconstruct images using data acquired over a narrower angular range, i.e., with a higher temporal resolution. However, image reconstruction with a narrower angular range violates the data sufficiency condition, resulting in severe data-insufficiency-induced errors. The purpose of this work is to decouple the trade-off between these two types of errors in contrast-enhanced computed tomography (CT) imaging. We demonstrated that using the developed data consistency constrained deep temporal extrapolation method (AIRPORT), the entire time-varying imaged object can be accurately reconstructed with 40 frames-per-second temporal resolution, the time window needed to acquire a single projection view data using a typical C-arm cone-beam CT system. AIRPORT is applicable to general non-sparse imaging tasks using a single short-scan data acquisition.
典型的断层扫描图像重建方法要求成像对象在时间窗口内是静止的,以获得最小完整的数据集。违反此要求会导致重建图像中的时间平均误差。对于固定的机架旋转速度,为了减少误差,需要使用在较窄的角度范围内(即,具有较高的时间分辨率)获取的数据来重建图像。然而,具有较窄角度范围的图像重建违反了数据充足性条件,导致严重的数据不足引起的错误。这项工作的目的是消除对比增强计算机断层扫描 (CT) 成像中这两类错误之间的权衡。我们证明,使用开发的数据一致性约束深度时间外推方法(AIRPORT),可以以每秒 40 帧的时间分辨率精确重建整个时变成像对象,这是获取单个投影视图数据所需的时间窗口典型的 C 形臂锥束 CT 系统。 AIRPORT 适用于使用单次短扫描数据采集的一般非稀疏成像任务。

AU Elbatel, Marawan Marti, Robert Li, Xiaomeng
AU Elbatel、Marawan Marti、Robert Li、小萌

FoPro-KD: Fourier Prompted Effective Knowledge Distillation for Long-Tailed Medical Image Recognition
FoPro-KD:傅立叶促进长尾医学图像识别的有效知识蒸馏

Representational transfer from publicly available models is a promising technique for improving medical image classification, especially in long-tailed datasets with rare diseases. However, existing methods often overlook the frequency-dependent behavior of these models, thereby limiting their effectiveness in transferring representations and generalizations to rare diseases. In this paper, we propose FoPro-KD, a novel framework that leverages the power of frequency patterns learned from frozen pre-trained models to enhance their transferability and compression, presenting a few unique insights: 1) We demonstrate that leveraging representations from publicly available pre-trained models can substantially improve performance, specifically for rare classes, even when utilizing representations from a smaller pre-trained model. 2) We observe that pre-trained models exhibit frequency preferences, which we explore using our proposed Fourier Prompt Generator (FPG), allowing us to manipulate specific frequencies in the input image, enhancing the discriminative representational transfer. 3) By amplifying or diminishing these frequencies in the input image, we enable Effective Knowledge Distillation (EKD). EKD facilitates the transfer of knowledge from pre-trained models to smaller models. Through extensive experiments in long-tailed gastrointestinal image recognition and skin lesion classification, where rare diseases are prevalent, our FoPro-KD framework outperforms existing methods, enabling more accessible medical models for rare disease classification.
来自公开可用模型的表征转移是一种有前途的改进医学图像分类的技术,特别是在罕见疾病的长尾数据集中。然而,现有的方法常常忽视这些模型的频率依赖性行为,从而限制了它们将表征和概括转移到罕见疾病的有效性。在本文中,我们提出了 FoPro-KD,这是一种新颖的框架,它利用从冻结的预训练模型中学习到的频率模式的力量来增强其可转移性和压缩性,并提出了一些独特的见解:1)我们证明了利用公开可用的表示预训练模型可以显着提高性能,特别是对于稀有类别,即使使用较小的预训练模型的表示也是如此。 2)我们观察到预训练的模型表现出频率偏好,我们使用我们提出的傅立叶提示生成器(FPG)进行探索,使我们能够操纵输入图像中的特定频率,从而增强判别性表征转移。 3)通过放大或减少输入图像中的这些频率,我们实现了有效知识蒸馏(EKD)。 EKD 有助于将知识从预训练模型转移到更小的模型。通过在罕见疾病普遍存在的长尾胃肠道图像识别和皮肤病变分类方面进行大量实验,我们的 FoPro-KD 框架优于现有方法,为罕见疾病分类提供了更容易访问的医学模型。

AU Fu, Li-Wei Liu, Chih-Hao Jain, Manu Chen, Chih-Shan Jason Wu, Yu-Hung Huang, Sheng-Lung Chen, Homer H.
AU Fu、Li-Wei Liu、Chih-Hao Jain、Manu Chen、Chih-Shan Jason Wu、Yu-Hung Huang、Sheng-Lung Chen、Homer H.

Training With Uncertain Annotations for Semantic Segmentation of Basal Cell Carcinoma From Full-Field OCT Images
使用不确定注释对全视野 OCT 图像进行基底细胞癌语义分割的训练

Semantic segmentation of basal cell carcinoma (BCC) from full-field optical coherence tomography (FF-OCT) images of human skin has received considerable attention in medical imaging. However, it is challenging for dermatopathologists to annotate the training data due to OCT's lack of color specificity. Very often, they are uncertain about the correctness of the annotations they made. In practice, annotations fraught with uncertainty profoundly impact the effectiveness of model training and hence the performance of BCC segmentation. To address this issue, we propose an approach to model training with uncertain annotations. The proposed approach includes a data selection strategy to mitigate the uncertainty of training data, a class expansion to consider sebaceous gland and hair follicle as additional classes to enhance the performance of BCC segmentation, and a self-supervised pre-training procedure to improve the initial weights of the segmentation model parameters. Furthermore, we develop three post-processing techniques to reduce the impact of speckle noise and image discontinuities on BCC segmentation. The mean Dice score of BCC of our model reaches 0.503 +/- 0.003, which, to the best of our knowledge, is the best performance to date for semantic segmentation of BCC from FF-OCT images.
从人类皮肤的全场光学相干断层扫描(FF-OCT)图像中对基底细胞癌(BCC)进行语义分割在医学成像领域受到了相当大的关注。然而,由于 OCT 缺乏颜色特异性,皮肤病理学家对训练数据进行注释具有挑战性。很多时候,他们不确定自己所做的注释的正确性。在实践中,充满不确定性的注释会深刻影响模型训练的有效性,从而影响 BCC 分割的性能。为了解决这个问题,我们提出了一种使用不确定注释进行模型训练的方法。所提出的方法包括用于减轻训练数据不确定性的数据选择策略、将皮脂腺和毛囊视为附加类以增强 BCC 分割性能的类扩展,以及用于改进初始模型的自监督预训练程序。分割模型参数的权重。此外,我们开发了三种后处理技术来减少散斑噪声和图像不连续性对 BCC 分割的影响。我们模型的 BCC 平均 Dice 分数达到 0.503 +/- 0.003,据我们所知,这是迄今为止对 FF-OCT 图像进行 BCC 语义分割的最佳性能。

AU Zhang, Jianjia Mao, Haiyang Wang, Xinran Guo, Yuan Wu, Weiwen
张AU、毛健佳、王海洋、郭欣然、吴媛、伟文

Wavelet-Inspired Multi-channel Score-based Model for Limited-angle CT Reconstruction.
用于有限角度 CT 重建的小波启发多通道基于评分的模型。

Score-based generative model (SGM) has demonstrated great potential in the challenging limited-angle CT (LA-CT) reconstruction. SGM essentially models the probability density of the ground truth data and generates reconstruction results by sampling from it. Nevertheless, direct application of the existing SGM methods to LA-CT suffers multiple limitations. Firstly, the directional distribution of the artifacts attributing to the missing angles is ignored. Secondly, the different distribution properties of the artifacts in different frequency components have not been fully explored. These drawbacks would inevitably degrade the estimation of the probability density and the reconstruction results. After an in-depth analysis of these factors, this paper proposes a Wavelet-Inspired Score-based Model (WISM) for LA-CT reconstruction. Specifically, besides training a typical SGM with the original images, the proposed method additionally performs the wavelet transform and models the probability density in each wavelet component with an extra SGM. The wavelet components preserve the spatial correspondence with the original image while performing frequency decomposition, thereby keeping the directional property of the artifacts for further analysis. On the other hand, different wavelet components possess more specific contents of the original image in different frequency ranges, simplifying the probability density modeling by decomposing the overall density into component-wise ones. The resulting two SGMs in the image-domain and wavelet-domain are integrated into a unified sampling process under the guidance of the observation data, jointly generating high-quality and consistent LA-CT reconstructions. The experimental evaluation on various datasets consistently verifies the superior performance of the proposed method over the competing method.
基于评分的生成模型 (SGM) 在具有挑战性的有限角度 CT (LA-CT) 重建中表现出了巨大的潜力。 SGM本质上是对地面真实数据的概率密度进行建模,并通过对其进行采样来生成重建结果。然而,将现有的 SGM 方法直接应用于 LA-CT 受到多种限制。首先,忽略了归因于缺失角度的伪影的方向分布。其次,不同频率分量中伪影的不同分布特性尚未得到充分探索。这些缺点将不可避免地降低概率密度的估计和重建结果。在对这些因素进行深入分析后,本文提出了一种用于 LA-CT 重建的小波启发评分模型(WISM)。具体来说,除了用原始图像训练典型的SGM之外,所提出的方法还执行小波变换并用额外的SGM对每个小波分量中的概率密度进行建模。小波分量在执行频率分解时保留了与原始图像的空间对应关系,从而保留了伪影的方向特性以供进一步分析。另一方面,不同的小波分量在不同的频率范围内拥有原始图像的更具体的内容,通过将整体密度分解为分量密度来简化概率密度建模。由此产生的图像域和小波域中的两个 SGM 在观测数据的指导下集成到统一的采样过程中,共同生成高质量且一致的 LA-CT 重建。 对各种数据集的实验评估一致验证了所提出的方法相对于竞争方法的优越性能。

AU Zhang, Xiao Sun, Kaicong Wu, Dijia Xiong, Xiaosong Liu, Jiameng Yao, Linlin Li, Shufang Wang, Yining Feng, Jun Shen, Dinggang
张AU、孙晓、吴凯聪、熊迪佳、刘晓松、姚佳萌、李琳琳、王淑芳、冯伊宁、沉军、丁刚

An Anatomy- and Topology-Preserving Framework for Coronary Artery Segmentation
冠状动脉分割的解剖学和拓扑保留框架

Coronary artery segmentation is critical for coronary artery disease diagnosis but challenging due to its tortuous course with numerous small branches and inter-subject variations. Most existing studies ignore important anatomical information and vascular topologies, leading to less desirable segmentation performance that usually cannot satisfy clinical demands. To deal with these challenges, in this paper we propose an anatomy- and topology-preserving two-stage framework for coronary artery segmentation. The proposed framework consists of an anatomical dependency encoding (ADE) module and a hierarchical topology learning (HTL) module for coarse-to-fine segmentation, respectively. Specifically, the ADE module segments four heart chambers and aorta, and thus five distance field maps are obtained to encode distance between chamber surfaces and coarsely segmented coronary artery. Meanwhile, ADE also performs coronary artery detection to crop region-of-interest and eliminate foreground-background imbalance. The follow-up HTL module performs fine segmentation by exploiting three hierarchical vascular topologies, i.e., key points, centerlines, and neighbor connectivity using a multi-task learning scheme. In addition, we adopt a bottom-up attention interaction (BAI) module to integrate the feature representations extracted across hierarchical topologies. Extensive experiments on public and in-house datasets show that the proposed framework achieves state-of-the-art performance for coronary artery segmentation.
冠状动脉分割对于冠状动脉疾病的诊断至关重要,但由于其曲折的路线、大量的小分支和受试者间的差异,因此具有挑战性。大多数现有研究忽略了重要的解剖信息和血管拓扑,导致分割性能不太理想,通常无法满足临床需求。为了应对这些挑战,在本文中,我们提出了一种保留解剖学和拓扑结构的冠状动脉分割两阶段框架。所提出的框架由分别用于从粗到细分割的解剖依赖编码(ADE)模块和层次拓扑学习(HTL)模块组成。具体来说,ADE模块分割四个心室和主动脉,从而获得五个距离场图来编码心室表面和粗分割的冠状动脉之间的距离。同时,ADE 还执行冠状动脉检测以裁剪感兴趣区域并消除前景-背景不平衡。后续的 HTL 模块通过利用三个分层血管拓扑(即关键点、中心线和使用多任务学习方案的邻居连接)来执行精细分割。此外,我们采用自下而上的注意力交互(BAI)模块来集成跨层次拓扑提取的特征表示。对公共和内部数据集的广泛实验表明,所提出的框架实现了冠状动脉分割的最先进性能。

AU Tang, Yuqi Wang, Nanchao Dong, Zhijie Lowerison, Matthew Del Aguila, Angela Johnston, Natalie Vu, Tri Ma, Chenshuo Xu, Yirui Yang, Wei Song, Pengfei Yao, Junjie
AU Tang, Yuqi Wang, Nanchao Dong,zhijie Lowerison, Matthew Del Aguila, Angela Johnston, Natalie Vu, Tri Ma, Chenshuo Xu, Yirui Yang, Wei Song, Pengfei Yao, Junjie

Non-invasive Deep-Brain Imaging with 3D Integrated Photoacoustic Tomography and Ultrasound Localization Microscopy (3D-PAULM).
使用 3D 集成光声断层扫描和超声定位显微镜 (3D-PAULM) 进行非侵入性深脑成像。

Photoacoustic computed tomography (PACT) is a proven technology for imaging hemodynamics in deep brain of small animal models. PACT is inherently compatible with ultrasound (US) imaging, providing complementary contrast mechanisms. While PACT can quantify the brain's oxygen saturation of hemoglobin (sO2), US imaging can probe the blood flow based on the Doppler effect. Further, by tracking gas-filled microbubbles, ultrasound localization microscopy (ULM) can map the blood flow velocity with sub-diffraction spatial resolution. In this work, we present a 3D deep-brain imaging system that seamlessly integrates PACT and ULM into a single device, 3D-PAULM. Using a low ultrasound frequency of 4 MHz, 3D-PAULM is capable of imaging the brain hemodynamic functions with intact scalp and skull in a totally non-invasive manner. Using 3D-PAULM, we studied the mouse brain functions with ischemic stroke. Multi-spectral PACT, US B-mode imaging, microbubble-enhanced power Doppler (PD), and ULM were performed on the same mouse brain with intrinsic image co-registration. From the multi-modality measurements, we further quantified blood perfusion, sO2, vessel density, and flow velocity of the mouse brain, showing stroke-induced ischemia, hypoxia, and reduced blood flow. We expect that 3D-PAULM can find broad applications in studying deep brain functions on small animal models.
光声计算机断层扫描 (PACT) 是一种经过验证的技术,用于对小动物模型深部脑部血流动力学进行成像。 PACT 本质上与超声 (US) 成像兼容,提供互补的对比机制。 PACT 可以量化大脑血红蛋白的氧饱和度 (sO2),而 US 成像可以基于多普勒效应探测血流。此外,通过跟踪充气微泡,超声定位显微镜(ULM)可以以亚衍射空间分辨率绘制血流速度图。在这项工作中,我们提出了一种 3D 深脑成像系统,它将 PACT 和 ULM 无缝集成到单个设备 3D-PAULM 中。 3D-PAULM 使用 4 MHz 的低超声频率,能够以完全无创的方式对完整头皮和头骨的大脑血流动力学功能进行成像。使用 3D-PAULM,我们研究了缺血性中风小鼠的大脑功能。多光谱 PACT、US B 模式成像、微泡增强功率多普勒 (PD) 和 ULM 在同一小鼠大脑上进行,并具有内在图像共同配准。通过多模态测量,我们进一步量化了小鼠大脑的血液灌注、sO2、血管密度和流速,显示中风引起的缺血、缺氧和血流量减少。我们期望 3D-PAULM 能够在小动物模型的深部脑功能研究中找到广泛的应用。

EI 1558-254X DA 2024-10-11 UT MEDLINE:39383084 PM 39383084 ER
EI 1558-254X DA 2024-10-11 UT MEDLINE:39383084 PM 39383084 ER

AU Park, Jinil Shin, Taehoon Park, Jang-Yeon
AU Park、申真一、朴泰勋、张妍

Three-Dimensional Variable Slab-Selective Projection Acquisition Imaging.
三维可变板选择性投影采集成像。

Three-dimensional (3D) projection acquisition (PA) imaging has recently gained attention because of its advantages, such as achievability of very short echo time, less sensitivity to motion, and undersampled acquisition of projections without sacrificing spatial resolution. However, larger subjects require a stronger Nyquist criterion and are more likely to be affected by outer-volume signals outside the field of view (FOV), which significantly degrades the image quality. Here, we proposed a variable slab-selective projection acquisition (VSS-PA) method to mitigate the Nyquist criterion and effectively suppress aliasing streak artifacts in 3D PA imaging. The proposed method involves maintaining the vertical orientation of the slab-selective gradient for frequency-selective spin excitation and the readout gradient for data acquisition. As VSS-PA can selectively excite spins only in the width of the desired FOV in the projection direction during data acquisition, the effective size of the scanned object that determines the Nyquist criterion can be reduced. Additionally, unwanted signals originating from outside the FOV (e.g., aliasing streak artifacts) can be effectively avoided. The mitigation of the Nyquist criterion owing to VSS-PA was theoretically described and confirmed through numerical simulations and phantom and human lung experiments. These experiments further showed that the aliasing streak artifacts were nearly suppressed.
三维 (3D) 投影采集 (PA) 成像最近因其优点而受到关注,例如可实现非常短的回波时间、对运动的敏感性较低以及在不牺牲空间分辨率的情况下对投影进行欠采样采集。然而,较大的拍摄对象需要更强的奈奎斯特准则,并且更有可能受到视场 (FOV) 之外的外部体积信号的影响,从而显着降低图像质量。在这里,我们提出了一种可变平板选择性投影采集(VSS-PA)方法来减轻奈奎斯特准则并有效抑制 3D PA 成像中的混叠条纹伪影。所提出的方法涉及维持用于频率选择性自旋激发的板选择梯度和用于数据采集的读出梯度的垂直方向。由于VSS-PA在数据采集过程中可以选择性地仅在投影方向上所需FOV的宽度内激发自旋,因此可以减小决定奈奎斯特准则的扫描物体的有效尺寸。此外,可以有效避免源自视场外部的不需要的信号(例如,混叠条纹伪影)。通过数值模拟、模型和人肺实验从理论上描述并证实了 VSS-PA 对奈奎斯特准则的缓解。这些实验进一步表明,混叠条纹伪影几乎被抑制。

EI 1558-254X DA 2024-10-02 UT MEDLINE:39348262 PM 39348262 ER
EI 1558-254X DA 2024-10-02 UT MEDLINE:39348262 PM 39348262 ER

AU Purma, Vishnuvardhan Srinath, Suhas Srirangarajan, Seshan Kakkar, Aanchal Prathosh, A P
AU Purma、Vishnuvardhan Srinath、Suhas Srirangarajan、Seshan Kakkar、Aanchal Prathosh、美联社

GenSelfDiff-HIS: Generative Self-Supervision Using Diffusion for Histopathological Image Segmentation.
GenSelfDiff-HIS:使用扩散进行组织病理学图像分割的生成自我监督。

Histopathological image segmentation is a laborious and time-intensive task, often requiring analysis from experienced pathologists for accurate examinations. To reduce this burden, supervised machine-learning approaches have been adopted using large-scale annotated datasets for histopathological image analysis. However, in several scenarios, the availability of large-scale annotated data is a bottleneck while training such models. Self-supervised learning (SSL) is an alternative paradigm that provides some respite by constructing models utilizing only the unannotated data which is often abundant. The basic idea of SSL is to train a network to perform one or many pseudo or pretext tasks on unannotated data and use it subsequently as the basis for a variety of downstream tasks. It is seen that the success of SSL depends critically on the considered pretext task. While there have been many efforts in designing pretext tasks for classification problems, there have not been many attempts on SSL for histopathological image segmentation. Motivated by this, we propose an SSL approach for segmenting histopathological images via generative diffusion models. Our method is based on the observation that diffusion models effectively solve an image-to-image translation task akin to a segmentation task. Hence, we propose generative diffusion as the pretext task for histopathological image segmentation. We also utilize a multi-loss function-based fine-tuning for the downstream task. We validate our method using several metrics on two publicly available datasets along with a newly proposed head and neck (HN) cancer dataset containing Hematoxylin and Eosin (H&E) stained images along with annotations.
组织病理学图像分割是一项费力且耗时的任务,通常需要经验丰富的病理学家进行分析才能进行准确的检查。为了减轻这种负担,已采用监督机器学习方法,使用大规模注释数据集进行组织病理学图像分析。然而,在某些情况下,大规模注释数据的可用性是训练此类模型的瓶颈。自监督学习(SSL)是一种替代范式,它通过仅利用通常丰富的未注释数据构建模型来提供一些喘息机会。 SSL 的基本思想是训练网络对未注释的数据执行一项或多项伪或借口任务,并随后将其用作各种下游任务的基础。可以看出,SSL 的成功关键取决于所考虑的借口任务。虽然人们在设计分类问题的借口任务方面做出了很多努力,但在 SSL 上进行组织病理学图像分割的尝试却很少。受此启发,我们提出了一种 SSL 方法,通过生成扩散模型分割组织病理学图像。我们的方法基于这样的观察:扩散模型有效地解决了类似于分割任务的图像到图像的翻译任务。因此,我们提出生成扩散作为组织病理学图像分割的借口任务。我们还利用基于多重损失函数的微调来进行下游任务。我们使用两个公开可用的数据集以及新提出的头颈(HN)癌症数据集的多个指标来验证我们的方法,该数据集包含苏木精和伊红(H&E)染色图像以及注释。

EI 1558-254X DA 2024-09-04 UT MEDLINE:39222449 PM 39222449 ER
EI 1558-254X DA 2024-09-04 UT MEDLINE:39222449 PM 39222449 ER

AU Liu, Qiang Chao, Weian Wen, Ruyi Gong, Yubin Xi, Lei
刘AU、超强、文维安、龚如意、奚玉斌、雷

Optimized Excitation in Microwave-induced Thermoacoustic Imaging for Artifact Suppression.
用于抑制伪影的微波诱导热声成像中的优化激励。

Microwave-induced thermoacoustic imaging (M-TAI) allows the visualization of macroscopic and microscopic structures of bio-tissues. However, it suffers from severe inherent artifacts that might misguide the subsequent diagnostics and treatments of diseases. To overcome this limitation, we propose an optimized excitation strategy. In detail, the strategy integrates dynamically compound specific absorption rate (SAR) and co-planar configuration of polarization state, incident wave vector and imaging plane. Starting from the theoretical analysis, we interpret the underlying mechanism supporting the superiority of the optimized excitation strategy to achieve an effect equivalent to homogenizing the deposited electromagnetic energy in bio-tissues. The following numerical simulations demonstrate that the strategy enables better preservation of the conductivity weighting of samples while increasing Pearson correlation coefficient. Furthermore, the in vitro and in vivo M-TAI experiments validate the effectiveness and robustness of this optimized excitation strategy in artifact suppression, allowing the simultaneous identification of both boundary and inside fine structures within bio-tissues. All the results suggest that the optimized excitation strategy can be expanded to diverse scenarios, inspiring more suitable strategies that remarkably suppress the inherent artifacts in M-TAI.
微波诱导热声成像 (M-TAI) 可以实现生物组织的宏观和微观结构的可视化。然而,它存在严重的固有伪影,可能会误导随后的疾病诊断和治疗。为了克服这个限制,我们提出了一种优化的激励策略。具体来说,该策略动态集成了复合比吸收率(SAR)以及偏振态、入射波矢量和成像平面的共面配置。从理论分析开始,我们解释了支持优化激励策略的优越性的基本机制,以实现相当于均匀化生物组织中沉积的电磁能的效果。以下数值模拟表明,该策略能够更好地保留样品的电导率权重,同时增加皮尔逊相关系数。此外,体外和体内 M-TAI 实验验证了这种优化的激励策略在伪影抑制中的有效性和鲁棒性,允许同时识别生物组织内的边界和内部精细结构。所有结果表明,优化的激励策略可以扩展到不同的场景,激发更合适的策略,显着抑制 M-TAI 中的固有伪影。

AU Sogancioglu, Ecem van Ginneken, Bram Behrendt, Finn Bengs, Marcel Schlaefer, Alexander Radu, Miron Xu, Di Sheng, Ke Scalzo, Fabien Marcus, Eric Papa, Samuele Teuwen, Jonas Scholten, Ernst Th. Schalekamp, Steven Hendrix, Nils Jacobs, Colin Hendrix, Ward Sanchez, Clara I. Murphy, Keelin
AU Sogancioglu、Ecem van Ginneken、Bram Behrendt、Finn Bengs、Marcel Schlaefer、Alexander Radu、Miron Xu、Di Shen、Ke Scalzo、Fabien Marcus、Eric Papa、Samuele Teuwen、Jonas Scholten、Ernst Th。沙勒坎普、史蒂文·亨德里克斯、尼尔斯·雅各布斯、科林·亨德里克斯、沃德·桑切斯、克拉拉·墨菲、基林

Nodule Detection and Generation on Chest X-Rays: NODE21 Challenge
胸部 X 射线结节检测和生成:NODE21 挑战

Pulmonary nodules may be an early manifestation of lung cancer, the leading cause of cancer-related deaths among both men and women. Numerous studies have established that deep learning methods can yield high-performance levels in the detection of lung nodules in chest X-rays. However, the lack of gold-standard public datasets slows down the progression of the research and prevents benchmarking of methods for this task. To address this, we organized a public research challenge, NODE21, aimed at the detection and generation of lung nodules in chest X-rays. While the detection track assesses state-of-the-art nodule detection systems, the generation track determines the utility of nodule generation algorithms to augment training data and hence improve the performance of the detection systems. This paper summarizes the results of the NODE21 challenge and performs extensive additional experiments to examine the impact of the synthetically generated nodule training images on the detection algorithm performance.
肺结节可能是肺癌的早期表现,肺癌是男性和女性癌症相关死亡的主要原因。大量研究表明,深度学习方法可以在胸部 X 光检查中的肺结节检测中达到高性能水平。然而,缺乏黄金标准的公共数据集会减慢研究的进展,并阻碍对该任务的方法进行基准测试。为了解决这个问题,我们组织了一项公共研究挑战赛 NODE21,旨在检测和生成胸部 X 射线中的肺结节。虽然检测轨迹评估最先进的结节检测系统,但生成轨迹确定结节生成算法在增强训练数据方面的效用,从而提高检测系统的性能。本文总结了 NODE21 挑战的结果,并进行了广泛的额外实验,以检查综合生成的结节训练图像对检测算法性能的影响。

AU Wu, Ruoyou Li, Cheng Zou, Juan Liu, Xinfeng Zheng, Hairong Wang, Shanshan
吴AU、李若友、邹成、刘娟、郑新峰、王海蓉、珊珊

Generalizable Reconstruction for Accelerating MR Imaging via Federated Learning with Neural Architecture Search.
通过神经架构搜索联合学习加速 MR 成像的可推广重建。

Heterogeneous data captured by different scanning devices and imaging protocols can affect the generalization performance of the deep learning magnetic resonance (MR) reconstruction model. While a centralized training model is effective in mitigating this problem, it raises concerns about privacy protection. Federated learning is a distributed training paradigm that can utilize multi-institutional data for collaborative training without sharing data. However, existing federated learning MR image reconstruction methods rely on models designed manually by experts, which are complex and computationally expensive, suffering from performance degradation when facing heterogeneous data distributions. In addition, these methods give inadequate consideration to fairness issues, namely ensuring that the model's training does not introduce bias towards any specific dataset's distribution. To this end, this paper proposes a generalizable federated neural architecture search framework for accelerating MR imaging (GAutoMRI). Specifically, automatic neural architecture search is investigated for effective and efficient neural network representation learning of MR images from different centers. Furthermore, we design a fairness adjustment approach that can enable the model to learn features fairly from inconsistent distributions of different devices and centers, and thus facilitate the model to generalize well to the unseen center. Extensive experiments show that our proposed GAutoMRI has better performances and generalization ability compared with seven state-of-the-art federated learning methods. Moreover, the GAutoMRI model is significantly more lightweight, making it an efficient choice for MR image reconstruction tasks. The code will be made available at https://github.com/ternencewu123/GAutoMRI.
不同扫描设备和成像协议捕获的异构数据会影响深度学习磁共振(MR)重建模型的泛化性能。虽然集中式训练模型可以有效缓解这一问题,但它引起了人们对隐私保护的担忧。联邦学习是一种分布式训练范式,可以利用多机构数据进行协作训练,而无需共享数据。然而,现有的联邦学习MR图像重建方法依赖于专家手动设计的模型,模型复杂且计算成本高,在面对异构数据分布时性能会下降。此外,这些方法没有充分考虑公平性问题,即确保模型的训练不会对任何特定数据集的分布引入偏差。为此,本文提出了一种用于加速 MR 成像的通用联邦神经架构搜索框架(GAutoMRI)。具体来说,研究了自动神经架构搜索,以对来自不同中心的 MR 图像进行有效且高效的神经网络表示学习。此外,我们设计了一种公平性调整方法,可以使模型从不同设备和中心的不一致分布中公平地学习特征,从而促进模型很好地泛化到不可见的中心。大量实验表明,与七种最先进的联邦学习方法相比,我们提出的 GAutoMRI 具有更好的性能和泛化能力。此外,GAutoMRI 模型明显更加轻量级,使其成为 MR 图像重建任务的有效选择。该代码将在 https://github 上提供。com/ternencewu123/GAutoMRI。

AU Zhou, Chengfeng Wang, Jun Xiang, Suncheng Liu, Feng Huang, Hefeng Qian, Dahong
周AU、王成峰、向军、刘孙成、黄峰、钱鹤峰、大洪

A Simple Normalization Technique Using Window Statistics to Improve the Out-of-Distribution Generalization on Medical Images
一种利用窗口统计改进医学图像的分布外泛化的简单归一化技术

Since data scarcity and data heterogeneity are prevailing for medical images, well-trained Convolutional Neural Networks (CNNs) using previous normalization methods may perform poorly when deployed to a new site. However, a reliable model for real-world clinical applications should generalize well both on in-distribution (IND) and out-of-distribution (OOD) data (e.g., the new site data). In this study, we present a novel normalization technique called window normalization (WIN) to improve the model generalization on heterogeneous medical images, which offers a simple yet effective alternative to existing normalization methods. Specifically, WIN perturbs the normalizing statistics with the local statistics computed within a window. This feature-level augmentation technique regularizes the models well and improves their OOD generalization significantly. Leveraging its advantage, we propose a novel self-distillation method called WIN-WIN. WIN-WIN can be easily implemented with two forward passes and a consistency constraint, serving as a simple extension to existing methods. Extensive experimental results on various tasks (6 tasks) and datasets (24 datasets) demonstrate the generality and effectiveness of our methods.
由于医学图像普遍存在数据稀缺和数据异质性,因此使用以前的标准化方法训练有素的卷积神经网络 (CNN) 在部署到新站点时可能表现不佳。然而,真实世界临床应用的可靠模型应该能够很好地推广分布内(IND)和分布外(OOD)数据(例如,新站点数据)。在这项研究中,我们提出了一种称为窗口归一化(WIN)的新颖归一化技术,以改进异构医学图像的模型泛化,它为现有归一化方法提供了一种简单而有效的替代方法。具体来说,WIN 使用窗口内计算的本地统计数据扰乱标准化统计数据。这种特征级增强技术可以很好地规范模型并显着提高其 OOD 泛化能力。利用其优势,我们提出了一种称为 WIN-WIN 的新型自蒸馏方法。 WIN-WIN 可以通过两次前向传递和一致性约束轻松实现,作为现有方法的简单扩展。各种任务(6 个任务)和数据集(24 个数据集)的广泛实验结果证明了我们方法的通用性和有效性。

AU Martinez-Sanchez, Antonio Lamm, Lorenz Jasnin, Marion Phelippeau, Harold
AU 马丁内斯-桑切斯、安东尼奥·拉姆、洛伦兹·贾斯宁、马里昂·费利波、哈罗德

Simulating the cellular context in synthetic datasets for cryo-electron tomography.
模拟冷冻电子断层扫描合成数据集中的细胞环境。

Cryo-electron tomography (cryo-ET) allows to visualize the cellular context at macromolecular level. To date, the impossibility of obtaining a reliable ground truth is limiting the application of deep learning-based image processing algorithms in this field. As a consequence, there is a growing demand of realistic synthetic datasets for training deep learning algorithms. In addition, besides assisting the acquisition and interpretation of experimental data, synthetic tomograms are used as reference models for cellular organization analysis from cellular tomograms. Current simulators in cryo-ET focus on reproducing distortions from image acquisition and tomogram reconstruction, however, they can not generate many of the low order features present in cellular tomograms. Here we propose several geometric and organization models to simulate low order cellular structures imaged by cryo-ET. Specifically, clusters of any known cytosolic or membrane bound macromolecules, membranes with different geometries as well as different filamentous structures such as microtubules or actin-like networks. Moreover, we use parametrizable stochastic models to generate a high diversity of geometries and organizations to simulate representative and generalized datasets, including very crowded environments like those observed in native cells. These models have been implemented in a multiplatform open-source Python package, including scripts to generate cryo-tomograms with adjustable sizes and resolutions. In addition, these scripts provide also distortion-free density maps besides the ground truth in different file formats for efficient access and advanced visualization. We show that such a realistic synthetic dataset can be readily used to train generalizable deep learning algorithms.
冷冻电子断层扫描 (cryo-ET) 可以在大分子水平上可视化细胞环境。迄今为止,无法获得可靠的地面事实限制了基于深度学习的图像处理算法在该领域的应用。因此,对用于训练深度学习算法的真实合成数据集的需求不断增长。此外,除了协助实验数据的获取和解释之外,合成断层图还用作细胞断层图细胞组织分析的参考模型。目前冷冻电子断层扫描中的模拟器专注于再现图像采集和断层扫描重建中的失真,但是它们无法生成细胞断层扫描中存在的许多低阶特征。在这里,我们提出了几种几何和组织模型来模拟冷冻电子断层成像的低阶细胞结构。具体来说,任何已知的胞质或膜结合大分子、具有不同几何形状的膜以及不同的丝状结构(例如微管或肌动蛋白样网络)的簇。此外,我们使用可参数化的随机模型来生成高度多样化的几何形状和组织,以模拟代表性和广义的数据集,包括非常拥挤的环境,例如在本地细胞中观察到的环境。这些模型已在多平台开源 Python 包中实现,包括用于生成大小和分辨率可调的冷冻断层图的脚本。此外,除了不同文件格式的地面实况之外,这些脚本还提供无失真密度图,以实现高效访问和高级可视化。我们证明,这样一个真实的合成数据集可以很容易地用于训练可推广的深度学习算法。

AU Li, Zilong Gao, Qi Wu, Yaping Niu, Chuang Zhang, Junping Wang, Meiyun Wang, Ge Shan, Hongming
AU Li、高子龙、吴奇、牛亚平、张闯、王军平、王美云、单戈、洪明

Quad-Net: Quad-Domain Network for CT Metal Artifact Reduction
Quad-Net:用于减少 CT 金属伪影的四域网络

Metal implants and other high-density objects in patients introduce severe streaking artifacts in CT images, compromising image quality and diagnostic performance. Although various methods were developed for CT metal artifact reduction over the past decades, including the latest dual-domain deep networks, remaining metal artifacts are still clinically challenging in many cases. Here we extend the state-of-the-art dual-domain deep network approach into a quad-domain counterpart so that all the features in the sinogram, image, and their corresponding Fourier domains are synergized to eliminate metal artifacts optimally without compromising structural subtleties. Our proposed quad-domain network for MAR, referred to as Quad-Net, takes little additional computational cost since the Fourier transform is highly efficient, and works across the four receptive fields to learn both global and local features as well as their relations. Specifically, we first design a Sinogram-Fourier Restoration Network (SFR-Net) in the sinogram domain and its Fourier space to faithfully inpaint metal-corrupted traces. Then, we couple SFR-Net with an Image-Fourier Refinement Network (IFR-Net) which takes both an image and its Fourier spectrum to improve a CT image reconstructed from the SFR-Net output using cross-domain contextual information. Quad-Net is trained on clinical datasets to minimize a composite loss function. Quad-Net does not require precise metal masks, which is of great importance in clinical practice. Our experimental results demonstrate the superiority of Quad-Net over the state-of-the-art MAR methods quantitatively, visually, and statistically. The Quad-Net code is publicly available at https://github.com/longzilicart/Quad-Net.
患者体内的金属植入物和其他高密度物体会在 CT 图像中引入严重的条纹伪影,从而影响图像质量和诊断性能。尽管过去几十年来开发了各种减少 CT 金属伪影的方法,包括最新的双域深度网络,但在许多情况下,残留的金属伪影在临床上仍然具有挑战性。在这里,我们将最先进的双域深度网络方法扩展到四域对应物,以便协同正弦图、图像及其相应傅里叶域中的所有特征,以最佳方式消除金属伪影,而不影响结构的微妙之处。我们提出的 MAR 四域网络(称为 Quad-Net)几乎不需要额外的计算成本,因为傅里叶变换非常高效,并且跨四个感受野工作以学习全局和局部特征及其关系。具体来说,我们首先在正弦图域及其傅立叶空间中设计正弦图-傅立叶恢复网络(SFR-Net),以忠实地修复金属损坏的痕迹。然后,我们将 SFR-Net 与图像傅里叶细化网络 (IFR-Net) 结合起来,该网络采用图像及其傅里叶频谱来改进使用跨域上下文信息从 SFR-Net 输出重建的 CT 图像。 Quad-Net 在临床数据集上进行训练,以最小化复合损失函数。 Quad-Net不需要精确的金属掩模,这在临床实践中非常重要。我们的实验结果在定量、视觉和统计方面证明了 Quad-Net 相对于最先进的 MAR 方法的优越性。 Quad-Net 代码可在 https://github.com/longzilicart/Quad-Net 上公开获取。

AU Vray, Guillaume Tomar, Devavrat Bozorgtabar, Behzad Thiran, Jean-Philippe
AU Vray、纪尧姆·托马尔、Devavrat Bozorgtabar、Behzad Thiran、Jean-Philippe

Distill-SODA: Distilling Self-Supervised Vision Transformer for Source-Free Open-Set Domain Adaptation in Computational Pathology
Distill-SODA:蒸馏自监督视觉变压器,用于计算病理学中的无源开放集域适应

Developing computational pathology models is essential for reducing manual tissue typing from whole slide images, transferring knowledge from the source domain to an unlabeled, shifted target domain, and identifying unseen categories. We propose a practical setting by addressing the above-mentioned challenges in one fell swoop, i.e., source-free open-set domain adaptation. Our methodology focuses on adapting a pre-trained source model to an unlabeled target dataset and encompasses both closed-set and open-set classes. Beyond addressing the semantic shift of unknown classes, our framework also deals with a covariate shift, which manifests as variations in color appearance between source and target tissue samples. Our method hinges on distilling knowledge from a self-supervised vision transformer (ViT), drawing guidance from either robustly pre-trained transformer models or histopathology datasets, including those from the target domain. In pursuit of this, we introduce a novel style-based adversarial data augmentation, serving as hard positives for self-training a ViT, resulting in highly contextualized embeddings. Following this, we cluster semantically akin target images, with the source model offering weak pseudo-labels, albeit with uncertain confidence. To enhance this process, we present the closed-set affinity score (CSAS), aiming to correct the confidence levels of these pseudo-labels and to calculate weighted class prototypes within the contextualized embedding space. Our approach establishes itself as state-of-the-art across three public histopathological datasets for colorectal cancer assessment. Notably, our self-training method seamlessly integrates with open-set detection methods, resulting in enhanced performance in both closed-set and open-set recognition tasks.
开发计算病理学模型对于减少整个幻灯片图像的手动组织分型、将知识从源域转移到未标记的、转移的目标域以及识别看不见的类别至关重要。我们提出了一种实用的设置,通过一举解决上述挑战,即无源开放集域适应。我们的方法侧重于使预训练的源模型适应未标记的目标数据集,并涵盖封闭集和开放集类。除了解决未知类别的语义转变之外,我们的框架还处理协变量转变,这表现为源组织样本和目标组织样本之间颜色外观的变化。我们的方法取决于从自监督视觉转换器(ViT)中提取知识,从经过严格预训练的转换器模型或组织病理学数据集(包括来自目标域的数据集)中获取指导​​。为了实现这一目标,我们引入了一种新颖的基于风格的对抗性数据增强,作为自我训练 ViT 的硬性积极因素,从而产生高度情境化的嵌入。接下来,我们对语义上相似的目标图像进行聚类,源模型提供弱伪标签,尽管置信度不确定。为了增强这个过程,我们提出了闭集亲和力评分(CSAS),旨在纠正这些伪标签的置信水平并计算上下文嵌入空间内的加权类原型。我们的方法在结直肠癌评估的三个公共组织病理学数据集中确立了最先进的方法。 值得注意的是,我们的自我训练方法与开放集检测方法无缝集成,从而提高了封闭集和开放集识别任务的性能。

AU Gui, Shuangchun Wang, Zhenkun Chen, Jixiang Zhou, Xun Zhang, Chen Cao, Yi
区桂、王双春、陈振坤、周吉祥、张迅、曹晨、易

MT4MTL-KD: A Multi-Teacher Knowledge Distillation Framework for Triplet Recognition
MT4MTL-KD:用于三元组识别的多教师知识蒸馏框架

The recognition of surgical triplets plays a critical role in the practical application of surgical videos. It involves the sub-tasks of recognizing instruments, verbs, and targets, while establishing precise associations between them. Existing methods face two significant challenges in triplet recognition: 1) the imbalanced class distribution of surgical triplets may lead to spurious task association learning, and 2) the feature extractors cannot reconcile local and global context modeling. To overcome these challenges, this paper presents a novel multi-teacher knowledge distillation framework for multi-task triplet learning, known as MT4MTL-KD. MT4MTL-KD leverages teacher models trained on less imbalanced sub-tasks to assist multi-task student learning for triplet recognition. Moreover, we adopt different categories of backbones for the teacher and student models, facilitating the integration of local and global context modeling. To further align the semantic knowledge between the triplet task and its sub-tasks, we propose a novel feature attention module (FAM). This module utilizes attention mechanisms to assign multi-task features to specific sub-tasks. We evaluate the performance of MT4MTL-KD on both the 5-fold cross-validation and the CholecTriplet challenge splits of the CholecT45 dataset. The experimental results consistently demonstrate the superiority of our framework over state-of-the-art methods, achieving significant improvements of up to 6.4% on the cross-validation split.
手术三联体的识别在手术视频的实际应用中起着至关重要的作用。它涉及识别仪器、动词和目标,同时在它们之间建立精确关联的子任务。现有方法在三元组识别中面临两个重大挑战:1)手术三元组的不平衡类别分布可能导致虚假的任务关联学习,2)特征提取器无法协调局部和全局上下文建模。为了克服这些挑战,本文提出了一种用于多任务三元组学习的新型多教师知识蒸馏框架,称为 MT4MTL-KD。 MT4MTL-KD 利用在不平衡性较小的子任务上训练的教师模型来协助多任务学生学习三联体识别。此外,我们为教师和学生模型采用不同类别的骨干网,促进本地和全局上下文建模的集成。为了进一步协调三元组任务及其子任务之间的语义知识,我们提出了一种新颖的特征注意模块(FAM)。该模块利用注意力机制将多任务特征分配给特定的子任务。我们评估了 MT4MTL-KD 在 CholecT45 数据集的 5 倍交叉验证和 CholecTriplet 挑战分割上的性能。实验结果一致证明了我们的框架相对于最先进的方法的优越性,在交叉验证分割上实现了高达 6.4% 的显着改进。

AU Ma, Kai Wen, Xuyun Zhu, Qi Zhang, Daoqiang
区马、文凯、朱旭云、张琪、道强

Ordinal Pattern Tree: A New Representation Method for Brain Network Analysis
序数模式树:一种新的脑网络分析表示方法

Brain networks, describing the functional or structural interactions of brain with graph theory, have been widely used for brain imaging analysis. Currently, several network representation methods have been developed for describing and analyzing brain networks. However, most of these methods ignored the valuable weighted information of the edges in brain networks. In this paper, we propose a new representation method (i.e., ordinal pattern tree) for brain network analysis. Compared with the existing network representation methods, the proposed ordinal pattern tree (OPT) can not only leverage the weighted information of the edges but also express the hierarchical relationships of nodes in brain networks. On OPT, nodes are connected by ordinal edges which are constructed by using the ordinal pattern relationships of weighted edges. We represent brain networks as OPTs and further develop a new graph kernel called optimal transport (OT) based ordinal pattern tree (OT-OPT) kernel to measure the similarity between paired brain networks. In OT-OPT kernel, the OT distances are used to calculate the transport costs between the nodes on the OPTs. Based on these OT distances, we use exponential function to calculate OT-OPT kernel which is proved to be positive definite. To evaluate the effectiveness of the proposed method, we perform classification and regression experiments on ADHD-200, ABIDE and ADNI datasets. The experimental results demonstrate that our proposed method outperforms the state-of-the-art graph methods in the classification and regression tasks.
脑网络用图论描述大脑的功能或结构相互作用,已广泛用于脑成像分析。目前,已经开发了几种网络表示方法来描述和分析大脑网络。然而,这些方法大多数都忽略了大脑网络中边缘的有价值的加权信息。在本文中,我们提出了一种用于脑网络分析的新表示方法(即序数模式树)。与现有的网络表示方法相比,所提出的序数模式树(OPT)不仅可以利用边缘的加权信息,还可以表达脑网络中节点的层次关系。在 OPT 上,节点通过序数边连接,序数边是使用加权边的序数模式关系构造的。我们将大脑网络表示为 OPT,并进一步开发了一种新的图内核,称为基于最优传输 (OT) 的序数模式树 (OT-OPT) 内核,以测量配对大脑网络之间的相似性。在OT-OPT内核中,OT距离用于计算OPT上节点之间的传输成本。基于这些OT距离,我们使用指数函数来计算OT-OPT核,并证明该核是正定的。为了评估所提出方法的有效性,我们在 ADHD-200、ABIDE 和 ADNI 数据集上进行分类和回归实验。实验结果表明,我们提出的方法在分类和回归任务中优于最先进的图方法。

AU Li, Zihan Li, Yunxiang Li, Qingde Wang, Puyang Guo, Dazhou Lu, Le Jin, Dakai Zhang, You Hong, Qingqi
李AU、李子涵、李云翔、王庆德、郭濮阳、路大舟、金乐、张大开、尤红、庆琪

LViT: Language Meets Vision Transformer in Medical Image Segmentation
LViT:医学图像分割中语言与视觉转换器的结合

Deep learning has been widely used in medical image segmentation and other aspects. However, the performance of existing medical image segmentation models has been limited by the challenge of obtaining sufficient high-quality labeled data due to the prohibitive data annotation cost. To alleviate this limitation, we propose a new text-augmented medical image segmentation model LViT (Language meets Vision Transformer). In our LViT model, medical text annotation is incorporated to compensate for the quality deficiency in image data. In addition, the text information can guide to generate pseudo labels of improved quality in the semi-supervised learning. We also propose an Exponential Pseudo label Iteration mechanism (EPI) to help the Pixel-Level Attention Module (PLAM) preserve local image features in semi-supervised LViT setting. In our model, LV (Language-Vision) loss is designed to supervise the training of unlabeled images using text information directly. For evaluation, we construct three multimodal medical segmentation datasets (image + text) containing X-rays and CT images. Experimental results show that our proposed LViT has superior segmentation performance in both fully-supervised and semi-supervised setting. The code and datasets are available at https://github.com/HUANGLIZI/LViT.
深度学习已广泛应用于医学图像分割等方面。然而,由于数据注释成本过高,现有医学图像分割模型的性能受到获取足够高质量标记数据的挑战的限制。为了缓解这一限制,我们提出了一种新的文本增强医学图像分割模型 LViT(Language meet Vision Transformer)。在我们的 LViT 模型中,结合了医学文本注释来弥补图像数据的质量缺陷。此外,文本信息可以指导在半监督学习中生成质量提高的伪标签。我们还提出了指数伪标签迭代机制(EPI)来帮助像素级注意力模块(PLAM)在半监督 LViT 设置中保留局部图像特征。在我们的模型中,LV(语言-视觉)损失旨在直接使用文本信息监督未标记图像的训练。为了进行评估,我们构建了三个包含 X 射线和 CT 图像的多模态医学分割数据集(图像 + 文本)。实验结果表明,我们提出的 LViT 在全监督和半监督设置中都具有优异的分割性能。代码和数据集可在 https://github.com/HUANGLIZI/LViT 获取。

AU Zhang, Yinglin Xi, Ruiling Zeng, Lingxi Towey, Dave Bai, Ruibin Higashita, Risa Liu, Jiang
AU 张、奚英林、曾瑞玲、Lingxi Towey、Dave Bai、Ruibin Higashita、Risa Liu、Jiang

Structural Priors Guided Network for the Corneal Endothelial Cell Segmentation
用于角膜内皮细胞分割的结构先验引导网络

The segmentation of blurred cell boundaries in cornea endothelium microscope images is challenging, which affects the clinical parameter estimation accuracy. Existing deep learning methods only consider pixel-wise classification accuracy and lack of utilization of cell structure knowledge. Therefore, the segmentation of the blurred cell boundary is discontinuous. This paper proposes a structural prior guided network (SPG-Net) for corneal endothelium cell segmentation. We first employ a hybrid transformer convolution backbone to capture more global context. Then, we use Feature Enhancement (FE) module to improve the representation ability of features and Local Affinity-based Feature Fusion (LAFF) module to propagate structural information among hierarchical features. Finally, we introduce the joint loss based on cross entropy and structure similarity index measure (SSIM) to supervise the training process under pixel and structure levels. We compare the SPG-Net with various state-of-the-art methods on four corneal endothelial datasets. The experiment results suggest that the SPG-Net can alleviate the problem of discontinuous cell boundary segmentation and balance the pixel-wise accuracy and structure preservation. We also evaluate the agreement of parameter estimation between ground truth and the prediction of SPG-Net. The statistical analysis results show a good agreement and correlation.
角膜内皮显微镜图像中模糊细胞边界的分割具有挑战性,这影响了临床参数估计的准确性。现有的深度学习方法仅考虑像素级分类精度,缺乏对细胞结构知识的利用。因此,模糊细胞边界的分割是不连续的。本文提出了一种用于角膜内皮细胞分割的结构先验引导网络(SPG-Net)。我们首先采用混合变压器卷积主干来捕获更多全局上下文。然后,我们使用特征增强(FE)模块来提高特征的表示能力,并使用基于局部亲和力的特征融合(LAFF)模块来在层次特征之间传播结构信息。最后,我们引入基于交叉熵和结构相似性指数度量(SSIM)的联合损失来监督像素和结构级别下的训练过程。我们在四个角膜内皮数据集上将 SPG-Net 与各种最先进的方法进行比较。实验结果表明,SPG-Net 可以缓解不连续的细胞边界分割问题,并平衡像素精度和结构保留。我们还评估了真实值与 SPG-Net 预测之间参数估计的一致性。统计分析结果显示出良好的一致性和相关性。

AU Liu, Zhentao Fang, Yu Li, Changjian Wu, Han Liu, Yuan Shen, Dinggang Cui, Zhiming
刘AU、方振涛、李宇、吴昌建、刘涵、沉远、崔定刚、志明

Geometry-Aware Attenuation Learning for Sparse-View CBCT Reconstruction.
用于稀疏视图 CBCT 重建的几何感知衰减学习。

Cone Beam Computed Tomography (CBCT) plays a vital role in clinical imaging. Traditional methods typically require hundreds of 2D X-ray projections to reconstruct a high-quality 3D CBCT image, leading to considerable radiation exposure. This has led to a growing interest in sparse-view CBCT reconstruction to reduce radiation doses. While recent advances, including deep learning and neural rendering algorithms, have made strides in this area, these methods either produce unsatisfactory results or suffer from time inefficiency of individual optimization. In this paper, we introduce a novel geometry-aware encoder-decoder framework to solve this problem. Our framework starts by encoding multi-view 2D features from various 2D X-ray projections with a 2D CNN encoder. Leveraging the geometry of CBCT scanning, it then back-projects the multi-view 2D features into the 3D space to formulate a comprehensive volumetric feature map, followed by a 3D CNN decoder to recover 3D CBCT image. Importantly, our approach respects the geometric relationship between 3D CBCT image and its 2D X-ray projections during feature back projection stage, and enjoys the prior knowledge learned from the data population. This ensures its adaptability in dealing with extremely sparse view inputs without individual training, such as scenarios with only 5 or 10 X-ray projections. Extensive evaluations on two simulated datasets and one real-world dataset demonstrate exceptional reconstruction quality and time efficiency of our method.
锥形束计算机断层扫描(CBCT)在临床成像中发挥着至关重要的作用。传统方法通常需要数百次 2D X 射线投影才能重建高质量的 3D CBCT 图像,从而导致相当大的辐射暴露。这导致人们对稀疏视图 CBCT 重建以减少辐射剂量越来越感兴趣。尽管深度学习和神经渲染算法等最新进展在这一领域取得了长足的进步,但这些方法要么产生不令人满意的结果,要么遭受个体优化时间效率低下的困扰。在本文中,我们引入了一种新颖的几何感知编码器-解码器框架来解决这个问题。我们的框架首先使用 2D CNN 编码器对来自各种 2D X 射线投影的多视图 2D 特征进行编码。然后,利用 CBCT 扫描的几何结构,将多视图 2D 特征反投影到 3D 空间中,以制定全面的体积特征图,然后通过 3D CNN 解码器恢复 3D CBCT 图像。重要的是,我们的方法在特征反投影阶段尊重 3D CBCT 图像与其 2D X 射线投影之间的几何关系,并享受从数据群体中学到的先验知识。这确保了其无需单独训练即可处理极其稀疏的视图输入的适应性,例如只有 5 或 10 个 X 射线投影的场景。对两个模拟数据集和一个真实数据集的广泛评估证明了我们的方法具有出色的重建质量和时间效率。

AU Xu, Shicheng Li, Wei Li, Zuoyong Zhao, Tiesong Zhang, Bob
徐AU、李世成、李伟、赵作勇、张铁松、Bob

Facing Differences of Similarity: Intra- and Inter-Correlation Unsupervised Learning for Chest X-Ray Anomaly Detection.
面对相似性的差异:胸部 X 射线异常检测的内相关和互相关无监督学习。

Anomaly detection can significantly aid doctors in interpreting chest X-rays. The commonly used strategy involves utilizing the pre-trained network to extract features from normal data to establish feature representations. However, when a pre-trained network is applied to more detailed X-rays, differences of similarity can limit the robustness of these feature representations. Therefore, we propose an intra- and inter-correlation learning framework for chest X-ray anomaly detection. Firstly, to better leverage the similar anatomical structure information in chest X-rays, we introduce the Anatomical-Feature Pyramid Fusion Module for feature fusion. This module aims to obtain fusion features with both local details and global contextual information. These fusion features are initialized by a trainable feature mapper and stored in a feature bank to serve as centers for learning. Furthermore, to Facing Differences of Similarity (FDS) introduced by the pre-trained network, we propose an intra- and inter-correlation learning strategy: (1) We use intra-correlation learning to establish intra-correlation between mapped features of individual images and semantic centers, thereby initially discovering lesions; (2) We employ inter-correlation learning to establish inter-correlation between mapped features of different images, further mitigating the differences of similarity introduced by the pre-trained network, and achieving effective detection results even in diverse chest disease environments. Finally, a comparison with 18 state-of-the-art methods on three datasets demonstrates the superiority and effectiveness of the proposed method across various scenarios.
异常检测可以极大地帮助医生解读胸部 X 光片。常用的策略是利用预训练的网络从正常数据中提取特征来建立特征表示。然而,当预训练的网络应用于更详细的 X 射线时,相似性的差异可能会限制这些特征表示的鲁棒性。因此,我们提出了一种用于胸部 X 射线异常检测的内相关和互相关学习框架。首先,为了更好地利用胸部X光片中相似的解剖结构信息,我们引入了解剖特征金字塔融合模块进行特征融合。该模块旨在获得具有局部细节和全局上下文信息的融合特征。这些融合特征由可训练的特征映射器初始化,并存储在特征库中作为学习中心。此外,针对预训练网络引入的相似性差异(FDS),我们提出了一种内相关和互相关学习策略:(1)我们使用内相关学习来建立各个图像的映射特征之间的内相关和语义中心,从而初步发现病变; (2)我们采用互相关学习来建立不同图像的映射特征之间的互相关性,进一步减轻预训练网络引入的相似性差异,即使在不同的胸部疾病环境中也能获得有效的检测结果。最后,在三个数据集上与 18 种最先进的方法进行比较,证明了所提出的方法在各种场景下的优越性和有效性。

AU Song, Haofei Mao, Xintian Yu, Jing Li, Qingli Wang, Yan
AU宋、毛浩飞、于心田、李静、王庆利、严

I<SUP>3</SUP>Net: Inter-Intra-Slice Interpolation Network for Medical Slice Synthesis
I<SUP>3</SUP>Net:用于医学切片合成的切片内插值网络

Medical imaging is limited by acquisition time and scanning equipment. CT and MR volumes, reconstructed with thicker slices, are anisotropic with high in-plane resolution and low through-plane resolution. We reveal an intriguing phenomenon that due to the mentioned nature of data, performing slice-wise interpolation from the axial view can yield greater benefits than performing super-resolution from other views. Based on this observation, we propose an Inter-Intra-slice Interpolation Network (I(3)Net), which fully explores information from high in-plane resolution and compensates for low through-plane resolution. The through-plane branch supplements the limited information contained in low through-plane resolution from high in-plane resolution and enables continual and diverse feature learning. In-plane branch transforms features to the frequency domain and enforces an equal learning opportunity for all frequency bands in a global context learning paradigm. We further propose a cross-view block to take advantage of the information from all three views online. Extensive experiments on two public datasets demonstrate the effectiveness of I(3)Net, and noticeably outperforms state-of-the-art super-resolution, video frame interpolation and slice interpolation methods by a large margin. We achieve 43.90dB in PSNR, with at least 1.14dB improvement under the upscale factor of x2 on MSD dataset with faster inference. Code is available at https://github.com/DeepMedLab-ECNU/Medical-Image-Reconstruction.
医学成像受到采集时间和扫描设备的限制。用较厚的切片重建的 CT 和 MR 体积具有各向异性,具有高平面内分辨率和低平面分辨率。我们揭示了一个有趣的现象,由于所提到的数据性质,从轴向视图执行切片插值可以比从其他视图执行超分辨率产生更大的好处。基于这一观察,我们提出了一种片内插值网络(I(3)Net),它充分探索高平面内分辨率的信息并补偿低平面分辨率。贯通平面分支从高平面内分辨率中补充了低贯通平面分辨率中包含的有限信息,并实现了持续且多样化的特征学习。平面内分支将特征转换到频域,并在全局上下文学习范式中为所有频段强制提供平等的学习机会。我们进一步提出了一个跨视图块来利用来自所有三个在线视图的信息。对两个公共数据集的大量实验证明了 I(3)Net 的有效性,并且明显优于最先进的超分辨率、视频帧插值和切片插值方法。我们在 MSD 数据集上实现了 43.90dB 的 PSNR,在 x2 的放大因子下至少提高了 1.14dB,推理速度更快。代码可在 https://github.com/DeepMedLab-ECNU/Medical-Image-Reconstruction 获取。

AU Miller, David A. Grannonico, Marta Liu, Mingna Savier, Elise McHaney, Kara Erisir, Alev Netland, Peter A. Cang, Jianhua Liu, Xiaorong Zhang, Hao F.
AU Miller、David A. Grannonico、Marta Liu、Mingna Savier、Elise McHaney、Kara Erisir、Alev Netland、Peter A. Cang、Jianhua Liu、张晓蓉、Hao F.

Visible-Light Optical Coherence Tomography Fibergraphy of the Tree Shrew Retinal Ganglion Cell Axon Bundles
树鼩视网膜神经节细胞轴突束的可见光光学相干断层扫描纤维成像

We seek to develop techniques for high-resolution imaging of the tree shrew retina for visualizing and parameterizing retinal ganglion cell (RGC) axon bundles in vivo. We applied visible-light optical coherence tomography fibergraphy (vis-OCTF) and temporal speckle averaging (TSA) to visualize individual RGC axon bundles in the tree shrew retina. For the first time, we quantified individual RGC bundle width, height, and cross-sectional area and applied vis-OCT angiography (vis-OCTA) to visualize the retinal microvasculature in tree shrews. Throughout the retina, as the distance from the optic nerve head (ONH) increased from 0.5 mm to 2.5 mm, bundle width increased by 30%, height decreased by 67%, and cross-sectional area decreased by 36%. We also showed that axon bundles become vertically elongated as they converge toward the ONH. Ex vivo confocal microscopy of retinal flat-mounts immunostained with Tuj1 confirmed our in vivo vis-OCTF findings.
我们寻求开发树鼩视网膜高分辨率成像技术,用于体内视网膜神经节细胞(RGC)轴突束的可视化和参数化。我们应用可见光光学相干断层扫描纤维成像 (vis-OCTF) 和颞散斑平均 (TSA) 来可视化树鼩视网膜中的单个 RGC 轴突束。我们首次量化了单个 RGC 束的宽度、高度和横截面积,并应用 vis-OCT 血管造影 (vis-OCTA) 来可视化树鼩的视网膜微脉管系统。在整个视网膜中,随着距视神经头(ONH)的距离从0.5毫米增加到2.5毫米,束宽度增加30%,高度减少67%,横截面积减少36%。我们还表明,轴突束在向 ONH 汇聚时会垂直拉长。用 Tuj1 免疫染色的视网膜平片的离体共聚焦显微镜证实了我们的体内 vis-OCTF 研究结果。

AU Zhang, Yuhan Ma, Xiao Huang, Kun Li, Mingchao Heng, Pheng-Ann
张AU、马雨涵、小黄、李坤、衡明超、彭安

Semantic-Oriented Visual Prompt Learning for Diabetic Retinopathy Grading on Fundus Images
眼底图像糖尿病视网膜病变分级的面向语义的视觉提示学习

Diabetic retinopathy (DR) is a serious ocular condition that requires effective monitoring and treatment by ophthalmologists. However, constructing a reliable DR grading model remains a challenging and costly task, heavily reliant on high-quality training sets and adequate hardware resources. In this paper, we investigate the knowledge transferability of large-scale pre-trained models (LPMs) to fundus images based on prompt learning to construct a DR grading model efficiently. Unlike full-tuning which fine-tunes all parameters of LPMs, prompt learning only involves a minimal number of additional learnable parameters while achieving a competitive effect as full-tuning. Inspired by visual prompt tuning, we propose Semantic-oriented Visual Prompt Learning (SVPL) to enhance the semantic perception ability for better extracting task-specific knowledge from LPMs, without any additional annotations. Specifically, SVPL assigns a group of learnable prompts for each DR level to fit the complex pathological manifestations and then aligns each prompt group to task-specific semantic space via a contrastive group alignment (CGA) module. We also propose a plug-and-play adapter module, Hierarchical Semantic Delivery (HSD), which allows the semantic transition of prompt groups from shallow to deep layers to facilitate efficient knowledge mining and model convergence. Our extensive experiments on three public DR grading datasets demonstrate that SVPL achieves superior results compared to other transfer tuning and DR grading methods. Further analysis suggests that the generalized knowledge from LPMs is advantageous for constructing the DR grading model on fundus images.
糖尿病视网膜病变(DR)是一种严重的眼部疾病,需要眼科医生的有效监测和治疗。然而,构建可靠的 DR 分级模型仍然是一项具有挑战性且成本高昂的任务,严重依赖高质量的训练集和充足的硬件资源。在本文中,我们研究了基于即时学习的大规模预训练模型(LPM)到眼底图像的知识可迁移性,以有效地构建 DR 分级模型。与对 LPM 的所有参数进行微调的全调优不同,即时学习仅涉及最少数量的额外可学习参数,同时实现与全调优一样的竞争效果。受视觉提示调整的启发,我们提出了面向语义的视觉提示学习(SVPL)来增强语义感知能力,以便更好地从 LPM 中提取特定于任务的知识,而无需任何额外的注释。具体来说,SVPL 为每个 DR 级别分配一组可学习的提示,以适应复杂的病理表现,然后通过对比组对齐(CGA)模块将每个提示组与特定任务的语义空间对齐。我们还提出了一种即插即用的适配器模块——分层语义传递(HSD),它允许提示组从浅层到深层的语义转换,以促进高效的知识挖掘和模型收敛。我们对三个公共 DR 分级数据集进行的广泛实验表明,与其他传输调整和 DR 分级方法相比,SVPL 取得了优异的结果。进一步分析表明,LPM 的广义知识有利于构建眼底图像的 DR 分级模型。

AU Liu, Aohan Guo, Yuchen Yong, Jun-Hai Xu, Feng
刘AU、郭敖汉、勇雨辰、徐俊海、冯

Multi-Grained Radiology Report Generation With Sentence-Level Image-Language Contrastive Learning
通过句子级图像语言对比学习生成多粒度放射学报告

The automatic generation of accurate radiology reports is of great clinical importance and has drawn growing research interest. However, it is still a challenging task due to the imbalance between normal and abnormal descriptions and the multi-sentence and multi-topic nature of radiology reports. These features result in significant challenges to generating accurate descriptions for medical images, especially the important abnormal findings. Previous methods to tackle these problems rely heavily on extra manual annotations, which are expensive to acquire. We propose a multi-grained report generation framework incorporating sentence-level image-sentence contrastive learning, which does not require any extra labeling but effectively learns knowledge from the image-report pairs. We first introduce contrastive learning as an auxiliary task for image feature learning. Different from previous contrastive methods, we exploit the multi-topic nature of imaging reports and perform fine-grained contrastive learning by extracting sentence topics and contents and contrasting between sentence contents and refined image contents guided by sentence topics. This forces the model to learn distinct abnormal image features for each specific topic. During generation, we use two decoders to first generate coarse sentence topics and then the fine-grained text of each sentence. We directly supervise the intermediate topics using sentence topics learned by our contrastive objective. This strengthens the generation constraint and enables independent fine-tuning of the decoders using reinforcement learning, which further boosts model performance. Experiments on two large-scale datasets MIMIC-CXR and IU-Xray demonstrate that our approach outperforms existing state-of-the-art methods, evaluated by both language generation metrics and clinical accuracy.
自动生成准确的放射学报告具有重要的临床意义,并引起了越来越多的研究兴趣。然而,由于正常和异常描述之间的不平衡以及放射学报告的多句和多主题性质,这仍然是一项具有挑战性的任务。这些特征给生成医学图像的准确描述带来了重大挑战,尤其是重要的异常发现。以前解决这些问题的方法严重依赖额外的手动注释,而获取这些注释的成本很高。我们提出了一种结合句子级图像句子对比学习的多粒度报告生成框架,它不需要任何额外的标签,但可以有效地从图像报告对中学习知识。我们首先引入对比学习作为图像特征学习的辅助任务。与以往的对比方法不同,我们利用成像报告的多主题性质,通过提取句子主题和内容,并将句子内容与句子主题引导的细化图像内容进行对比来进行细粒度的对比学习。这迫使模型学习每个特定主题的明显异常图像特征。在生成过程中,我们使用两个解码器首先生成粗粒度的句子主题,然后生成每个句子的细粒度文本。我们使用对比目标学习的句子主题直接监督中间主题。这增强了生成约束,并能够使用强化学习对解码器进行独立微调,从而进一步提高模型性能。 在两个大型数据集 MIMIC-CXR 和 IU-Xray 上进行的实验表明,通过语言生成指标和临床准确性进行评估,我们的方法优于现有的最先进方法。

AU Shi, Yongyi Xia, Wenjun Wang, Ge Mou, Xuanqin
区石、夏永义、王文军、葛某、玄勤

Blind CT Image Quality Assessment Using DDPM-derived Content and Transformer-based Evaluator.
使用 DDPM 衍生内容和基于 Transformer 的评估器进行盲 CT 图像质量评估。

Lowering radiation dose per view and utilizing sparse views per scan are two common CT scan modes, albeit often leading to distorted images characterized by noise and streak artifacts. Blind image quality assessment (BIQA) strives to evaluate perceptual quality in alignment with what radiologists perceive, which plays an important role in advancing low-dose CT reconstruction techniques. An intriguing direction involves developing BIQA methods that mimic the operational characteristic of the human visual system (HVS). The internal generative mechanism (IGM) theory reveals that the HVS actively deduces primary content to enhance comprehension. In this study, we introduce an innovative BIQA metric that emulates the active inference process of IGM. Initially, an active inference module, implemented as a denoising diffusion probabilistic model (DDPM), is constructed to anticipate the primary content. Then, the dissimilarity map is derived by assessing the interrelation between the distorted image and its primary content. Subsequently, the distorted image and dissimilarity map are combined into a multi-channel image, which is inputted into a transformer-based image quality evaluator. By leveraging the DDPM-derived primary content, our approach achieves competitive performance on a low-dose CT dataset.
降低每次视图的辐射剂量和利用每次扫描的稀疏视图是两种常见的 CT 扫描模式,尽管这通常会导致以噪声和条纹伪影为特征的图像失真。盲图像质量评估 (BIQA) 致力于评估与放射科医生感知一致的感知质量,这在推进低剂量 CT 重建技术方面发挥着重要作用。一个有趣的方向涉及开发模仿人类视觉系统 (HVS) 操作特征的 BIQA 方法。内部生成机制(IGM)理论揭示,HVS 主动推断主要内容以增强理解力。在本研究中,我们引入了一种创新的 BIQA 指标,可以模拟 IGM 的主动推理过程。最初,构建一个作为去噪扩散概率模型(DDPM)实现的主动推理模块来预测主要内容。然后,通过评估失真图像与其主要内容之间的相互关系来导出相异图。随后,将失真图像和相异图组合成多通道图像,并将其输入到基于变换器的图像质量评估器中。通过利用 DDPM 衍生的主要内容,我们的方法在低剂量 CT 数据集上实现了具有竞争力的性能。

AU Xia, Jinqiu Zhou, Yiwen Deng, Wenxin Kang, Jing Wu, Wangjiang Qi, Mengke Zhou, Linghong Ma, Jianhui Xu, Yuan
夏区、周金秋、邓一文、康文欣、吴静、齐王江、周孟克、马令红、徐建辉、袁

PND-Net: Physics-Inspired Non-Local Dual-Domain Network for Metal Artifact Reduction
PND-Net:物理启发的非本地双域网络,用于减少金属伪影

Metal artifacts caused by the presence of metallic implants tremendously degrade the quality of reconstructed computed tomography (CT) images and therefore affect the clinical diagnosis or reduce the accuracy of organ delineation and dose calculation in radiotherapy. Although various deep learning methods have been proposed for metal artifact reduction (MAR), most of them aim to restore the corrupted sinogram within the metal trace, which removes beam hardening artifacts but ignores other components of metal artifacts. In this paper, based on the physical property of metal artifacts which is verified via Monte Carlo (MC) simulation, we propose a novel physics-inspired non-local dual-domain network (PND-Net) for MAR in CT imaging. Specifically, we design a novel non-local sinogram decomposition network (NSD-Net) to acquire the weighted artifact component and develop an image restoration network (IR-Net) to reduce the residual and secondary artifacts in the image domain. To facilitate the generalization and robustness of our method on clinical CT images, we employ a trainable fusion network (F-Net) in the artifact synthesis path to achieve unpaired learning. Furthermore, we design an internal consistency loss to ensure the data fidelity of anatomical structures in the image domain and introduce the linear interpolation sinogram as prior knowledge to guide sinogram decomposition. NSD-Net, IR-Net, and F-Net are jointly trained so that they can benefit from one another. Extensive experiments on simulation and clinical data demonstrate that our method outperforms state-of-the-art MAR methods.
金属植入物的存在引起的金属伪影极大地降低了重建计算机断层扫描(CT)图像的质量,从而影响临床诊断或降低放射治疗中器官描绘和剂量计算的准确性。尽管已经提出了各种深度学习方法来减少金属伪影(MAR),但大多数方法的目的是恢复金属迹线内损坏的正弦图,从而消除束硬化伪影,但忽略金属伪影的其他组成部分。在本文中,基于通过蒙特卡罗(MC)模拟验证的金属伪影的物理特性,我们提出了一种新颖的物理启发的非局部双域网络(PND-Net),用于CT成像中的MAR。具体来说,我们设计了一种新颖的非局部正弦图分解网络(NSD-Net)来获取加权伪影分量,并开发图像恢复网络(IR-Net)来减少图像域中的残留和二次伪影。为了促进我们的方法在临床 CT 图像上的泛化和鲁棒性,我们在工件合成路径中采用可训练的融合网络(F-Net)来实现不配对学习。此外,我们设计了内部一致性损失以确保图像域中解剖结构的数据保真度,并引入线性插值正弦图作为先验知识来指导正弦图分解。 NSD-Net、IR-Net 和 F-Net 是联合训练的,因此它们可以相互受益。大量的模拟和临床数据实验表明,我们的方法优于最先进的 MAR 方法。

AU Lin, Jingyin Xie, Wende Kang, Li Wu, Huisi
AU Lin、谢静银、康文德、吴莉、慧思

Dynamic-guided Spatiotemporal Attention for Echocardiography Video Segmentation.
用于超声心动图视频分割的动态引导时空注意力。

Left ventricle (LV) endocardium segmentation in echocardiography video has received much attention as an important step in quantifying LV ejection fraction. Most existing methods are dedicated to exploiting temporal information on top of 2D convolutional networks. In addition to single appearance semantic learning, some research attempted to introduce motion cues through the optical flow estimation (OFE) task to enhance temporal consistency modeling. However, OFE in these methods is tightly coupled to LV endocardium segmentation, resulting in noisy inter-frame flow prediction, and post-optimization based on these flows accumulates errors. To address these drawbacks, we propose dynamic-guided spatiotemporal attention (DSA) for semi-supervised echocardiography video segmentation. We first fine-tune the off-the-shelf OFE network RAFT on echocardiography data to provide dynamic information. Taking inter-frame flows as additional input, we use a dual-encoder structure to extract motion and appearance features separately. Based on the connection between dynamic continuity and semantic consistency, we propose a bilateral feature calibration module to enhance both features. For temporal consistency modeling, the DSA is proposed to aggregate neighboring frame context using deformable attention that is realized by offsets grid attention. Dynamic information is introduced into DSA through a bilateral offset estimation module to effectively combine with appearance semantics and predict attention offsets, thereby guiding semantic-based spatiotemporal attention. We evaluated our method on two popular echocardiography datasets, CAMUS and EchoNet-Dynamic, and achieved state-of-the-art.
超声心动图视频中的左心室 (LV) 心内膜分割作为量化 LV 射血分数的重要步骤而受到广泛关注。大多数现有方法致力于利用 2D 卷积网络之上的时间信息。除了单一外观语义学习之外,一些研究还尝试通过光流估计(OFE)任务引入运动线索来增强时间一致性建模。然而,这些方法中的 OFE 与左心室心内膜分割紧密耦合,导致帧间血流预测存在噪声,并且基于这些血流的后优化会累积误差。为了解决这些缺点,我们提出了用于半监督超声心动图视频分割的动态引导时空注意力(DSA)。我们首先根据超声心动图数据对现成的 OFE 网络 RAFT 进行微调,以提供动态信息。以帧间流作为附加输入,我们使用双编码器结构分别提取运动和外观特征。基于动态连续性和语义一致性之间的联系,我们提出了双边特征校准模块来增强这两个特征。对于时间一致性建模,提出了 DSA 使用通过偏移网格注意力实现的可变形注意力来聚合相邻帧上下文。通过双边偏移估计模块将动态信息引入DSA,有效结合外观语义并预测注意力偏移,从而指导基于语义的时空注意力。我们在两个流行的超声心动图数据集 CAMUS 和 EchoNet-Dynamic 上评估了我们的方法,并达到了最先进的水平。

AU Mandot, Shubham Zannoni, Elena M. Cai, Ling Nie, Xingchen La Riviere, Patrick J. Wilson, Matthew D. Meng, Ling Jian
AU Mandot、Shubham Zannoni、Elena M. Cai、聂凌、Xingchen La Riviere、Patrick J. Wilson、Matthew D.Meng、凌健

A High-Sensitivity Benchtop X-Ray Fluorescence Emission Tomography (XFET) System With a Full-Ring of X-Ray Imaging-Spectrometers and a Compound-Eye Collimation Aperture
具有全环 X 射线成像光谱仪和复眼准直孔径的高灵敏度台式 X 射线荧光发射断层扫描 (XFET) 系统

The advent of metal-based drugs and metal nanoparticles as therapeutic agents in anti-tumor treatment has motivated the advancement of X-ray fluorescence computed tomography (XFCT) techniques. An XFCT imaging modality can detect, quantify, and image the biodistribution of metal elements using the X-ray fluorescence signal emitted upon X-ray irradiation. However, the majority of XFCT imaging systems and instrumentation developed so far rely on a single or a small number of detectors. This work introduces the first full-ring benchtop X-ray fluorescence emission tomography (XFET) system equipped with 24 solid-state detectors arranged in a hexagonal geometry and a 96-pinhole compound-eye collimator. We experimentally demonstrate the system's sensitivity and its capability of multi-element detection and quantification by performing imaging studies on an animal-sized phantom. In our preliminary studies, the phantom was irradiated with a pencil beam of X-rays produced using a low-powered polychromatic X-ray source (90kVp and 60W max power). This investigation shows a significant enhancement in the detection limit of gadolinium to as low as 0.1 mg/mL concentration. The results also illustrate the unique capabilities of the XFET system to simultaneously determine the spatial distribution and accurately quantify the concentrations of multiple metal elements.
金属基药物和金属纳米颗粒作为抗肿瘤治疗药物的出现推动了X射线荧光计算机断层扫描(XFCT)技术的进步。 XFCT 成像模式可以使用 X 射线照射时发出的 X 射线荧光信号来检测、量化和成像金属元素的生物分布。然而,迄今为止开发的大多数 XFCT 成像系统和仪器都依赖于单个或少量探测器。这项工作介绍了第一个全环台式X射线荧光发射断层扫描(XFET)系统,配备有24个呈六边形几何形状排列的固态探测器和一个96针孔复眼准直器。我们通过对动物大小的模型进行成像研究,通过实验证明了系统的灵敏度及其多元素检测和量化的能力。在我们的初步研究中,用低功率多色 X 射线源(90kVp 和 60W 最大功率)产生的笔形 X 射线束照射模型。这项研究表明钆的检测限显着提高至低至 0.1 mg/mL 浓度。结果还说明了 XFET 系统同时确定空间分布并准确量化多种金属元素浓度的独特能力。

AU Pang, Yan Liang, Jiaming Huang, Teng Chen, Hao Li, Yunhao Li, Dan Huang, Lin Wang, Qiong
AU庞、梁彦、黄家明、陈腾、李浩、李云浩、黄丹、王林、琼

Slim UNETR: Scale Hybrid Transformers to Efficient 3D Medical Image Segmentation Under Limited Computational Resources
Slim UNETR:在有限的计算资源下扩展混合变压器以实现高效的 3D 医学图像分割

Hybrid transformer-based segmentation approaches have shown great promise in medical image analysis. However, they typically require considerable computational power and resources during both training and inference stages, posing a challenge for resource-limited medical applications common in the field. To address this issue, we present an innovative framework called Slim UNETR, designed to achieve a balance between accuracy and efficiency by leveraging the advantages of both convolutional neural networks and transformers. Our method features the Slim UNETR Block as a core component, which effectively enables information exchange through self-attention mechanism decomposition and cost-effective representation aggregation. Additionally, we utilize the throughput metric as an efficiency indicator to provide feedback on model resource consumption. Our experiments demonstrate that Slim UNETR outperforms state-of-the-art models in terms of accuracy, model size, and efficiency when deployed on resource-constrained devices. Remarkably, Slim UNETR achieves 92.44% dice accuracy on BraTS2021 while being 34.6x smaller and 13.4x faster during inference compared to Swin UNETR.
基于混合变压器的分割方法在医学图像分析中显示出了巨大的前景。然而,它们在训练和推理阶段通常需要大量的计算能力和资源,这对该领域常见的资源有限的医疗应用提出了挑战。为了解决这个问题,我们提出了一个名为 Slim UNETR 的创新框架,旨在通过利用卷积神经网络和 Transformer 的优势来实现准确性和效率之间的平衡。我们的方法以 Slim UNETR Block 作为核心组件,通过自注意力机制分解和经济有效的表示聚合有效地实现信息交换。此外,我们利用吞吐量指标作为效率指标来提供有关模型资源消耗的反馈。我们的实验表明,当部署在资源受限的设备上时,Slim UNETR 在准确性、模型大小和效率方面优于最先进的模型。值得注意的是,与 Swin UNETR 相比,Slim UNETR 在 BraTS2021 上的骰子准确度达到了 92.44%,同时体积缩小了 34.6 倍,推理速度加快了 13.4 倍。

AU Park, Mi-Ae Zaha, Vlad G. Badawi, Ramsey D. Bowen, Spencer L.
AU Park、Mi-Ae Zaha、Vlad G. Badawi、Ramsey D. Bowen、Spencer L.

Supplemental Transmission Aided Attenuation Correction for Quantitative Cardiac PET
定量心脏 PET 的补充传输辅助衰减校正

Quantitative PET attenuation correction (AC) for cardiac PET/CT and PET/MR is a challenging problem. We propose and evaluate an AC approach that uses coincidences from a relatively weak and physically fixed sparse external source, in combination with that from the patient, to reconstruct $\mu $ -maps based on physics principles alone. The low 30 cm3 volume of the source makes it easy to fill and place, and the method does not use prior image data or attenuation map assumptions. Our supplemental transmission aided maximum likelihood reconstruction of attenuation and activity (sTX-MLAA) algorithm contains an attenuation map update that maximizes the likelihood of terms representing coincidences originating from tracer in the patient and a weighted expression of counts segmented from the external source alone. Both external source and patient scatter and randoms are fully corrected. We evaluated performance of sTX-MLAA compared to reference standard CT-based AC with FDG PET/CT phantom studies; including modeling a patient with myocardial inflammation. Through an ROI analysis we measured <= 5 % bias in activity concentrations for PET images generated with sTX-MLAA and a TX source strength >= 12.7$ MBq, relative to CT-AC. PET background variability (from noise and sparse sampling) was substantially reduced with sTX-MLAA compared to using counts segmented from the transmission source alone for AC. Results suggest that sTX-MLAA will enable quantitative PET during cardiac PET/CT and PET/MR of human patients.
心脏 PET/CT 和 PET/MR 的定量 PET 衰减校正 (AC) 是一个具有挑战性的问题。我们提出并评估了一种 AC 方法,该方法使用来自相对较弱且物理固定的稀疏外部源的巧合,结合来自患者的巧合,仅基于物理原理来重建 $\mu $ -图。源的体积仅为 30 cm3,因此易于填充和放置,并且该方法不使用先前的图像数据或衰减图假设。我们的补充传输辅助衰减和活动的最大似然重建 (sTX-MLAA) 算法包含衰减图更新,可最大化表示源自患者示踪剂的重合项的可能性以及仅从外部源分段的计数的加权表达式。外部源和患者分散和随机均已完全校正。我们通过 FDG PET/CT 模型研究评估了 sTX-MLAA 与基于 CT 的参考标准 AC 的性能;包括对患有心肌炎症的患者进行建模。通过 ROI 分析,我们测量到使用 sTX-MLAA 生成的 PET 图像的活性浓度存在 <= 5% 偏差,TX 源强度 >= 12.7$ MBq(相对于 CT-AC)。与仅使用从 AC 传输源分段的计数相比,sTX-MLAA 显着降低了 PET 背景变异性(来自噪声和稀疏采样)。结果表明,sTX-MLAA 将在人类患者的心脏 PET/CT 和 PET/MR 期间实现定量 PET。

AU Zhang, Yi Li, Jiayue Li, Xinyang Xie, Min Islam, Md. Tauhidul Zhang, Haixian
AU 张、李毅、李家跃、谢欣阳、Min Islam、Md. Tauhidul 张、海贤

FAOT-Net: A 1.5-Stage Framework for 3D Pelvic Lymph Node Detection With Online Candidate Tuning
FAOT-Net:通过在线候选调整进行 3D 盆腔淋巴结检测的 1.5 阶段框架

Accurate and automatic detection of pelvic lymph nodes in computed tomography (CT) scans is critical for diagnosing lymph node metastasis in colorectal cancer, which in turn plays a crucial role in its staging, treatment planning, surgical guidance, and postoperative follow-up of colorectal cancer. However, achieving high detection sensitivity and specificity poses a challenge due to the small and variable sizes of these nodes, as well as the presence of numerous similar signals within the complex pelvic CT image. To tackle these issues, we propose a 3D feature-aware online-tuning network (FAOT-Net) that introduces a novel 1.5-stage structure to seamlessly integrate detection and refinement via our online candidate tuning process and takes advantage of multi-level information through the tailored feature flow. Furthermore, we redesign the anchor fitting and anchor matching strategies to further improve detection performance in a nearly hyperparameter-free manner. Our framework achieves the FROC score of 52.8 and the sensitivity of 91.7% with 16 false positives per scan on the PLNDataset.
计算机断层扫描(CT)扫描中准确、自动检测盆腔淋巴结对于诊断结直肠癌淋巴结转移至关重要,而这对于结直肠癌的分期、治疗计划、手术指导和术后随访起着至关重要的作用。癌症。然而,由于这些节点较小且尺寸可变,以及复杂的盆腔 CT 图像中存在大量相似信号,实现高检测灵敏度和特异性提出了挑战。为了解决这些问题,我们提出了一个 3D 特征感知在线调整网络(FAOT-Net),它引入了一种新颖的 1.5 阶段结构,通过我们的在线候选调整过程无缝集成检测和细化,并通过定制的功能流程。此外,我们重新设计了锚点拟合和锚点匹配策略,以近乎无超参数的方式进一步提高检测性能。我们的框架在 PLN 数据集上的每次扫描出现 16 个误报,FROC 得分为 52.8,灵敏度为 91.7%。

AU Bian, Chenyuan Xia, Nan Xie, Anmu Cong, Shan Dong, Qian
卞卞、夏晨元、谢楠、丛安木、山东、钱

Adversarially Trained Persistent Homology Based Graph Convolutional Network for Disease Identification Using Brain Connectivity
基于对抗性训练的持久同源图卷积网络,利用大脑连接进行疾病识别

Brain disease propagation is associated with characteristic alterations in the structural and functional connectivity networks of the brain. To identify disease-specific network representations, graph convolutional networks (GCNs) have been used because of their powerful graph embedding ability to characterize the non-Euclidean structure of brain networks. However, existing GCNs generally focus on learning the discriminative region of interest (ROI) features, often ignoring important topological information that enables the integration of connectome patterns of brain activity. In addition, most methods fail to consider the vulnerability of GCNs to perturbations in network properties of the brain, which considerably degrades the reliability of diagnosis results. In this study, we propose an adversarially trained persistent homology-based graph convolutional network (ATPGCN) to capture disease-specific brain connectome patterns and classify brain diseases. First, the brain functional/structural connectivity is constructed using different neuroimaging modalities. Then, we develop a novel strategy that concatenates the persistent homology features from a brain algebraic topology analysis with readout features of the global pooling layer of a GCN model to collaboratively learn the individual-level representation. Finally, we simulate the adversarial perturbations by targeting the risk ROIs from clinical prior, and incorporate them into a training loop to evaluate the robustness of the model. The experimental results on three independent datasets demonstrate that ATPGCN outperforms existing classification methods in disease identification and is robust to minor perturbations in network architecture. Our code is available at https://github.com/CYB08/ATPGCN.
脑部疾病的传播与大脑结构和功能连接网络的特征改变有关。为了识别特定疾病的网络表示,图卷积网络(GCN)已被使用,因为它们具有强大的图嵌入能力来表征大脑网络的非欧几里德结构。然而,现有的 GCN 通常专注于学习区分性感兴趣区域 (ROI) 特征,常常忽略能够整合大脑活动连接组模式的重要拓扑信息。此外,大多数方法未能考虑 GCN 对大脑网络特性扰动的脆弱性,这大大降低了诊断结果的可靠性。在这项研究中,我们提出了一种经过对抗性训练的基于同源性的持久图卷积网络(ATPGCN)来捕获特定疾病的大脑连接组模式并对大脑疾病进行分类。首先,使用不同的神经影像模式构建大脑功能/结构连接。然后,我们开发了一种新颖的策略,将大脑代数拓扑分析中的持久同源特征与 GCN 模型的全局池化层的读出特征连接起来,以协作学习个体级别的表示。最后,我们通过针对临床先验的风险投资回报率来模拟对抗性扰动,并将其纳入训练循环中以评估模型的稳健性。三个独立数据集上的实验结果表明,ATPGCN 在疾病识别方面优于现有的分类方法,并且对网络架构中的微小扰动具有鲁棒性。我们的代码可在 https://github.com/CYB08/ATPGCN 获取。

AU Ju, Lie Yu, Zhen Wang, Lin Zhao, Xin Wang, Xin Bonnington, Paul Ge, Zongyuan
AU Ju, 烈宇, 王震, 赵林, 王鑫, Xin Bonnington, Paul Ge, 宗源

Hierarchical Knowledge Guided Learning for Real-World Retinal Disease Recognition
分层知识引导学习现实世界的视网膜疾病识别

In the real world, medical datasets often exhibit a long-tailed data distribution (i.e., a few classes occupy the majority of the data, while most classes have only a limited number of samples), which results in a challenging long-tailed learning scenario. Some recently published datasets in ophthalmology AI consist of more than 40 kinds of retinal diseases with complex abnormalities and variable morbidity. Nevertheless, more than 30 conditions are rarely seen in global patient cohorts. From a modeling perspective, most deep learning models trained on these datasets may lack the ability to generalize to rare diseases where only a few available samples are presented for training. In addition, there may be more than one disease for the presence of the retina, resulting in a challenging label co-occurrence scenario, also known as multi-label, which can cause problems when some re-sampling strategies are applied during training. To address the above two major challenges, this paper presents a novel method that enables the deep neural network to learn from a long-tailed fundus database for various retinal disease recognition. Firstly, we exploit the prior knowledge in ophthalmology to improve the feature representation using a hierarchy-aware pre-training. Secondly, we adopt an instance-wise class-balanced sampling strategy to address the label co-occurrence issue under the long-tailed medical dataset scenario. Thirdly, we introduce a novel hybrid knowledge distillation to train a less biased representation and classifier. We conducted extensive experiments on four databases, including two public datasets and two in-house databases with more than one million fundus images. The experimental results demonstrate the superiority of our proposed methods with recognition accuracy outperforming the state-of-the-art competitors, especially for these rare diseases.
在现实世界中,医学数据集通常表现出长尾数据分布(即少数类别占据大部分数据,而大多数类别只有有限数量的样本),这导致了具有挑战性的长尾学习场景。最近发布的一些眼科人工智能数据集包含 40 多种视网膜疾病,这些疾病具有复杂的异常情况和不同的发病率。然而,超过 30 种病症在全球患者队列中很少见。从建模的角度来看,大多数在这些数据集上训练的深度学习模型可能缺乏泛化到罕见疾病的能力,因为只有少数可用样本可供训练。此外,视网膜的存在可能存在不止一种疾病,从而导致具有挑战性的标签共现场景,也称为多标签,在训练期间应用一些重采样策略时可能会导致问题。为了解决上述两个主要挑战,本文提出了一种新方法,使深度神经网络能够从长尾眼底数据库中学习,以进行各种视网膜疾病识别。首先,我们利用眼科的先验知识,使用层次结构感知预训练来改进特征表示。其次,我们采用实例级类平衡采样策略来解决长尾医疗数据集场景下的标签共现问题。第三,我们引入了一种新颖的混合知识蒸馏来训练偏差较小的表示和分类器。我们对四个数据库进行了广泛的实验,包括两个公共数据集和两个内部数据库,拥有超过一百万张眼底图像。 实验结果证明了我们提出的方法的优越性,其识别精度优于最先进的竞争对手,特别是对于这些罕见疾病。

AU Lagogiannis, Ioannis Meissen, Felix Kaissis, Georgios Rueckert, Daniel
AU Lagogiannis、Ioannis Meissen、Felix Kaissis、Georgios Rueckert、Daniel

Unsupervised Pathology Detection: A Deep Dive Into the State of the Art
无监督病理检测:深入研究最先进的技术

Deep unsupervised approaches are gathering increased attention for applications such as pathology detection and segmentation in medical images since they promise to alleviate the need for large labeled datasets and are more generalizable than their supervised counterparts in detecting any kind of rare pathology. As the Unsupervised Anomaly Detection (UAD) literature continuously grows and new paradigms emerge, it is vital to continuously evaluate and benchmark new methods in a common framework, in order to reassess the state-of-the-art (SOTA) and identify promising research directions. To this end, we evaluate a diverse selection of cutting-edge UAD methods on multiple medical datasets, comparing them against the established SOTA in UAD for brain MRI. Our experiments demonstrate that newly developed feature-modeling methods from the industrial and medical literature achieve increased performance compared to previous work and set the new SOTA in a variety of modalities and datasets. Additionally, we show that such methods are capable of benefiting from recently developed self-supervised pre-training algorithms, further increasing their performance. Finally, we perform a series of experiments in order to gain further insights into some unique characteristics of selected models and datasets. Our code can be found under https://github.com/iolag/UPD_study/.
深度无监督方法越来越受到医学图像中病理检测和分割等应用的关注,因为它们有望减轻对大型标记数据集的需求,并且在检测任何类型的罕见病理方面比有监督方法更通用。随着无监督异常检测(UAD)文献的不断增长和新范式的出现,在通用框架中不断评估和基准测试新方法至关重要,以便重新评估最先进的(SOTA)并识别有前途的研究方向。为此,我们在多个医学数据集上评估了多种选择的尖端 UAD 方法,并将它们与脑 MRI UAD 中已建立的 SOTA 进行比较。我们的实验表明,与之前的工作相比,来自工业和医学文献的新开发的特征建模方法实现了更高的性能,并在各种模式和数据集中设置了新的 SOTA。此外,我们表明此类方法能够受益于最近开发的自监督预训练算法,进一步提高其性能。最后,我们进行了一系列实验,以便进一步了解所选模型和数据集的一些独特特征。我们的代码可以在 https://github.com/iolag/UPD_study/ 下找到。

AU Li, Jiajia Zhang, Pingping Wang, Teng Zhu, Lei Liu, Ruhan Yang, Xia Wang, Kaixuan Shen, Dinggang Sheng, Bin
AU 李、张佳佳、王萍萍、朱腾、刘雷、杨如涵、王霞、沉凯旋、盛定刚、斌

DSMT-Net: Dual Self-Supervised Multi-Operator Transformation for Multi-Source Endoscopic Ultrasound Diagnosis
DSMT-Net:多源内窥镜超声诊断的双重自监督多操作员改造

Pancreatic cancer has the worst prognosis of all cancers. The clinical application of endoscopic ultrasound (EUS) for the assessment of pancreatic cancer risk and of deep learning for the classification of EUS images have been hindered by inter-grader variability and labeling capability. One of the key reasons for these difficulties is that EUS images are obtained from multiple sources with varying resolutions, effective regions, and interference signals, making the distribution of the data highly variable and negatively impacting the performance of deep learning models. Additionally, manual labeling of images is time-consuming and requires significant effort, leading to the desire to effectively utilize a large amount of unlabeled data for network training. To address these challenges, this study proposes the Dual Self-supervised Multi-Operator Transformation Network (DSMT-Net) for multi-source EUS diagnosis. The DSMT-Net includes a multi-operator transformation approach to standardize the extraction of regions of interest in EUS images and eliminate irrelevant pixels. Furthermore, a transformer-based dual self-supervised network is designed to integrate unlabeled EUS images for pre-training the representation model, which can be transferred to supervised tasks such as classification, detection, and segmentation. A large-scale EUS-based pancreas image dataset (LEPset) has been collected, including 3,500 pathologically proven labeled EUS images (from pancreatic and non-pancreatic cancers) and 8,000 unlabeled EUS images for model development. The self-supervised method has also been applied to breast cancer diagnosis and was compared to state-of-the-art deep learning models on both datasets. The results demonstrate that the DSMT-Net significantly improves the accuracy of pancreatic and breast cancer diagnosis.
胰腺癌是所有癌症中预后最差的。用于评估胰腺癌风险的内窥镜超声 (EUS) 和用于分类 EUS 图像的深度学习的临床应用受到年级间差异和标记能力的阻碍。造成这些困难的关键原因之一是 EUS 图像是从多个来源获得的,具有不同的分辨率、有效区域和干扰信号,使得数据的分布高度可变,并对深度学习模型的性能产生负面影响。此外,手动标记图像非常耗时且需要付出巨大的努力,因此需要有效利用大量未标记的数据进行网络训练。为了应对这些挑战,本研究提出了用于多源 EUS 诊断的双自监督多操作员转换网络(DSMT-Net)。 DSMT-Net 包括一种多算子变换方法,用于标准化 EUS 图像中感兴趣区域的提取并消除不相关的像素。此外,基于变压器的双自监督网络被设计用于集成未标记的 EUS 图像来预训练表示模型,该模型可以转移到分类、检测和分割等监督任务。已经收集了基于 EUS 的大规模胰腺图像数据集 (LEPset),包括 3,500 张经病理学证明的标记 EUS 图像(来自胰腺癌和非胰腺癌)和 8,000 张用于模型开发的未标记 EUS 图像。这种自我监督方法也已应用于乳腺癌诊断,并与两个数据集上最先进的深度学习模型进行了比较。 结果表明 DSMT-Net 显着提高了胰腺癌和乳腺癌诊断的准确性。

AU Wang, Chong Chen, Yuanhong Liu, Fengbei Elliott, Michael Kwok, Chun Fung Pena-Solorzano, Carlos Frazer, Helen McCarthy, Davis James Carneiro, Gustavo
AU Wang、Chong Chen、Yuanhong Liu、Fengbei Elliott、Michael Kwok、Chun Fung Pena-Solorzano、Carlos Frazer、Helen McCarthy、Davis James Carneiro、Gustavo

An Interpretable and Accurate Deep-Learning Diagnosis Framework Modeled With Fully and Semi-Supervised Reciprocal Learning
通过全监督和半监督交互学习建模的可解释且准确的深度学习诊断框架

The deployment of automated deep-learning classifiers in clinical practice has the potential to streamline the diagnosis process and improve the diagnosis accuracy, but the acceptance of those classifiers relies on both their accuracy and interpretability. In general, accurate deep-learning classifiers provide little model interpretability, while interpretable models do not have competitive classification accuracy. In this paper, we introduce a new deep-learning diagnosis framework, called InterNRL, that is designed to be highly accurate and interpretable. InterNRL consists of a student-teacher framework, where the student model is an interpretable prototype-based classifier (ProtoPNet) and the teacher is an accurate global image classifier (GlobalNet). The two classifiers are mutually optimised with a novel reciprocal learning paradigm in which the student ProtoPNet learns from optimal pseudo labels produced by the teacher GlobalNet, while GlobalNet learns from ProtoPNet's classification performance and pseudo labels. This reciprocal learning paradigm enables InterNRL to be flexibly optimised under both fully- and semi-supervised learning scenarios, reaching state-of-the-art classification performance in both scenarios for the tasks of breast cancer and retinal disease diagnosis. Moreover, relying on weakly-labelled training images, InterNRL also achieves superior breast cancer localisation and brain tumour segmentation results than other competing methods.
在临床实践中部署自动化深度学习分类器有可能简化诊断过程并提高诊断准确性,但这些分类器的接受程度取决于其准确性和可解释性。一般来说,准确的深度学习分类器提供的模型可解释性很小,而可解释模型不具有竞争性的分类准确性。在本文中,我们介绍了一种新的深度学习诊断框架,称为 InterNRL,其设计高度准确且可解释。 InterNRL 由学生-教师框架组成,其中学生模型是可解释的基于原型的分类器 (ProtoPNet),教师模型是精确的全局图像分类器 (GlobalNet)。这两个分类器通过一种新颖的互惠学习范式相互优化,其中学生 ProtoPNet 从教师 GlobalNet 生成的最佳伪标签中学习,而 GlobalNet 从 ProtoPNet 的分类性能和伪标签中学习。这种互惠学习范式使 InterNRL 能够在全监督和半监督学习场景下进行灵活优化,在乳腺癌和视网膜疾病诊断任务的两种场景下都达到最先进的分类性能。此外,依靠弱标记的训练图像,InterNRL 还取得了比其他竞争方法更优异的乳腺癌定位和脑肿瘤分割结果。

AU Zhi, Shaohua Wang, Yinghui Xiao, Haonan Bai, Ti Li, Bing Tang, Yunsong Liu, Chenyang Li, Wen Li, Tian Ge, Hong Cai, Jing
区志、王少华、肖英辉、白浩南、李体、唐兵、刘云松、李晨阳、李文、戈天、蔡宏、静

Coarse-Super-Resolution-Fine Network (CoSF-Net): A Unified End-to-End Neural Network for 4D-MRI With Simultaneous Motion Estimation and Super-Resolution
粗超分辨率精细网络 (CoSF-Net):用于 4D-MRI 的统一端到端神经网络,具有同步运动估计和超分辨率

Four-dimensional magnetic resonance imaging (4D-MRI) is an emerging technique for tumor motion management in image-guided radiation therapy (IGRT). However, current 4D-MRI suffers from low spatial resolution and strong motion artifacts owing to the long acquisition time and patients' respiratory variations. If not managed properly, these limitations can adversely affect treatment planning and delivery in IGRT. In this study, we developed a novel deep learning framework called the coarse-super-resolution-fine network (CoSF-Net) to achieve simultaneous motion estimation and super-resolution within a unified model. We designed CoSF-Net by fully excavating the inherent properties of 4D-MRI, with consideration of limited and imperfectly matched training datasets. We conducted extensive experiments on multiple real patient datasets to assess the feasibility and robustness of the developed network. Compared with existing networks and three state-of-the-art conventional algorithms, CoSF-Net not only accurately estimated the deformable vector fields between the respiratory phases of 4D-MRI but also simultaneously improved the spatial resolution of 4D-MRI, enhancing anatomical features and producing 4D-MR images with high spatiotemporal resolution.
四维磁共振成像(4D-MRI)是图像引导放射治疗(IGRT)中肿瘤运动管理的新兴技术。然而,由于采集时间长和患者呼吸变化,当前的 4D-MRI 存在空间分辨率低和运动伪影强的问题。如果管理不当,这些限制可能会对 IGRT 的治疗计划和实施产生不利影响。在这项研究中,我们开发了一种称为粗超分辨率精细网络(CoSF-Net)的新型深度学习框架,以在统一模型内实现同时运动估计和超分辨率。我们充分挖掘4D-MRI的固有特性,并考虑到有限且不完全匹配的训练数据集,设计了CoSF-Net。我们对多个真实患者数据集进行了广泛的实验,以评估所开发网络的可行性和稳健性。与现有网络和三种最先进的传统算法相比,CoSF-Net不仅准确估计了4D-MRI呼吸阶段之间的可变形矢量场,而且同时提高了4D-MRI的空间分辨率,增强了解剖特征并生成具有高时空分辨率的 4D-MR 图像。

AU Zhang, Qixiang Li, Yi Xue, Cheng Wang, Haonan Li, Xiaomeng
张AU、李启翔、薛毅、王成、李浩南、小萌

GlandSAM: Injecting Morphology Knowledge into Segment Anything Model for Label-free Gland Segmentation.
GlandSAM:将形态学知识注入 Segment Anything 模型中,以实现无标记腺体分割。

This paper presents a label-free gland segmentation, GlandSAM, which achieves comparable performance with supervised methods while no label is required during its training or inference phase. We observe that the Segment Anything model produces sub-optimal results on gland dataset: It either over-segments a gland into many fractions or under-segments the gland regions by confusing many of them with the background, due to the complex morphology of glands and lack of sufficient labels. To address this challenge, our GlandSAM innovatively injects two clues about gland morphology into SAM to guide the segmentation process: (1) Heterogeneity within glands and (2) Similarity with the background. Initially, we leverage the clues to decompose the intricate glands by selectively extracting a proposal for each gland sub-region of heterogeneous appearances. Then, we inject the morphology clues into SAM in a fine-tuning manner with a novel morphology-aware semantic grouping module that explicitly groups the high-level semantics of gland sub-regions. In this way, our GlandSAM could capture comprehensive knowledge about gland morphology, and produce well-delineated and complete segmentation results. Extensive experiments conducted on the GlaS dataset and the CRAG dataset reveal that GlandSAM outperforms state-of-the-art label-free methods by a significant margin. Notably, our GlandSAM even surpasses several fully-supervised methods that require pixel-wise labels for training, which highlights the remarkable performance and potential of GlandSAM in the realm of gland segmentation.
本文提出了一种无标签腺体分割 GlandSAM,它实现了与监督方法相当的性能,而在训练或推理阶段不需要标签。我们观察到,Segment Anything 模型在腺体数据集上产生次优结果:由于腺体的复杂形态,它要么将腺体过度分割成许多部分,要么通过将其中许多部分与背景混淆来对腺体区域进行欠分割。缺乏足够的标签。为了应对这一挑战,我们的 GlandSAM 创新地将有关腺体形态的两条线索注入 SAM 中以指导分割过程:(1) 腺体内的异质性和 (2) 与背景的相似性。最初,我们利用线索通过选择性地提取具有异质外观的每个腺体子区域的建议来分解复杂的腺体。然后,我们使用新颖的形态感知语义分组模块以微调方式将形态线索注入 SAM,该模块明确对腺体子区域的高级语义进行分组。通过这种方式,我们的 GlandSAM 可以捕获有关腺体形态的全面知识,并产生清晰且完整的分割结果。在 GlaS 数据集和 CRAG 数据集上进行的大量实验表明,GlandSAM 的性能明显优于最先进的无标签方法。值得注意的是,我们的 GlandSAM 甚至超越了几种需要逐像素标签进行训练的完全监督方法,这凸显了 GlandSAM 在腺体分割领域的卓越性能和潜力。

AU Huang, Yanyan Zhao, Weiqin Fu, Yu Zhu, Lingting Yu, Lequan
黄AU, 赵艳艳, 付伟勤, 朱宇, 于凌婷, 乐泉

Unleash the Power of State Space Model for Whole Slide Image with Local Aware Scanning and Importance Resampling.
通过局部感知扫描和重要性重采样,释放整个幻灯片图像的状态空间模型的力量。

Whole slide image (WSI) analysis is gaining prominence within the medical imaging field. However, previous methods often fall short of efficiently processing entire WSIs due to their gigapixel size. Inspired by recent developments in state space models, this paper introduces a new Pathology Mamba (PAM) for more accurate and robust WSI analysis. PAM includes three carefully designed components to tackle the challenges of enormous image size, the utilization of local and hierarchical information, and the mismatch between the feature distributions of training and testing during WSI analysis. Specifically, we design a Bi-directional Mamba Encoder to process the extensive patches present in WSIs effectively and efficiently, which can handle large-scale pathological images while achieving high performance and accuracy. To further harness the local information and inherent hierarchical structure of WSI, we introduce a novel Local-aware Scanning module, which employs a local-aware mechanism alongside hierarchical scanning to adeptly capture both the local information and the overarching structure within WSIs. Moreover, to alleviate the patch feature distribution misalignment between training and testing, we propose a Test-time Importance Resampling module to conduct testing patch resampling to ensure consistency of feature distribution between the training and testing phases, and thus enhance model prediction. Extensive evaluation on nine WSI datasets with cancer subtyping and survival prediction tasks demonstrates that PAM outperforms current state-of-the-art methods and also its enhanced capability in modeling discriminative areas within WSIs. The source code is available at https://github.com/HKU-MedAI/PAM.
全幻灯片图像 (WSI) 分析在医学成像领域越来越受到重视。然而,由于 WSI 的大小为十亿像素,以前的方法通常无法有效地处理整个 WSI。受状态空间模型最新发展的启发,本文引入了新的 Pathology Mamba (PAM),以实现更准确、更稳健的 WSI 分析。 PAM 包括三个精心设计的组件,以应对巨大图像尺寸、本地和分层信息的利用以及 WSI 分析期间训练和测试的特征分布之间不匹配的挑战。具体来说,我们设计了一种双向 Mamba 编码器来有效且高效地处理 WSI 中存在的大量斑块,它可以处理大规模病理图像,同时实现高性能和准确性。为了进一步利用 WSI 的本地信息和固有的层次结构,我们引入了一种新颖的本地感知扫描模块,该模块采用本地感知机制和层次扫描来熟练地捕获 WSI 内的本地信息和总体结构。此外,为了缓解训练和测试之间的补丁特征分布不一致,我们提出了测试时重要性重采样模块来进行测试补丁重采样,以确保训练和测试阶段之间特征分布的一致性,从而增强模型预测。对包含癌症亚型和生存预测任务的 9 个 WSI 数据集进行的广泛评估表明,PAM 优于当前最先进的方法,并且其在 WSI 内的判别区域建模方面的能力也得到了增强。源代码可在 https://github.com/HKU-MedAI/PAM 获取。

AU Thies, Mareike Wagner, Fabian Maul, Noah Yu, Haijun Meier, Manuela Goldmann Schneider, Linda-Sophie Gu, Mingxuan Mei, Siyuan Folle, Lukas Preuhs, Alexander Manhart, Michael Maier, Andreas
AU Thies、Mareike Wagner、Fabian Maul、Noah Yu、海军梅尔、Manuela Goldmann Schneider、Linda-Sophie Gu、梅明轩、思源·福勒、Lukas Preuhs、Alexander Manhart、Michael Maier、Andreas

A gradient-based approach to fast and accurate head motion compensation in cone-beam CT.
一种基于梯度的方法,可在锥束 CT 中实现快速、准确的头部运动补偿。

Cone-beam computed tomography (CBCT) systems, with their flexibility, present a promising avenue for direct point-of-care medical imaging, particularly in critical scenarios such as acute stroke assessment. However, the integration of CBCT into clinical workflows faces challenges, primarily linked to long scan duration resulting in patient motion during scanning and leading to image quality degradation in the reconstructed volumes. This paper introduces a novel approach to CBCT motion estimation using a gradient-based optimization algorithm, which leverages generalized derivatives of the backprojection operator for cone-beam CT geometries. Building on that, a fully differentiable target function is formulated which grades the quality of the current motion estimate in reconstruction space. We drastically accelerate motion estimation yielding a 19-fold speed-up compared to existing methods. Additionally, we investigate the architecture of networks used for quality metric regression and propose predicting voxel-wise quality maps, favoring autoencoder-like architectures over contracting ones. This modification improves gradient flow, leading to more accurate motion estimation. The presented method is evaluated through realistic experiments on head anatomy. It achieves a reduction in reprojection error from an initial average of 3 mm to 0.61 mm after motion compensation and consistently demonstrates superior performance compared to existing approaches. The analytic Jacobian for the backprojection operation, which is at the core of the proposed method, is made publicly available. In summary, this paper contributes to the advancement of CBCT integration into clinical workflows by proposing a robust motion estimation approach that enhances efficiency and accuracy, addressing critical challenges in time-sensitive scenarios.
锥形束计算机断层扫描 (CBCT) 系统凭借其灵活性,为直接护理点医学成像提供了一种有前景的途径,特别是在急性中风评估等关键情况下。然而,将 CBCT 集成到临床工作流程中面临着挑战,主要与长扫描持续时间有关,导致扫描期间患者运动并导致重建体积中的图像质量下降。本文介绍了一种使用基于梯度的优化算法进行 CBCT 运动估计的新方法,该算法利用锥束 CT 几何形状的反投影算子的广义导数。在此基础上,制定了一个完全可微的目标函数,该函数对重建空间中当前运动估计的质量进行分级。我们极大地加速了运动估计,与现有方法相比,速度提高了 19 倍。此外,我们研究了用于质量度量回归的网络架构,并提出预测体素质量图,与收缩架构相比,更倾向于类似自动编码器的架构。此修改改进了梯度流,从而实现更准确的运动估计。通过头部解剖学的真实实验来评估所提出的方法。运动补偿后,它可以将重投影误差从最初的平均 3 毫米减少到 0.61 毫米,并且始终表现出比现有方法更优越的性能。作为所提出方法的核心的反投影运算的解析雅可比行列式是公开的。 总之,本文提出了一种强大的运动估计方法,可提高效率和准确性,解决时间敏感场景中的关键挑战,有助于推动 CBCT 融入临床工作流程。

AU Weng, Li Zhu, Zhoule Dai, Kaixin Zheng, Zhe Zhu, Junming Wu, Hemmings
翁翁、朱丽、戴周乐、郑凯欣、朱哲、吴俊明、Hemmings

Reduced-Reference Learning for Target Localization in Deep Brain Stimulation
脑深部刺激中目标定位的减少参考学习

This work proposes a supervised machine learning method for target localization in deep brain stimulation (DBS). DBS is a recognized treatment for essential tremor. The effects of DBS significantly depend on the precise implantation of electrodes. Recent research on diffusion tensor imaging shows that the optimal target for essential tremor is related to the dentato-rubro-thalamic tract (DRTT), thus DRTT targeting has become a promising direction. The tractography-based targeting is more accurate than conventional ones, but still too complicated for clinical scenarios, where only structural magnetic resonance imaging (sMRI) data is available. In order to improve efficiency and utility, we consider target localization as a non-linear regression problem in a reduced-reference learning framework, and solve it with convolutional neural networks (CNNs). The proposed method is an efficient two-step framework, and consists of two image-based networks: one for classification and the other for localization. We model the basic workflow as an image retrieval process and define relevant performance metrics. Using DRTT as pseudo groundtruths, we show that individualized tractography-based optimal targets can be inferred from sMRI data with high accuracy. For two datasets of ${280}\times {220}/{272}\times {227}$ (0.7/0.8 mm slice thickness) sMRI input, our model achieves an average posterior localization error of 2.3/1.2 mm, and a median of 1.7/1.02 mm. The proposed framework is a novel application of reduced-reference learning, and a first attempt to localize DRTT from sMRI. It significantly outperforms existing methods using 3D-CNN, anatomical and DRTT atlas, and may serve as a new baseline for general target localization problems.
这项工作提出了一种用于深部脑刺激(DBS)中目标定位的监督机器学习方法。 DBS 是公认的特发性震颤治疗方法。 DBS 的效果很大程度上取决于电极的精确植入。最近的弥散张量成像研究表明,特发性震颤的最佳靶点与齿状红丘脑束(DRTT)有关,因此DRTT靶向已成为一个有前途的方向。基于纤维束成像的靶向比传统靶向更准确,但对于只能获得结构磁共振成像(sMRI)数据的临床场景来说仍然过于复杂。为了提高效率和实用性,我们将目标定位视为减少参考学习框架中的非线性回归问题,并用卷积神经网络(CNN)来解决它。所提出的方法是一种有效的两步框架,由两个基于图像的网络组成:一个用于分类,另一个用于定位。我们将基本工作流程建模为图像检索过程,并定义相关的性能指标。使用 DRTT 作为伪事实,我们表明可以从 sMRI 数据中高精度地推断出基于个体化纤维束成像的最佳目标。对于 ${280}\times {220}/{272}\times {227}$ (0.7/0.8 mm 切片厚度) sMRI 输入的两个数据集,我们的模型实现了 2.3/1.2 mm 的平均后定位误差,并且中位数为 1.7/1.02 毫米。所提出的框架是减少参考学习的新颖应用,也是从 sMRI 定位 DRTT 的首次尝试。它明显优于使用 3D-CNN、解剖学和 DRTT 图集的现有方法,并且可以作为一般目标定位问题的新基线。

AU Xu, Chi Xu, Haozheng Giannarou, Stamatia
AU 徐、徐驰、Haozheng Giannarou、Stamatia

Distance Regression Enhanced with Temporal Information Fusion and Adversarial Training for Robot-Assisted Endomicroscopy.
通过时间信息融合和机器人辅助内镜检查的对抗训练增强距离回归。

Probe-based confocal laser endomicroscopy (pCLE) has a role in characterising tissue intraoperatively to guide tumour resection during surgery. To capture good quality pCLE data which is important for diagnosis, the probe-tissue contact needs to be maintained within a working range of micrometre scale. This can be achieved through micro-surgical robotic manipulation which requires the automatic estimation of the probe-tissue distance. In this paper, we propose a novel deep regression framework composed of the Deep Regression Generative Adversarial Network (DR-GAN) and a Sequence Attention (SA) module. The aim of DR-GAN is to train the network using an enhanced image-based supervision approach. It extents the standard generator by using a well-defined function for image generation, instead of a learnable decoder. Also, DR-GAN uses a novel learnable neural perceptual loss which combines for the first time spatial and frequency domain features. This effectively suppresses the adverse effects of noise in the pCLE data. To incorporate temporal information, we've designed the SA module which is a cross-attention module, enhanced with Radial Basis Function based encoding (SA-RBF). Furthermore, to train the regression framework, we designed a multi-step training mechanism. During inference, the trained network is used to generate data representations which are fused along time in the SA-RBF module to boost the regression stability. Our proposed network advances SOTA networks by addressing the challenge of excessive noise in the pCLE data and enhancing regression stability. It outperforms SOTA networks applied on the pCLE Regression dataset (PRD) in terms of accuracy, data quality and stability.
基于探针的共焦激光内窥镜(pCLE)在术中表征组织以指导手术期间的肿瘤切除方面发挥着作用。为了捕获对诊断很重要的高质量 pCLE 数据,探针与组织的接触需要保持在微米级的工作范围内。这可以通过显微外科机器人操作来实现,这需要自动估计探针与组织的距离。在本文中,我们提出了一种新颖的深度回归框架,由深度回归生成对抗网络(DR-GAN)和序列注意(SA)模块组成。 DR-GAN 的目标是使用增强的基于图像的监督方法来训练网络。它通过使用定义明确的图像生成函数而不是可学习的解码器来扩展标准生成器。此外,DR-GAN 使用了一种新颖的可学习神经感知损失,首次结合了空间和频域特征。这有效地抑制了pCLE数据中噪声的不利影响。为了合并时间信息,我们设计了 SA 模块,它是一个交叉注意力模块,并通过基于径向基函数的编码 (SA-RBF) 进行了增强。此外,为了训练回归框架,我们设计了一个多步骤训练机制。在推理过程中,经过训练的网络用于生成数据表示,这些数据表示在 SA-RBF 模块中随时间融合,以提高回归稳定性。我们提出的网络通过解决 pCLE 数据中过多噪声的挑战并增强回归稳定性来推进 SOTA 网络。它在准确性、数据质量和稳定性方面优于应用于 pCLE 回归数据集(PRD)的 SOTA 网络。

AU Tan, Zuopeng Zhang, Lihe Lv, Yanan Ma, Yili Lu, Huchuan
谭AU、张作鹏、吕礼合、马亚男、路一立、胡川

GroupMorph: Medical Image Registration via Grouping Network with Contextual Fusion.
GroupMorph:通过具有上下文融合的分组网络进行医学图像配准。

Pyramid-based deformation decomposition is a promising registration framework, which gradually decomposes the deformation field into multi-resolution subfields for precise registration. However, most pyramid-based methods directly produce one subfield per resolution level, which does not fully depict the spatial deformation. In this paper, we propose a novel registration model, called GroupMorph. Different from typical pyramid-based methods, we adopt the grouping-combination strategy to predict deformation field at each resolution. Specifically, we perform group-wise correlation calculation to measure the similarities of grouped features. After that, n groups of deformation subfields with different receptive fields are predicted in parallel. By composing these subfields, a deformation field with multi-receptive field ranges is formed, which can effectively identify both large and small deformations. Meanwhile, a contextual fusion module is designed to fuse the contextual features and provide the inter-group information for the field estimator of the next level. By leveraging the inter-group correspondence, the synergy among deformation subfields is enhanced. Extensive experiments on four public datasets demonstrate the effectiveness of GroupMorph. Code is available at https://github.com/TVayne/GroupMorph.
基于金字塔的变形分解是一种有前途的配准框架,它将变形场逐渐分解为多分辨率子场以进行精确配准。然而,大多数基于金字塔的方法直接为每个分辨率级别生成一个子场,这不能完全描述空间变形。在本文中,我们提出了一种新颖的注册模型,称为 GroupMorph。与典型的基于金字塔的方法不同,我们采用分组组合策略来预测每个分辨率下的变形场。具体来说,我们执行分组相关性计算来测量分组特征的相似性。之后,并行预测n组具有不同感受野的形变子场。通过组合这些子场,形成具有多个感受野范围的变形场,可以有效地识别大变形和小变形。同时,设计了上下文融合模块来融合上下文特征,为下一级的场估计器提供组间信息。通过利用组间对应关系,增强了变形子场之间的协同作用。对四个公共数据集的大量实验证明了 GroupMorph 的有效性。代码可在 https://github.com/TVayne/GroupMorph 获取。

EI 1558-254X DA 2024-05-16 UT MEDLINE:38739510 PM 38739510 ER
EI 1558-254X DA 2024-05-16 UT MEDLINE:38739510 PM 38739510 ER

AU Ambrosanio, Michele Bevacqua, Martina Teresa Lovetri, Joe Pascazio, Vito Isernia, Tommaso
AU Ambrosanio、米歇尔·贝瓦夸、玛蒂娜·特蕾莎·洛夫特里、乔·帕斯卡齐奥、维托·伊塞尔尼亚、托马索

In-Vivo Electrical Properties Estimation of Biological Tissues by Means of a Multi-Step Microwave Tomography Approach
通过多步微波断层扫描方法估计生物组织的体内电特性

The accurate quantitative estimation of the electromagnetic properties of tissues can serve important diagnostic and therapeutic medical purposes. Quantitative microwave tomography is an imaging modality that can provide maps of the in-vivo electromagnetic properties of the imaged tissues, i.e. both the permittivity and the electric conductivity. A multi-step microwave tomography approach is proposed for the accurate retrieval of such spatial maps of biological tissues. The underlying idea behind the new imaging approach is to progressively add details to the maps in a step-wise fashion starting from single-frequency qualitative reconstructions. Multi-frequency microwave data is utilized strategically in the final stage. The approach results in improved accuracy of the reconstructions compared to inversion of the data in a single step. As a case study, the proposed workflow was tested on an experimental microwave data set collected for the imaging of the human forearm. The human forearm is a good test case as it contains several soft tissues as well as bone, exhibiting a wide range of values for the electrical properties.
组织电磁特性的准确定量估计可以服务于重要的诊断和治疗医学目的。定量微波断层扫描是一种成像方式,可以提供成像组织的体内电磁特性图,即介电常数和电导率。提出了一种多步骤微波断层扫描方法,用于准确检索生物组织的此类空间图。新成像方法背后的基本思想是从单频定性重建开始,以逐步的方式逐步向地图添加细节。在最后阶段战略性地利用多频微波数据。与单步数据反演相比,该方法提高了重建的准确性。作为案例研究,所提出的工作流程在为人类前臂成像而收集的实验微波数据集上进行了测试。人类前臂是一个很好的测试用例,因为它包含多个软组织和骨骼,表现出广泛的电特性值。

AU Guo, Rui Lin, Zhichao Xin, Jingyu Li, Maokun Yang, Fan Xu, Shenheng Abubakar, Aria
郭果、林睿、辛志超、李靖宇、杨茂琨、徐帆、申恒阿布巴卡、Aria

Three Dimensional Microwave Data Inversion in Feature Space for Stroke Imaging
中风成像特征空间中的三维微波数据反演

Microwave imaging is a promising method for early diagnosing and monitoring brain strokes. It is portable, non-invasive, and safe to the human body. Conventional techniques solve for unknown electrical properties represented as pixels or voxels, but often result in inadequate structural information and high computational costs. We propose to reconstruct the three dimensional (3D) electrical properties of the human brain in a feature space, where the unknowns are latent codes of a variational autoencoder (VAE). The decoder of the VAE, with prior knowledge of the brain, acts as a module of data inversion. The codes in the feature space are optimized by minimizing the misfit between measured and simulated data. A dataset of 3D heads characterized by permittivity and conductivity is constructed to train the VAE. Numerical examples show that our method increases structural similarity by 14% and speeds up the solution process by over 3 orders of magnitude using only 4.8% number of the unknowns compared to the voxel-based method. This high-resolution imaging of electrical properties leads to more accurate stroke diagnosis and offers new insights into the study of the human brain.
微波成像是早期诊断和监测脑中风的一种有前途的方法。它便携、无创、对人体安全。传统技术解决了以像素或体素表示的未知电特性,但通常会导致结构信息不足和计算成本高昂。我们建议在特征空间中重建人脑的三维(3D)电特性,其中未知数是变分自动编码器(VAE)的潜在代码。 VAE 的解码器具有大脑的先验知识,充当数据反演的模块。通过最小化测量数据和模拟数据之间的失配来优化特征空间中的代码。构建以介电常数和电导率为特征的 3D 头部数据集来训练 VAE。数值示例表明,与基于体素的方法相比,我们的方法仅使用 4.8% 的未知数,将结构相似性提高了 14%,并将求解过程加快了 3 个数量级以上。这种高分辨率的电特性成像可以实现更准确的中风诊断,并为人类大脑的研究提供新的见解。

AU Wu, Weiwen Pan, Jiayi Wang, Yanyang Wang, Shaoyu Zhang, Jianjia
吴AU、潘伟文、王佳怡、王艳阳、张少宇、健佳

Multi-channel Optimization Generative Model for Stable Ultra-Sparse-View CT Reconstruction.
用于稳定超稀疏视图 CT 重建的多通道优化生成模型。

Score-based generative model (SGM) has risen to prominence in sparse-view CT reconstruction due to its impressive generation capability. The consistency of data is crucial in guiding the reconstruction process in SGM-based reconstruction methods. However, the existing data consistency policy exhibits certain limitations. Firstly, it employs partial data from the reconstructed image of iteration process for image updates, which leads to secondary artifacts with compromising image quality. Moreover, the updates to the SGM and data consistency are considered as distinct stages, disregarding their interdependent relationship. Additionally, the reference image used to compute gradients in the reconstruction process is derived from intermediate result rather than ground truth. Motivated by the fact that a typical SGM yields distinct outcomes with different random noise inputs, we propose a Multi-channel Optimization Generative Model (MOGM) for stable ultra-sparse-view CT reconstruction by integrating a novel data consistency term into the stochastic differential equation model. Notably, the unique aspect of this data consistency component is its exclusive reliance on original data for effectively confining generation outcomes. Furthermore, we pioneer an inference strategy that traces back from the current iteration result to ground truth, enhancing reconstruction stability through foundational theoretical support. We also establish a multi-channel optimization reconstruction framework, where conventional iterative techniques are employed to seek the reconstruction solution. Quantitative and qualitative assessments on 23 views datasets from numerical simulation, clinical cardiac and sheep's lung underscore the superiority of MOGM over alternative methods. Reconstructing from just 10 and 7 views, our method consistently demonstrates exceptional performance.
基于评分的生成模型 (SGM) 因其令人印象深刻的生成能力而在稀疏视图 CT 重建中脱颖而出。在基于 SGM 的重建方法中,数据的一致性对于指导重建过程至关重要。然而,现有的数据一致性策略存在一定的局限性。首先,它利用迭代过程中重建图像的部分数据进行图像更新,这会导致二次伪影,从而影响图像质量。此外,SGM 的更新和数据一致性被视为不同的阶段,而不考虑它们之间的相互依赖关系。此外,在重建过程中用于计算梯度的参考图像是从中间结果而不是地面实况中得出的。受典型 SGM 在不同随机噪声输入下产生不同结果这一事实的启发,我们提出了一种多通道优化生成模型 (MOGM),通过将新颖的数据一致性项集成到随机微分方程中,实现稳定的超稀疏视图 CT 重建模型。值得注意的是,该数据一致性组件的独特之处在于它完全依赖原始数据来有效限制发电结果。此外,我们首创了一种从当前迭代结果追溯到地面实况的推理策略,通过基础理论支持增强重建稳定性。我们还建立了多通道优化重建框架,采用传统的迭代技术来寻求重建解决方案。对数值模拟、临床心脏和羊肺的 23 个视图数据集进行的定量和定性评估强调了 MOGM 相对于其他方法的优越性。 仅从 10 个和 7 个视图进行重建,我们的方法始终表现出卓越的性能。

AU Liu, Jingyu Cui, Weigang Chen, Yipeng Ma, Yulan Dong, Qunxi Cai, Ran Li, Yang Hu, Bin
刘AU、崔静宇、陈伟刚、马一鹏、董玉兰、蔡群喜、李然、胡杨、斌

Deep Fusion of Multi-Template Using Spatio-Temporal Weighted Multi-Hypergraph Convolutional Networks for Brain Disease Analysis
使用时空加权多超图卷积网络进行多模板深度融合进行脑疾病分析

Conventional functional connectivity network (FCN) based on resting-state fMRI (rs-fMRI) can only reflect the relationship between pairwise brain regions. Thus, the hyper-connectivity network (HCN) has been widely used to reveal high-order interactions among multiple brain regions. However, existing HCN models are essentially spatial HCN, which reflect the spatial relevance of multiple brain regions, but ignore the temporal correlation among multiple time points. Furthermore, the majority of HCN construction and learning frameworks are limited to using a single template, while the multi-template carries richer information. To address these issues, we first employ multiple templates to parcellate the rs-fMRI into different brain regions. Then, based on the multi-template data, we propose a spatio-temporal weighted HCN (STW-HCN) to capture more comprehensive high-order temporal and spatial properties of brain activity. Next, a novel deep fusion model of multi-template called spatio-temporal weighted multi-hypergraph convolutional network (STW-MHGCN) is proposed to fuse the STW-HCN of multiple templates, which extracts the deep interrelation information between different templates. Finally, we evaluate our method on the ADNI-2 and ABIDE-I datasets for mild cognitive impairment (MCI) and autism spectrum disorder (ASD) analysis. Experimental results demonstrate that the proposed method is superior to the state-of-the-art approaches in MCI and ASD classification, and the abnormal spatio-temporal hyper-edges discovered by our method have significant significance for the brain abnormalities analysis of MCI and ASD.
传统的基于静息态功能磁共振成像(rs-fMRI)的功能连接网络(FCN)只能反映成对大脑区域之间的关系。因此,超连接网络(HCN)已被广泛用于揭示多个大脑区域之间的高阶相互作用。然而,现有的HCN模型本质上是空间HCN,反映了多个大脑区域的空间相关性,但忽略了多个时间点之间的时间相关性。此外,大多数HCN构建和学习框架仅限于使用单个模板,而多模板承载了更丰富的信息。为了解决这些问题,我们首先使用多个模板将 rs-fMRI 分割到不同的大脑区域。然后,基于多模板数据,我们提出了时空加权 HCN(STW-HCN)来捕获更全面的大脑活动的高阶时空特性。接下来,提出了一种新颖的多模板深度融合模型,称为时空加权多超图卷积网络(STW-MHGCN)来融合多个模板的STW-HCN,提取不同模板之间的深层相互关系信息。最后,我们在 ADNI-2 和 ABIDE-I 数据集上评估我们的方法,以进行轻度认知障碍 (MCI) 和自闭症谱系障碍 (ASD) 分析。实验结果表明,该方法优于 MCI 和 ASD 分类中最先进的方法,并且我们的方法发现的异常时空超边缘对于 MCI 和 ASD 的大脑异常分析具有重要意义。

AU Razavi, Raha Plonka, Gerlind Rabbani, Hossein
AU Razavi、Raha Plonka、Gerlind Rabbani、Hossein

<i>X</i>-Let's Atom Combinations for Modeling and Denoising of OCT Images by Modified Morphological Component Analysis
<i>X</i>-让我们通过改进的形态成分分析对 OCT 图像进行建模和去噪的原子组合

An improved analysis of Optical Coherence Tomography (OCT) images of the retina is of essential importance for the correct diagnosis of retinal abnormalities. Unfortunately, OCT images suffer from noise arising from different sources. In particular, speckle noise caused by the scattering of light waves strongly degrades the quality of OCT image acquisitions. In this paper, we employ a Modified Morphological Component Analysis (MMCA) to provide a new method that separates the image into components that contain different features as texture, piecewise smooth parts, and singularities along curves. Each image component is computed as a sparse representation in a suitable dictionary. To create these dictionaries, we use non-data-adaptive multi-scale ( X -let) transforms which have been shown to be well suitable to extract the special OCT image features. In this way, we reach two goals at once. On the one hand, we achieve strongly improved denoising results by applying adaptive local thresholding techniques separately to each image component. The denoising performance outperforms other state-of-the-art denoising algorithms regarding the PSNR as well as no-reference image quality assessments. On the other hand, we obtain a decomposition of the OCT images in well-interpretable image components that can be exploited for further image processing tasks, such as classification.
改进视网膜光学相干断层扫描 (OCT) 图像的分析对于正确诊断视网膜异常至关重要。不幸的是,OCT 图像受到不同来源产生的噪声的影响。特别是,光波散射引起的散斑噪声严重降低了 OCT 图像采集的质量。在本文中,我们采用改进的形态成分分析(MMCA)来提供一种新方法,将图像分成包含不同特征的成分,如纹理、分段平滑部分和沿曲线的奇点。每个图像分量都被计算为合适字典中的稀疏表示。为了创建这些字典,我们使用非数据自适应多尺度(X-let)变换,该变换已被证明非常适合提取特殊的 OCT 图像特征。这样,我们就同时达到了两个目标。一方面,我们通过对每个图像分量分别应用自适应局部阈值技术,获得了显着改善的去噪结果。在 PSNR 以及无参考图像质量评估方面,去噪性能优于其他最先进的去噪算法。另一方面,我们获得了 OCT 图像在易于解释的图像成分中的分解,可用于进一步的图像处理任务,例如分类。

AU Li, Ziyu Miller, Karla L Chen, Xi Chiew, Mark Wu, Wenchuan
AU Li, Ziyu Miller, Karla L Chen, Xi Chiew, Mark Wu, 汶川

Self-navigated 3D diffusion MRI using an optimized CAIPI sampling and structured low-rank reconstruction estimated navigator.
使用优化的 CAIPI 采样和结构化低秩重建估计导航器的自导航 3D 扩散 MRI。

3D multi-slab acquisitions are an appealing approach for diffusion MRI because they are compatible with the imaging regime delivering optimal SNR efficiency. In conventional 3D multi-slab imaging, shot-to-shot phase variations caused by motion pose challenges due to the use of multi-shot k-space acquisition. Navigator acquisition after each imaging echo is typically employed to correct phase variations, which prolongs scan time and increases the specific absorption rate (SAR). The aim of this study is to develop a highly efficient, self-navigated method to correct for phase variations in 3D multi-slab diffusion MRI without explicitly acquiring navigators. The sampling of each shot is carefully designed to intersect with the central kz=0 plane of each slab, and the multi-shot sampling is optimized for self-navigation performance while retaining decent reconstruction quality. The kz=0 intersections from all shots are jointly used to reconstruct a 2D phase map for each shot using a structured low-rank constrained reconstruction that leverages the redundancy in shot and coil dimensions. The phase maps are used to eliminate the shot-to-shot phase inconsistency in the final 3D multi-shot reconstruction. We demonstrate the method's efficacy using retrospective simulations and prospectively acquired in-vivo experiments at 1.22 mm and 1.09 mm isotropic resolutions. Compared to conventional navigated 3D multi-slab imaging, the proposed self-navigated method achieves comparable image quality while shortening the scan time by 31.7% and improving the SNR efficiency by 15.5%. The proposed method produces comparable quality of DTI and white matter tractography to conventional navigated 3D multi-slab acquisition with a much shorter scan time.
3D 多板采集是扩散 MRI 的一种有吸引力的方法,因为它们与提供最佳 SNR 效率的成像机制兼容。在传统的 3D 多板成像中,由于使用多镜头 k 空间采集,由运动引起的镜头间相位变化带来了挑战。每次成像回波后的导航器采集通常用于校正相位变化,从而延长扫描时间并增加比吸收率 (SAR)。本研究的目的是开发一种高效的自导航方法来校正 3D 多板扩散 MRI 中的相位变化,而无需显式获取导航器。每个镜头的采样都经过精心设计,与每个板的中心 kz=0 平面相交,并且多镜头采样针对自导航性能进行了优化,同时保留了良好的重建质量。来自所有炮弹的 kz=0 交点被联合用于使用利用炮弹和线圈尺寸中的冗余的结构化低秩约束重建来重建每个炮弹的二维相位图。相位图用于消除最终 3D 多镜头重建中镜头与镜头之间的相位不一致。我们使用回顾性模拟和前瞻性体内实验在 1.22 毫米和 1.09 毫米各向同性分辨率下证明了该方法的有效性。与传统的导航3D多板成像相比,所提出的自导航方法实现了可比的图像质量,同时扫描时间缩短了31.7%,信噪比效率提高了15.5%。所提出的方法可产生与传统导航 3D 多板采集相当的 DTI 和白质纤维束成像质量,且扫描时间短得多。

AU Emre, Taha Chakravarty, Arunava Rivail, Antoine Lachinov, Dmitrii Leingang, Oliver Riedl, Sophie Mai, Julia Scholl, Hendrik P. N. Sivaprasad, Sobha Rueckert, Daniel Lotery, Andrew Schmidt-Erfurth, Ursula Bogunovic, Hrvoje CA PINNACLE Consortium
AU Emre、Taha Chakravarty、Arunava Rivail、Antoine Lachinov、Dmitrii Leingang、Oliver Riedl、Sophie Mai、Julia Scholl、Hendrik PN Sivaprasad、Sobha Rueckert、Daniel Lotery、Andrew Schmidt-Erfurth、Ursula Bogunovic、Hrvoje CA PINNACLE 联盟

3DTINC: Time-Equivariant Non-Contrastive Learning for Predicting Disease Progression From Longitudinal OCTs
3DTINC:通过纵向 OCT 预测疾病进展的时间等变非对比学习

Self-supervised learning (SSL) has emerged as a powerful technique for improving the efficiency and effectiveness of deep learning models. Contrastive methods are a prominent family of SSL that extract similar representations of two augmented views of an image while pushing away others in the representation space as negatives. However, the state-of-the-art contrastive methods require large batch sizes and augmentations designed for natural images that are impractical for 3D medical images. To address these limitations, we propose a new longitudinal SSL method, 3DTINC, based on non-contrastive learning. It is designed to learn perturbation-invariant features for 3D optical coherence tomography (OCT) volumes, using augmentations specifically designed for OCT. We introduce a new non-contrastive similarity loss term that learns temporal information implicitly from intra-patient scans acquired at different times. Our experiments show that this temporal information is crucial for predicting progression of retinal diseases, such as age-related macular degeneration (AMD). After pretraining with 3DTINC, we evaluated the learned representations and the prognostic models on two large-scale longitudinal datasets of retinal OCTs where we predict the conversion to wet-AMD within a six-month interval. Our results demonstrate that each component of our contributions is crucial for learning meaningful representations useful in predicting disease progression from longitudinal volumetric scans.
自监督学习(SSL)已成为提高深度学习模型效率和有效性的强大技术。对比方法是 SSL 的一个重要家族,它提取图像的两个增强视图的相似表示,同时将表示空间中的其他视图作为底片排除。然而,最先进的对比方法需要大批量大小和针对自然图像设计的增强,这对于 3D 医学图像来说是不切实际的。为了解决这些限制,我们提出了一种基于非对比学习的新的纵向 SSL 方法 3DTINC。它旨在使用专为 OCT 设计的增强功能来学习 3D 光学相干断层扫描 (OCT) 体积的扰动不变特征。我们引入了一种新的非对比相似性损失项,它可以从不同时间获取的患者体内扫描中隐式学习时间信息。我们的实验表明,这种时间信息对于预测视网膜疾病的进展至关重要,例如年龄相关性黄斑变性(AMD)。使用 3DTINC 进行预训练后,我们在两个大规模视网膜 OCT 纵向数据集上评估了学习到的表示和预后模型,我们预测在六个月的时间间隔内向湿性 AMD 的转化。我们的结果表明,我们贡献的每个组成部分对于学习有意义的表示至关重要,这些表示有助于通过纵向体积扫描预测疾病进展。

AU Kong, Youyong Zhang, Xiaotong Wang, Wenhan Zhou, Yue Li, Yueying Yuan, Yonggui
AU Kong、张友勇、王晓彤、周文瀚、李悦、袁月英、永贵

Multi-Scale Spatial-Temporal Attention Networks for Functional Connectome Classification.
用于功能连接组分类的多尺度时空注意力网络。

Many neuropsychiatric disorders are considered to be associated with abnormalities in the functional connectivity networks of the brain. The research on the classification of functional connectivity can therefore provide new perspectives for understanding the pathology of disorders and contribute to early diagnosis and treatment. Functional connectivity exhibits a nature of dynamically changing over time, however, the majority of existing methods are unable to collectively reveal the spatial topology and time-varying characteristics. Furthermore, despite the efforts of limited spatial-temporal studies to capture rich information across different spatial scales, they have not delved into the temporal characteristics among different scales. To address above issues, we propose a novel Multi-Scale Spatial-Temporal Attention Networks (MSSTAN) to exploit the multi-scale spatial-temporal information provided by functional connectome for classification. To fully extract spatial features of brain regions, we propose a Topology Enhanced Graph Transformer module to guide the attention calculations in the learning of spatial features by incorporating topology priors. A Multi-Scale Pooling Strategy is introduced to obtain representations of brain connectome at various scales. Considering the temporal dynamic characteristics between dynamic functional connectome, we employ Locality Sensitive Hashing attention to further capture long-term dependencies in time dynamics across multiple scales and reduce the computational complexity of the original attention mechanism. Experiments on three brain fMRI datasets of MDD and ASD demonstrate the superiority of our proposed approach. In addition, benefiting from the attention mechanism in Transformer, our results are interpretable, which can contribute to the discovery of biomarkers. The code is available at https://github.com/LIST-KONG/MSSTAN.
许多神经精神疾病被认为与大脑功能连接网络的异常有关。因此,对功能连接分类的研究可以为理解疾病的病理学提供新的视角,并有助于早期诊断和治疗。功能连通性表现出随时间动态变化的性质,然而,大多数现有方法无法共同揭示空间拓扑和时变特征。此外,尽管有限的时空研究努力捕捉不同空间尺度上的丰富信息,但他们并没有深入研究不同尺度之间的时间特征。为了解决上述问题,我们提出了一种新颖的多尺度时空注意力网络(MSSTAN),以利用功能连接组提供的多尺度时空信息进行分类。为了充分提取大脑区域的空间特征,我们提出了一个拓扑增强图转换器模块,通过结合拓扑先验来指导空间特征学习中的注意力计算。引入多尺度池化策略来获取不同尺度的大脑连接组的表示。考虑到动态功能连接组之间的时间动态特征,我们采用局部敏感哈希注意力来进一步捕获跨多个尺度的时间动态的长期依赖性,并降低原始注意力机制的计算复杂度。对 MDD 和 ASD 的三个大脑功能磁共振成像数据集的实验证明了我们提出的方法的优越性。 此外,受益于 Transformer 中的注意力机制,我们的结果是可解释的,这可以有助于生物标志物的发现。该代码可从 https://github.com/LIST-KONG/MSSTAN 获取。

EI 1558-254X DA 2024-08-24 UT MEDLINE:39172603 PM 39172603 ER
EI 1558-254X DA 2024-08-24 UT MEDLINE:39172603 PM 39172603 ER

AU Chen, Ruifeng Zhang, Zhongliang Quan, Guotao Du, Yanfeng Chen, Yang Li, Yinsheng
陈AU、张瑞峰、全忠良、杜国涛、陈彦峰、李杨、银生

PRECISION: A Physics-Constrained and Noise-Controlled Diffusion Model for Photon Counting Computed Tomography.
精度:用于光子计数计算机断层扫描的物理约束和噪声控制扩散模型。

Recently, the use of photon counting detectors in computed tomography (PCCT) has attracted extensive attention. It is highly desired to improve the quality of material basis image and the quantitative accuracy of elemental composition, particularly when PCCT data is acquired at lower radiation dose levels. In this work, we develop a physics-constrained and noise-controlled diffusion model, PRECISION in short, to address the degraded quality of material basis images and inaccurate quantification of elemental composition mainly caused by imperfect noise model and/or hand-crafted regularization of material basis images, such as local smoothness and/or sparsity, leveraged in the existing direct material basis image reconstruction approaches. In stark contrast, PRECISION learns distribution-level regularization to describe the feature of ideal material basis images via training a noise-controlled spatial-spectral diffusion model. The optimal material basis images of each individual subject are sampled from this learned distribution under the constraint of the physical model of a given PCCT and the measured data obtained from the subject. PRECISION exhibits the potential to improve the quality of material basis images and the quantitative accuracy of elemental composition for PCCT.
最近,光子计数探测器在计算机断层扫描(PCCT)中的使用引起了广泛的关注。非常需要提高材料基础图像的质量和元素成分的定量精度,特别是在较低辐射剂量水平下采集 PCCT 数据时。在这项工作中,我们开发了一种物理约束和噪声控制的扩散模型,简称 PRECISION,以解决主要由不完善的噪声模型和/或手工正则化引起的材料基础图像质量下降和元素成分量化不准确的问题。在现有的直接物质基础图像重建方法中利用物质基础图像,例如局部平滑度和/或稀疏度。形成鲜明对比的是,PRECISION 通过训练噪声控制的空间光谱扩散模型来学习分布级正则化来描述理想材料基础图像的特征。在给定 PCCT 的物理模型和从受试者获得的测量数据的约束下,从该学习分布中采样每个受试者的最佳材料基础图像。 PRECISION 展现出提高 PCCT 材料基础图像质量和元素成分定量准确性的潜力。

AU Majumder, Sharmin Islam, Md. Tauhidul Taraballi, Francesca Righetti, Raffaella
AU Majumder、Sharmin Islam、Md. Tauhidul Taraballi、Francesca Righetti、Raffaella

Non-Invasive Imaging of Mechanical Properties of Cancers In Vivo Based on Transformations of the Eshelby's Tensor Using Compression Elastography
基于压缩弹性成像的埃谢尔比张量变换的体内癌症机械特性的非侵入性成像

Knowledge of the mechanical properties is of great clinical significance for diagnosis, prognosis and treatment of cancers. Recently, a new method based on Eshelby's theory to simultaneously assess Young's modulus (YM) and Poisson's ratio (PR) in tissues has been proposed. A significant limitation of this method is that accuracy of the reconstructed YM and PR is affected by the orientation/alignment of the tumor with the applied stress. In this paper, we propose a new method to reconstruct YM and PR in cancers that is invariant to the 3D orientation of the tumor with respect to the axis of applied stress. The novelty of the proposed method resides on the use of a tensor transformation to improve the robustness of Eshelby's theory and reconstruct YM and PR of tumors with high accuracy and in realistic experimental conditions. The method is validated using finite element simulations and controlled experiments using phantoms with known mechanical properties. The in vivo feasibility of the developed method is demonstrated in an orthotopic mouse model of breast cancer. Our results show that the proposed technique can estimate the YM and PR with overall accuracy of (97.06 +/- 2.42) % under all tested tumor orientations. Animal experimental data demonstrate the potential of the proposed methodology in vivo. The proposed method can significantly expand the range of applicability of the Eshelby's theory to tumors and provide new means to accurately image and quantify mechanical parameters of cancers in clinical conditions.
了解其机械性能对于癌症的诊断、预后和治疗具有重要的临床意义。最近,提出了一种基于埃谢尔比理论同时评估组织中杨氏模量(YM)和泊松比(PR)的新方法。该方法的一个显着限制是重建的 YM 和 PR 的准确性受到肿瘤与所施加应力的方向/对齐的影响。在本文中,我们提出了一种重建癌症中 YM 和 PR 的新方法,该方法对于肿瘤相对于施加应力轴的 3D 方向是不变的。该方法的新颖性在于使用张量变换来提高 Eshelby 理论的鲁棒性,并在真实的实验条件下高精度地重建肿瘤的 YM 和 PR。该方法通过有限元模拟和使用具有已知机械性能的模型的受控实验进行验证。所开发方法的体内可行性在乳腺癌原位小鼠模型中得到了证明。我们的结果表明,所提出的技术可以在所有测试的肿瘤方向下以 (97.06 +/- 2.42) % 的总体准确度估计 YM 和 PR。动物实验数据证明了所提出的方法在体内的潜力。所提出的方法可以显着扩展 Eshelby 理论对肿瘤的适用范围,并为临床条件下癌症的力学参数的精确成像和量化提供新的手段。

AU Xie, Zhiying Zeinstra, Nicole Kirby, Mitchell A. Le, Nhan Minh Murry, Charles E. Zheng, Ying Wang, Ruikang K.
AU Xie、Zhiying Zeinstra、Nicole Kirby、Mitchell A. Le、Nhan Minh Murry、Charles E. Cheng、Ying Wang、Ruikang K.

Quantifying Microvascular Structure in Healthy and Infarcted Rat Hearts Using Optical Coherence Tomography Angiography
使用光学相干断层扫描血管造影量化健康和梗塞大鼠心脏的微血管结构

Myocardial infarction (MI) is a life-threatening medical emergency resulting in coronary microvascular dysregulation and heart muscle damage. One of the primary characteristics of MI is capillary loss, which plays a significant role in the progression of this cardiovascular condition. In this study, we utilized optical coherence tomography angiography (OCTA) to image coronary microcirculation in fixed rat hearts, aiming to analyze coronary microvascular impairment post-infarction. Various angiographic metrics are presented to quantify vascular features, including the vessel area density, vessel complexity index, vessel tortuosity index, and flow impairment. Pathological differences identified from OCTA analysis are corroborated with histological analysis. The quantitative assessments reveal a significant decrease in microvascular density in the capillary-sized vessels and an enlargement for the arteriole/venule-sized vessels. Further, microvascular tortuosity and complexity exhibit an increase after myocardial infarction. The results underscore the feasibility of using OCTA to offer qualitative microvascular details and quantitative metrics, providing insights into coronary vascular network remodeling during disease progression and response to therapy.
心肌梗死 (MI) 是一种危及生命的医疗紧急情况,会导致冠状动脉微血管失调和心肌损伤。 MI 的主要特征之一是毛细血管损失,这在这种心血管疾病的进展中起着重要作用。在本研究中,我们利用光学相干断层扫描血管造影(OCTA)对固定大鼠心脏的冠状动脉微循环进行成像,旨在分析梗死后冠状动脉微血管损伤。提出了各种血管造影指标来量化血管特征,包括血管面积密度、血管复杂性指数、血管迂曲指数和血流损伤。 OCTA 分析确定的病理学差异得到组织学分析的证实。定量评估显示毛细血管大小的血管中微血管密度显着降低,而小动脉/微静脉大小的血管增大。此外,心肌梗塞后微血管的迂曲度和复杂性表现出增加。结果强调了使用 OCTA 提供定性微血管细节和定量指标的可行性,从而深入了解疾病进展和治疗反应期间的冠状血管网络重塑。

AU Deng, Shiyu Chen, Yinda Huang, Wei Zhang, Ruobing Xiong, Zhiwei
AU 邓、陈世宇、黄银达、张伟、熊若冰、志伟

Unsupervised Domain Adaptation for EM Image Denoising with Invertible Networks.
使用可逆网络进行电磁图像去噪的无监督域适应。

Electron microscopy (EM) image denoising is critical for visualization and subsequent analysis. Despite the remarkable achievements of deep learning-based non-blind denoising methods, their performance drops significantly when domain shifts exist between the training and testing data. To address this issue, unpaired blind denoising methods have been proposed. However, these methods heavily rely on image-to-image translation and neglect the inherent characteristics of EM images, limiting their overall denoising performance. In this paper, we propose the first unsupervised domain adaptive EM image denoising method, which is grounded in the observation that EM images from similar samples share common content characteristics. Specifically, we first disentangle the content representations and the noise components from noisy images and establish a shared domain-agnostic content space via domain alignment to bridge the synthetic images (source domain) and the real images (target domain). To ensure precise domain alignment, we further incorporate domain regularization by enforcing that: the pseudo-noisy images, reconstructed using both content representations and noise components, accurately capture the characteristics of the noisy images from which the noise components originate, all while maintaining semantic consistency with the noisy images from which the content representations originate. To guarantee lossless representation decomposition and image reconstruction, we introduce disentanglement-reconstruction invertible networks. Finally, the reconstructed pseudo-noisy images, paired with their corresponding clean counterparts, serve as valuable training data for the denoising network. Extensive experiments on synthetic and real EM datasets demonstrate the superiority of our method in terms of image restoration quality and downstream neuron segmentation accuracy. Our code is publicly available at https://github.com/sydeng99/DADn.
电子显微镜 (EM) 图像去噪对于可视化和后续分析至关重要。尽管基于深度学习的非盲去噪方法取得了显着的成就,但当训练数据和测试数据之间存在域转移时,其性能会显着下降。为了解决这个问题,提出了不成对的盲去噪方法。然而,这些方法严重依赖图像到图像的转换,忽略了电磁图像的固有特征,限制了它们的整体去噪性能。在本文中,我们提出了第一个无监督域自适应电磁图像去噪方法,该方法基于对来自相似样本的电磁图像具有共同内容特征的观察。具体来说,我们首先从噪声图像中分离出内容表示和噪声分量,并通过域对齐建立一个共享的与域无关的内容空间,以桥接合成图像(源域)和真实图像(目标域)。为了确保精确的域对齐,我们通过强制执行以下操作进一步合并域正则化:使用内容表示和噪声分量重建的伪噪声图像准确地捕获噪声分量所源自的噪声图像的特征,同时保持语义一致性与内容表示所源自的噪声图像。为了保证无损表示分解和图像重建,我们引入了解纠缠重建可逆网络。最后,重建的伪噪声图像与其相应的干净对应图像配对,作为去噪网络的宝贵训练数据。 对合成和真实 EM 数据集的大量实验证明了我们的方法在图像恢复质量和下游神经元分割精度方面的优越性。我们的代码可在 https://github.com/sydeng99/DADn 上公开获取。

AU Wang, Zhenguo Wu, Yaping Xia, Zeheng Chen, Xinyi Li, Xiaochen Bai, Yan Zhou, Yun Liang, Dong Zheng, Hairong Yang, Yongfeng Wang, Shanshan Wang, Meiyun Sun, Tao
王AU、吴振国、夏亚平、陈泽恒、李欣怡、白晓晨、周彦、梁云、郑东、杨海荣、王永峰、王珊珊、孙美云、陶

Non-Invasive Quantification of the Brain [<SUP>18</SUP>F]FDG-PET Using Inferred Blood Input Function Learned From Total-Body Data With Physical Constraint
使用从具有物理约束的全身数据中学习的推断血液输入功能对大脑进行无创定量 [<SUP>18</SUP>F]FDG-PET

Full quantification of brain PET requires the blood input function (IF), which is traditionally achieved through an invasive and time-consuming arterial catheter procedure, making it unfeasible for clinical routine. This study presents a deep learning based method to estimate the input function (DLIF) for a dynamic brain FDG scan. A long short-term memory combined with a fully connected network was used. The dataset for training was generated from 85 total-body dynamic scans obtained on a uEXPLORER scanner. Time-activity curves from 8 brain regions and the carotid served as the input of the model, and labelled IF was generated from the ascending aorta defined on CT image. We emphasize the goodness-of-fitting of kinetic modeling as an additional physical loss to reduce the bias and the need for large training samples. DLIF was evaluated together with existing methods in terms of RMSE, area under the curve, regional and parametric image quantifications. The results revealed that the proposed model can generate IFs that closer to the reference ones in terms of shape and amplitude compared with the IFs generated using existing methods. All regional kinetic parameters calculated using DLIF agreed with reference values, with the correlation coefficient being 0.961 (0.913) and relative bias being 1.68 +/- 8.74% (0.37 +/- 4.93%) for K-i (K-1). In terms of the visual appearance and quantification, parametric images were also highly identical to the reference images. In conclusion, our experiments indicate that a trained model can infer an image-derived IF from dynamic brain PET data, which enables subsequent reliable kinetic modeling.
脑PET的全面量化需要血液输入功能(IF),传统上这是通过侵入性且耗时的动脉导管手术来实现的,这使得其在临床常规中不可行。本研究提出了一种基于深度学习的方法来估计动态脑 FDG 扫描的输入函数 (DLIF)。使用了长短期记忆与完全连接的网络相结合。用于训练的数据集是通过 uEXPLORER 扫描仪获得的 85 次全身动态扫描生成的。来自 8 个脑区和颈动脉的时间-活动曲线作为模型的输入,标记的 IF 由 CT 图像上定义的升主动脉生成。我们强调动力学建模的拟合优度作为额外的物理损失,以减少偏差和对大型训练样本的需求。 DLIF 与现有方法一起在 RMSE、曲线下面积、区域和参数图像量化方面进行评估。结果表明,与使用现有方法生成的中频相比,所提出的模型可以生成在形状和幅度方面更接近参考中频的中频。使用 DLIF 计算的所有区域动力学参数均与参考值一致,Ki (K-1) 的相关系数为 0.961 (0.913),相对偏差为 1.68 +/- 8.74% (0.37 +/- 4.93%)。在视觉外观和量化方面,参数图像也与参考图像高度一致。总之,我们的实验表明,经过训练的模型可以从动态大脑 PET 数据中推断出图像衍生的 IF,从而实现后续可靠的动力学建模。

C1 Chinese Acad Sci, Paul C Lauterbur Res Ctr Biomed Imaging, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China C1 Zhengzhou Univ, Henan Prov Peoples Hosp, Zhengzhou Peoples Hosp, Zhengzhou 450001, Peoples R China C1 United Imaging Healthcare Grp Co Ltd, Cent Res Inst, Shanghai 201815, Peoples R China C3 United Imaging Healthcare Grp Co Ltd SN 0278-0062 EI 1558-254X DA 2024-07-22 UT WOS:001263692100016 PM 38386580 ER
C1 中国科学院,Paul C Lauterbur Res Ctr Biomed Imaging,深圳先进技术研究院,深圳 518055,人民医院 C1 河南省郑州大学人民医院,郑州人民医院,郑州 450001,人民医院 C1 联合影像医疗集团有限公司,中心研究所,上海 201815,人民 R 中国 C3 联影医疗集团有限公司 SN 0278-0062 EI 1558-254X DA 2024-07-22 UT WOS:001263692100016 PM 38386580 ER

AU Fan, Jiansong Lv, Tianxu Wang, Pei Hong, Xiaoyan Liu, Yuan Jiang, Chunjuan Ni, Jianming Li, Lihua Pan, Xiang
AU Fan、吕建松、王天旭、洪沛、刘晓燕、蒋媛、倪春娟、李建明、潘丽华、项

DCDiff: Dual-Granularity Cooperative Diffusion Models for Pathology Image Analysis.
DCDiff:用于病理图像分析的双粒度协作扩散模型。

Whole Slide Images (WSIs) are paramount in the medical field, with extensive applications in disease diagnosis and treatment. Recently, many deep-learning methods have been used to classify WSIs. However, these methods are inadequate for accurately analyzing WSIs as they treat regions in WSIs as isolated entities and ignore contextual information. To address this challenge, we propose a novel Dual-Granularity Cooperative Diffusion Model (DCDiff) for the precise classification of WSIs. Specifically, we first design a cooperative forward and reverse diffusion strategy, utilizing fine-granularity and coarse-granularity to regulate each diffusion step and gradually improve context awareness. To exchange information between granularities, we propose a coupled U-Net for dual-granularity denoising, which efficiently integrates dual-granularity consistency information using the designed Fine- and Coarse-granularity Cooperative Aware (FCCA) model. Ultimately, the cooperative diffusion features extracted by DCDiff can achieve cross-sample perception from the reconstructed distribution of training samples. Experiments on three public WSI datasets show that the proposed method can achieve superior performance over state-of-the-art methods. The code is available at https://github.com/hemo0826/DCDiff.
全幻灯片图像(WSI)在医学领域至关重要,在疾病诊断和治疗中有着广泛的应用。最近,许多深度学习方法已被用于对 WSI 进行分类。然而,这些方法不足以准确分析 WSI,因为它们将 WSI 中的区域视为孤立的实体并忽略上下文信息。为了应对这一挑战,我们提出了一种新颖的双粒度协作扩散模型(DCDiff),用于 WSI 的精确分类。具体来说,我们首先设计了一种协作的正向和反向扩散策略,利用细粒度和粗粒度来调节每个扩散步骤并逐渐提高上下文感知。为了在粒度之间交换信息,我们提出了一种用于双粒度去噪的耦合 U-Net,它使用设计的细粒度和粗粒度协作感知(FCCA)模型有效地集成了双粒度一致性信息。最终,DCDiff提取的协作扩散特征可以从训练样本的重构分布中实现跨样本感知。在三个公共 WSI 数据集上的实验表明,所提出的方法可以实现优于最先进方法的性能。该代码可在 https://github.com/hemo0826/DCDiff 获取。

AU Azampour, Mohammad Farid Mach, Kristina Fatemizadeh, Emad Demiray, Beatrice Westenfelder, Kay Steiger, Katja Eiber, Matthias Wendler, Thomas Kainz, Bernhard Navab, Nassir
AU Azampour、穆罕默德·法里德·马赫、克里斯蒂娜·法特米扎德、埃马德·德米雷、比阿特丽斯·韦斯滕菲尔德、凯·斯泰格、卡佳·艾伯、马蒂亚斯·温德勒、托马斯·凯恩斯、伯恩哈德·纳瓦布、纳西尔

Multitask Weakly Supervised Generative Network for MR-US Registration.
用于 MR-US 注册的多任务弱监督生成网络。

Registering pre-operative modalities, such as magnetic resonance imaging or computed tomography, to ultrasound images is crucial for guiding clinicians during surgeries and biopsies. Recently, deep-learning approaches have been proposed to increase the speed and accuracy of this registration problem. However, all of these approaches need expensive supervision from the ultrasound domain. In this work, we propose a multitask generative framework that needs weak supervision only from the pre-operative imaging domain during training. To perform a deformable registration, the proposed framework translates a magnetic resonance image to the ultrasound domain while preserving the structural content. To demonstrate the efficacy of the proposed method, we tackle the registration problem of pre-operative 3D MR to transrectal ultrasonography images as necessary for targeted prostate biopsies. We use an in-house dataset of 600 patients, divided into 540 for training, 30 for validation, and the remaining for testing. An expert manually segmented the prostate in both modalities for validation and test sets to assess the performance of our framework. The proposed framework achieves a 3.58 mm target registration error on the expert-selected landmarks, 89.2% in the Dice score, and 1.81 mm 95th percentile Hausdorff distance on the prostate masks in the test set. Our experiments demonstrate that the proposed generative model successfully translates magnetic resonance images into the ultrasound domain. The translated image contains the structural content and fine details due to an ultrasound-specific two-path design of the generative model. The proposed framework enables training learning-based registration methods while only weak supervision from the pre-operative domain is available.
将磁共振成像或计算机断层扫描等术前模式与超声图像配准对于指导临床医生进行手术和活检至关重要。最近,人们提出了深度学习方法来提高配准问题的速度和准确性。然而,所有这些方法都需要来自超声领域的昂贵监督。在这项工作中,我们提出了一个多任务生成框架,在训练期间仅需要来自术前成像领域的弱监督。为了执行可变形配准,所提出的框架将磁共振图像转换到超声域,同时保留结构内容。为了证明所提出方法的有效性,我们解决了术前 3D MR 与经直肠超声图像的配准问题,这是目标前列腺活检所必需的。我们使用包含 600 名患者的内部数据集,其中 540 名用于训练,30 名用于验证,其余用于测试。专家以验证和测试集的方式手动分割前列腺,以评估我们框架的性能。所提出的框架在专家选择的地标上实现了 3.58 毫米的目标配准误差,在 Dice 得分中实现了 89.2%,在测试集中的前列腺面罩上实现了 1.81 毫米的第 95 个百分位 Hausdorff 距离。我们的实验表明,所提出的生成模型成功地将磁共振图像转换到超声领域。由于生成模型的超声特定双路径设计,翻译后的图像包含结构内容和精细细节。所提出的框架能够训练基于学习的配准方法,而仅可使用来自术前域的弱监督。

AU Xu, Mengya Islam, Mobarakol Bai, Long Ren, Hongliang
AU Xu、Mengya Islam、白莫巴拉科尔、任龙、洪亮

Privacy-Preserving Synthetic Continual Semantic Segmentation for Robotic Surgery
用于机器人手术的隐私保护综合连续语义分割

Deep Neural Networks (DNNs) based semantic segmentation of the robotic instruments and tissues can enhance the precision of surgical activities in robot-assisted surgery. However, in biological learning, DNNs cannot learn incremental tasks over time and exhibit catastrophic forgetting, which refers to the sharp decline in performance on previously learned tasks after learning a new one. Specifically, when data scarcity is the issue, the model shows a rapid drop in performance on previously learned instruments after learning new data with new instruments. The problem becomes worse when it limits releasing the dataset of the old instruments for the old model due to privacy concerns and the unavailability of the data for the new or updated version of the instruments for the continual learning model. For this purpose, we develop a privacy-preserving synthetic continual semantic segmentation framework by blending and harmonizing (i) open-source old instruments foreground to the synthesized background without revealing real patient data in public and (ii) new instruments foreground to extensively augmented real background. To boost the balanced logit distillation from the old model to the continual learning model, we design overlapping class-aware temperature normalization (CAT) by controlling model learning utility. We also introduce multi-scale shifted-feature distillation (SD) to maintain long and short-range spatial relationships among the semantic objects where conventional short-range spatial features with limited information reduce the power of feature distillation. We demonstrate the effectiveness of our framework on the EndoVis 2017 and 2018 instrument segmentation dataset with a generalized continual learning setting. Code is available at https://github.com/XuMengyaAmy/Synthetic_CAT_SD.
基于深度神经网络(DNN)的机器人器械和组织语义分割可以提高机器人辅助手术中手术活动的精度。然而,在生物学习中,DNN 无法随着时间的推移学习增量任务,并表现出灾难性遗忘,这是指在学习新任务后,先前学习的任务的性能急剧下降。具体来说,当数据稀缺成为问题时,该模型在使用新仪器学习新数据后,在之前学习的仪器上表现出性能迅速下降。当由于隐私问题以及持续学习模型的新版本或更新版本的工具的数据不可用而限制发布旧模型的旧工具的数据集时,问题会变得更糟。为此,我们开发了一个保护隐私的合成连续语义分割框架,通过混合和协调(i)开源旧仪器前景到合成背景,而不在公共场合透露真实的患者数据,以及(ii)新仪器前景到广泛增强的真实数据背景。为了促进从旧模型到持续学习模型的平衡逻辑蒸馏,我们通过控制模型学习效用来设计重叠的类感知温度归一化(CAT)。我们还引入了多尺度移位特征蒸馏(SD)来维持语义对象之间的长程和短程空间关系,其中信息有限的传统短程空间特征降低了特征蒸馏的能力。我们通过广义持续学习设置在 EndoVis 2017 和 2018 仪器分割数据集上展示了我们的框架的有效性。代码可在 https://github 上获取。com/XuMengyaAmy/Synthetic_CAT_SD.

AU Liu, Pei Ji, Luping Zhang, Xinyu Ye, Feng
刘AU、季沛、张路平、叶新宇、冯

Pseudo-Bag Mixup Augmentation for Multiple Instance Learning-Based Whole Slide Image Classification
基于多实例学习的整个幻灯片图像分类的伪袋混合增强

Given the special situation of modeling gigapixel images, multiple instance learning (MIL) has become one of the most important frameworks for Whole Slide Image (WSI) classification. In current practice, most MIL networks often face two unavoidable problems in training: i) insufficient WSI data and ii) the sample memorization inclination inherent in neural networks. These problems may hinder MIL models from adequate and efficient training, suppressing the continuous performance promotion of classification models on WSIs. Inspired by the basic idea of Mixup, this paper proposes a new Pseudo-bag Mixup (PseMix) data augmentation scheme to improve the training of MIL models. This scheme generalizes the Mixup strategy for general images to special WSIs via pseudo-bags so as to be applied in MIL-based WSI classification. Cooperated by pseudo-bags, our PseMix fulfills the critical size alignment and semantic alignment in Mixup strategy. Moreover, it is designed as an efficient and decoupled method, neither involving time-consuming operations nor relying on MIL model predictions. Comparative experiments and ablation studies are specially designed to evaluate the effectiveness and advantages of our PseMix. Experimental results show that PseMix could often assist state-of-the-art MIL networks to refresh their classification performance on WSIs. Besides, it could also boost the generalization performance of MIL models in special test scenarios, and promote their robustness to patch occlusion and label noise. Our source code is available at https://github.com/liupei101/PseMix.
鉴于十亿像素图像建模的特殊情况,多实例学习(MIL)已成为全幻灯片图像(WSI)分类最重要的框架之一。在当前实践中,大多数MIL网络在训练中经常面临两个不可避免的问题:i)WSI数据不足和ii)神经网络固有的样本记忆倾向。这些问题可能会阻碍MIL模型充分有效的训练,抑制WSI上分类模型的持续性能提升。受Mixup基本思想的启发,本文提出了一种新的伪袋混合(PseMix)数据增强方案来改进MIL模型的训练。该方案通过伪袋将一般图像的 Mixup 策略推广到特殊的 WSI,从而应用于基于 MIL 的 WSI 分类。在伪袋的配合下,我们的 PseMix 实现了 Mixup 策略中的关键尺寸对齐和语义对齐。此外,它被设计为一种高效、解耦的方法,既不涉及耗时的操作,也不依赖于 MIL 模型预测。比较实验和消融研究是专门为评估我们的 PseMix 的有效性和优势而设计的。实验结果表明,PseMix 通常可以帮助最先进的 MIL 网络刷新其在 WSI 上的分类性能。此外,它还可以提高MIL模型在特殊测试场景中的泛化性能,并提高其对补丁遮挡和标签噪声的鲁棒性。我们的源代码位于 https://github.com/liupei101/PseMix。

AU Ma, Yuxi Wang, Jiacheng Yang, Jing Wang, Liansheng
区马、王雨曦、杨家成、王静、连胜

Model-Heterogeneous Semi-Supervised Federated Learning for Medical Image Segmentation
用于医学图像分割的模型异构半监督联邦学习

Medical image segmentation is crucial in clinical diagnosis, helping physicians identify and analyze medical conditions. However, this task is often accompanied by challenges like sensitive data, privacy concerns, and expensive annotations. Current research focuses on personalized collaborative training of medical segmentation systems, ignoring that obtaining segmentation annotations is time-consuming and laborious. Achieving a perfect balance between annotation cost and segmentation performance while ensuring local model personalization has become a valuable direction. Therefore, this study introduces a novel Model-Heterogeneous Semi-Supervised Federated (HSSF) Learning framework. It proposes Regularity Condensation and Regularity Fusion to transfer autonomously selective knowledge to ensure the personalization between sites. In addition, to efficiently utilize unlabeled data and reduce the annotation burden, it proposes a Self-Assessment (SA) module and a Reliable Pseudo-Label Generation (RPG) module. The SA module generates self-assessment confidence in real-time based on model performance, and the RPG module generates reliable pseudo-label based on SA confidence. We evaluate our model separately on the Skin Lesion and Polyp Lesion datasets. The results show that our model performs better than other methods characterized by heterogeneity. Moreover, it exhibits highly commendable performance even in homogeneous designs, most notably in region-based metrics. The full range of resources can be readily accessed through the designated repository located at HSSF(github.com) on the platform of GitHub.
医学图像分割在临床诊断中至关重要,可以帮助医生识别和分析医疗状况。然而,这项任务通常伴随着敏感数据、隐私问题和昂贵的注释等挑战。目前的研究主要集中在医学分割系统的个性化协同训练上,忽略了获取分割标注的耗时费力。在保证本地模型个性化的同时,在标注成本和分割性能之间实现完美平衡已成为一个有价值的方向。因此,本研究引入了一种新颖的模型-异构半监督联邦(HSSF)学习框架。它提出规则性压缩和规则性融合来传输自主选择的知识,以确保站点之间的个性化。此外,为了有效利用未标记的数据并减少注释负担,它提出了自我评估(SA)模块和可靠的伪标签生成(RPG)模块。 SA模块根据模型性能实时生成自我评估置信度,RPG模块根据SA置信度生成可靠的伪标签。我们在皮肤病变和息肉病变数据集上分别评估我们的模型。结果表明,我们的模型比其他具有异质性的方法表现得更好。此外,即使在同质设计中,它也表现出高度值得称赞的性能,尤其是在基于区域的指标中。全方位的资源可以通过GitHub平台上位于HSSF(github.com)的指定存储库轻松访问。

AU Bontempo, Gianpaolo Bolelli, Federico Porrello, Angelo Calderara, Simone Ficarra, Elisa
AU Bontempo、Gianpaolo Bolelli、Federico Porrello、Angelo Calderara、Simone Ficarra、Elisa

A Graph-Based Multi-Scale Approach With Knowledge Distillation for WSI Classification
用于 WSI 分类的基于图的多尺度知识蒸馏方法

The usage of Multi Instance Learning (MIL) for classifying Whole Slide Images (WSIs) has recently increased. Due to their gigapixel size, the pixel-level annotation of such data is extremely expensive and time-consuming, practically unfeasible. For this reason, multiple automatic approaches have been raised in the last years to support clinical practice and diagnosis. Unfortunately, most state-of-the-art proposals apply attention mechanisms without considering the spatial instance correlation and usually work on a single-scale resolution. To leverage the full potential of pyramidal structured WSI, we propose a graph-based multi-scale MIL approach, DAS-MIL. Our model comprises three modules: i) a self-supervised feature extractor, ii) a graph-based architecture that precedes the MIL mechanism and aims at creating a more contextualized representation of the WSI structure by considering the mutual (spatial) instance correlation both inter and intra-scale. Finally, iii) a (self) distillation loss between resolutions is introduced to compensate for their informative gap and significantly improve the final prediction. The effectiveness of the proposed framework is demonstrated on two well-known datasets, where we outperform SOTA on WSI classification, gaining a +2.7% AUC and +3.7% accuracy on the popular Camelyon16 benchmark.
最近,用于对整个幻灯片图像 (WSI) 进行分类的多实例学习 (MIL) 的使用有所增加。由于其大小为十亿像素,此类数据的像素级注释极其昂贵且耗时,实际上是不可行的。因此,近年来提出了多种自动方法来支持临床实践和诊断。不幸的是,大多数最先进的提案都应用注意力机制,而不考虑空间实例相关性,并且通常适用于单尺度分辨率。为了充分利用金字塔结构 WSI 的潜力,我们提出了一种基于图的多尺度 MIL 方法,DAS-MIL。我们的模型包含三个模块:i)自监督特征提取器,ii)基于图的架构,它先于 MIL 机制,旨在通过考虑相互(空间)实例相关性来创建 WSI 结构的更加上下文化的表示。和尺度内。最后,iii)引入分辨率之间的(自)蒸馏损失,以补偿它们的信息差距并显着改善最终预测。所提出的框架的有效性在两个著名的数据集上得到了证明,其中我们在 WSI 分类上的表现优于 SOTA,在流行的 Camelyon16 基准上获得了 +2.7% 的 AUC 和 +3.7% 的准确率。

AU Li, Jiawen Cheng, Junru Meng, Lingqin Yan, Hui He, Yonghong Shi, Huijuan Guan, Tian Han, Anjia
AU Li、程嘉文、孟俊如、严令勤、何慧、施永红、关慧娟、田汉、安佳

DeepTree: Pathological Image Classification Through Imitating Tree-Like Strategies of Pathologists
DeepTree:通过模仿病理学家的树状策略进行病理图像分类

Digitization of pathological slides has promoted the research of computer-aided diagnosis, in which artificial intelligence analysis of pathological images deserves attention. Appropriate deep learning techniques in natural images have been extended to computational pathology. Still, they seldom take into account prior knowledge in pathology, especially the analysis process of lesion morphology by pathologists. Inspired by the diagnosis decision of pathologists, we design a novel deep learning architecture based on tree-like strategies called DeepTree. It imitates pathological diagnosis methods, designed as a binary tree structure, to conditionally learn the correlation between tissue morphology, and optimizes branches to finetune the performance further. To validate and benchmark DeepTree, we build a dataset of frozen lung cancer tissues and design experiments on a public dataset of breast tumor subtypes and our dataset. Results show that the deep learning architecture based on tree-like strategies makes the pathological image classification more accurate, transparent, and convincing. Simultaneously, prior knowledge based on diagnostic strategies yields superior representation ability compared to alternative methods. Our proposed methodology helps improve the trust of pathologists in artificial intelligence analysis and promotes the practical clinical application of pathology-assisted diagnosis.
病理切片的数字化推动了计算机辅助诊断的研究,其中病理图像的人工智能分析值得关注。自然图像中适当的深度学习技术已扩展到计算病理学。尽管如此,他们很少考虑病理学的先验知识,尤其是病理学家对病变形态的分析过程。受到病理学家诊断决策的启发,我们设计了一种基于树状策略的新型深度学习架构,称为 DeepTree。它模仿病理诊断方法,设计为二叉树结构,有条件地学习组织形态之间的相关性,并优化分支以进一步微调性能。为了验证和基准 DeepTree,我们构建了冷冻肺癌组织的数据集,并在乳腺肿瘤亚型的公共数据集和我们的数据集上设计了实验。结果表明,基于树状策略的深度学习架构使得病理图像分类更加准确、透明和令人信服。同时,与其他方法相比,基于诊断策略的先验知识具有卓越的表示能力。我们提出的方法有助于提高病理学家对人工智能分析的信任,促进病理辅助诊断的实际临床应用。

AU Zhang, Huimin Ren, Mingyang Wang, Yu Jin, Zhiyuan Zhang, Shanxiang Liu, Jiaqian Fu, Jia Qin, Huan
张AU、任惠民、王明阳、金宇、张志远、刘善祥、付家谦、秦家、欢

In Vivo Microwave-Induced Thermoacoustic Endoscopy for Colorectal Tumor Detection in Deep Tissue
体内微波诱导热声内窥镜用于深部组织结直肠肿瘤检测

Optical endoscopy, as one of the common clinical diagnostic modalities, provides irreplaceable advantages in the diagnosis and treatment of internal organs. However, the approach is limited to the characterization of superficial tissues due to the strong optical scattering properties of tissue. In this work, a microwave-induced thermoacoustic (TA) endoscope (MTAE) was developed and evaluated. The MTAE system integrated a homemade monopole sleeve antenna (diameter = 7 mm) for providing homogenized pulsed microwave irradiation to induce a TA signal in the colorectal cavity and a side-viewing focus ultrasonic transducer (diameter = 3 mm) for detecting the TA signal in the ultrasonic spectrum to construct the image. Our MTAE, system combined microwave excitation and acoustic detection; produced images with dielectric contrast and high spatial resolution at several centimeters deep in soft tissues, overcome the current limitations of the imaging depth of optical endoscopy and mechanical wave-based imaging contrast of ultrasound endoscopy, and had the ability to extract complete features for deep location tumors that could be infiltrating and invading adjacent structures. The practical feasibility of the MTAE system was evaluated i n vivo with rabbits having colorectal tumors. The results demonstrated that colorectal tumor progression could be visualized from the changes in electromagnetic parameters of the tissue via MTAE, showing its potential clinical application.
光学内窥镜作为临床常见的诊断手段之一,在内脏器官的诊断和治疗中具有不可替代的优势。然而,由于组织的强光学散射特性,该方法仅限于浅表组织的表征。在这项工作中,开发并评估了微波诱导热声(TA)内窥镜(MTAE)。 MTAE系统集成了自制单极套筒天线(直径 = 7 mm),用于提供均匀脉冲微波辐射,以在结直肠腔中感应 TA 信号;以及侧视聚焦超声换能器(直径 = 3 mm),用于检测结直肠腔内的 TA 信号。超声波频谱来构建图像。我们的 MTAE,系统结合了微波激发和声学检测;在软组织深处产生了介电对比度和高空间分辨率的图像,克服了目前光学内窥镜成像深度和超声内窥镜基于机械波的成像对比度的限制,并具有提取完整特征进行深层定位的能力可能浸润和侵入邻近结构的肿瘤。 MTAE 系统的实际可行性在患有结直肠肿瘤的兔子身上进行了体内评估。结果表明,通过 MTAE 可以从组织电磁参数的变化中可视化结直肠肿瘤的进展,显示出其潜在的临床应用。

C1 South China Normal Univ, Coll Biophoton, MOE Key Lab Laser Life Sci, Guangzhou Key Lab Spectral Anal & Funct Probes,Gua, Guangzhou 510631, Peoples R China C1 South China Normal Univ, Inst Laser Life Sci, Guangzhou 510631, Peoples R China SN 0278-0062 EI 1558-254X DA 2024-07-02 UT WOS:001196733400008 PM 38113149 ER
C1 华南师范大学,生物光子学院,教育部激光生命科学重点实验室,广州市光谱分析与功能探针重点实验室,广州,广州 510631 C1 华南师范大学,激光生命科学研究所,广州 510631 SN 0278-0062 EI 1558-254X DA 2024-07-02 UT WOS:001196733400008 PM 38113149 ER

AU Wu, Huisi Zhang, Baiming Chen, Cheng Qin, Jing
吴AU、张惠思、陈百明、秦程、静

Federated Semi-Supervised Medical Image Segmentation via Prototype-Based Pseudo-Labeling and Contrastive Learning
通过基于原型的伪标签和对比学习进行联合半监督医学图像分割

Existing federated learning works mainly focus on the fully supervised training setting. In realistic scenarios, however, most clinical sites can only provide data without annotations due to the lack of resources or expertise. In this work, we are concerned with the practical yet challenging federated semi-supervised segmentation (FSSS), where labeled data are only with several clients and other clients can just provide unlabeled data. We take an early attempt to tackle this problem and propose a novel FSSS method with prototype-based pseudo-labeling and contrastive learning. First, we transmit a labeled-aggregated model, which is obtained based on prototype similarity, to each unlabeled client, to work together with the global model for debiased pseudo labels generation via a consistency- and entropy-aware selection strategy. Second, we transfer image-level prototypes from labeled datasets to unlabeled clients and conduct prototypical contrastive learning on unlabeled models to enhance their discriminative power. Finally, we perform the dynamic model aggregation with a designed consistency-aware aggregation strategy to dynamically adjust the aggregation weights of each local model. We evaluate our method on COVID-19 X-ray infected region segmentation, COVID-19 CT infected region segmentation and colorectal polyp segmentation, and experimental results consistently demonstrate the effectiveness of our proposed method. Codes areavailable at https://github.com/zhangbaiming/FedSemiSeg.
现有的联邦学习工作主要集中在完全监督的训练设置上。然而,在现实场景中,由于缺乏资源或专业知识,大多数临床站点只能提供没有注释的数据。在这项工作中,我们关注实用但具有挑战性的联邦半监督分割(FSSS),其中标记数据仅存在于多个客户端,而其他客户端只能提供未标记数据。我们早期尝试解决这个问题,并提出了一种新颖的 FSSS 方法,具有基于原型的伪标签和对比学习。首先,我们将基于原型相似性获得的标记聚合模型传输给每个未标记的客户端,以通过一致性和熵感知的选择策略与全局模型一起生成去偏伪标签。其次,我们将图像级原型从标记数据集转移到未标记的客户端,并对未标记的模型进行原型对比学习,以增强其判别能力。最后,我们使用设计的一致性感知聚合策略来执行动态模型聚合,以动态调整每个本地模型的聚合权重。我们在 COVID-19 X 射线感染区域分割、COVID-19 CT 感染区域分割和结直肠息肉分割上评估了我们的方法,实验结果一致证明了我们提出的方法的有效性。代码可在 https://github.com/zhangbaiming/FedSemiSeg 获取。

AU Park, Jungkyu Chledowski, Jakub Jastrzebski, Stanislaw Witowski, Jan Xu, Yanqi Du, Linda Gaddam, Sushma Kim, Eric Lewin, Alana Parikh, Ujas Plaunova, Anastasia Chen, Sardius Millet, Alexandra Park, James Pysarenko, Kristine Patel, Shalin Goldberg, Julia Wegener, Melanie Moy, Linda Heacock, Laura Reig, Beatriu Geras, Krzysztof J.
AU Park、Jungkyu Chledowski、Jakub Jastrzebski、Stanislaw Witowski、Jan Xu、Yanqi Du、Linda Gaddam、Sushma Kim、Eric Lewin、Alana Parikh、Ujas Plaunova、Anastasia Chen、Sardius Millet、Alexandra Park、James Pysarenko、Kristine Patel、Shalin Goldberg , 朱莉娅·韦格纳, 梅兰妮·莫伊, 琳达·希考克, 劳拉·雷格, 贝阿特留·杰拉斯, 克日什托夫·J.

An Efficient Deep Neural Network to Classify Large 3D Images With Small Objects
用于对大型 3D 图像和小物体进行分类的高效深度神经网络

3D imaging enables accurate diagnosis by providing spatial information about organ anatomy. However, using 3D images to train AI models is computationally challenging because they consist of 10x or 100x more pixels than their 2D counterparts. To be trained with high-resolution 3D images, convolutional neural networks resort to downsampling them or projecting them to 2D. We propose an effective alternative, a neural network that enables efficient classification of full-resolution 3D medical images. Compared to off-the-shelf convolutional neural networks, our network, 3D Globally-Aware Multiple Instance Classifier (3D-GMIC), uses 77.98%-90.05% less GPU memory and 91.23%-96.02% less computation. While it is trained only with image-level labels, without segmentation labels, it explains its predictions by providing pixel-level saliency maps. On a dataset collected at NYU Langone Health, including 85,526 patients with full-field 2D mammography (FFDM), synthetic 2D mammography, and 3D mammography, 3D-GMIC achieves an AUC of 0.831 (95% CI: 0.769-0.887) in classifying breasts with malignant findings using 3D mammography. This is comparable to the performance of GMIC on FFDM (0.816, 95% CI: 0.737-0.878) and synthetic 2D (0.826, 95% CI: 0.754-0.884), which demonstrates that 3D-GMIC successfully classified large 3D images despite focusing computation on a smaller percentage of its input compared to GMIC. Therefore, 3D-GMIC identifies and utilizes extremely small regions of interest from 3D images consisting of hundreds of millions of pixels, dramatically reducing associated computational challenges. 3D-GMIC generalizes well to BCS-DBT, an external dataset from Duke University Hospital, achieving an AUC of 0.848 (95% CI: 0.798-0.896).
3D 成像通过提供有关器官解剖结构的空间信息来实现准确诊断。然而,使用 3D 图像来训练 AI 模型在计算上具有挑战性,因为它们包含的像素比 2D 图像多 10 倍或 100 倍。为了使用高分辨率 3D 图像进行训练,卷积神经网络会对其进行下采样或将其投影为 2D。我们提出了一种有效的替代方案,即一种能够对全分辨率 3D 医学图像进行有效分类的神经网络。与现成的卷积神经网络相比,我们的网络 3D 全局感知多实例分类器 (3D-GMIC) 使用的 GPU 内存减少了 77.98%-90.05%,计算量减少了 91.23%-96.02%。虽然它仅使用图像级标签进行训练,没有分割标签,但它通过提供像素级显着性图来解释其预测。在 NYU Langone Health 收集的数据集上,包括 85,526 名接受全视野 2D 乳房 X 光检查 (FFDM)、合成 2D 乳房 X 光检查和 3D 乳房 X 光检查的患者,3D-GMIC 在乳房分类方面的 AUC 为 0.831 (95% CI: 0.769-0.887)使用 3D 乳房 X 光检查发现恶性结果。这与 GMIC 在 FFDM(0.816,95% CI:0.737-0.878)和合成 2D(0.826,95% CI:0.754-0.884)上的性能相当,这表明 3D-GMIC 尽管进行聚焦计算,仍成功对大型 3D 图像进行分类与 GMIC 相比,其投入的比例较小。因此,3D-GMIC 可识别并利用由数亿像素组成的 3D 图像中极小的感兴趣区域,从而显着减少相关的计算挑战。 3D-GMIC 可以很好地推广到杜克大学医院的外部数据集 BCS-DBT,其 AUC 为 0.848(95% CI:0.798-0.896)。

AU Zhu, Meilu Liao, Jing Liu, Jun Yuan, Yixuan
朱AU、廖美璐、刘静、袁俊、艺轩

FedOSS: Federated Open Set Recognition via Inter-Client Discrepancy and Collaboration
FedOSS:通过客户端差异和协作进行联合开放集识别

Open set recognition (OSR) aims to accurately classify known diseases and recognize unseen diseases as the unknown class in medical scenarios. However, in existing OSR approaches, gathering data from distributed sites to construct large-scale centralized training datasets usually leads to high privacy and security risk, which could be alleviated elegantly via the popular cross-site training paradigm, federated learning (FL). To this end, we represent the first effort to formulate federated open set recognition (FedOSR), and meanwhile propose a novel Federated Open Set Synthesis (FedOSS) framework to address the core challenge of FedOSR: the unavailability of unknown samples for all anticipated clients during the training phase. The proposed FedOSS framework mainly leverages two modules, i.e., Discrete Unknown Sample Synthesis (DUSS) and Federated Open Space Sampling (FOSS), to generate virtual unknown samples for learning decision boundaries between known and unknown classes. Specifically, DUSS exploits inter-client knowledge inconsistency to recognize known samples near decision boundaries and then pushes them beyond decision boundaries to synthesize discrete virtual unknown samples. FOSS unites these generated unknown samples from different clients to estimate the class-conditional distributions of open data space near decision boundaries and further samples open data, thereby improving the diversity of virtual unknown samples. Additionally, we conduct comprehensive ablation experiments to verify the effectiveness of DUSS and FOSS. FedOSS shows superior performance on public medical datasets in comparison with state-of-the-art approaches.
开放集识别(OSR)旨在对已知疾病进行准确分类,并将未见过的疾病识别为医疗场景中的未知类别。然而,在现有的 OSR 方法中,从分布式站点收集数据来构建大规模集中式训练数据集通常会导致较高的隐私和安全风险,而这可以通过流行的跨站点训练范式——联邦学习(FL)来优雅地缓解。为此,我们首次提出了联邦开放集识别(FedOSR),同时提出了一种新颖的联邦开放集综合(FedOSS)框架来解决FedOSR的核心挑战:在整个过程中,所有预期客户都无法获得未知样本。训练阶段。所提出的FedOSS框架主要利用两个模块,即离散未知样本合成(DUSS)和联合开放空间采样(FOSS)来生成虚拟未知样本,用于学习已知类和未知类之间的决策边界。具体来说,DUSS利用客户端间的知识不一致来识别决策边界附近的已知样本,然后将它们推到决策边界之外以合成离散的虚拟未知样本。 FOSS将这些来自不同客户端的生成的未知样本联合起来,估计决策边界附近开放数据空间的类条件分布,并对开放数据进行进一步采样,从而提高虚拟未知样本的多样性。此外,我们还进行了全面的消融实验来验证 DUSS 和 FOSS 的有效性。与最先进的方法相比,FedOSS 在公共医疗数据集上显示出卓越的性能。

AU Xu, Chenchu Zhang, Tong Zhang, Dong Zhang, Dingwen Han, Junwei
徐AU、张晨初、张桐、张栋、韩丁文、俊伟

Deep Generative Adversarial Reinforcement Learning for Semi-Supervised Segmentation of Low-Contrast and Small Objects in Medical Images
用于医学图像中低对比度和小物体半监督分割的深度生成对抗强化学习

Deep reinforcement learning (DRL) has demonstrated impressive performance in medical image segmentation, particularly for low-contrast and small medical objects. However, current DRL-based segmentation methods face limitations due to the optimization of error propagation in two separate stages and the need for a significant amount of labeled data. In this paper, we propose a novel deep generative adversarial reinforcement learning (DGARL) approach that, for the first time, enables end-to-end semi-supervised medical image segmentation in the DRL domain. DGARL ingeniously establishes a pipeline that integrates DRL and generative adversarial networks (GANs) to optimize both detection and segmentation tasks holistically while mutually enhancing each other. Specifically, DGARL introduces two innovative components to facilitate this integration in semi-supervised settings. First, a task-joint GAN with two discriminators links the detection results to the GAN's segmentation performance evaluation, allowing simultaneous joint evaluation and feedback. This ensures that DRL and GAN can be directly optimized based on each other's results. Second, a bidirectional exploration DRL integrates backward exploration and forward exploration to ensure the DRL agent explores the correct direction when forward exploration is disabled due to lack of explicit rewards. This mitigates the issue of unlabeled data being unable to provide rewards and rendering DRL unexplorable. Comprehensive experiments on three generalization datasets, comprising a total of 640 patients, demonstrate that our novel DGARL achieves 85.02% Dice and improves at least 1.91% for brain tumors, achieves 73.18% Dice and improves at least 4.28% for liver tumors, and achieves 70.85% Dice and improves at least 2.73% for pancreas compared to the ten most recent advanced methods, our results attest to the superiority of DGARL. Code is available at GitHub.
深度强化学习(DRL)在医学图像分割方面表现出了令人印象深刻的性能,特别是对于低对比度和小型医疗对象。然而,由于两个独立阶段的错误传播优化以及需要大量标记数据,当前基于 DRL 的分割方法面临局限性。在本文中,我们提出了一种新颖的深度生成对抗强化学习(DGARL)方法,该方法首次在 DRL 领域实现端到端的半监督医学图像分割。 DGARL 巧妙地建立了一个集成 DRL 和生成对抗网络(GAN)的管道,以整体优化检测和分割任务,同时相互增强。具体来说,DGARL 引入了两个创新组件来促进半监督环境中的这种集成。首先,具有两个判别器的任务联合 GAN 将检测结果与 GAN 的分割性能评估联系起来,从而允许同时进行联合评估和反馈。这确保了 DRL 和 GAN 可以直接基于彼此的结果进行优化。其次,双向探索 DRL 集成了后向探索和前向探索,以确保当由于缺乏显式奖励而禁用前向探索时,DRL 代理探索正确的方向。这缓解了未标记数据无法提供奖励以及导致 DRL 无法探索的问题。对三个泛化数据集(总共包括 640 名患者)的综合实验表明,我们的新型 DGARL 对于脑肿瘤实现了 85.02% Dice,改善了至少 1.91%,对于肝脏肿瘤实现了 73.18% Dice,改善了至少 4.28%,并且实现了 70.85 % 骰子并提高至少 2。与十种最新先进方法相比,胰腺的死亡率为 73%,我们的结果证明了 DGARL 的优越性。代码可在 GitHub 上获取。

AU Liu, Mingxin Liu, Yunzan Xu, Pengbo Cui, Hui Ke, Jing Ma, Jiquan
刘AU、刘明欣、徐云赞、崔鹏波、柯辉、马晶、吉泉

Exploiting Geometric Features via Hierarchical Graph Pyramid Transformer for Cancer Diagnosis Using Histopathological Images
通过分层图金字塔变压器利用几何特征使用组织病理学图像进行癌症诊断

Cancer is widely recognized as the primary cause of mortality worldwide, and pathology analysis plays a pivotal role in achieving accurate cancer diagnosis. The intricate representation of features in histopathological images encompasses abundant information crucial for disease diagnosis, regarding cell appearance, tumor microenvironment, and geometric characteristics. However, recent deep learning methods have not adequately exploited geometric features for pathological image classification due to the absence of effective descriptors that can capture both cell distribution and gathering patterns, which often serve as potent indicators. In this paper, inspired by clinical practice, a Hierarchical Graph Pyramid Transformer (HGPT) is proposed to guide pathological image classification by effectively exploiting a geometric representation of tissue distribution which was ignored by existing state-of-the-art methods. First, a graph representation is constructed according to morphological feature of input pathological image and learn geometric representation through the proposed multi-head graph aggregator. Then, the image and its graph representation are feed into the transformer encoder layer to model long-range dependency. Finally, a locality feature enhancement block is designed to enhance the 2D local representation of feature embedding, which is not well explored in the existing vision transformers. An extensive experimental study is conducted on Kather-5K, MHIST, NCT-CRC-HE, and GasHisSDB for binary or multi-category classification of multiple cancer types. Results demonstrated that our method is capable of consistently reaching superior classification outcomes for histopathological images, which provide an effective diagnostic tool for malignant tumors in clinical practice.
癌症被广泛认为是全世界死亡的主要原因,病理分析在实现准确的癌症诊断中发挥着关键作用。组织病理学图像中特征的复杂表示包含对疾病诊断至关重要的丰富信息,包括细胞外观、肿瘤微环境和几何特征。然而,由于缺乏可以捕获细胞分布和聚集模式的有效描述符,而这些描述符通常可以作为有效的指标,因此最近的深度学习方法尚未充分利用几何特征进行病理图像分类。在本文中,受临床实践的启发,提出了一种层次图金字塔变换器(HGPT),通过有效利用现有最先进方法忽略的组织分布的几何表示来指导病理图像分类。首先,根据输入病理图像的形态特征构建图表示,并通过所提出的多头图聚合器学习几何表示。然后,图像及其图形表示形式被输入到变压器编码器层以对远程依赖性进行建模。最后,设计了局部特征增强块来增强特征嵌入的二维局部表示,这在现有的视觉变换器中没有得到很好的探索。在 Kather-5K、MHIST、NCT-CRC-HE 和 GasHisSDB 上进行了广泛的实验研究,用于多种癌症类型的二元或多类别分类。 结果表明,我们的方法能够始终如一地达到组织病理学图像的优异分类结果,为临床实践中的恶性肿瘤提供有效的诊断工具。

AU Sun, Jiarui Li, Qiuxuan Liu, Yuhao Liu, Yichuan Coatrieux, Gouenou Coatrieux, Jean-Louis Chen, Yang Lu, Jie
AU Sun、李嘉瑞、刘秋轩、刘宇豪、Yichuan Coatrieux、Gouenou Coatrieux、Jean-Louis Chen、杨路、杰

Pathological Asymmetry-Guided Progressive Learning for Acute Ischemic Stroke Infarct Segmentation.
病理不对称引导的急性缺血性中风梗塞分割的渐进式学习。

Quantitative infarct estimation is crucial for diagnosis, treatment and prognosis in acute ischemic stroke (AIS) patients. As the early changes of ischemic tissue are subtle and easily confounded by normal brain tissue, it remains a very challenging task. However, existing methods often ignore or confuse the contribution of different types of anatomical asymmetry caused by intrinsic and pathological changes to segmentation. Further, inefficient domain knowledge utilization leads to mis-segmentation for AIS infarcts. Inspired by this idea, we propose a pathological asymmetry-guided progressive learning (PAPL) method for AIS infarct segmentation. PAPL mimics the step-by-step learning patterns observed in humans, including three progressive stages: knowledge preparation stage, formal learning stage, and examination improvement stage. First, knowledge preparation stage accumulates the preparatory domain knowledge of the infarct segmentation task, helping to learn domain-specific knowledge representations to enhance the discriminative ability for pathological asymmetries by constructed contrastive learning task. Then, formal learning stage efficiently performs end-to-end training guided by learned knowledge representations, in which the designed feature compensation module (FCM) can leverage the anatomy similarity between adjacent slices from the volumetric medical image to help aggregate rich anatomical context information. Finally, examination improvement stage encourages improving the infarct prediction from the previous stage, where the proposed perception refinement strategy (RPRS) further exploits the bilateral difference comparison to correct the mis-segmentation infarct regions by adaptively regional shrink and expansion. Extensive experiments on public and in-house NCCT datasets demonstrated the superiority of the proposed PAPL, which is promising to help better stroke evaluation and treatment.
定量梗塞评估对于急性缺血性卒中(AIS)患者的诊断、治疗和预后至关重要。由于缺血组织的早期变化很微妙,很容易与正常脑组织混淆,因此这仍然是一项非常具有挑战性的任务。然而,现有的方法经常忽略或混淆由内在和病理变化引起的不同类型的解剖不对称对分割的贡献。此外,低效的领域知识利用会导致 AIS 梗塞的错误分割。受这个想法的启发,我们提出了一种用于 AIS 梗塞分割的病理不对称引导渐进学习(PAPL)方法。 PAPL模仿人类观察到的循序渐进的学习模式,包括三个渐进阶段:知识准备阶段、正式学习阶段和考试改进阶段。首先,知识准备阶段积累梗塞分割任务的准备领域知识,通过构建对比学习任务帮助学习特定领域的知识表示,以增强对病理不对称的判别能力。然后,正式学习阶段在学习的知识表示的指导下有效地执行端到端训练,其中设计的特征补偿模块(FCM)可以利用体积医学图像中相邻切片之间的解剖相似性来帮助聚合丰富的解剖上下文信息。最后,检查改进阶段鼓励改进前一阶段的梗塞预测,其中提出的感知细化策略(RPRS)进一步利用双边差异比较,通过自适应区域收缩和扩展来纠正错误分割的梗塞区域。 对公共和内部 NCCT 数据集的广泛实验证明了所提出的 PAPL 的优越性,它有望帮助更好的中风评估和治疗。

AU Wang, Hongqiu Chen, Jian Zhang, Shichen He, Yuan Xu, Jinfeng Wu, Mengwan He, Jinlan Liao, Wenjun Luo, Xiangde
王AU、陈红秋、张健、何世辰、徐媛、吴金峰、何梦万、廖金兰、罗文君、祥德

Dual-Reference Source-Free Active Domain Adaptation for Nasopharyngeal Carcinoma Tumor Segmentation across Multiple Hospitals.
跨多个医院鼻咽癌肿瘤分割的双参考无源主动域适应。

Nasopharyngeal carcinoma (NPC) is a prevalent and clinically significant malignancy that predominantly impacts the head and neck area. Precise delineation of the Gross Tumor Volume (GTV) plays a pivotal role in ensuring effective radiotherapy for NPC. Despite recent methods that have achieved promising results on GTV segmentation, they are still limited by lacking carefully-annotated data and hard-to-access data from multiple hospitals in clinical practice. Although some unsupervised domain adaptation (UDA) has been proposed to alleviate this problem, unconditionally mapping the distribution distorts the underlying structural information, leading to inferior performance. To address this challenge, we devise a novel Sourece-Free Active Domain Adaptation framework to facilitate domain adaptation for the GTV segmentation task. Specifically, we design a dual reference strategy to select domain-invariant and domain-specific representative samples from a specific target domain for annotation and model fine-tuning without relying on source-domain data. Our approach not only ensures data privacy but also reduces the workload for oncologists as it just requires annotating a few representative samples from the target domain and does not need to access the source data. We collect a large-scale clinical dataset comprising 1057 NPC patients from five hospitals to validate our approach. Experimental results show that our method outperforms the previous active learning (e.g., AADA and MHPL) and UDA (e.g., Tent and CPR) methods, and achieves comparable results to the fully supervised upper bound, even with few annotations, highlighting the significant medical utility of our approach. In addition, there is no public dataset about multi-center NPC segmentation, we will release code and dataset for future research (Git).
鼻咽癌(NPC)是一种常见且具有临床意义的恶性肿瘤,主要影响头颈部区域。精确描绘大体肿瘤体积(GTV)对于确保鼻咽癌的有效放射治疗起着关键作用。尽管最近的方法在 GTV 分割方面取得了可喜的结果,但它们仍然受到临床实践中缺乏仔细注释的数据和难以访问来自多个医院的数据的限制。尽管已经提出了一些无监督域适应(UDA)来缓解这个问题,但无条件映射分布会扭曲底层结构信息,导致性能较差。为了应对这一挑战,我们设计了一种新颖的无源主动域适应框架,以促进 GTV 分割任务的域适应。具体来说,我们设计了一种双重参考策略,从特定目标域中选择域不变和域特定的代表性样本进行注释和模型微调,而不依赖于源域数据。我们的方法不仅确保了数据隐私,还减少了肿瘤学家的工作量,因为它只需要注释目标域中的一些代表性样本,而不需要访问源数据。我们收集了来自五家医院的 1057 名鼻咽癌患者的大规模临床数据集来验证我们的方法。实验结果表明,我们的方法优于之前的主动学习(例如,AADA 和 MHPL)和 UDA(例如,Tent 和 CPR)方法,并且即使没有很少的注释,也能达到与完全监督上限相当的结果,凸显了重要的医疗效用我们的方法。 另外,目前还没有关于多中心NPC分割的公共数据集,我们将发布代码和数据集以供将来的研究(Git)。

AU Yang, Yan Yu, Jun Fu, Zhenqi Zhang, Ke Yu, Ting Wang, Xianyun Jiang, Hanliang Lv, Junhui Huang, Qingming Han, Weidong
欧阳、余彦、付军、张振奇、于柯、王婷、蒋先云、吕汉良、黄俊辉、韩清明、卫东

Token-Mixer: Bind Image and Text in One Embedding Space for Medical Image Reporting.
令牌混合器:将图像和文本绑定在一个嵌入空间中,用于医学图像报告。

Medical image reporting focused on automatically generating the diagnostic reports from medical images has garnered growing research attention. In this task, learning cross-modal alignment between images and reports is crucial. However, the exposure bias problem in autoregressive text generation poses a notable challenge, as the model is optimized by a word-level loss function using the teacher-forcing strategy. To this end, we propose a novel Token-Mixer framework that learns to bind image and text in one embedding space for medical image reporting. Concretely, Token-Mixer enhances the cross-modal alignment by matching image-to-text generation with text-to-text generation that suffers less from exposure bias. The framework contains an image encoder, a text encoder and a text decoder. In training, images and paired reports are first encoded into image tokens and text tokens, and these tokens are randomly mixed to form the mixed tokens. Then, the text decoder accepts image tokens, text tokens or mixed tokens as prompt tokens and conducts text generation for network optimization. Furthermore, we introduce a tailored text decoder and an alternative training strategy that well integrate with our Token-Mixer framework. Extensive experiments across three publicly available datasets demonstrate Token-Mixer successfully enhances the image-text alignment and thereby attains a state-of-the-art performance. Related codes are available at https://github.com/yangyan22/Token-Mixer.
专注于从医学图像自动生成诊断报告的医学图像报告已引起越来越多的研究关注。在此任务中,学习图像和报告之间的跨模式对齐至关重要。然而,自回归文本生成中的暴露偏差问题提出了一个显着的挑战,因为该模型是使用教师强制策略通过单词级损失函数进行优化的。为此,我们提出了一种新颖的令牌混合器框架,该框架学习将图像和文本绑定在一个嵌入空间中以进行医学图像报告。具体来说,Token-Mixer 通过将图像到文本生成与受曝光偏差影响较小的文本到文本生成相匹配来增强跨模式对齐。该框架包含图像编码器、文本编码器和文本解码器。在训练中,图像和配对报告首先被编码为图像标记和文本标记,并且这些标记被随机混合以形成混合标记。然后,文本解码器接受图像标记、文本标记或混合标记作为提示标记,并进行文本生成以进行网络优化。此外,我们引入了定制的文本解码器和替代训练策略,它们与我们的令牌混合器框架很好地集成。在三个公开可用的数据集上进行的广泛实验表明,Token-Mixer 成功地增强了图像文本对齐,从而获得了最先进的性能。相关代码可参见https://github.com/yangyan22/Token-Mixer。

AU Zhang, Yumin Li, Hongliu Gao, Yajun Duan, Haoran Huang, Yawen Zheng, Yefeng
张AU、李玉民、高红柳、段亚军、黄浩然、郑亚文、叶峰

Prototype Correlation Matching and Class-Relation Reasoning for Few-Shot Medical Image Segmentation.
少镜头医学图像分割的原型相关匹配和类关系推理。

Few-shot medical image segmentation has achieved great progress in improving accuracy and efficiency of medical analysis in the biomedical imaging field. However, most existing methods cannot explore inter-class relations among base and novel medical classes to reason unseen novel classes. Moreover, the same kind of medical class has large intra-class variations brought by diverse appearances, shapes and scales, thus causing ambiguous visual characterization to degrade generalization performance of these existing methods on unseen novel classes. To address the above challenges, in this paper, we propose a Prototype correlation Matching and Class-relation Reasoning (i.e., PMCR) model. The proposed model can effectively mitigate false pixel correlation matches caused by large intra-class variations while reasoning inter-class relations among different medical classes. Specifically, in order to address false pixel correlation match brought by large intra-class variations, we propose a prototype correlation matching module to mine representative prototypes that can characterize diverse visual information of different appearances well. We aim to explore prototypelevel rather than pixel-level correlation matching between support and query features via optimal transport algorithm to tackle false matches caused by intra-class variations. Meanwhile, in order to explore inter-class relations, we design a class-relation reasoning module to segment unseen novel medical objects via reasoning inter-class relations between base and novel classes. Such inter-class relations can be well propagated to semantic encoding of local query features to improve few-shot segmentation performance. Quantitative comparisons illustrates the large performance improvement of our model over other baseline methods.
小样本医学图像分割在提高生物医学成像领域医学分析的准确性和效率方面取得了巨大进步。然而,大多数现有方法无法探索基础医学类和新医学类之间的类间关系来推理未见过的新类。此外,同一类型的医学类由于不同的外观、形状和尺度而具有较大的类内差异,从而导致模糊的视觉表征,从而降低了这些现有方法对未知新类的泛化性能。为了解决上述挑战,在本文中,我们提出了原型相关匹配和类关系推理(即 PMCR)模型。所提出的模型可以有效地减轻由大的类内变化引起的错误像素相关匹配,同时推理不同医学类别之间的类间关系。具体来说,为了解决较大的类内变化带来的错误像素相关匹配问题,我们提出了一种原型相关匹配模块来挖掘能够很好地表征不同外观的各种视觉信息的代表性原型。我们的目标是通过最佳传输算法探索支持和查询特征之间的原型级而不是像素级相关匹配,以解决由类内变化引起的错误匹配。同时,为了探索类间关系,我们设计了一个类关系推理模块,通过推理基础类和新类之间的类间关系来分割未见过的新颖医疗对象。这种类间关系可以很好地传播到本地查询特征的语义编码,以提高少样本分割性能。 定量比较表明我们的模型相对于其他基线方法有巨大的性能改进。

AU Li, Kang Zhu, Yu Yu, Lequan Heng, Pheng-Ann
区莉、朱康、余宇、恒乐泉、彭安

A Dual Enrichment Synergistic Strategy to Handle Data Heterogeneity for Domain Incremental Cardiac Segmentation
处理域增量心脏分割数据异质性的双重丰富协同策略

Upon remarkable progress in cardiac image segmentation, contemporary studies dedicate to further upgrading model functionality toward perfection, through progressively exploring the sequentially delivered datasets over time by domain incremental learning. Existing works mainly concentrated on addressing the heterogeneous style variations, but overlooked the critical shape variations across domains hidden behind the sub-disease composition discrepancy. In case the updated model catastrophically forgets the sub-diseases that were learned in past domains but are no longer present in the subsequent domains, we proposed a dual enrichment synergistic strategy to incrementally broaden model competence for a growing number of sub-diseases. The data-enriched scheme aims to diversify the shape composition of current training data via displacement-aware shape encoding and decoding, to gradually build up the robustness against cross-domain shape variations. Meanwhile, the model-enriched scheme intends to strengthen model capabilities by progressively appending and consolidating the latest expertise into a dynamically-expanded multi-expert network, to gradually cultivate the generalization ability over style-variated domains. The above two schemes work in synergy to collaboratively upgrade model capabilities in two-pronged manners. We have extensively evaluated our network with the ACDC and M&Ms datasets in single-domain and compound-domain incremental learning settings. Our approach outperformed other competing methods and achieved comparable results to the upper bound.
随着心脏图像分割取得显着进展,当代研究致力于通过领域增量学习逐步探索随时间推移顺序交付的数据集,从而进一步将模型功能升级到完美。现有的工作主要集中在解决异质风格变化,但忽视了隐藏在子疾病组成差异背后的跨领域的关键形状变化。如果更新的模型灾难性地忘记了在过去领域中学到的但在后续领域中不再存在的子疾病,我们提出了一种双重富集协同策略,以逐步扩大模型针对越来越多的子疾病的能力。数据丰富方案旨在通过位移感知形状编码和解码来使当前训练数据的形状组成多样化,以逐步建立针对跨域形状变化的鲁棒性。同时,模型丰富方案旨在通过将最新的专业知识逐步附加和整合到动态扩展的多专家网络中来增强模型能力,逐步培养不同风格领域的泛化能力。上述两种方案协同作用,双管齐下协同升级模型能力。我们在单域和复合域增量学习设置中使用 ACDC 和 M&Ms 数据集广泛评估了我们的网络。我们的方法优于其他竞争方法,并取得了与上限相当的结果。

AU Ren, Zhimei Sidky, Emil Y. Barber, Rina Foygel Kao, Chien-Min Pan, Xiaochuan
AU Ren、Zhimei Sidky、Emil Y. Barber、Rina Foygel Kao、Chien-Min Pan、小川

Simultaneous Activity and Attenuation Estimation in TOF-PET With TV-Constrained Nonconvex Optimization
利用电视约束非凸优化同时估计 TOF-PET 中的活性和衰减

An alternating direction method of multipliers (ADMM) framework is developed for nonsmooth biconvex optimization for inverse problems in imaging. In particular, the simultaneous estimation of activity and attenuation (SAA) problem in time-of-flight positron emission tomography (TOF-PET) has such a structure when maximum likelihood estimation (MLE) is employed. The ADMM framework is applied to MLE for SAA in TOF-PET, resulting in the ADMM-SAA algorithm. This algorithm is extended by imposing total variation (TV) constraints on both the activity and attenuation map, resulting in the ADMM-TVSAA algorithm. The performance of this algorithm is illustrated using the penalized maximum likelihood activity and attenuation estimation (P-MLAA) algorithm as a reference.
开发了一种交替方向乘法器 (ADMM) 框架,用于成像反问题的非光滑双凸优化。特别地,当采用最大似然估计(MLE)时,飞行时间正电子发射断层扫描(TOF-PET)中的活性和衰减的同时估计(SAA)问题具有这样的结构。 ADMM框架应用于TOF-PET中SAA的MLE,产生了ADMM-SAA算法。通过对活动图和衰减图施加总变分 (TV) 约束来扩展该算法,从而形成 ADMM-TVSAA 算法。使用惩罚最大似然活动和衰减估计(P-MLAA)算法作为参考来说明该算法的性能。

AU Tuccio, Giulia Afrakhteh, Sajjad Iacca, Giovanni Demi, Libertario
AU Tuccio、Giulia Afrakhteh、Sajjad Iacca、Giovanni Demi、Libertario

Time Efficient Ultrasound Localization Microscopy Based on A Novel Radial Basis Function 2D Interpolation
基于新型径向基函数二维插值的省时超声定位显微镜

Ultrasound localization microscopy (ULM) allows for the generation of super-resolved (SR) images of the vasculature by precisely localizing intravenously injected microbubbles. Although SR images may be useful for diagnosing and treating patients, their use in the clinical context is limited by the need for prolonged acquisition times and high frame rates. The primary goal of our study is to relax the requirement of high frame rates to obtain SR images. To this end, we propose a new time-efficient ULM (TEULM) pipeline built on a cutting-edge interpolation method. More specifically, we suggest employing Radial Basis Functions (RBFs) as interpolators to estimate the missing values in the 2-dimensional (2D) spatio-temporal structures. To evaluate this strategy, we first mimic the data acquisition at a reduced frame rate by applying a down-sampling (DS = 2, 4, 8, and 10) factor to high frame rate ULM data. Then, we up-sample the data to the original frame rate using the suggested interpolation to reconstruct the missing frames. Finally, using both the original high frame rate data and the interpolated one, we reconstruct SR images using the ULM framework steps. We evaluate the proposed TEULM using four in vivo datasets, a Rat brain (dataset A), a Rat kidney (dataset B), a Rat tumor (dataset C) and a Rat brain bolus (dataset D), interpolating at the in-phase and quadrature (IQ) level. Results demonstrate the effectiveness of TEULM in recovering vascular structures, even at a DS rate of 10 (corresponding to a frame rate of sub-100Hz). In conclusion, the proposed technique is successful in reconstructing accurate SR images while requiring frame rates of one order of magnitude lower than standard ULM.
超声定位显微镜(ULM)可以通过精确定位静脉注射的微泡来生成脉管系统的超分辨率(SR)图像。尽管 SR 图像可能对诊断和治疗患者有用,但其在临床环境中的使用因需要延长采集时间和高帧速率而受到限制。我们研究的主要目标是放宽获取 SR 图像的高帧率要求。为此,我们提出了一种基于尖端插值方法构建的新的省时 ULM (TEULM) 流程。更具体地说,我们建议使用径向基函数(RBF)作为插值器来估计二维(2D)时空结构中的缺失值。为了评估此策略,我们首先通过对高帧率 ULM 数据应用下采样(DS = 2、4、8 和 10)因子来模拟降低帧率的数据采集。然后,我们使用建议的插值将数据上采样到原始帧速率,以重建丢失的帧。最后,使用原始高帧率数据和插值数据,我们使用 ULM 框架步骤重建 SR 图像。我们使用四个体内数据集评估所提出的 TEULM,即大鼠大脑(数据集 A)、大鼠肾脏(数据集 B)、大鼠肿瘤(数据集 C)和大鼠脑丸(数据集 D),在同相插值和正交 (IQ) 电平。结果证明了 TEULM 在恢复血管结构方面的有效性,即使在 DS 速率为 10(对应于低于 100Hz 的帧速率)时也是如此。总之,所提出的技术成功地重建了精确的 SR 图像,同时要求帧速率比标准 ULM 低一个数量级。

AU Wu, Weiwen Wang, Yanyang Liu, Qiegen Wang, Ge Zhang, Jianjia
吴AU、王伟文、刘艳阳、王切根、张戈、健佳

Wavelet-Improved Score-Based Generative Model for Medical Imaging
基于小波改进的医学成像评分生成模型

The score-based generative model (SGM) has demonstrated remarkable performance in addressing challenging under-determined inverse problems in medical imaging. However, acquiring high-quality training datasets for these models remains a formidable task, especially in medical image reconstructions. Prevalent noise perturbations or artifacts in low-dose Computed Tomography (CT) or under-sampled Magnetic Resonance Imaging (MRI) hinder the accurate estimation of data distribution gradients, thereby compromising the overall performance of SGMs when trained with these data. To alleviate this issue, we propose a wavelet-improved denoising technique to cooperate with the SGMs, ensuring effective and stable training. Specifically, the proposed method integrates a wavelet sub-network and the standard SGM sub-network into a unified framework, effectively alleviating inaccurate distribution of the data distribution gradient and enhancing the overall stability. The mutual feedback mechanism between the wavelet sub-network and the SGM sub-network empowers the neural network to learn accurate scores even when handling noisy samples. This combination results in a framework that exhibits superior stability during the learning process, leading to the generation of more precise and reliable reconstructed images. During the reconstruction process, we further enhance the robustness and quality of the reconstructed images by incorporating regularization constraint. Our experiments, which encompass various scenarios of low-dose and sparse-view CT, as well as MRI with varying under-sampling rates and masks, demonstrate the effectiveness of the proposed method by significantly enhanced the quality of the reconstructed images. Especially, our method with noisy training samples achieves comparable results to those obtained using clean data.
基于评分的生成模型(SGM)在解决医学成像中具有挑战性的欠定逆问题方面表现出了卓越的性能。然而,为这些模型获取高质量的训练数据集仍然是一项艰巨的任务,特别是在医学图像重建中。低剂量计算机断层扫描 (CT) 或欠采样磁共振成像 (MRI) 中普遍存在的噪声扰动或伪影阻碍了数据分布梯度的准确估计,从而影响了使用这些数据进行训练时 SGM 的整体性能。为了缓解这个问题,我们提出了一种小波改进的去噪技术来与 SGM 配合,确保训练的有效和稳定。具体来说,该方法将小波子网络和标准SGM子网络集成到统一的框架中,有效缓解数据分布梯度分布不准确的问题,增强整体稳定性。小波子网络和SGM子网络之间的相互反馈机制使神经网络即使在处理噪声样本时也能学习准确的分数。这种组合形成了一个在学习过程中表现出卓越稳定性的框架,从而生成更精确和可靠的重建图像。在重建过程中,我们通过结合正则化约束进一步增强重建图像的鲁棒性和质量。我们的实验涵盖低剂量和稀疏视图 CT 的各种场景,以及具有不同欠采样率和掩模的 MRI,通过显着提高重建图像的质量证明了所提出方法的有效性。 特别是,我们使用噪声训练样本的方法取得了与使用干净数据获得的结果相当的结果。

AU Yang, Yuxuan Wang, Hao Wang, Jizhou Dong, Kai Ding, Shuai
欧阳、王雨轩、王浩、董继周、丁凯、帅

Semantic-Preserving Surgical Video Retrieval With Phase and Behavior Coordinated Hashing
使用相位和行为协调哈希的语义保留手术视频检索

Medical professionals rely on surgical video retrieval to discover relevant content within large numbers of videos for surgical education and knowledge transfer. However, the existing retrieval techniques often fail to obtain user-expected results since they ignore valuable semantics in surgical videos. The incorporation of rich semantics into video retrieval is challenging in terms of the hierarchical relationship modeling and coordination between coarse- and fine-grained semantics. To address these issues, this paper proposes a novel semantic-preserving surgical video retrieval (SPSVR) framework, which incorporates surgical phase and behavior semantics using a dual-level hashing module to capture their hierarchical relationship. This module preserves the semantics in binary hash codes by transforming the phase and behavior similarities into high- and low-level similarities in a shared Hamming space. The binary codes are optimized by performing a reconstruction task, a high-level similarity preservation task, and a low-level similarity preservation task, using a coordinated optimization strategy for efficient learning. A self-supervised learning scheme is adopted to capture behavior semantics from video clips so that the indexing of behaviors is unencumbered by fine-grained annotation and recognition. Experiments on four surgical video datasets for two different disciplines demonstrate the robust performance of the proposed framework. In addition, the results of the clinical validation experiments indicate the ability of the proposed method to retrieve the results expected by surgeons. The code can be found at https://github.com/trigger26/SPSVR.
医疗专业人员依靠手术视频检索在大量视频中发现相关内容,以进行手术教育和知识转移。然而,现有的检索技术往往无法获得用户期望的结果,因为它们忽略了手术视频中有价值的语义。将丰富的语义融入视频检索在层次关系建模以及粗粒度和细粒度语义之间的协调方面具有挑战性。为了解决这些问题,本文提出了一种新颖的语义保留手术视频检索(SPSVR)框架,该框架使用双层哈希模块来结合手术阶段和行为语义来捕获它们的层次关系。该模块通过将阶段和行为相似性转换为共享汉明空间中的高级和低级相似性来保留二进制哈希码中的语义。通过执行重建任务、高级相似性保存任务和低级相似性保存任务来优化二进制代码,使用协调优化策略进行高效学习。采用自监督学习方案从视频剪辑中捕获行为语义,从而使行为索引不受细粒度注释和识别的阻碍。对两个不同学科的四个手术视频数据集的实验证明了所提出的框架的稳健性能。此外,临床验证实验的结果表明所提出的方法能够检索外科医生期望的结果。代码可以在 https://github.com/trigger26/SPSVR 找到。

AU Lu, Xu Cui, Zengzhen Sun, Yihua Khor, Hee Guan Sun, Ao Ma, Longfei Chen, Fang Gao, Shan Tian, Yun Zhou, Fang Lv, Yang Liao, Hongen
AU Lu、徐翠、孙增振、许一华、孙熙冠、敖马、陈龙飞、高方、田善、周云、吕方、廖杨、洪恩

Better Rough Than Scarce: Proximal Femur Fracture Segmentation With Rough Annotations
粗糙比稀缺更好:带有粗糙注释的近端股骨骨折分割

Proximal femoral fracture segmentation in computed tomography (CT) is essential in the preoperative planning of orthopedic surgeons. Recently, numerous deep learning-based approaches have been proposed for segmenting various structures within CT scans. Nevertheless, distinguishing various attributes between fracture fragments and soft tissue regions in CT scans frequently poses challenges, which have received comparatively limited research attention. Besides, the cornerstone of contemporary deep learning methodologies is the availability of annotated data, while detailed CT annotations remain scarce. To address the challenge, we propose a novel weakly-supervised framework, namely Rough Turbo Net (RT-Net), for the segmentation of proximal femoral fractures. We emphasize the utilization of human resources to produce rough annotations on a substantial scale, as opposed to relying on limited fine-grained annotations that demand a substantial time to create. In RT-Net, rough annotations pose fractured-region constraints, which have demonstrated significant efficacy in enhancing the accuracy of the network. Conversely, the fine annotations can provide more details for recognizing edges and soft tissues. Besides, we design a spatial adaptive attention module (SAAM) that adapts to the spatial distribution of the fracture regions and align feature in each decoder. Moreover, we propose a fine-edge loss which is applied through an edge discrimination network to penalize the absence or imprecision edge features. Extensive quantitative and qualitative experiments demonstrate the superiority of RT-Net to state-of-the-art approaches. Furthermore, additional experiments show that RT-Net has the capability to produce pseudo labels for raw CT images that can further improve fracture segmentation performance and has the potential to improve segmentation performance on public datasets.
计算机断层扫描 (CT) 中的近端股骨骨折分割对于骨科医生的术前计划至关重要。最近,人们提出了许多基于深度学习的方法来分割 CT 扫描中的各种结构。然而,在 CT 扫描中区分骨折碎片和软组织区域的各种属性经常会带来挑战,而这些挑战受到的研究关注相对有限。此外,当代深度学习方法的基石是注释数据的可用性,而详细的 CT 注释仍然很少。为了应对这一挑战,我们提出了一种新颖的弱监督框架,即 Rough Turbo Net (RT-Net),用于近端股骨骨折的分割。我们强调利用人力资源大规模地生成粗略注释,而不是依赖需要大量时间来创建的有限细粒度注释。在 RT-Net 中,粗略的注释造成了断裂区域约束,这在提高网络准确性方面表现出了显着的功效。相反,精细注释可以为识别边缘和软组织提供更多细节。此外,我们设计了一个空间自适应注意模块(SAAM),它适应断裂区域的空间分布并在每个解码器中对齐特征。此外,我们提出了一种通过边缘判别网络应用的精细边缘损失,以惩罚边缘特征的缺失或不精确。大量的定量和定性实验证明了 RT-Net 相对于最先进方法的优越性。 此外,额外的实验表明,RT-Net 具有为原始 CT 图像生成伪标签的能力,可以进一步提高裂缝分割性能,并有可能提高公共数据集的分割性能。

C1 Tsinghua Univ, Sch Biomed Engn, Beijing 100084, Peoples R China C1 Tsinghua Univ, Grad Sch Shenzhen, Shenzhen 518055, Peoples R China C1 Peking Univ Third Hosp, Dept Orthoped, Beijing 100191, Peoples R China C1 Tsinghua Univ, Sch Biomed Engn, Beijing 100084, Peoples R China C1 Shanghai Jiao Tong Univ, Sch Biomed Engn, Shanghai 200240, Peoples R China C1 Shanghai Jiao Tong Univ, Inst Med Robot, Shanghai 200240, Peoples R China SN 0278-0062 EI 1558-254X DA 2024-09-18 UT WOS:001307429600012 PM 38652607 ER
C1 清华大学生物医学工程学院,北京 100084,人民大学 C1 清华大学深圳研究生院,深圳 518055,人民大学 C1 北京大学第三医院骨科,北京 100191,人民大学 C1 清华大学生物医学工程学院,北京100084,人民R中国C1上海交通大学,上海生物医学工程学院,上海200240,人民R中国C1上海交通大学,医学机器人研究所,上海200240,人民R中国SN 0278-0062 EI 1558-254X DA 2024- 09-18 UT WOS:001307429600012 PM 38652607 ER

AU Ruan, Guohui Wang, Zhaonian Liu, Chunyi Xia, Ling Wang, Huafeng Qi, Li Chen, Wufan
阮盟、王国辉、刘兆年、夏春一、王凌、齐华峰、陈力、吴凡

Magnetic Resonance Electrical Properties Tomography Based on Modified Physics- Informed Neural Network and Multiconstraints
基于修正物理信息神经网络和多约束的磁共振电特性层析成像

This paper presents a novel method based on leveraging physics-informed neural networks for magnetic resonance electrical property tomography (MREPT). MREPT is a noninvasive technique that can retrieve the spatial distribution of electrical properties (EPs) of scanned tissues from measured transmit radiofrequency (RF) in magnetic resonance imaging (MRI) systems. The reconstruction of EP values in MREPT is achieved by solving a partial differential equation derived from Maxwell's equations that lacks a direct solution. Most conventional MREPT methods suffer from artifacts caused by the invalidation of the assumption applied for simplification of the problem and numerical errors caused by numerical differentiation. Existing deep learning-based (DL-based) MREPT methods comprise data-driven methods that need to collect massive datasets for training or model-driven methods that are only validated in trivial cases. Hence we proposed a model-driven method that learns mapping from a measured RF, its spatial gradient and Laplacian to EPs using fully connected networks (FCNNs). The spatial gradient of EP can be computed through the automatic differentiation of FCNNs and the chain rule. FCNNs are optimized using the residual of the central physical equation of convection-reaction MREPT as the loss function ( ${{\mathcal {L}}}{)}$ . To alleviate the ill condition of the problem, we added multiconstraints, including the similarity constraint between permittivity and conductivity and the ${\ell }_{{{1}}}$ norm of spatial gradients of permittivity and conductivity, to the ${{\mathcal {L}}}$ . We demonstrate the proposed method with a three-dimensional realistic head model, a digital phantom simulation, and a practical phantom experiment at a 9.4T animal MRI system.
本文提出了一种基于物理信息神经网络的磁共振电特性断层扫描(MREPT)的新方法。 MREPT 是一种无创技术,可以从磁共振成像 (MRI) 系统中测量的发射射频 (RF) 中检索扫描组织的电特性 (EP) 的空间分布。 MREPT 中 EP 值的重建是通过求解从缺乏直接解的麦克斯韦方程组导出的偏微分方程来实现的。大多数传统的 MREPT 方法都存在因用于简化问题的假设无效而导致的伪影以及数值微分导致的数值误差。现有的基于深度学习(DL)的 MREPT 方法包括需要收集大量数据集进行训练的数据驱动方法或仅在琐碎情况下验证的模型驱动方法。因此,我们提出了一种模型驱动的方法,该方法使用全连接网络(FCNN)学习从测量的 RF、其空间梯度和拉普拉斯算子到 EP 的映射。 EP的空间梯度可以通过FCNN的自动微分和链式法则来计算。 FCNN 使用对流反应 MREPT 中心物理方程的残差作为损失函数 ( ${{\mathcal {L}}}{)}$ 进行优化。为了缓解这个问题,我们在 ${ {\mathcal {L}}}$ 。我们通过三维逼真头部模型、数字体模模拟以及 9.4T 动物 MRI 系统的实际体模实验演示了所提出的方法。

AU Xu, Jiaxing Bian, Qingtian Li, Xinhang Zhang, Aihu Ke, Yiping Qiao, Miao Zhang, Wei Sim, Wei Khang Jeremy Gulyas, Balazs CA Alzheimers Dis Neuroimaging Initiative
徐AU、卞嘉兴、李庆天、张新航、柯爱虎、乔一平、张淼、辛伟、康伟 Jeremy Gulyas、Balazs CA 阿尔茨海默病神经影像计划

Contrastive Graph Pooling for Explainable Classification of Brain Networks
用于可解释的大脑网络分类的对比图池

Functional magnetic resonance imaging (fMRI) is a commonly used technique to measure neural activation. Its application has been particularly important in identifying underlying neurodegenerative conditions such as Parkinson's, Alzheimer's, and Autism. Recent analysis of fMRI data models the brain as a graph and extracts features by graph neural networks (GNNs). However, the unique characteristics of fMRI data require a special design of GNN. Tailoring GNN to generate effective and domain-explainable features remains challenging. In this paper, we propose a contrastive dual-attention block and a differentiable graph pooling method called ContrastPool to better utilize GNN for brain networks, meeting fMRI-specific requirements. We apply our method to 5 resting-state fMRI brain network datasets of 3 diseases and demonstrate its superiority over state-of-the-art baselines. Our case study confirms that the patterns extracted by our method match the domain knowledge in neuroscience literature, and disclose direct and interesting insights. Our contributions underscore the potential of ContrastPool for advancing the understanding of brain networks and neurodegenerative conditions. The source code is available at https://github.com/AngusMonroe/ContrastPool.
功能磁共振成像(fMRI)是测量神经激活的常用技术。它的应用对于识别潜在的神经退行性疾病(如帕金森病、阿尔茨海默病和自闭症)尤其重要。最近对功能磁共振成像数据的分析将大脑建模为图形,并通过图形神经网络 (GNN) 提取特征。然而,fMRI数据的独特特征需要GNN的特殊设计。定制 GNN 以生成有效且领域可解释的特征仍然具有挑战性。在本文中,我们提出了一种对比性双注意力块和一种称为 ContrastPool 的可微图池方法,以更好地将 GNN 用于大脑网络,满足 fMRI 的特定要求。我们将我们的方法应用于 3 种疾病的 5 个静息态 fMRI 脑网络数据集,并证明了其相对于最先进基线的优越性。我们的案例研究证实,我们的方法提取的模式与神经科学文献中的领域知识相匹配,并揭示了直接且有趣的见解。我们的贡献强调了 ContrastPool 在促进对大脑网络和神经退行性疾病的理解方面的潜力。源代码可在 https://github.com/AngusMonroe/ContrastPool 获取。

AU Kim, Boah Zhuang, Yan Mathai, Tejas Sudharshan Summers, Ronald M
AU Kim、Boah Zhuang、Yan Mathai、Tejas Sudharshan Summers、Ronald M

OTMorph: Unsupervised Multi-domain Abdominal Medical Image Registration Using Neural Optimal Transport.
OTMorph:使用神经最优传输的无监督多域腹部医学图像配准。

Deformable image registration is one of the essential processes in analyzing medical images. In particular, when diagnosing abdominal diseases such as hepatic cancer and lymphoma, multi-domain images scanned from different modalities or different imaging protocols are often used. However, they are not aligned due to scanning times, patient breathing, movement, etc. Although recent learning-based approaches can provide deformations in real-time with high performance, multi-domain abdominal image registration using deep learning is still challenging since the images in different domains have different characteristics such as image contrast and intensity ranges. To address this, this paper proposes a novel unsupervised multi-domain image registration framework using neural optimal transport, dubbed OTMorph. When moving and fixed volumes are given as input, a transport module of our proposed model learns the optimal transport plan to map data distributions from the moving to the fixed volumes and estimates a domain-transported volume. Subsequently, a registration module taking the transported volume can effectively estimate the deformation field, leading to deformation performance improvement. Experimental results on multi-domain image registration using multi-modality and multi-parametric abdominal medical images demonstrate that the proposed method provides superior deformable registration via the domain-transported image that alleviates the domain gap between the input images. Also, we attain the improvement even on out-of-distribution data, which indicates the superior generalizability of our model for the registration of various medical images. Our source code is available at https://github.com/boahK/OTMorph.
变形图像配准是分析医学图像的基本过程之一。特别是,在诊断肝癌和淋巴瘤等腹部疾病时,经常使用从不同模式或不同成像协议扫描的多域图像。然而,由于扫描时间、患者呼吸、运动等原因,它们并没有对齐。尽管最近基于学习的方法可以高性能地实时提供变形,但使用深度学习的多域腹部图像配准仍然具有挑战性,因为图像不同领域具有不同的特征,例如图像对比度和强度范围。为了解决这个问题,本文提出了一种使用神经最优传输的新型无监督多域图像配准框架,称为 OTMorph。当移动和固定体积作为输入时,我们提出的模型的传输模块学习最佳传输计划以将数据分布从移动体积映射到固定体积并估计域传输体积。随后,采用传输体积的配准模块可以有效地估计变形场,从而提高变形性能。使用多模态和多参数腹部医学图像进行多域图像配准的实验结果表明,该方法通过域传输图像提供了优异的变形配准,从而减轻了输入图像之间的域间隙。此外,我们甚至在分布外数据上也取得了改进,这表明我们的模型对于各种医学图像的配准具有卓越的通用性。我们的源代码可在 https://github.com/boahK/OTMorph 获取。

AU Ali, Rehman Mitcham, Trevor M. Brevett, Thurston Agudo, Oscar Calderon Martinez, Cristina Duran Li, Cuiping Doyley, Marvin M. Duric, Nebojsa
AU Ali、Rehman Mitcham、Trevor M. Brevett、Thurston Agudo、Oscar Calderon Martinez、Cristina Duran Li、Cuiping Doyley、Marvin M. Duric、Nebojsa

2-D Slicewise Waveform Inversion of Sound Speed and Acoustic Attenuation for Ring Array Ultrasound Tomography Based on a Block LU Solver
基于 Block LU 求解器的环形阵列超声断层扫描声速和声衰减的二维切片波形反演

Ultrasound tomography is an emerging imaging modality that uses the transmission of ultrasound through tissue to reconstruct images of its mechanical properties. Initially, ray-based methods were used to reconstruct these images, but their inability to account for diffraction often resulted in poor resolution. Waveform inversion overcame this limitation, providing high-resolution images of the tissue. Most clinical implementations, often directed at breast cancer imaging, currently rely on a frequency-domain waveform inversion to reduce computation time. For ring arrays, ray tomography was long considered a necessary step prior to waveform inversion in order to avoid cycle skipping. However, in this paper, we demonstrate that frequency-domain waveform inversion can reliably reconstruct high-resolution images of sound speed and attenuation without relying on ray tomography to provide an initial model. We provide a detailed description of our frequency-domain waveform inversion algorithm with open-source code and data that we make publicly available.
超声断层扫描是一种新兴的成像方式,它利用超声波通过组织的传输来重建其机械特性的图像。最初,使用基于射线的方法来重建这些图像,但它们无法考虑衍射,通常导致分辨率较差。波形反演克服了这一限制,提供了组织的高分辨率图像。大多数临床实施通常针对乳腺癌成像,目前依赖频域波形反演来减少计算时间。对于环形阵列,射线断层扫描长期以来被认为是波形反演之前的必要步骤,以避免周期跳跃。然而,在本文中,我们证明频域波形反演可以可靠地重建声速和衰减的高分辨率图像,而无需依赖射线断层扫描来提供初始模型。我们通过公开的开源代码和数据提供了频域波形反演算法的详细描述。

AU Lin, Wenjun Hu, Yan Fu, Huazhu Yang, Mingming Chng, Chin-Boon Kawasaki, Ryo Chui, Cheekong Liu, Jiang

Instrument-Tissue Interaction Detection Framework for Surgical Video Understanding
用于手术视频理解的仪器-组织相互作用检测框架

Instrument-tissue interaction detection task, which helps understand surgical activities, is vital for constructing computer-assisted surgery systems but with many challenges. Firstly, most models represent instrument-tissue interaction in a coarse-grained way which only focuses on classification and lacks the ability to automatically detect instruments and tissues. Secondly, existing works do not fully consider relations between intra- and inter-frame of instruments and tissues. In the paper, we propose to represent instrument-tissue interaction as instrument class, instrument bounding box, tissue class, tissue bounding box, action class quintuple and present an Instrument-Tissue Interaction Detection Network (ITIDNet) to detect the quintuple for surgery videos understanding. Specifically, we propose a Snippet Consecutive Feature (SCF) Layer to enhance features by modeling relationships of proposals in the current frame using global context information in the video snippet. We also propose a Spatial Corresponding Attention (SCA) Layer to incorporate features of proposals between adjacent frames through spatial encoding. To reason relationships between instruments and tissues, a Temporal Graph (TG) Layer is proposed with intra-frame connections to exploit relationships between instruments and tissues in the same frame and inter-frame connections to model the temporal information for the same instance. For evaluation, we build a cataract surgery video (PhacoQ) dataset and a cholecystectomy surgery video (CholecQ) dataset. Experimental results demonstrate the promising performance of our model, which outperforms other state-of-the-art models on both datasets.
仪器与组织相互作用检测任务有助于理解手术活动,对于构建计算机辅助手术系统至关重要,但也面临许多挑战。首先,大多数模型以粗粒度的方式表示器械与组织的相互作用,仅注重分类,缺乏自动检测器械和组织的能力。其次,现有的工作没有充分考虑仪器和组织框架内和框架间的关系。在本文中,我们建议将仪器-组织相互作用表示为仪器类、仪器边界框、组织类、组织边界框、动作类五元组,并提出一种仪器-组织相互作用检测网络(ITIDNet)来检测五元组以进行手术视频理解。具体来说,我们提出了一个片段连续特征(SCF)层,通过使用视频片段中的全局上下文信息对当前帧中的提案关系进行建模来增强特征。我们还提出了一个空间对应注意(SCA)层,通过空间编码合并相邻帧之间的提案特征。为了推理仪器和组织之间的关系,提出了具有帧内连接的时间图(TG)层,以利用同一帧中仪器和组织之间的关系,以及帧间连接来对同一实例的时间信息进行建模。为了进行评估,我们构建了白内障手术视频(PhacoQ)数据集和胆囊切除手术视频(CholecQ)数据集。实验结果证明了我们的模型的良好性能,在两个数据集上都优于其他最先进的模型。

AU Wang, Kang Zheng, Feiyang Cheng, Lan Dai, Hong-Ning Dou, Qi Qin, Jing
王AU、郑康、程飞扬、戴蓝、窦红宁、秦琪、静

Breast Cancer Classification From Digital Pathology Images via Connectivity-Aware Graph Transformer
通过连接感知图形转换器对数字病理图像进行乳腺癌分类

Automated classification of breast cancer subtypes from digital pathology images has been an extremely challenging task due to the complicated spatial patterns of cells in the tissue micro-environment. While newly proposed graph transformers are able to capture more long-range dependencies to enhance accuracy, they largely ignore the topological connectivity between graph nodes, which is nevertheless critical to extract more representative features to address this difficult task. In this paper, we propose a novel connectivity-aware graph transformer (CGT) for phenotyping the topology connectivity of the tissue graph constructed from digital pathology images for breast cancer classification. Our CGT seamlessly integrates connectivity embedding to node feature at every graph transformer layer by using local connectivity aggregation, in order to yield more comprehensive graph representations to distinguish different breast cancer subtypes. In light of the realistic intercellular communication mode, we then encode the spatial distance between two arbitrary nodes as connectivity bias in self-attention calculation, thereby allowing the CGT to distinctively harness the connectivity embedding based on the distance of two nodes. We extensively evaluate the proposed CGT on a large cohort of breast carcinoma digital pathology images stained by Haematoxylin & Eosin. Experimental results demonstrate the effectiveness of our CGT, which outperforms state-of-the-art methods by a large margin. Codes are released on https://github.com/wang-kang-6/CGT.
由于组织微环境中细胞的空间模式复杂,从数字病理图像中自动分类乳腺癌亚型一直是一项极具挑战性的任务。虽然新提出的图转换器能够捕获更多的远程依赖关系以提高准确性,但它们在很大程度上忽略了图节点之间的拓扑连接性,但这对于提取更具代表性的特征来解决这一艰巨的任务至关重要。在本文中,我们提出了一种新颖的连接感知图转换器(CGT),用于对根据乳腺癌分类的数字病理图像构建的组织图的拓扑连接进行表型分析。我们的 CGT 通过使用局部连接聚合将连接嵌入无缝集成到每个图转换器层的节点特征,以便产生更全面的图表示来区分不同的乳腺癌亚型。根据实际的细胞间通信模式,我们将两个任意节点之间的空间距离编码为自注意力计算中的连接偏差,从而使 CGT 能够根据两个节点的距离独特地利用连接嵌入。我们在大量苏木精和伊红染色的乳腺癌数字病理图像上广泛评估了所提出的 CGT。实验结果证明了我们的 CGT 的有效性,其性能大大优于最先进的方法。代码发布在https://github.com/wang-kang-6/CGT。

AU Zhou, Lei Zhang, Yuzhong Zhang, Jiadong Qian, Xuejun Gong, Chen Sun, Kun Ding, Zhongxiang Wang, Xing Li, Zhenhui Liu, Zaiyi Shen, Dinggang
周AU、张磊、张玉中、钱家栋、宫学军、孙晨、丁坤、王忠祥、李星、刘振辉、沉在义、丁刚

Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation in DCE-MRI.
用于 DCE-MRI 中乳腺肿瘤分割的原型学习引导混合网络。

Automated breast tumor segmentation on the basis of dynamic contrast-enhancement magnetic resonance imaging (DCE-MRI) has shown great promise in clinical practice, particularly for identifying the presence of breast disease. However, accurate segmentation of breast tumor is a challenging task, often necessitating the development of complex networks. To strike an optimal tradeoff between computational costs and segmentation performance, we propose a hybrid network via the combination of convolution neural network (CNN) and transformer layers. Specifically, the hybrid network consists of a encoder-decoder architecture by stacking convolution and deconvolution layers. Effective 3D transformer layers are then implemented after the encoder subnetworks, to capture global dependencies between the bottleneck features. To improve the efficiency of hybrid network, two parallel encoder sub-networks are designed for the decoder and the transformer layers, respectively. To further enhance the discriminative capability of hybrid network, a prototype learning guided prediction module is proposed, where the category-specified prototypical features are calculated through online clustering. All learned prototypical features are finally combined with the features from decoder for tumor mask prediction. The experimental results on private and public DCE-MRI datasets demonstrate that the proposed hybrid network achieves superior performance than the state-of-the-art (SOTA) methods, while maintaining balance between segmentation accuracy and computation cost. Moreover, we demonstrate that automatically generated tumor masks can be effectively applied to identify HER2-positive subtype from HER2-negative subtype with the similar accuracy to the analysis based on manual tumor segmentation. The source code is available at https://github.com/ZhouL-lab/ PLHN.
基于动态对比增强磁共振成像(DCE-MRI)的自动乳腺肿瘤分割在临床实践中显示出巨大的前景,特别是在识别乳腺疾病的存在方面。然而,乳腺肿瘤的准确分割是一项具有挑战性的任务,通常需要开发复杂的网络。为了在计算成本和分割性能之间取得最佳权衡,我们提出了一种通过卷积神经网络(CNN)和变压器层相结合的混合网络。具体来说,混合网络由堆叠卷积层和反卷积层的编码器-解码器架构组成。然后在编码器子网络之后实现有效的 3D 变换层,以捕获瓶颈特征之间的全局依赖性。为了提高混合网络的效率,分别为解码器层和变换器层设计了两个并行编码器子网络。为了进一步增强混合网络的判别能力,提出了一种原型学习引导预测模块,其中通过在线聚类计算类别指定的原型特征。所有学习到的原型特征最终与来自解码器的特征相结合以进行肿瘤掩模预测。在私有和公共 DCE-MRI 数据集上的实验结果表明,所提出的混合网络比最先进的(SOTA)方法具有更优越的性能,同时保持分割精度和计算成本之间的平衡。此外,我们证明自动生成的肿瘤掩模可以有效地应用于识别 HER2 阳性亚型和 HER2 阴性亚型,其准确性与基于手动肿瘤分割的分析相似。 源代码可在 https://github.com/ZhouL-lab/PLHN 获取。

AU Cai, Zhiyuan Lin, Li He, Huaqing Cheng, Pujin Tang, Xiaoying
蔡区、林志远、何力、程华清、唐普金、小英

Uni4Eye++: A General Masked Image Modeling Multi-modal Pre-training Framework for Ophthalmic Image Classification and Segmentation.
Uni4Eye++:用于眼科图像分类和分割的通用掩模图像建模多模态预训练框架。

A large-scale labeled dataset is a key factor for the success of supervised deep learning in most ophthalmic image analysis scenarios. However, limited annotated data is very common in ophthalmic image analysis, since manual annotation is time-consuming and labor-intensive. Self-supervised learning (SSL) methods bring huge opportunities for better utilizing unlabeled data, as they do not require massive annotations. To utilize as many unlabeled ophthalmic images as possible, it is necessary to break the dimension barrier, simultaneously making use of both 2D and 3D images as well as alleviating the issue of catastrophic forgetting. In this paper, we propose a universal self-supervised Transformer framework named Uni4Eye++ to discover the intrinsic image characteristic and capture domain-specific feature embedding in ophthalmic images. Uni4Eye++ can serve as a global feature extractor, which builds its basis on a Masked Image Modeling task with a Vision Transformer architecture. On the basis of our previous work Uni4Eye, we further employ an image entropy guided masking strategy to reconstruct more-informative patches and a dynamic head generator module to alleviate modality confusion. We evaluate the performance of our pre-trained Uni4Eye++ encoder by fine-tuning it on multiple downstream ophthalmic image classification and segmentation tasks. The superiority of Uni4Eye++ is successfully established through comparisons to other state-of-the-art SSL pre-training methods. Our code is available at Github1.
大规模标记数据集是监督深度学习在大多数眼科图像分析场景中成功的关键因素。然而,有限的注释数据在眼科图像分析中非常常见,因为手动注释既耗时又费力。自监督学习(SSL)方法为更好地利用未标记数据带来了巨大的机会,因为它们不需要大量注释。为了利用尽可能多的未标记眼科图像,有必要打破维度障碍,同时利用 2D 和 3D 图像,并减轻灾难性遗忘的问题。在本文中,我们提出了一种名为 Uni4Eye++ 的通用自监督 Transformer 框架,用于发现内在图像特征并捕获眼科图像中嵌入的特定领域特征。 Uni4Eye++ 可以用作全局特征提取器,它以具有 Vision Transformer 架构的掩模图像建模任务为基础。在我们之前工作 Uni4Eye 的基础上,我们进一步采用图像熵引导掩蔽策略来重建更多信息的补丁和动态头部生成器模块来减轻模态混乱。我们通过在多个下游眼科图像分类和分割任务上对其进行微调来评估预训练 Uni4Eye++ 编码器的性能。通过与其他最先进的 SSL 预训练方法的比较,成功确立了 Uni4Eye++ 的优越性。我们的代码可以在 Github1 上找到。

AU De Marco, Fabio Andrejewski, Jana Urban, Theresa Willer, Konstantin Gromann, Lukas Koehler, Thomas Maack, Hanns-Ingo Herzen, Julia Pfeiffer, Franz
AU De Marco、法比奥·安德烈耶夫斯基、贾娜·厄本、特里萨·威勒、康斯坦丁·格罗曼、卢卡斯·克勒、托马斯·麦克、汉斯-英戈·赫尔岑、朱莉娅·菲佛、弗朗兹

X-Ray Dark-Field Signal Reduction Due to Hardening of the Visibility Spectrum
可见光谱硬化导致 X 射线暗场信号减少

X-ray dark-field imaging enables a spatially-resolved visualization of ultra-small-angle X-ray scattering. Using phantom measurements, we demonstrate that a material's effective dark-field signal may be reduced by modification of the visibility spectrum by other dark-field-active objects in the beam. This is the dark-field equivalent of conventional beam-hardening, and is distinct from related, known effects, where the dark-field signal is modified by attenuation or phase shifts. We present a theoretical model for this group of effects and verify it by comparison to the measurements. These findings have significant implications for the interpretation of dark-field signal strength in polychromatic measurements.
X 射线暗场成像可实现超小角度 X 射线散射的空间分辨可视化。使用体模测量,我们证明材料的有效暗场信号可以通过光束中其他暗场活跃物体修改可见光谱来减少。这是传统光束硬化的暗场等效,并且与相关的已知效应不同,在已知效应中,暗场信号通过衰减或相移进行修改。我们提出了这组效应的理论模型,并通过与测量结果的比较来验证它。这些发现对于多色测量中暗场信号强度的解释具有重要意义。

AU Rong, Dingyi Zhao, Zhongyin Wu, Yue Ke, Bilian Ni, Binging
区蓉、赵定一、吴中银、柯岳、倪碧莲、冰冰

Prediction of Myopia Eye Axial Elongation With Orthokeratology Treatment via Dense I2I Based Corneal Topography Change Analysis
通过基于密集 I2I 的角膜地形变化分析预测近视眼眼轴伸长与角膜塑形治疗

While orthokeratology (OK) has shown effective to slow the progression of myopia, it remains unknown how spatially distributed structural stress/tension applying to different regions affects the change of corneal geometry, and consecutive the outcome of myopia control, at fine-grained detail. Acknowledging that the underlying working mechanism of OK lens is essentially mechanics induced refractive parameter reshaping, in this study, we develop a novel mechanics rule guided deep image-to-image learning framework, which densely predicts patient's corneal topography change according to treatment parameters (lens geometry, wearing time, physiological parameters, etc.), and consecutively predicts the influence on eye axial length change after OK treatment. Encapsulated in a U-shaped multi-resolution map-to-map architecture, the proposed model features two major components. First, geometric and wearing parameters of OK lens are spatially encoded with convolutions to form a multi-channel input volume/tensor for latent encodings of external stress/tension applied to different regions of cornea. Second, these external latent force maps are progressively down-sampled and injected into this multi-scale architecture for predicting the change of corneal topography map. At each feature learning layer, we formally derive a mathematic framework that simulates the physical process of corneal deformation induced by lens-to-cornea interaction and corneal internal tension, which is reformulated into parameter learnable cross-attention/self-attention modules in the context of transformer architecture. A total of 1854 eyes of myopia patients are included in the study and the results show that the proposed model precisely predicts corneal topography change with a high PSNR as 28.45dB, as well as a significant accuracy gain for axial elongation prediction (i.e., 0.0276 in MSE). It is also demonstrated that our method provides interpretable associations between various OK treatment parameters and the final control effect.
虽然角膜塑形术(OK)已被证明可以有效减缓近视的进展,但仍不清楚施加到不同区域的空间分布的结构应力/张力如何影响角膜几何形状的变化,以及在细粒度细节上连续影响近视控制的结果。认识到 OK 镜片的基本工作机制本质上是力学引起的屈光参数重塑,在本研究中,我们开发了一种新颖的力学规则引导的深度图像到图像学习框架,该框架根据治疗参数(镜片)密集预测患者的角膜地形变化几何形状、佩戴时间、生理参数等),并连续预测OK治疗后对眼轴长度变化的影响。所提出的模型封装在 U 形多分辨率地图到地图架构中,具有两个主要组成部分。首先,利用卷积对OK镜片的几何和佩戴参数进行空间编码,以形成多通道输入体积/张量,用于对施加到角膜不同区域的外部应力/张力进行潜在编码。其次,这些外部潜在力图被逐步下采样并注入到这个多尺度架构中,以预测角膜地形图的变化。在每个特征学习层,我们正式推导了一个数学框架,该框架模拟由晶状体与角膜相互作用和角膜内部张力引起​​的角膜变形的物理过程,并在上下文中将其重新表述为参数可学习的交叉注意/自注意模块变压器架构。该研究共纳入了 1854 只近视患者眼睛,结果表明该模型能够准确预测角膜地形图变化,PSNR 高达 28。45dB,以及轴向伸长预测的显着精度增益(即 MSE 中的 0.0276)。还证明我们的方法提供了各种 OK 治疗参数和最终控制效果之间的可解释关联。

AU Cheung, Chim-Lee Wu, Mengjie Fang, Ge Ho, Justin D. L. Liang, Liyuan Tan, Kel Vin Lin, Fa-Hsuan Chang, Hing-Chiu Kwok, Ka-Wai
AU Cheung、Chim-Lee Wu、Mengjie Fang、Ge Ho、Justin DL Liang、Liyuan Tan、Kel Vin Lin、Fa-Hsuan Chang、Hing-Chiu Kwok、Ka-Wai

Omnidirectional Monolithic Marker for Intra-Operative MR-Based Positional Sensing in Closed MRI
用于闭合 MRI 中基于 MR 的术中位置传感的全向整体标记

We present a design of an inductively coupled radio frequency (ICRF) marker for magnetic resonance (MR)-based positional tracking, enabling the robust increase of tracking signal at all scanning orientations in quadrature-excited closed MR imaging (MRI). The marker employs three curved resonant circuits fully covering a cylindrical surface that encloses the signal source. Each resonant circuit is a planar spiral inductor with parallel plate capacitors fabricated monolithically on flexible printed circuit board (FPC) and bent to achieve the curved structure. Size of the constructed marker is Phi 3-mm x5 -mm with quality factor > 22, and its tracking performance was validated with 1.5 T MRI scanner. As result, the marker remains as a high positive contrast spot under 360(degrees )rotations in 3 axes. The marker can be accurately localized with a maximum error of 0.56 mm under a displacement of 56 mm from the isocenter, along with an inherent standard deviation of 0.1-mm. Accrediting to the high image contrast, the presented marker enables automatic and real-time tracking in 3D without dependency on its orientation with respect to the MRI scanner receive coil. In combination with its small form-factor, the presented marker would facilitate robust and wireless MR-based tracking for intervention and clinical diagnosis. This method targets applications that can involve rotational changes in all axes (X-Y-Z).
我们提出了一种用于基于磁共振 (MR) 的位置跟踪的电感耦合射频 (ICRF) 标记的设计,能够在正交激励闭合 MR 成像 (MRI) 中的所有扫描方向上实现跟踪信号的强劲增加。该标记采用三个弯曲谐振电路,完全覆盖包围信号源的圆柱形表面。每个谐振电路都是平面螺旋电感器和平行板电容器,单片制造在柔性印刷电路板 (FPC) 上并弯曲以实现弯曲结构。构建的标记尺寸为 Phi 3-mm x5-mm,品质因数为 > 22,其跟踪性能通过 1.5 T MRI 扫描仪进行了验证。结果,标记在 3 个轴 360(度)旋转下仍保持高正对比度点。在距等中心点位移 56 毫米的情况下,可以精确定位标记,最大误差为 0.56 毫米,固有标准偏差为 0.1 毫米。由于具有高图像对比度,所提出的标记能够在 3D 中自动实时跟踪,而不依赖于其相对于 MRI 扫描仪接收线圈的方向。结合其小巧的外形,所提出的标记将促进基于 MR 的稳健和无线跟踪,以进行干预和临床诊断。此方法针对可能涉及所有轴 (XYZ) 旋转变化的应用。

AU Yue, Guanghui Zhang, Lixin Du, Jingfeng Zhou, Tianwei Zhou, Wei Lin, Weisi
区悦、张光辉、杜立新、周景峰、周天伟、林伟、伟思

Subjective and Objective Quality Assessment of Colonoscopy Videos.
结肠镜检查视频的主观和客观质量评估。

Captured colonoscopy videos usually suffer from multiple real-world distortions, such as motion blur, low brightness, abnormal exposure, and object occlusion, which impede visual interpretation. However, existing works mainly investigate the impacts of synthesized distortions, which differ from real-world distortions greatly. This research aims to carry out an in-depth study for colonoscopy Video Quality Assessment (VQA). In this study, we advance this topic by establishing both subjective and objective solutions. Firstly, we collect 1,000 colonoscopy videos with typical visual quality degradation conditions in practice and construct a multi-attribute VQA database. The quality of each video is annotated by subjective experiments from five distortion attributes (i.e., temporal-spatial visibility, brightness, specular reflection, stability, and utility), as well as an overall perspective. Secondly, we propose a Distortion Attribute Reasoning Network (DARNet) for automatic VQA. DARNet includes two streams to extract features related to spatial and temporal distortions, respectively. It adaptively aggregates the attribute-related features through a multi-attribute association module to predict the quality score of each distortion attribute. Motivated by the observation that the rating behaviors for all attributes are different, a behavior guided reasoning module is further used to fuse the attribute-aware features, resulting in the overall quality. Experimental results on the constructed database show that our DARNet correlates well with subjective ratings and is superior nine state-of-the-art methods.
捕获的结肠镜检查视频通常会遭受多种现实世界的扭曲,例如运动模糊、低亮度、异常曝光和物体遮挡,这些都会妨碍视觉解释。然而,现有的工作主要研究合成扭曲的影响,这与现实世界的扭曲有很大不同。本研究旨在对结肠镜视频质量评估(VQA)进行深入研究。在这项研究中,我们通过建立主观和客观的解决方案来推进这个主题。首先,我们收集了实践中具有典型视觉质量退化情况的1000个结肠镜检查视频,并构建了多属性VQA数据库。每个视频的质量是通过五个失真属性(即时空可见性、亮度、镜面反射、稳定性和实用性)以及整体视角的主观实验来注释的。其次,我们提出了一种用于自动 VQA 的失真属性推理网络 (DARNet)。 DARNet 包括两个流,分别用于提取与空间和时间扭曲相关的特征。它通过多属性关联模块自适应地聚合与属性相关的特征,以预测每个失真属性的质量得分。由于观察到所有属性的评分行为都是不同的,因此进一步使用行为引导推理模块来融合属性感知特征,从而得出整体质量。所构建数据库的实验结果表明,我们的 DARNet 与主观评分具有良好的相关性,并且优于九种最先进的方法。

AU Mineo, Raffaele Salanitri, F. Proietto Bellitto, G. Kavasidis, I. De Filippo, O. Millesimo, M. De Ferrari, G. M. Aldinucci, M. Giordano, D. Palazzo, S. D'Ascenzo, F. Spampinato, C.
AU Mineo、Raffaele Salanitri、F. Proietto Bellitto、G. Kavasidis、I. De Filippo、O. Millesimo、M. De Ferrari、GM Aldinucci、M. Giordano、D. Palazzo、S. D'Ascenzo、F. Spampinato、 C.

A Convolutional-Transformer Model for FFR and iFR Assessment From Coronary Angiography
用于冠状动脉造影 FFR 和 iFR 评估的卷积变压器模型

The quantification of stenosis severity from X-ray catheter angiography is a challenging task. Indeed, this requires to fully understand the lesion's geometry by analyzing dynamics of the contrast material, only relying on visual observation by clinicians. To support decision making for cardiac intervention, we propose a hybrid CNN-Transformer model for the assessment of angiography-based non-invasive fractional flow-reserve (FFR) and instantaneous wave-free ratio (iFR) of intermediate coronary stenosis. Our approach predicts whether a coronary artery stenosis is hemodynamically significant and provides direct FFR and iFR estimates. This is achieved through a combination of regression and classification branches that forces the model to focus on the cut-off region of FFR (around 0.8 FFR value), which is highly critical for decision-making. We also propose a spatio-temporal factorization mechanisms that redesigns the transformer's self-attention mechanism to capture both local spatial and temporal interactions between vessel geometry, blood flow dynamics, and lesion morphology. The proposed method achieves state-of-the-art performance on a dataset of 778 exams from 389 patients. Unlike existing methods, our approach employs a single angiography view and does not require knowledge of the key frame; supervision at training time is provided by a classification loss (based on a threshold of the FFR/iFR values) and a regression loss for direct estimation. Finally, the analysis of model interpretability and calibration shows that, in spite of the complexity of angiographic imaging data, our method can robustly identify the location of the stenosis and correlate prediction uncertainty to the provided output scores.
通过 X 射线导管血管造影量化狭窄严重程度是一项具有挑战性的任务。事实上,这需要通过分析对比材料的动力学来充分了解病变的几何形状,仅依靠临床医生的视觉观察。为了支持心脏介入决策,我们提出了一种混合 CNN-Transformer 模型,用于评估基于血管造影的无创血流储备分数 (FFR) 和中间冠状动脉狭窄的瞬时无波比 (iFR)。我们的方法可以预测冠状动脉狭窄是否具有血流动力学显着性,并提供直接的 FFR 和 iFR 估计。这是通过回归和分类分支的组合来实现的,迫使模型专注于 FFR 的截止区域(大约 0.8 FFR 值),这对于决策非常关键。我们还提出了一种时空分解机制,重新设计了变压器的自注意力机制,以捕获血管几何形状、血流动力学和病变形态之间的局部空间和时间相互作用。所提出的方法在 389 名患者的 778 项检查数据集上实现了最先进的性能。与现有方法不同,我们的方法采用单一血管造影视图,不需要关键帧的知识;训练时的监督由分类损失(基于 FFR/iFR 值的阈值)和用于直接估计的回归损失提供。最后,模型可解释性和校准的分析表明,尽管血管造影成像数据很复杂,但我们的方法可以稳健地识别狭窄的位置,并将预测不确定性与提供的输出分数相关联。

AU Li, Xibao Ouyang, Xi Zhang, Jiadong Ding, Zhongxiang Zhang, Yuyao Xue, Zhong Shi, Feng Shen, Dinggang
AU Li, 欧阳喜宝, 张喜, 丁家栋, 张忠祥, 薛玉瑶, 石钟, 沉峰, 丁刚

Carotid Vessel Wall Segmentation Through Domain Aligner, Topological Learning, and Segment Anything Model for Sparse Annotation in MR Images.
通过域对准器、拓扑学习和分段任意模型进行颈动脉血管壁分割,以实现 MR 图像中的稀疏注释。

Medical image analysis poses significant challenges due to limited availability of clinical data, which is crucial for training accurate models. This limitation is further compounded by the specialized and labor-intensive nature of the data annotation process. For example, despite the popularity of computed tomography angiography (CTA) in diagnosing atherosclerosis with an abundance of annotated datasets, magnetic resonance (MR) images stand out with better visualization for soft plaque and vessel wall characterization. However, the higher cost and limited accessibility of MR, as well as time-consuming nature of manual labeling, contribute to fewer annotated datasets. To address these issues, we formulate a multi-modal transfer learning network, named MT-Net, designed to learn from unpaired CTA and sparsely-annotated MR data. Additionally, we harness the Segment Anything Model (SAM) to synthesize additional MR annotations, enriching the training process. Specifically, our method first segments vessel lumen regions followed by precise characterization of carotid artery vessel walls, thereby ensuring both segmentation accuracy and clinical relevance. Validation of our method involved rigorous experimentation on publicly available datasets from COSMOS and CARE-II challenge, demonstrating its superior performance compared to existing state-of-the-art techniques.
由于临床数据的可用性有限,医学图像分析面临重大挑战,而临床数据对于训练准确的模型至关重要。数据注释过程的专业性和劳动密集型性质进一步加剧了这种限制。例如,尽管计算机断层扫描血管造影 (CTA) 在诊断动脉粥样硬化方面很受欢迎,并且具有大量带注释的数据集,但磁共振 (MR) 图像在软斑块和血管壁表征方面具有更好的可视化效果,因此脱颖而出。然而,MR 的成本较高、可访问性有限,以及手动标记的耗时性,导致带注释的数据集较少。为了解决这些问题,我们制定了一个多模态迁移学习网络,名为 MT-Net,旨在从不成对的 CTA 和稀疏注释的 MR 数据中学习。此外,我们利用分段任意模型 (SAM) 来合成额外的 MR 注释,丰富训练过程。具体来说,我们的方法首先分割血管腔区域,然后精确表征颈动脉血管壁,从而确保分割准确性和临床相关性。我们的方法的验证涉及对 COSMOS 和 CARE-II 挑战赛的公开数据集进行严格的实验,证明其与现有最先进技术相比具有卓越的性能。

AU Wang, Jian Qiao, Liang Zhou, Shichong Zhou, Jin Wang, Jun Li, Juncheng Ying, Shihui Chang, Cai Shi, Jun
AU Wang、Jian Qiao、Liang Zhou、Shichong Zhou、Jin Wang、Jun Li、Jun Cheng Ying、Shihui Chang、Cai Shi、Jun

Weakly Supervised Lesion Detection and Diagnosis for Breast Cancers With Partially Annotated Ultrasound Images
利用部分注释的超声图像对乳腺癌进行弱监督病变检测和诊断

Deep learning (DL) has proven highly effective for ultrasound-based computer-aided diagnosis (CAD) of breast cancers. In an automatic CAD system, lesion detection is critical for the following diagnosis. However, existing DL-based methods generally require voluminous manually-annotated region of interest (ROI) labels and class labels to train both the lesion detection and diagnosis models. In clinical practice, the ROI labels, i.e. ground truths, may not always be optimal for the classification task due to individual experience of sonologists, resulting in the issue of coarse annotation to limit the diagnosis performance of a CAD model. To address this issue, a novel Two-Stage Detection and Diagnosis Network (TSDDNet) is proposed based on weakly supervised learning to improve diagnostic accuracy of the ultrasound-based CAD for breast cancers. In particular, all the initial ROI-level labels are considered as coarse annotations before model training. In the first training stage, a candidate selection mechanism is then designed to refine manual ROIs in the fully annotated images and generate accurate pseudo-ROIs for the partially annotated images under the guidance of class labels. The training set is updated with more accurate ROI labels for the second training stage. A fusion network is developed to integrate detection network and classification network into a unified end-to-end framework as the final CAD model in the second training stage. A self-distillation strategy is designed on this model for joint optimization to further improves its diagnosis performance. The proposed TSDDNet is evaluated on three B-mode ultrasound datasets, and the experimental results indicate that it achieves the best performance on both lesion detection and diagnosis tasks, suggesting promising application potential.
事实证明,深度学习 (DL) 对于基于超声的乳腺癌计算机辅助诊断 (CAD) 非常有效。在自动 CAD 系统中,病变检测对于后续诊断至关重要。然而,现有的基于深度学习的方法通常需要大量手动注释的感兴趣区域(ROI)标签和类别标签来训练病变检测和诊断模型。在临床实践中,由于超声医师的个人经验,ROI 标签(即基本事实)可能并不总是最适合分类任务,从而导致粗略注释的问题,从而限制了 CAD 模型的诊断性能。为了解决这个问题,提出了一种基于弱监督学习的新型两阶段检测和诊断网络(TSDDNet),以提高基于超声的 CAD 对乳腺癌的诊断准确性。特别是,在模型训练之前,所有初始 ROI 级别标签都被视为粗略注释。在第一个训练阶段,设计候选选择机制来细化完全注释图像中的手动 ROI,并在类标签的指导下为部分注释图像生成准确的伪 ROI。在第二个训练阶段,训练集会更新为更准确的 ROI 标签。开发融合网络,将检测网络和分类网络集成到统一的端到端框架中,作为第二训练阶段的最终CAD模型。在此模型上设计了自蒸馏策略进行联合优化,以进一步提高其诊断性能。 所提出的 TSDDNet 在三个 B 型超声数据集上进行了评估,实验结果表明它在病变检测和诊断任务上均取得了最佳性能,表明其具有广阔的应用潜力。

AU Liu, Yuedong Zhou, Xuan Wei, Cunfeng Xu, Qiong
刘AU、周跃东、韦宣、徐存峰、琼

Sparse-view Spectral CT Reconstruction and Material Decomposition based on Multi-channel SGM.
基于多通道SGM的稀疏视能谱CT重建与材料分解。

In medical applications, the diffusion of contrast agents in tissue can reflect the physiological function of organisms, so it is valuable to quantify the distribution and content of contrast agents in the body over a period. Spectral CT has the advantages of multi-energy projection acquisition and material decomposition, which can quantify K-edge contrast agents. However, multiple repetitive spectral CT scans can cause excessive radiation doses. Sparse-view scanning is commonly used to reduce dose and scan time, but its reconstructed images are usually accompanied by streaking artifacts, which leads to inaccurate quantification of the contrast agents. To solve this problem, an unsupervised sparse-view spectral CT reconstruction and material decomposition algorithm based on the multi-channel score-based generative model (SGM) is proposed in this paper. First, multi-energy images and tissue images are used as multi-channel input data for SGM training. Secondly, the organism is multiply scanned in sparse views, and the trained SGM is utilized to generate multi-energy images and tissue images driven by sparse-view projections. After that, a material decomposition algorithm using tissue images generated by SGM as prior images for solving contrast agent images is established. Finally, the distribution and content of the contrast agents are obtained. The comparison and evaluation of this method are given in this paper, and a series of mouse scanning experiments are carried out to verify the effectiveness of the method.
在医学应用中,造影剂在组织中的扩散可以反映生物体的生理功能,因此量化造影剂在一段时间内在体内的分布和含量具有重要价值。能谱CT具有多能量投影采集和物质分解的优点,可以量化K边造影剂。然而,多次重复的能谱 CT 扫描可能会导致辐射剂量过多。稀疏视图扫描通常用于减少剂量和扫描时间,但其重建图像通常伴有条纹伪影,从而导致造影剂定量不准确。针对这一问题,本文提出一种基于多通道评分生成模型(SGM)的无监督稀疏视图能谱CT重建和材料分解算法。首先,使用多能量图像和组织图像作为SGM训练的多通道输入数据。其次,在稀疏视图中对生物体进行多次扫描,并利用经过训练的 SGM 生成由稀疏视图投影驱动的多能量图像和组织图像。之后,建立了一种利用SGM生成的组织图像作为先验图像来求解造影剂图像的材料分解算法。最后获得造影剂的分布和含量。本文对该方法进行了比较和评价,并进行了一系列小鼠扫描实验来验证该方法的有效性。

EI 1558-254X DA 2024-06-18 UT MEDLINE:38865221 PM 38865221 ER
EI 1558-254X DA 2024-06-18 UT MEDLINE:38865221 PM 38865221 ER

AU Billot, Benjamin Dey, Neel Moyer, Daniel Hoffmann, Malte Turk, Esra Abaci Gagoski, Borjan Ellen Grant, P Golland, Polina
AU Billot、本杰明·戴伊、尼尔·莫耶、丹尼尔·霍夫曼、马尔特·特克、埃斯拉·阿巴奇·加戈斯基、博尔扬·艾伦·格兰特、P Golland、波利纳

SE(3)-Equivariant and Noise-Invariant 3D Rigid Motion Tracking in Brain MRI.
SE(3)-脑 MRI 中的等变和噪声不变 3D 刚性运动跟踪。

Rigid motion tracking is paramount in many medical imaging applications where movements need to be detected, corrected, or accounted for. Modern strategies rely on convolutional neural networks (CNN) and pose this problem as rigid registration. Yet, CNNs do not exploit natural symmetries in this task, as they are equivariant to translations (their outputs shift with their inputs) but not to rotations. Here we propose EquiTrack, the first method that uses recent steerable SE(3)-equivariant CNNs (E-CNN) for motion tracking. While steerable E-CNNs can extract corresponding features across different poses, testing them on noisy medical images reveals that they do not have enough learning capacity to learn noise invariance. Thus, we introduce a hybrid architecture that pairs a denoiser with an E-CNN to decouple the processing of anatomically irrelevant intensity features from the extraction of equivariant spatial features. Rigid transforms are then estimated in closed-form. EquiTrack outperforms state-of-the-art learning and optimisation methods for motion tracking in adult brain MRI and fetal MRI time series. Our code is available at https://github.com/BBillot/EquiTrack.
刚性运动跟踪在许多需要检测、校正或解释运动的医学成像应用中至关重要。现代策略依赖于卷积神经网络(CNN),并将这个问题称为刚性配准。然而,CNN 在此任务中并未利用自然对称性,因为它们与平移等价(它们的输出随输入变化),但与旋转不同。在这里,我们提出了 EquiTrack,这是第一个使用最新的可操纵 SE(3) 等变 CNN (E-CNN) 进行运动跟踪的方法。虽然可操纵的 E-CNN 可以提取不同姿势的相应特征,但在噪声医学图像上测试它们表明它们没有足够的学习能力来学习噪声不变性。因此,我们引入了一种混合架构,将降噪器与 E-CNN 配对,以将解剖上不相关的强度特征的处理与等变空间特征的提取解耦。然后以封闭形式估计刚性变换。 EquiTrack 在成人大脑 MRI 和胎儿 MRI 时间序列的运动跟踪方面优于最先进的学习和优化方法。我们的代码可在 https://github.com/BBillot/EquiTrack 获取。

AU Chen, Fang Han, Haojie Wan, Peng Chen, Lingyu Kong, Wentao Liao, Hongen Wen, Baojie Liu, Chunrui Zhang, Daoqiang
陈AU、韩芳、万浩杰、陈鹏、孔令宇、廖文涛、温洪恩、刘宝杰、张春瑞、道强

Do as Sonographers Think: Contrast-enhanced Ultrasound for Thyroid Nodules Diagnosis via Microvascular Infiltrative Awareness.
按照超声医师的想法去做:对比增强超声通过微血管浸润意识诊断甲状腺结节。

Dynamic contrast-enhanced ultrasound (CEUS) imaging can reflect the microvascular distribution and blood flow perfusion, thereby holding clinical significance in distinguishing between malignant and benign thyroid nodules. Notably, CEUS offers a meticulous visualization of the microvascular distribution surrounding the nodule, leading to an apparent increase in tumor size compared to gray-scale ultrasound (US). In the dual-image obtained, the lesion size enlarged from gray-scale US to CEUS, as the microvascular appeared to be continuously infiltrating the surrounding tissue. Although the infiltrative dilatation of microvasculature remains ambiguous, sonographers believe it may promote the diagnosis of thyroid nodules. We propose a deep learning model designed to emulate the diagnostic reasoning process employed by sonographers. This model integrates the observation of microvascular infiltration on dynamic CEUS, leveraging the additional insights provided by gray-scale US for enhanced diagnostic support. Specifically, temporal projection attention is implemented on time dimension of dynamic CEUS to represent the microvascular perfusion. Additionally, we employ a group of confidence maps with flexible Sigmoid Alpha Functions to aware and describe the infiltrative dilatation process. Moreover, a self-adaptive integration mechanism is introduced to dynamically integrate the assisted gray-scale US and the confidence maps of CEUS for individual patients, ensuring a trustworthy diagnosis of thyroid nodules. In this retrospective study, we collected a thyroid nodule dataset of 282 CEUS videos. The method achieves a superior diagnostic accuracy and sensitivity of 89.52% and 93.75%, respectively. These results suggest that imitating the diagnostic thinking of sonographers, encompassing dynamic microvascular perfusion and infiltrative expansion, proves beneficial for CEUS-based thyroid nodule diagnosis.
动态超声造影(CEUS)成像可以反映微血管分布和血流灌注情况,对鉴别甲状腺结节的良恶性具有临床意义。值得注意的是,CEUS 可以对结节周围的微血管分布进行细致的可视化,与灰度超声 (US) 相比,导致肿瘤尺寸明显增加。在获得的双图像中,病变尺寸从灰度US放大到CEUS,因为微血管似乎不断浸润周围组织。尽管微脉管系统的浸润性扩张仍然不明确,但超声检查医师认为它可能促进甲状腺结节的诊断。我们提出了一种深度学习模型,旨在模拟超声医师所采用的诊断推理过程。该模型整合了动态 CEUS 上微血管浸润的观察,利用灰度超声提供的额外见解来增强诊断支持。具体来说,在动态CEUS的时间维度上实施时间投影注意力来表示微血管灌注。此外,我们采用一组具有灵活的 Sigmoid Alpha 函数的置信图来感知和描述渗透扩张过程。此外,引入自适应整合机制,动态整合个体患者的辅助灰度超声和CEUS置信度图,确保甲状腺结节的诊断可信。在这项回顾性研究中,我们收集了 282 个 CEUS 视频的甲状腺结节数据集。该方法的诊断准确率和灵敏度分别为 89.52% 和 93.75%。 这些结果表明,模仿超声医师的诊断思维,包括动态微血管灌注和浸润扩张,被证明有利于基于 CEUS 的甲状腺结节诊断。

EI 1558-254X DA 2024-05-31 UT MEDLINE:38801692 PM 38801692 ER
EI 1558-254X DA 2024-05-31 UT MEDLINE:38801692 PM 38801692 ER

AU Khan, M Owais Seresti, Anahita A Menon, Karthik Marsden, Alison L Nieman, Koen
AU Khan、M Owais Seresti、Anahita A Menon、Karthik Marsden、Alison L Nieman、Koen

Quantification and Visualization of CT Myocardial Perfusion Imaging to Detect Ischemia-Causing Coronary Arteries.
CT 心肌灌注成像的量化和可视化以检测引起缺血的冠状动脉。

Coronary computed tomography angiography (cCTA) has poor specificity to identify coronary stenosis that limit blood flow to the myocardial tissue. Integration of dynamic CT myocardial perfusion imaging (CT-MPI) can potentially improve the diagnostic accuracy. We propose a method that integrates cCTA and CT-MPI to identify culprit coronary lesions that limit blood flow to the myocardium. Coronary arteries and left ventricle surfaces were segmented from cCTA and registered to CT-MPI. Myocardial blood flow (MBF) was derived from CT-MPI. A ray-casting approach was developed to project volumetric MBF onto the left ventricle surface. MBF volume were divided into coronary-specific territories based on proximity to the nearest coronary artery. MBF and normalized MBF were computed for the myocardium and each of the coronary artery. Projection of MBF onto cCTA allowed for direct visualization of perfusion defects. Normalized MBF had higher correlation with ischemic myocardial territory compared to MBF (MBF: R2=0.81 and Index MBF: R2=0.90). There were 18 vessels that showed angiographic disease (stenosis >50%); however, normalized MBF demonstrated only 5 coronary territories to be ischemic. These findings demonstrate that cCTA and CT-MPI can be integrated to visualize myocardial defects and detect culprit coronary arteries responsible for perfusion defects. These methods can allow for non-invasive detection of ischemia-causing coronary lesions and ultimately help guide clinicians to deliver more targeted coronary interventions.
冠状动脉计算机断层扫描血管造影 (cCTA) 识别限制流向心肌组织的血流的冠状动脉狭窄的特异性较差。动态 CT 心肌灌注成像 (CT-MPI) 的集成可以潜在地提高诊断准确性。我们提出了一种整合 cCTA 和 CT-MPI 的方法来识别限制心肌血流的罪魁祸首冠状动脉病变。冠状动脉和左心室表面从 cCTA 中分割出来并注册到 CT-MPI 上。心肌血流量 (MBF) 源自 CT-MPI。开发了一种射线投射方法,将体积 MBF 投射到左心室表面。 MBF 体积根据与最近冠状动脉的接近程度分为冠状动脉特定区域。计算心肌和每条冠状动脉的MBF 和归一化MBF。 MBF 投影到 cCTA 上可以直接显示灌注缺陷。与 MBF 相比,归一化 MBF 与缺血心肌区域的相关性更高(MBF:R2=0.81,指数 MBF:R2=0.90)。有18条血管显示血管造影疾病(狭窄>50%);然而,标准化的 MBF 表明只有 5 个冠状动脉区域缺血。这些发现表明,cCTA 和 CT-MPI 可以整合起来,以可视化心肌缺陷并检测导致灌注缺陷的罪魁祸首冠状动脉。这些方法可以对引起缺血的冠状动脉病变进行无创检测,并最终帮助指导临床医生提供更有针对性的冠状动脉干预措施。

AU Jin, Yifei Meng, Ling-Jian
区金、孟亦飞、凌健

Exploration of Coincidence Detection of Cascade Photons to Enhance Preclinical Multi-Radionuclide SPECT Imaging
级联光子符合检测增强临床前多放射性核素 SPECT 成像的探索

We proposed a technique of coincidence detection of cascade photons (CDCP) to enhance preclinical SPECT imaging of therapeutic radionuclides emitting cascade photons, such as Lu-177, Ac-225, Ra-223, and In-111. We have carried out experimental studies to evaluate the proposed CDCP-SPECT imaging of low-activity radionuclides using a prototype coincidence detection system constructed with large-volume cadmium zinc telluride (CZT) imaging spectrometers and a pinhole collimator. With In-111 in experimental studies, the CDCP technique allows us to improve the signal-to-contamination in the projection (Projection-SCR) by similar to 53 times and reduce similar to 98% of the normalized contamination. Compared to traditional scatter correction, which achieves a Projection-SCR of 1.00, our CDCP method boosts it to 15.91, showing enhanced efficacy in reducing down-scattered contamination, especially at lower activities. The reconstructed images of a line source demonstrated the dramatic enhancement of the image quality with CDCP-SPECT compared to conventional and triple-energy-window-corrected SPECT data acquisition. We also introduced artificial energy blurring and Monte Carlo simulation to quantify the impact of detector performance, especially its energy resolution and timing resolution, on the enhancement through the CDCP technique. We have further demonstrated the benefits of the CDCP technique with simulation studies, which shows the potential of improving the signal-to-contamination ratio by 300 times with Ac-225, which emits cascade photons with a decay constant of similar to 0.1 ns. These results have demonstrated the potential of CDCP-enhanced SPECT for imaging a super-low level of therapeutic radionuclides in small animals.
我们提出了一种级联光子符合检测 (CDCP) 技术,以增强发射级联光子的治疗性放射性核素(例如 Lu-177、Ac-225、Ra-223 和 In-111)的临床前 SPECT 成像。我们开展了实验研究,使用由大体积碲化镉锌 (CZT) 成像光谱仪和针孔准直器构建的原型符合检测系统来评估所提出的低活度放射性核素的 CDCP-SPECT 成像。通过实验研究中的 In-111,CDCP 技术使我们能够将投影 (Projection-SCR) 中的信号污染比提高约 53 倍,并减少约 98% 的标准化污染。与传统的散射校正(投影-SCR 达到 1.00)相比,我们的 CDCP 方法将其提高到 15.91,显示出在减少下散射污染方面的增强功效,尤其是在较低的活动情况下。线源的重建图像表明,与传统和三能量窗口校正 SPECT 数据采集相比,CDCP-SPECT 的图像质量显着提高。我们还引入了人工能量模糊和蒙特卡罗模拟来量化探测器性能的影响,特别是其能量分辨率和定时分辨率,对通过CDCP技术增强的影响。我们通过模拟研究进一步证明了CDCP技术的优势,结果表明Ac-225具有将信号污染比提高300倍的潜力,Ac-225发射的级联光子的衰减常数接近0.1 ns。这些结果证明了CDCP增强的SPECT在小动物中对超低水平的治疗性放射性核素进行成像的潜力。

AU Jung, Wonsik Jeon, Eunjin Kang, Eunsong Suk, Heung-Il
欧正、全元植、姜恩珍、石恩松、兴日

EAG-RS: A Novel Explainability-Guided ROI-Selection Framework for ASD Diagnosis via Inter-Regional Relation Learning
EAG-RS:一种新颖的可解释性引导的 ROI 选择框架,用于通过区域间关系学习进行 ASD 诊断

Deep learning models based on resting-state functional magnetic resonance imaging (rs-fMRI) have been widely used to diagnose brain diseases, particularly autism spectrum disorder (ASD). Existing studies have leveraged the functional connectivity (FC) of rs-fMRI, achieving notable classification performance. However, they have significant limitations, including the lack of adequate information while using linear low-order FC as inputs to the model, not considering individual characteristics (i.e., different symptoms or varying stages of severity) among patients with ASD, and the non-explainability of the decision process. To cover these limitations, we propose a novel explainability-guided region of interest (ROI) selection (EAG-RS) framework that identifies non-linear high-order functional associations among brain regions by leveraging an explainable artificial intelligence technique and selects class-discriminative regions for brain disease identification. The proposed framework includes three steps: (i) inter-regional relation learning to estimate non-linear relations through random seed-based network masking, (ii) explainable connection-wise relevance score estimation to explore high-order relations between functional connections, and (iii) non-linear high-order FC-based diagnosis-informative ROI selection and classifier learning to identify ASD. We validated the effectiveness of our proposed method by conducting experiments using the Autism Brain Imaging Database Exchange (ABIDE) dataset, demonstrating that the proposed method outperforms other comparative methods in terms of various evaluation metrics. Furthermore, we qualitatively analyzed the selected ROIs and identified ASD subtypes linked to previous neuroscientific studies.
基于静息态功能磁共振成像(rs-fMRI)的深度学习模型已广泛用于诊断脑部疾病,特别是自闭症谱系障碍(ASD)。现有研究利用 rs-fMRI 的功能连接 (FC),取得了显着的分类性能。然而,它们有很大的局限性,包括在使用线性低阶 FC 作为模型的输入时缺乏足够的信息,没有考虑 ASD 患者的个体特征(即不同的症状或不同的严重程度阶段),以及非决策过程的可解释性。为了克服这些局限性,我们提出了一种新颖的可解释性引导的感兴趣区域(ROI)选择(EAG-RS)框架,该框架通过利用可解释的人工智能技术来识别大脑区域之间的非线性高阶功能关联,并选择类别判别性脑部疾病识别区域。所提出的框架包括三个步骤:(i)区域间关系学习,通过基于随机种子的网络掩码来估计非线性关系,(ii)可解释的连接相关性得分估计,以探索功能连接之间的高阶关系,以及(iii) 基于非线性高阶 FC 的诊断信息 ROI 选择和分类器学习来识别 ASD。我们通过使用自闭症脑成像数据库交换(ABIDE)数据集进行实验来验证我们提出的方法的有效性,证明所提出的方法在各种评估指标方面优于其他比较方法。此外,我们对选定的 ROI 进行了定性分析,并确定了与之前的神经科学研究相关的 ASD 亚型。

AU Du, Lei Zhao, Ying Zhang, Jianting Shang, Muheng Zhang, Jin Han, Junwei CA Alzheimers Dis Neuroimaging
杜AU, 赵磊, 张颖, 商建婷, 张木恒, 韩进, 俊伟 CA 阿尔茨海默病神经影像学

Identification of Genetic Risk Factors Based on Disease Progression Derived From Longitudinal Brain Imaging Phenotypes
根据纵向脑成像表型得出的疾病进展识别遗传风险因素

Neurodegenerative disorders usually happen stage-by-stage rather than overnight. Thus, cross-sectional brain imaging genetic methods could be insufficient to identify genetic risk factors. Repeatedly collecting imaging data over time appears to solve the problem. But most existing imaging genetic methods only use longitudinal imaging phenotypes straightforwardly, ignoring the disease progression trajectory which might be a more stable disease signature. In this paper, we propose a novel sparse multi-task mixed-effects longitudinal imaging genetic method (SMMLING). In our model, disease progression fitting and genetic risk factors identification are conducted jointly. Specifically, SMMLING models the disease progression using longitudinal imaging phenotypes, and then associates fitted disease progression with genetic variations. The baseline status and changing rate, i.e., the intercept and slope, of the progression trajectory thus shoulder the responsibility to discover loci of interest, which would have superior and stable performance. To facilitate the interpretation and stability, we employ $\ell _{{2},{1}}$ -norm and the fused group lasso (FGL) penalty to identify loci at both the individual level and group level. SMMLING can be solved by an efficient optimization algorithm which is guaranteed to converge to the global optimum. We evaluate SMMLING on synthetic data and real longitudinal neuroimaging genetic data. Both results show that, compared to existing longitudinal methods, SMMLING can not only decrease the modeling error but also identify more accurate and relevant genetic factors. Most risk loci reported by SMMLING are missed by comparison methods, implicating its superiority in genetic risk factors identification. Consequently, SMMLING could be a promising computational method for longitudinal imaging genetics.
神经退行性疾病通常是分阶段发生的,而不是一夜之间发生的。因此,横断面脑成像遗传方法可能不足以识别遗传风险因素。随着时间的推移反复收集成像数据似乎可以解决这个问题。但大多数现有的成像遗传学方法仅直接使用纵向成像表型,忽略了疾病进展轨迹,这可能是更稳定的疾病特征。在本文中,我们提出了一种新颖的稀疏多任务混合效应纵向成像遗传方法(SMMLING)。在我们的模型中,疾病进展拟合和遗传风险因素识别是联合进行的。具体来说,SMMLING 使用纵向成像表型对疾病进展进行建模,然后将拟合的疾病进展与遗传变异相关联。因此,进展轨迹的基线状态和变化率,即截距和斜率,担负着发现感兴趣位点的责任,具有优越和稳定的性能。为了促进解释和稳定性,我们采用 $\ell _{{2},{1}}$ -范数和融合组套索(FGL)惩罚来识别个体水平和群体水平的基因座。 SMMLING 可以通过有效的优化算法来解决,该算法保证收敛到全局最优值。我们根据合成数据和真实的纵向神经影像遗传数据评估 SMMLING。这两个结果都表明,与现有的纵向方法相比,SMMLING不仅可以减少建模误差,而且可以识别更准确和相关的遗传因素。 SMMLING 报告的大多数风险位点都被比较方法遗漏,这表明其在遗传风险因素识别方面的优越性。 因此,SMMLING 可能是一种有前途的纵向成像遗传学计算方法。

AU Tan, Zhiwei Shi, Fei Zhou, Yi Wang, Jingcheng Wang, Meng Peng, Yuanyuan Xu, Kai Liu, Ming Chen, Xinjian
谭AU、史志伟、周飞、王毅、王景城、彭猛、徐媛媛、刘凯、陈明、新建

A Multi-Scale Fusion and Transformer Based Registration Guided Speckle Noise Reduction for OCT Images
基于多尺度融合和变压器的 OCT 图像配准引导散斑降噪

Optical coherence tomography (OCT) images are inevitably affected by speckle noise because OCT is based on low-coherence interference. Multi-frame averaging is one of the effective methods to reduce speckle noise. Before averaging, the misalignment between images must be calibrated. In this paper, in order to reduce misalignment between images caused during the acquisition, a novel multi-scale fusion and Transformer based (MsFTMorph) method is proposed for deformable retinal OCT image registration. The proposed method captures global connectivity and locality with convolutional vision transformer and also incorporates a multi-resolution fusion strategy for learning the global affine transformation. Comparative experiments with other state-of-the-art registration methods demonstrate that the proposed method achieves higher registration accuracy. Guided by the registration, subsequent multi-frame averaging shows better results in speckle noise reduction. The noise is suppressed while the edges can be preserved. In addition, our proposed method has strong cross-domain generalization, which can be directly applied to images acquired by different scanners with different modes.
光学相干断层扫描(OCT)图像不可避免地受到散斑噪声的影响,因为OCT基于低相干干涉。多帧平均是降低散斑噪声的有效方法之一。在平均之前,必须校准图像之间的错位。在本文中,为了减少采集​​过程中引起的图像之间的错位,提出了一种新颖的基于多尺度融合和Transformer(MsFTMorph)的可变形视网膜OCT图像配准方法。所提出的方法通过卷积视觉变换器捕获全局连通性和局部性,并且还结合了用于学习全局仿射变换的多分辨率融合策略。与其他最先进的配准方法的比较实验表明,所提出的方法实现了更高的配准精度。在配准的指导下,后续的多帧平均显示出更好的散斑噪声抑制效果。噪声被抑制,同时可以保留边缘。此外,我们提出的方法具有很强的跨域泛化性,可以直接应用于不同模式的不同扫描仪获取的图像。

AU Hooshangnejad, Hamed China, Debarghya Huang, Yixuan Zbijewski, Wojciech Uneri, Ali McNutt, Todd Lee, Junghoon Ding, Kai
AU Hooshangnejad、Hamed China、Debarghya Huang、Yixuan Zbijewski、Wojciech Uneri、Ali McNutt、Todd Lee、Junghoon Ding、Kai

XIOSIS: An X-Ray-Based Intra-Operative Image-Guided Platform for Oncology Smart Material Delivery
XIOSIS:基于 X 射线的术中图像引导平台,用于肿瘤学智能材料输送

Image-guided interventional oncology procedures can greatly enhance the outcome of cancer treatment. As an enhancing procedure, oncology smart material delivery can increase cancer therapy's quality, effectiveness, and safety. However, the effectiveness of enhancing procedures highly depends on the accuracy of smart material placement procedures. Inaccurate placement of smart materials can lead to adverse side effects and health hazards. Image guidance can considerably improve the safety and robustness of smart material delivery. In this study, we developed a novel generative deep-learning platform that highly prioritizes clinical practicality and provides the most informative intra-operative feedback for image-guided smart material delivery. XIOSIS generates a patient-specific 3D volumetric computed tomography (CT) from three intraoperative radiographs (X-ray images) acquired by a mobile C-arm during the operation. As the first of its kind, XIOSIS (i) synthesizes the CT from small field-of-view radiographs;(ii) reconstructs the intra-operative spacer distribution; (iii) is robust; and (iv) is equipped with a novel soft-contrast cost function. To demonstrate the effectiveness of XIOSIS in providing intra-operative image guidance, we applied XIOSIS to the duodenal hydrogel spacer placement procedure. We evaluated XIOSIS performance in an image-guided virtual spacer placement and actual spacer placement in two cadaver specimens. XIOSIS showed a clinically acceptable performance, reconstructed the 3D intra-operative hydrogel spacer distribution with an average structural similarity of 0.88 and Dice coefficient of 0.63 and with less than 1 cm difference in spacer location relative to the spinal cord.
图像引导的介入肿瘤学手术可以极大地提高癌症治疗的效果。作为一种增强程序,肿瘤学智能材料输送可以提高癌症治疗的质量、有效性和安全性。然而,增强程序的有效性很大程度上取决于智能材料放置程序的准确性。智能材料的不准确放置可能会导致不良副作用和健康危害。图像引导可以显着提高智能物料输送的安全性和稳健性。在这项研究中,我们开发了一种新颖的生成深度学习平台,该平台高度重视临床实用性,并为图像引导的智能材料输送提供最丰富的术中反馈。 XIOSIS 根据手术期间移动 C 形臂采集的三张术中 X 光照片(X 射线图像)生成患者特定的 3D 体积计算机断层扫描 (CT)。作为同类产品中的第一个,XIOSIS (i) 通过小视场 X 线照片合成 CT;(ii) 重建术中垫片分布; (iii) 稳健; (iv) 配备了新颖的软对比成本函数。为了证明 XIOSIS 在提供术中图像引导方面的有效性,我们将 XIOSIS 应用于十二指肠水凝胶间隔物放置过程。我们评估了 XIOSIS 在图像引导的虚拟垫片放置和两个尸体标本中的实际垫片放置方面的性能。 XIOSIS 显示了临床上可接受的性能,重建了 3D 术中水凝胶垫片分布,平均结构相似性为 0.88,Dice 系数为 0.63,垫片相对于脊髓的位置差异小于 1 cm。

AU Wei, Xingyue Ge, Lin Huang, Lijie Luo, Jianwen Xu, Yan
区伟、葛星月、黄林、罗丽杰、徐建文、严

Unsupervised Non-rigid Histological Image Registration Guided by Keypoint Correspondences Based on Learnable Deep Features with Iterative Training.
基于可学习深度特征和迭代训练的关键点对应引导的无监督非刚性组织学图像配准。

Histological image registration is a fundamental task in histological image analysis. It is challenging because of substantial appearance differences due to multiple staining. Keypoint correspondences, i.e., matched keypoint pairs, have been introduced to guide unsupervised deep learning (DL) based registration methods to handle such a registration task. This paper proposes an iterative keypoint correspondence-guided (IKCG) unsupervised network for non-rigid histological image registration. Fixed deep features and learnable deep features are introduced as keypoint descriptors to automatically establish keypoint correspondences, the distance between which is used as a loss function to train the registration network. Fixed deep features extracted from DL networks that are pre-trained on natural image datasets are more discriminative than handcrafted ones, benefiting from the deep and hierarchical nature of DL networks. The intermediate layer outputs of the registration networks trained on histological image datasets are extracted as learnable deep features, which reveal unique information for histological images. An iterative training strategy is adopted to train the registration network and optimize learnable deep features jointly. Benefiting from the excellent matching ability of learnable deep features optimized with the iterative training strategy, the proposed method can solve the local non-rigid large displacement problem, an inevitable problem usually caused by misoperation, such as tears in producing tissue slices. The proposed method is evaluated on the Automatic Non-rigid Histology Image Registration (ANHIR) website and AutomatiC Registration Of Breast cAncer Tissue (ACROBAT) website. It ranked 1st on both websites as of August 6th, 2024.
组织学图像配准是组织学图像分析的一项基本任务。由于多次染色导致外观存在显着差异,因此具有挑战性。已引入关键点对应关系,即匹配的关键点对来指导基于无监督深度学习(DL)的注册方法来处理此类注册任务。本文提出了一种用于非刚性组织学图像配准的迭代关键点对应引导(IKCG)无监督网络。引入固定深度特征和可学习深度特征作为关键点描述符来自动建立关键点对应关系,其之间的距离用作损失函数来训练配准网络。从在自然图像数据集上预先训练的深度学习网络中提取的固定深度特征比手工制作的特征更具辨别力,这得益于深度学习网络的深度和分层性质。在组织学图像数据集上训练的配准网络的中间层输出被提取为可学习的深层特征,这些特征揭示了组织学图像的独特信息。采用迭代训练策略来训练配准网络并联合优化可学习的深层特征。得益于迭代训练策略优化的可学习深度特征的优异匹配能力,该方法可以解决局部非刚性大位移问题,这是通常由误操作引起的不可避免的问题,例如制作组织切片时的撕裂。该方法在自动非刚性组织学图像配准(ANHIR)网站和乳腺癌组织自动配准(ACROBAT)网站上进行了评估。截至2024年8月6日,它在两个网站上均排名第一。

AU Zhang, Yue Peng, Chengtao Wang, Qiuli Song, Dan Li, Kaiyan Kevin Zhou, S
AU 张、彭岳、王成涛、宋秋丽、李丹、凯彦 Kevin Zhou、S

Unified Multi-Modal Image Synthesis for Missing Modality Imputation.
用于缺失模态插补的统一多模态图像合成。

Multi-modal medical images provide complementary soft-tissue characteristics that aid in the screening and diagnosis of diseases. However, limited scanning time, image corruption and various imaging protocols often result in incomplete multi-modal images, thus limiting the usage of multi-modal data for clinical purposes. To address this issue, in this paper, we propose a novel unified multi-modal image synthesis method for missing modality imputation. Our method overall takes a generative adversarial architecture, which aims to synthesize missing modalities from any combination of available ones with a single model. To this end, we specifically design a Commonality- and Discrepancy-Sensitive Encoder for the generator to exploit both modality-invariant and specific information contained in input modalities. The incorporation of both types of information facilitates the generation of images with consistent anatomy and realistic details of the desired distribution. Besides, we propose a Dynamic Feature Unification Module to integrate information from a varying number of available modalities, which enables the network to be robust to random missing modalities. The module performs both hard integration and soft integration, ensuring the effectiveness of feature combination while avoiding information loss. Verified on two public multi-modal magnetic resonance datasets, the proposed method is effective in handling various synthesis tasks and shows superior performance compared to previous methods.
多模态医学图像提供互补的软组织特征,有助于疾病的筛查和诊断。然而,有限的扫描时间、图像损坏和各种成像协议通常会导致多模态图像不完整,从而限制了多模态数据在临床上的使用。为了解决这个问题,在本文中,我们提出了一种用于缺失模态插补的新型统一多模态图像合成方法。我们的方法总体上采用生成对抗架构,其目的是从可用的模式与单个模型的任意组合中合成缺失的模式。为此,我们专门为生成器设计了一个通用性和差异敏感编码器,以利用输入模态中包含的模态不变信息和特定信息。两种类型信息的结合有助于生成具有一致的解剖结构和所需分布的真实细节的图像。此外,我们提出了一个动态特征统一模块来集成来自不同数量的可用模态的信息,这使得网络能够对随机丢失的模态具有鲁棒性。该模块同时进行硬集成和软集成,保证特征组合的有效性,同时避免信息丢失。在两个公共多模态磁共振数据集上进行验证,所提出的方法可以有效处理各种合成任务,并且与以前的方法相比表现出优越的性能。

AU Xiao, Jiayin Li, Si Lin, Tongxu Zhu, Jian Yuan, Xiaochen Feng, David Dagan Sheng, Bin
区晓、李佳音、林思、朱同旭、袁建、冯晓晨、盛大干、斌

Multi-Label Chest X-Ray Image Classification with Single Positive Labels.
具有单一阳性标签的多标签胸部 X 射线图像分类。

Deep learning approaches for multi-label Chest X-ray (CXR) images classification usually require large-scale datasets. However, acquiring such datasets with full annotations is costly, time-consuming, and prone to noisy labels. Therefore, we introduce a weakly supervised learning problem called Single Positive Multi-label Learning (SPML) into CXR images classification (abbreviated as SPML-CXR), in which only one positive label is annotated per image. A simple solution to SPML-CXR problem is to assume that all the unannotated pathological labels are negative, however, it might introduce false negative labels and decrease the model performance. To this end, we present a Multi-level Pseudo-label Consistency (MPC) framework for SPML-CXR. First, inspired by the pseudo-labeling and consistency regularization in semi-supervised learning, we construct a weak-to-strong consistency framework, where the model prediction on weakly-augmented image is treated as the pseudo label for supervising the model prediction on a strongly-augmented version of the same image, and define an Image-level Perturbation-based Consistency (IPC) regularization to recover the potential mislabeled positive labels. Besides, we incorporate Random Elastic Deformation (RED) as an additional strong augmentation to enhance the perturbation. Second, aiming to expand the perturbation space, we design a perturbation stream to the consistency framework at the feature-level and introduce a Feature-level Perturbation-based Consistency (FPC) regularization as a supplement. Third, we design a Transformer-based encoder module to explore the sample relationship within each mini-batch by a Batch-level Transformer-based Correlation (BTC) regularization. Extensive experiments on the CheXpert and MIMIC-CXR datasets have shown the effectiveness of our MPC framework for solving the SPML-CXR problem.
用于多标签胸部 X 射线 (CXR) 图像分类的深度学习方法通​​常需要大规模数据集。然而,获取具有完整注释的此类数据集成本高昂、耗时,并且容易产生嘈杂的标签。因此,我们将一种称为单正多标签学习(SPML)的弱监督学习问题引入CXR图像分类(缩写为SPML-CXR),其中每张图像仅注释一个正标签。 SPML-CXR 问题的一个简单解决方案是假设所有未注释的病理标签均为阴性,然而,这可能会引入假阴性标签并降低模型性能。为此,我们提出了 SPML-CXR 的多级伪标签一致性(MPC)框架。首先,受到半监督学习中伪标签和一致性正则化的启发,我们构建了一个弱到强的一致性框架,其中弱增强图像上的模型预测被视为伪标签,用于监督弱增强图像上的模型预测。同一图像的强烈增强版本,并定义图像级基于扰动的一致性(IPC)正则化以恢复潜在的错误标记的正标签。此外,我们将随机弹性变形(RED)作为额外的强增强来增强扰动。其次,为了扩展扰动空间,我们设计了特征级一致性框架的扰动流,并引入了特征级基于扰动的一致性(FPC)正则化作为补充。第三,我们设计了一个基于 Transformer 的编码器模块,通过批量级基于 Transformer 的相关性 (BTC) 正则化来探索每个小批量内的样本关系。 对 CheXpert 和 MIMIC-CXR 数据集的大量实验表明,我们的 MPC 框架在解决 SPML-CXR 问题方面的有效性。

AU Yang, Yuming Duan, Huilong Zheng, Yinfei
欧阳、段玉明、郑辉龙、银飞

Improved Transcranial Plane-Wave Imaging With Learned Speed-of-Sound Maps
利用学习的声速图改进经颅平面波成像

Although transcranial ultrasound plane-wave imaging (PWI) has promising clinical application prospects, studies have shown that variable speed-of-sound (SoS) would seriously damage the quality of ultrasound images. The mismatch between the conventional constant velocity assumption and the actual SoS distribution leads to the general blurring of ultrasound images. The optimization scheme for reconstructing transcranial ultrasound image is often solved using iterative methods like full-waveform inversion. These iterative methods are computationally expensive and based on prior magnetic resonance imaging (MRI) or computed tomography (CT) information. In contrast, the multi-stencils fast marching (MSFM) method can produce accurate time travel maps for the skull with heterogeneous acoustic speed. In this study, we first propose a convolutional neural network (CNN) to predict SoS maps of the skull from PWI channel data. Then, use these maps to correct the travel time to reduce transcranial aberration. To validate the performance of the proposed method, numerical, phantom and intact human skull studies were conducted using a linear array transducer (L11-5v, 128 elements, pitch = 0.3 mm). Numerical simulations demonstrate that for point targets, the lateral resolution of MSFM-restored images increased by 65%, and the center position shift decreased by 89%. For the cyst targets, the eccentricity of the fitting ellipse decreased by 75%, and the center position shift decreased by 58%. In the phantom study, the lateral resolution of MSFM-restored images was increased by 49%, and the position shift was reduced by 1.72 mm. This pipeline, termed AutoSoS, thus shows the potential to correct distortions in real-time transcranial ultrasound imaging, as demonstrated by experiments on the intact human skull.
尽管经颅超声平面波成像(PWI)具有广阔的临床应用前景,但研究表明可变声速(SoS)会严重损害超声图像的质量。传统的等速假设与实际的 SoS 分布之间的不匹配导致超声图像普遍模糊。重建经颅超声图像的优化方案通常使用全波形反演等迭代方法来求解。这些迭代方法的计算成本很高,并且基于先前的磁共振成像 (MRI) 或计算机断层扫描 (CT) 信息。相比之下,多模板快速行进(MSFM)方法可以为具有异质声速的头骨生成准确的时间旅行图。在这项研究中,我们首先提出了一个卷积神经网络(CNN)来根据 PWI 通道数据预测头骨的 SoS 图。然后,使用这些图来校正行进时间以减少经颅像差。为了验证所提出方法的性能,使用线性阵列传感器(L11-5v,128 个元件,节距 = 0.3 mm)进行了数值、模型和完整的人类头骨研究。数值模拟表明,对于点目标,MSFM恢复图像的横向分辨率提高了65%,中心位置偏移降低了89%。对于囊肿目标,拟合椭圆的偏心率降低了75%,中心位置偏移降低了58%。在体模研究中,MSFM恢复图像的横向分辨率提高了49%,位置偏移减少了1.72毫米。 因此,这一管道被称为 AutoSoS,显示出纠正实时经颅超声成像失真的潜力,正如在完整人类头骨上进行的实验所证明的那样。

AU Zhang, Binyu Meng, Zhu Li, Hongyuan Zhao, Zhicheng Su, Fei
张AU、孟斌宇、朱力、赵宏远、苏志成、费

MTCSNet: One-stage learning and two-point labeling are sufficient for cell segmentation.
MTCSNet:一阶段学习和两点标记足以进行细胞分割。

Deep convolution neural networks have been widely used in medical image analysis, such as lesion identification in whole-slide images, cancer detection, and cell segmentation, etc. However, it is often inevitable that researchers try their best to refine annotations so as to enhance the model performance, especially for cell segmentation task. Weakly supervised learning can greatly reduce the workload of annotations, while there is still a huge performance gap between the weakly and fully supervised learning approaches. In this work, we propose a weakly-supervised cell segmentation method, namely Multi-Task Cell Segmentation Network (MTCSNet), for multi-modal medical images, including pathological, brightfield, fluorescent, phase-contrast and differential interference contrast images. MTCSNet is learnt in a single-stage training manner, where only two annotated points for each cell provide supervision information, and the first one is the centroid, the second one is its boundary. Additionally, five auxiliary tasks are elaborately designed to train the network, including two pixel-level classifications, a pixel-level regression, a local temperature scaling and an instance-level distance regression task, which is proposed to regress the distances between the cell centroid and its boundaries in eight orientations. The experimental results indicate that our method outperforms all state-of-the-art weakly-supervised cell segmentation approaches on public multi-modal medical image datasets. The promising performance also shows that a single-stage learning with two-point labeling approach are sufficient for cell segmentation, instead of fine contour delineation. The codes are available at: https://github.com/binging512/MTCSNet.
深度卷积神经网络已广泛应用于医学图像分析,例如全切片图像中的病灶识别、癌症检测和细胞分割等。然而,研究人员往往不可避免地会尽力细化注释以增强注释模型性能,特别是细胞分割任务。弱监督学习可以大大减少注释的工作量,但弱监督学习方法和全监督学习方法之间仍然存在巨大的性能差距。在这项工作中,我们提出了一种弱监督细胞分割方法,即多任务细胞分割网络(MTCSNet),用于多模态医学图像,包括病理、明场、荧光、相差和微分干涉对比图像。 MTCSNet以单阶段训练方式学习,每个单元只有两个注释点提供监督信息,第一个是质心,第二个是其边界。此外,还精心设计了五个辅助任务来训练网络,包括两个像素级分类、一个像素级回归、一个局部温度缩放和一个实例级距离回归任务,该任务旨在回归细胞质心之间的距离及其八个方向的边界。实验结果表明,我们的方法在公共多模态医学图像数据集上优于所有最先进的弱监督细胞分割方法。令人鼓舞的性能还表明,采用两点标记方法的单阶段学习足以进行细胞分割,而不是精细轮廓描绘。代码位于:https://github.com/binging512/MTCSNet。

AU Zhu, Jianjun Wang, Cheng Zhang, Yi Zhan, Meixiao Zhao, Wei Teng, Sitong Lu, Ligong Teng, Gao-Jun
朱AU、王建军、张成、詹毅、赵美晓、滕伟、陆思同、滕立功、高军

3D/2D Vessel Registration Based on Monte Carlo Tree Search and Manifold Regularization
基于蒙特卡罗树搜索和流形正则化的 3D/2D 船舶配准

The augmented intra-operative real-time imaging in vascular interventional surgery, which is generally performed by projecting preoperative computed tomography angiography images onto intraoperative digital subtraction angiography (DSA) images, can compensate for the deficiencies of DSA-based navigation, such as lack of depth information and excessive use of toxic contrast agents. 3D/2D vessel registration is the critical step in image augmentation. A 3D/2D registration method based on vessel graph matching is proposed in this study. For rigid registration, the matching of vessel graphs can be decomposed into continuous states, thus 3D/2D vascular registration is formulated as a search tree problem. The Monte Carlo tree search method is applied to find the optimal vessel matching associated with the highest rigid registration score. For nonrigid registration, we propose a novel vessel deformation model based on manifold regularization. This model incorporates the smoothness constraint of vessel topology into the objective function. Furthermore, we derive simplified gradient formulas that enable fast registration. The proposed technique undergoes evaluation against seven rigid and three nonrigid methods using a variety of data - simulated, algorithmically generated, and manually annotated - across three vascular anatomies: the hepatic artery, coronary artery, and aorta. Our findings show the proposed method's resistance to pose variations, noise, and deformations, outperforming existing methods in terms of registration accuracy and computational efficiency. The proposed method demonstrates average registration errors of 2.14 mm and 0.34 mm for rigid and nonrigid registration, and an average computation time of 0.51 s.
血管介入手术中增强的术中实时成像通常通过将术前计算机断层扫描血管造影图像投影到术中数字减影血管造影(DSA)图像上来进行,可以弥补基于DSA的导航的不足,例如缺乏深度信息和过量使用有毒造影剂。 3D/2D 血管配准是图像增强的关键步骤。本研究提出了一种基于血管图匹配的3D/2D配准方法。对于刚性配准,血管图的匹配可以分解为连续状态,因此3D/2D血管配准被表述为搜索树问题。应用蒙特卡罗树搜索方法来寻找与最高刚性配准分数相关的最佳血管匹配。对于非刚性配准,我们提出了一种基于流形正则化的新型血管变形模型。该模型将容器拓扑的平滑约束纳入目标函数。此外,我们推导了简化的梯度公式,可以实现快速配准。所提出的技术使用各种数据(模拟的、算法生成的和手动注释的)针对七种刚性和三种非刚性方法进行了评估,涉及三种血管解剖结构:肝动脉、冠状动脉和主动脉。我们的研究结果表明,所提出的方法能够抵抗姿势变化、噪声和变形,在配准精度和计算效率方面优于现有方法。该方法的刚性和非刚性配准平均配准误差分别为 2.14 毫米和 0.34 毫米,平均计算时间为 0.51 秒。

AU Guan, Yu Yu, Chuanming Cui, Zhuoxu Zhou, Huilin Liu, Qiegen
区管、余宇、崔传明、周卓旭、刘慧琳、切根

Correlated and Multi-frequency Diffusion Modeling for Highly Under-sampled MRI Reconstruction.
用于高度欠采样 MRI 重建的相关多频扩散建模。

Given the obstacle in accentuating the reconstruction accuracy for diagnostically significant tissues, most existing MRI reconstruction methods perform targeted reconstruction of the entire MR image without considering fine details, especially when dealing with highly under-sampled images. Therefore, a considerable volume of efforts has been directed towards surmounting this challenge, as evidenced by the emergence of numerous methods dedicated to preserving high-frequency content as well as fine textural details in the reconstructed image. In this case, exploring the merits associated with each method of mining high-frequency information and formulating a reasonable principle to maximize the joint utilization of these approaches will be a more effective solution to achieve accurate reconstruction. Specifically, this work constructs an innovative principle named Correlated and Multi-frequency Diffusion Model (CM-DM) for highly under-sampled MRI reconstruction. In essence, the rationale underlying the establishment of such principle lies not in assembling arbitrary models, but in pursuing the effective combinations and replacement of components. It also means that the novel principle focuses on forming a correlated and multi-frequency prior through different high-frequency operators in the diffusion process. Moreover, multi-frequency prior further constraints the noise term closer to the target distribution in the frequency domain, thereby making the diffusion process converge faster. Experimental results verify that the proposed method achieved superior reconstruction accuracy, with a notable enhancement of approximately 2dB in PSNR compared to state-of-the-art methods.
鉴于在增强具有诊断意义的组织的重建精度方面存在障碍,大多数现有的 MRI 重建方法对整个 MR 图像进行有针对性的重建,而不考虑精细细节,特别是在处理高度欠采样的图像时。因此,人们付出了大量的努力来克服这一挑战,许多致力于保留重建图像中高频内容以及精细纹理细节的方法的出现就证明了这一点。在这种情况下,探索每种高频信息挖掘方法的优点,并制定合理的原则,最大限度地联合利用这些方法,将是实现精确重建的更有效的解决方案。具体来说,这项工作构建了一种创新原理,称为相关和多频扩散模型(CM-DM),用于高度欠采样 MRI 重建。从本质上讲,建立这一原则的依据不在于任意组装模型,而在于追求构件的有效组合和替换。这也意味着该新颖原理的重点是在扩散过程中通过不同的高频算子形成相关的多频先验。而且,多频先验进一步约束噪声项在频域上更接近目标分布,从而使扩散过程收敛得更快。实验结果验证了所提出的方法实现了优异的重建精度,与最先进的方法相比,PSNR 显着提高了约 2dB。

AU Huang, Peizhou Zhang, Chaoyi Zhang, Xiaoliang Li, Xiaojuan Dong, Liang Ying, Leslie
黄AU、张培洲、张超一、李晓亮、董晓娟、梁莹、Leslie

Self-Supervised Deep Unrolled Reconstruction Using Regularization by Denoising
使用正则化去噪的自监督深度展开重建

Deep learning methods have been successfully used in various computer vision tasks. Inspired by that success, deep learning has been explored in magnetic resonance imaging (MRI) reconstruction. In particular, integrating deep learning and model-based optimization methods has shown considerable advantages. However, a large amount of labeled training data is typically needed for high reconstruction quality, which is challenging for some MRI applications. In this paper, we propose a novel reconstruction method, named DURED-Net, that enables interpretable self-supervised learning for MR image reconstruction by combining a self-supervised denoising network and a plug-and-play method. We aim to boost the reconstruction performance of Noise2Noise in MR reconstruction by adding an explicit prior that utilizes imaging physics. Specifically, the leverage of a denoising network for MRI reconstruction is achieved using Regularization by Denoising (RED). Experiment results demonstrate that the proposed method requires a reduced amount of training data to achieve high reconstruction quality among the state-of-the-art approaches utilizing Noise2Noise.
深度学习方法已成功应用于各种计算机视觉任务。受这一成功的启发,深度学习在磁共振成像 (MRI) 重建中得到了探索。特别是,深度学习和基于模型的优化方法的结合已经显示出相当大的优势。然而,为了获得高重建质量,通常需要大量标记的训练数据,这对于某些 MRI 应用来说是一个挑战。在本文中,我们提出了一种名为 DURED-Net 的新颖重建方法,该方法通过结合自监督去噪网络和即插即用方法,实现 MR 图像重建的可解释自监督学习。我们的目标是通过添加利用成像物理的显式先验来提高 MR 重建中 Noise2Noise 的重建性能。具体来说,利用去噪正则化 (RED) 来实现 MRI 重建的去噪网络的作用。实验结果表明,在利用 Noise2Noise 的最先进方法中,所提出的方法需要减少训练数据量来实现高重建质量。

AU Pei, Yuchen Zhao, Fenqiang Zhong, Tao Ma, Laifa Liao, Lufan Wu, Zhengwang Wang, Li Zhang, He Wang, Lisheng Li, Gang
AU Pei, 赵雨辰, 钟奋强, 马涛, 廖来发, 吴路凡, 王正旺, 张莉, 王鹤, 李力生, 刚

PETS-Nets: Joint Pose Estimation and Tissue Segmentation of Fetal Brains Using Anatomy-Guided Networks
PETS-Nets:使用解剖引导网络进行胎儿大脑的联合姿势估计和组织分割

Fetal Magnetic Resonance Imaging (MRI) is challenged by fetal movements and maternal breathing. Although fast MRI sequences allow artifact free acquisition of individual 2D slices, motion frequently occurs in the acquisition of spatially adjacent slices. Motion correction for each slice is thus critical for the reconstruction of 3D fetal brain MRI. In this paper, we propose a novel multi-task learning framework that adopts a coarse-to-fine strategy to jointly learn the pose estimation parameters for motion correction and tissue segmentation map of each slice in fetal MRI. Particularly, we design a regression-based segmentation loss as a deep supervision to learn anatomically more meaningful features for pose estimation and segmentation. In the coarse stage, a U-Net-like network learns the features shared for both tasks. In the refinement stage, to fully utilize the anatomical information, signed distance maps constructed from the coarse segmentation are introduced to guide the feature learning for both tasks. Finally, iterative incorporation of the signed distance maps further improves the performance of both regression and segmentation progressively. Experimental results of cross-validation across two different fetal datasets acquired with different scanners and imaging protocols demonstrate the effectiveness of the proposed method in reducing the pose estimation error and obtaining superior tissue segmentation results simultaneously, compared with state-of-the-art methods.
胎儿磁共振成像 (MRI) 受到胎儿运动和母亲呼吸的挑战。尽管快速 MRI 序列允许无伪影地采集单个 2D 切片,但在采集空间相邻切片时经常会发生运动。因此,每个切片的运动校正对于 3D 胎儿脑 MRI 的重建至关重要。在本文中,我们提出了一种新颖的多任务学习框架,该框架采用从粗到细的策略来共同学习胎儿 MRI 中每个切片的运动校正和组织分割图的姿势估计参数。特别是,我们设计了基于回归的分割损失作为深度监督,以学习解剖学上更有意义的特征,以进行姿势估计和分割。在粗略阶段,类似 U-Net 的网络学习两个任务共享的特征。在细化阶段,为了充分利用解剖信息,引入了从粗分割构建的符号距离图来指导这两个任务的特征学习。最后,带符号距离图的迭代合并进一步逐步提高了回归和分割的性能。与最先进的方法相比,使用不同扫描仪和成像协议获取的两个不同胎儿数据集的交叉验证实验结果证明了所提出的方法在减少姿势估计误差和同时获得优异的组织分割结果方面的有效性。

AU van Gogh, Stefano Mukherjee, Subhadip Rawlik, Michal Pereira, Alexandre Spindler, Simon Zdora, Marie-Christine Stauber, Martin Varga, Zsuzsanna Stampanoni, Marco
AU 梵高、斯特凡诺·慕克吉、苏巴迪普·罗利克、米哈尔·佩雷拉、亚历山大·斯平德勒、西蒙·兹多拉、玛丽-克里斯蒂娜·施陶伯、马丁·瓦尔加、苏珊娜·斯坦帕诺尼、马可

Data-Driven Gradient Regularization for Quasi-Newton Optimization in Iterative Grating Interferometry CT Reconstruction
迭代光栅干涉CT重建中数据驱动的梯度正则化拟牛顿优化

Grating interferometry CT (GI-CT) is a promising technology that could play an important role in future breast cancer imaging. Thanks to its sensitivity to refraction and small-angle scattering, GI-CT could augment the diagnostic content of conventional absorption-based CT. However, reconstructing GI-CT tomographies is a complex task because of ill problem conditioning and high noise amplitudes. It has previously been shown that combining data-driven regularization with iterative reconstruction is promising for tackling challenging inverse problems in medical imaging. In this work, we present an algorithm that allows seamless combination of data-driven regularization with quasi-Newton solvers, which can better deal with ill-conditioned problems compared to gradient descent-based optimization algorithms. Contrary to most available algorithms, our method applies regularization in the gradient domain rather than in the image domain. This comes with a crucial advantage when applied in conjunction with quasi-Newton solvers: the Hessian is approximated solely based on denoised data. We apply the proposed method, which we call GradReg, to both conventional breast CT and GI-CT and show that both significantly benefit from our approach in terms of dose efficiency. Moreover, our results suggest that thanks to its sharper gradients that carry more high spatial-frequency content, GI-CT can benefit more from GradReg compared to conventional breast CT. Crucially, GradReg can be applied to any image reconstruction task which relies on gradient-based updates.
光栅干涉CT(GI-CT)是一项很有前景的技术,可能在未来乳腺癌成像中发挥重要作用。由于其对折射和小角散射的敏感性,GI-CT 可以增强传统吸收 CT 的诊断内容。然而,由于不良问题调节和高噪声幅度,重建 GI-CT 断层扫描是一项复杂的任务。先前的研究表明,将数据驱动的正则化与迭代重建相结合有望解决医学成像中具有挑战性的逆问题。在这项工作中,我们提出了一种算法,可以将数据驱动的正则化与拟牛顿求解器无缝结合,与基于梯度下降的优化算法相比,它可以更好地处理病态问题。与大多数可用算法相反,我们的方法在梯度域而不是图像域中应用正则化。当与拟牛顿求解器结合使用时,这具有至关重要的优势:Hessian 矩阵仅基于去噪数据进行近似。我们将所提出的方法(我们称之为 GradReg)应用于传统乳腺 CT 和 GI-CT,并表明两者在剂量效率方面都显着受益于我们的方法。此外,我们的结果表明,由于其更尖锐的梯度携带更多的高空间频率内容,与传统乳腺 CT 相比,GI-CT 可以从 GradReg 中受益更多。至关重要的是,GradReg 可以应用于任何依赖于基于梯度的更新的图像重建任务。

AU Guo, Pengfei Mei, Yiqun Zhou, Jinyuan Jiang, Shanshan Patel, Vishal M.
郭AU,梅鹏飞,周逸群,姜金元,珊珊帕特尔,Vishal M.

ReconFormer: Accelerated MRI Reconstruction Using Recurrent Transformer
ReconFormer:使用循环变压器加速 MRI 重建

The accelerating magnetic resonance imaging (MRI) reconstruction process is a challenging ill-posed inverse problem due to the excessive under-sampling operation in k -space. In this paper, we propose a recurrent Transformer model, namely ReconFormer, for MRI reconstruction, which can iteratively reconstruct high-fidelity magnetic resonance images from highly under-sampled k-space data (e.g., up to 8 x acceleration). In particular, the proposed architecture is built upon Recurrent Pyramid Transformer Layers (RPTLs). The core design of the proposed method is Recurrent Scale-wise Attention (RSA), which jointly exploits intrinsic multi-scale information at every architecture unit as well as the dependencies of the deep feature correlation through recurrent states. Moreover, benefiting from its recurrent nature, ReconFormer is lightweight compared to other baselines and only contains 1.1 M trainable parameters. We validate the effectiveness of ReconFormer on multiple datasets with different magnetic resonance sequences and show that it achieves significant improvements over the state-of-the-art methods with better parameter efficiency. The implementation code and pre-trained weights are available at https://github.com/guopengf/ReconFormer.
由于k空间中过度的欠采样操作,加速磁共振成像(MRI)重建过程是一个具有挑战性的不适定反问题。在本文中,我们提出了一种用于 MRI 重建的循环 Transformer 模型,即 ReconFormer,它可以从高度欠采样的 k 空间数据(例如,高达 8 倍加速度)迭代重建高保真磁共振图像。特别是,所提出的架构是建立在循环金字塔变压器层(RPTL)之上的。该方法的核心设计是循环尺度注意(RSA),它联合利用每个架构单元的内在多尺度信息以及通过循环状态的深层特征相关性的依赖关系。此外,得益于其循环性质,ReconFormer 与其他基线相比是轻量级的,并且仅包含 110 万个可训练参数。我们在具有不同磁共振序列的多个数据集上验证了 ReconFormer 的有效性,并表明它比最先进的方法取得了显着改进,具有更好的参数效率。实现代码和预训练权重可在 https://github.com/guopengf/ReconFormer 获取。

AU Song, Zhiyun Du, Penghui Yan, Junpeng Li, Kailu Shou, Jianzhong Lai, Maode Fan, Yubo Xu, Yan
AU宋、杜志云、严鹏辉、李俊鹏、寿凯鲁、赖建中、范茂德、徐宇波、严

Nucleus-Aware Self-Supervised Pretraining Using Unpaired Image-to-Image Translation for Histopathology Images
使用不成对的图像到图像转换进行组织病理学图像的核感知自监督预训练

Self-supervised pretraining attempts to enhance model performance by obtaining effective features from unlabeled data, and has demonstrated its effectiveness in the field of histopathology images. Despite its success, few works concentrate on the extraction of nucleus-level information, which is essential for pathologic analysis. In this work, we propose a novel nucleus-aware self-supervised pretraining framework for histopathology images. The framework aims to capture the nuclear morphology and distribution information through unpaired image-to-image translation between histopathology images and pseudo mask images. The generation process is modulated by both conditional and stochastic style representations, ensuring the reality and diversity of the generated histopathology images for pretraining. Further, an instance segmentation guided strategy is employed to capture instance-level information. The experiments on 7 datasets show that the proposed pretraining method outperforms supervised ones on Kather classification, multiple instance learning, and 5 dense-prediction tasks with the transfer learning protocol, and yields superior results than other self-supervised approaches on 8 semi-supervised tasks. Our project is publicly available at https://github.com/zhiyuns/UNITPathSSL.
自监督预训练试图通过从未标记数据中获取有效特征来增强模型性能,并已在组织病理学图像领域证明了其有效性。尽管取得了成功,但很少有工作集中于提取细胞核水平信息,这对于病理分析至关重要。在这项工作中,我们提出了一种新颖的用于组织病理学图像的核感知自监督预训练框架。该框架旨在通过组织病理学图像和伪掩模图像之间不成对的图像到图像转换来捕获核形态和分布信息。生成过程通过条件和随机风格表示进行调制,确保生成的用于预训练的组织病理学图像的真实性和多样性。此外,采用实例分割引导策略来捕获实例级信息。在 7 个数据集上的实验表明,所提出的预训练方法在 Kather 分类、多实例学习和使用迁移学习协议的 5 个密集预测任务上优于有监督的方法,并且在 8 个半监督任务上比其他自监督方法产生了更好的结果。我们的项目已公开发布于 https://github.com/zhiyuns/UNITPathSSL。

AU Fontanella, Alessandro Mair, Grant Wardlaw, Joanna Trucco, Emanuele Storkey, Amos
AU Fontanella、亚历山德罗·梅尔、格兰特·沃德劳、乔安娜·特鲁科、埃马努埃莱·斯托基、阿莫斯

Diffusion Models for Counterfactual Generation and Anomaly Detection in Brain Images.
脑图像中反事实生成和异常检测的扩散模型。

Segmentation masks of pathological areas are useful in many medical applications, such as brain tumour and stroke management. Moreover, healthy counterfactuals of diseased images can be used to enhance radiologists' training files and to improve the interpretability of segmentation models. In this work, we present a weakly supervised method to generate a healthy version of a diseased image and then use it to obtain a pixel-wise anomaly map. To do so, we start by considering a saliency map that approximately covers the pathological areas, obtained with ACAT. Then, we propose a technique that allows to perform targeted modifications to these regions, while preserving the rest of the image. In particular, we employ a diffusion model trained on healthy samples and combine Denoising Diffusion Probabilistic Model (DDPM) and Denoising Diffusion Implicit Model (DDIM) at each step of the sampling process. DDPM is used to modify the areas affected by a lesion within the saliency map, while DDIM guarantees reconstruction of the normal anatomy outside of it. The two parts are also fused at each timestep, to guarantee the generation of a sample with a coherent appearance and a seamless transition between edited and unedited parts. We verify that when our method is applied to healthy samples, the input images are reconstructed without significant modifications. We compare our approach with alternative weakly supervised methods on the task of brain lesion segmentation, achieving the highest mean Dice and IoU scores among the models considered.
病理区域的分割掩模在许多医学应用中都很有用,例如脑肿瘤和中风管理。此外,患病图像的健康反事实可用于增强放射科医生的培训文件并提高分割模型的可解释性。在这项工作中,我们提出了一种弱监督方法来生成患病图像的健康版本,然后使用它来获得逐像素异常图。为此,我们首先考虑使用 ACAT 获得的大致覆盖病理区域的显着图。然后,我们提出了一种技术,允许对这些区域进行有针对性的修改,同时保留图像的其余部分。特别是,我们采用在健康样本上训练的扩散模型,并在采样过程的每个步骤中结合去噪扩散概率模型(DDPM)和去噪扩散隐式模型(DDIM)。 DDPM 用于修改显着图中受病变影响的区域,而 DDIM 则保证重建其外部的正常解剖结构。这两个部分也在每个时间步融合,以保证生成具有连贯外观的样本以及编辑和未编辑部分之间的无缝过渡。我们验证了当我们的方法应用于健康样本时,输入图像的重建无需进行重大修改。我们将我们的方法与大脑病变分割任务上的其他弱监督方法进行比较,在所考虑的模型中实现了最高的平均 Dice 和 IoU 分数。

AU Wu, Jianghao Guo, Dong Wang, Guotai Yue, Qiang Yu, Huijun Li, Kang Zhang, Shaoting
吴AU、郭江浩、王栋、岳国泰、于强、李惠军、张康、绍婷

FPL plus : Filtered Pseudo Label-Based Unsupervised Cross-Modality Adaptation for 3D Medical Image Segmentation
FPL plus:基于过滤伪标签的无监督跨模态适应 3D 医学图像分割

Adapting a medical image segmentation model to a new domain is important for improving its cross-domain transferability, and due to the expensive annotation process, Unsupervised Domain Adaptation (UDA) is appealing where only unlabeled images are needed for the adaptation. Existing UDA methods are mainly based on image or feature alignment with adversarial training for regularization, and they are limited by insufficient supervision in the target domain. In this paper, we propose an enhanced Filtered Pseudo Label (FPL+)-based UDA method for 3D medical image segmentation. It first uses cross-domain data augmentation to translate labeled images in the source domain to a dual-domain training set consisting of a pseudo source-domain set and a pseudo target-domain set. To leverage the dual-domain augmented images to train a pseudo label generator, domain-specific batch normalization layers are used to deal with the domain shift while learning the domain-invariant structure features, generating high-quality pseudo labels for target-domain images. We then combine labeled source-domain images and target-domain images with pseudo labels to train a final segmentor, where image-level weighting based on uncertainty estimation and pixel-level weighting based on dual-domain consensus are proposed to mitigate the adverse effect of noisy pseudo labels. Experiments on three public multi-modal datasets for Vestibular Schwannoma, brain tumor and whole heart segmentation show that our method surpassed ten state-of-the-art UDA methods, and it even achieved better results than fully supervised learning in the target domain in some cases.
将医学图像分割模型适应新领域对于提高其跨域可转移性非常重要,并且由于昂贵的注释过程,无监督域适应(UDA)在仅需要未标记图像进行适应的情况下很有吸引力。现有的UDA方法主要基于图像或特征对齐并通过对抗性训练进行正则化,并且受到目标域监督不足的限制。在本文中,我们提出了一种基于增强型过滤伪标签(FPL+)的 UDA 方法,用于 3D 医学图像分割。它首先使用跨域数据增强将源域中的标记图像转换为由伪源域集和伪目标域集组成的双域训练集。为了利用双域增强图像来训练伪标签生成器,使用特定于域的批量归一化层来处理域移位,同时学习域不变的结构特征,为目标域图像生成高质量的伪标签。然后,我们将标记的源域图像和目标域图像与伪标签结合起来训练最终的分割器,其中提出基于不确定性估计的图像级加权和基于双域共识的像素级加权来减轻嘈杂的伪标签。在前庭神经鞘瘤、脑肿瘤和全心脏分割的三个公共多模态数据集上的实验表明,我们的方法超越了十种最先进的 UDA 方法,甚至在某些目标领域取得了比完全监督学习更好的结果。案例。

AU Yang, Chen Wang, Kailing Wang, Yuehao Dou, Qi Yang, Xiaokang Shen, Wei
欧阳、王晨、王凯灵、窦跃豪、杨奇、沉小康、魏

Efficient Deformable Tissue Reconstruction via Orthogonal Neural Plane
通过正交神经平面的高效可变形组织重建

Intraoperative imaging techniques for reconstructing deformable tissues in vivo are pivotal for advanced surgical systems. Existing methods either compromise on rendering quality or are excessively computationally intensive, often demanding dozens of hours to perform, which significantly hinders their practical application. In this paper, we introduce Fast Orthogonal Plane (Forplane), a novel, efficient framework based on neural radiance fields (NeRF) for the reconstruction of deformable tissues. We conceptualize surgical procedures as 4D volumes, and break them down into static and dynamic fields comprised of orthogonal neural planes. This factorization discretizes the four-dimensional space, leading to a decreased memory usage and faster optimization. A spatiotemporal importance sampling scheme is introduced to improve performance in regions with tool occlusion as well as large motions and accelerate training. An efficient ray marching method is applied to skip sampling among empty regions, significantly improving inference speed. Forplane accommodates both binocular and monocular endoscopy videos, demonstrating its extensive applicability and flexibility. Our experiments, carried out on two in vivo datasets, the EndoNeRF and Hamlyn datasets, demonstrate the effectiveness of our framework. In all cases, Forplane substantially accelerates both the optimization process (by over 100 times) and the inference process (by over 15 times) while maintaining or even improving the quality across a variety of non-rigid deformations. This significant performance improvement promises to be a valuable asset for future intraoperative surgical applications. The code of our project is now available at https://github.com/Loping151/ForPlane.
用于重建体内可变形组织的术中成像技术对于先进手术系统至关重要。现有的方法要么会影响渲染质量,要么计算量过大,通常需要数十个小时才能执行,这极大地阻碍了它们的实际应用。在本文中,我们介绍了快速正交平面(Forplane),这是一种基于神经辐射场(NeRF)的新型高效框架,用于重建可变形组织。我们将外科手术概念化为 4D 体积,并将其分解为由正交神经平面组成的静态和动态场。这种因式分解使四维空间离散化,从而减少内存使用并加快优化速度。引入时空重要性采样方案来提高工具遮挡和大运动区域的性能并加速训练。采用高效的光线行进方法来跳过空白区域之间的采样,显着提高推理速度。 Forplane 可容纳双目和单目内窥镜视频,展示了其广泛的适用性和灵活性。我们在两个体内数据集 EndoNeRF 和 Hamlyn 数据集上进行的实验证明了我们框架的有效性。在所有情况下,Forplane 都显着加速了优化过程(超过 100 倍)和推理过程(超过 15 倍),同时保持甚至提高了各种非刚性变形的质量。这一显着的性能改进有望成为未来术中手术应用的宝贵资产。我们项目的代码现在可以在 https://github.com/Loping151/ForPlane 上找到。

AU Xu, Zhenghua Liu, Yunxin Xu, Gang Lukasiewicz, Thomas
AU 徐、刘正华、徐云欣、Gang Lukasiewicz、Thomas

Self-Supervised Medical Image Segmentation Using Deep Reinforced Adaptive Masking.
使用深度强化自适应掩蔽的自监督医学图像分割。

Self-supervised learning aims to learn transferable representations from unlabeled data for downstream tasks. Inspired by masked language modeling in natural language processing, masked image modeling (MIM) has achieved certain success in the field of computer vision, but its effectiveness in medical images remains unsatisfactory. This is mainly due to the high redundancy and small discriminative regions in medical images compared to natural images. Therefore, this paper proposes an adaptive hard masking (AHM) approach based on deep reinforcement learning to expand the application of MIM in medical images. Unlike predefined random masks, AHM uses an asynchronous advantage actor-critic (A3C) model to predict reconstruction loss for each patch, enabling the model to learn where masking is valuable. By optimizing the non-differentiable sampling process using reinforcement learning, AHM enhances the understanding of key regions, thereby improving downstream task performance. Experimental results on two medical image datasets demonstrate that AHM outperforms state-of-the-art methods. Additional experiments under various settings validate the effectiveness of AHM in constructing masked images.
自监督学习旨在从下游任务的未标记数据中学习可转移的表示。受自然语言处理中掩蔽语言建模的启发,掩蔽图像建模(MIM)在计算机视觉领域取得了一定的成功,但其在医学图像中的效果仍不理想。这主要是由于与自然图像相比,医学图像具有高冗余度和小辨别区域。因此,本文提出一种基于深度强化学习的自适应硬掩蔽(AHM)方法,以扩展MIM在医学图像中的应用。与预定义的随机掩码不同,AHM 使用异步优势行动者批评家 (A3C) 模型来预测每个补丁的重建损失,使模型能够了解掩码在何处有价值。通过使用强化学习优化不可微采样过程,AHM 增强了对关键区域的理解,从而提高了下游任务性能。两个医学图像数据集的实验结果表明,AHM 优于最先进的方法。各种设置下的附加实验验证了 AHM 在构建蒙版图像方面的有效性。

EI 1558-254X DA 2024-08-03 UT MEDLINE:39088493 PM 39088493 ER
EI 1558-254X DA 2024-08-03 UT MEDLINE:39088493 PM 39088493 ER

AU Cai, Linqin Fang, Haodu Xu, Nuoying Ren, Bo
蔡区、方林勤、徐浩都、任诺英、薄

Counterfactual Causal-Effect Intervention for Interpretable Medical Visual Question Answering.
可解释的医学视觉问答的反事实因果效应干预。

Medical Visual Question Answering (VQA-Med) is a challenging task that involves answering clinical questions related to medical images. However, most current VQA-Med methods ignore the causal correlation between specific lesion or abnormality features and answers, while also failing to provide accurate explanations for their decisions. To explore the interpretability of VQA-Med, this paper proposes a novel CCIS-MVQA model for VQA-Med based on a counterfactual causal-effect intervention strategy. This model consists of the modified ResNet for image feature extraction, a GloVe decoder for question feature extraction, a bilinear attention network for vision and language feature fusion, and an interpretability generator for producing the interpretability and prediction results. The proposed CCIS-MVQA introduces a layer-wise relevance propagation method to automatically generate counterfactual samples. Additionally, CCIS-MVQA applies counterfactual causal reasoning throughout the training phase to enhance interpretability and generalization. Extensive experiments on three benchmark datasets show that the proposed CCIS-MVQA model outperforms the state-of-the-art methods. Enough visualization results are produced to analyze the interpretability and performance of CCIS-MVQA.
医学视觉问答 (VQA-Med) 是一项具有挑战性的任务,涉及回答与医学图像相关的临床问题。然而,当前大多数 VQA-Med 方法忽略了特定病变或异常特征与答案之间的因果关系,同时也未能为其决策提供准确的解释。为了探索 VQA-Med 的可解释性,本文提出了一种基于反事实因果干预策略的 VQA-Med 新型 CCIS-MVQA 模型。该模型由用于图像特征提取的改进的 ResNet、用于问题特征提取的 GloVe 解码器、用于视觉和语言特征融合的双线性注意网络以及用于生成可解释性和预测结果的可解释性生成器组成。所提出的 CCIS-MVQA 引入了分层相关性传播方法来自动生成反事实样本。此外,CCIS-MVQA 在整个训练阶段应用反事实因果推理,以增强可解释性和泛化性。对三个基准数据集的广泛实验表明,所提出的 CCIS-MVQA 模型优于最先进的方法。生成足够的可视化结果来分析 CCIS-MVQA 的可解释性和性能。

AU Zhu, Enjun Feng, Haiyu Chen, Long Lai, Yongqiang Chai, Senchun
朱AU、冯恩俊、陈海宇、赖龙、柴永强、森春

MP-Net: A Multi-Center Privacy-Preserving Network for Medical Image Segmentation
MP-Net:用于医学图像分割的多中心隐私保护网络

In this paper, we present the Multi-Center Privacy-Preserving Network (MP-Net), a novel framework designed for secure medical image segmentation in multi-center collaborations. Our methodology offers a new approach to multi-center collaborative learning, capable of reducing the volume of data transmission and enhancing data privacy protection. Unlike federated learning, which requires the transmission of model data between the central server and local servers in each round, our method only necessitates a single transfer of encrypted data. The proposed MP-Net comprises a three-layer model, consisting of encryption, segmentation, and decryption networks. We encrypt the image data into ciphertext using an encryption network and introduce an improved U-Net for image ciphertext segmentation. Finally, the segmentation mask is obtained through a decryption network. This architecture enables ciphertext-based image segmentation through computable image encryption. We evaluate the effectiveness of our approach on three datasets, including two cardiac MRI datasets and a CTPA dataset. Our results demonstrate that the MP-Net can securely utilize data from multiple centers to establish a more robust and information-rich segmentation model.
在本文中,我们提出了多中心隐私保护网络(MP-Net),这是一种专为多中心协作中安全医学图像分割而设计的新颖框架。我们的方法论提供了一种新的多中心协作学习方法,能够减少数据传输量并增强数据隐私保护。与联邦学习每轮都需要在中央服务器和本地服务器之间传输模型数据不同,我们的方法只需要单次传输加密数据。所提出的 MP-Net 包括一个三层模型,由加密、分段和解密网络组成。我们使用加密网络将图像数据加密为密文,并引入改进的 U-Net 进行图像密文分割。最后通过解密网络得到分割掩码。该架构通过可计算图像加密实现基于密文的图像分割。我们评估了我们的方法在三个数据集上的有效性,包括两个心脏 MRI 数据集和一个 CTPA 数据集。我们的结果表明,MP-Net 可以安全地利用来自多个中心的数据来建立更强大且信息丰富的分割模型。

AU Le, Tuan-Anh Bui, Minh Phu Hadadian, Yaser Gadelmowla, Khaled Mohamed Oh, Seungjun Im, Chaemin Hahn, Seungyong Yoon, Jungwon
AU Le、Tuan-Anh Bui、Minh Phu Hadadian、Yaser Gadelmowla、Khaled Mohamed Oh、Seungjun Im、Chaemin Hahn、Seungyong Yoon、Jungwon

Towards human-scale magnetic particle imaging: development of the first system with superconductor-based selection coils.
迈向人体规模的磁粒子成像:开发第一个具有基于超导体的选择线圈的系统。

Magnetic Particle Imaging (MPI) is an emerging tomographic modality that allows for precise three-dimensional (3D) mapping of magnetic nanoparticles (MNPs) concentration and distribution. Although significant progress has been made towards improving MPI since its introduction, scaling it up for human applications has proven challenging. High-quality images have been obtained in animal-scale MPI scanners with gradients up to 7 T/m/mu0, however, for MPI systems with bore diameters around 200 mm the gradients generated by electromagnets drop significantly to below 0.5 T/m/mu0. Given the current technological limitations in image reconstruction and the properties of available MNPs, these low gradients inherently impose limitations on improving MPI resolution for higher precision medical imaging. Utilizing superconductors stands out as a promising approach for developing a human-scale MPI system. In this study, we introduce, for the first time, a human-scale amplitude-modulated (AM) MPI system with superconductor-based selection coils. The system achieves an unprecedente