这是用户在 2024-8-16 14:07 为 https://zenodo.org/records/11125591 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
Published May 6, 2024 | Version v1.1
发布于 2024 年 5 月 6 日|版本v1.1
Software Open
软件打开

DeepPT: A deep learning model for predicting transcriptomics from histopathology images
DeepPT:一种用于从组织病理学图像预测转录组学的深度学习模型

  • 1. Biological Data Science Institute, College of Science, Australian National University, Canberra, ACT, Australia
  • 2. Pangea Biomed Ltd., Tel Aviv, Israel
  • 3. Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA.
  • 4. Department of Immunology, University of Pittsburgh, Pittsburgh, PA, USA; Tumor Microenvironment Center, UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
  • 5. Breast Cancer Now Toby Robins Research Centre, The Institute of Cancer Research, London, United Kingdom.
  • 6. The Royal Marsden Hospital NHS Foundation Trust, London, United Kingdom.
  • 7. Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
  • 8. Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
  • 9. Oncology Institute, Sheba Medical Center at Tel-Hashomer, Tel Aviv University, Tel Aviv, Israel
  • 10. Division of Medical Oncology, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
  • 11. Thoracic and GI Malignancies Branch, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA.
  • 12. Center for Immuno-Oncology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA.
  • 13. Surgical Oncology Program, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
  • 14. Center for Immuno-Oncology, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
  • 15. Laboratory of Genitourinary Cancer Pathogenesis, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA
  • 16. School of Clinical Medicine, University of Cambridge, Li Ka Shing Centre, Cambridge, UK
  • 17. Genitourinary Malignancy Branch, Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA

Description

DeepPT: A deep learning model for predicting transcriptomics from histopathology images
DeepPT:一种用于从组织病理学图像预测转录组学的深度学习模型

Code associated with “A deep-learning framework to predict cancer treatment response from histopathology images through imputed transcriptomics”, Nature Cancer 2024, by Danh-Tai Hoang et al. 
与“通过估算转录组学从组织病理学图像预测癌症治疗反应的深度学习框架”相关的代码, Nature Cancer 2024,Danh-Tai Hoang 等人。

1. Introduction 一、简介

DeepPT (Deep Pathology for Transcriptomics) is a deep learning framework that predicts gene expression from histopathology images. DeepPT consists of 4 main components:
DeepPT(转录深度病理学)是一种深度学习框架,可根据组织病理学图像预测基因表达。 DeepPT 由 4 个主要组件组成:

(i) Image pre-processing: Split each whole slide image into tiles/patches and select only tiles that contain tissue and exclude them from background. Color normalization was included to minimize staining variation (heterogeneity and batch effects).
(i) 图像预处理:将每个完整幻灯片图像分割成图块/补丁,并仅选择包含组织的图块并将其从背景中排除。包括颜色标准化以尽量减少染色变化(异质性和批次效应)。

(ii) Feature extraction: Use the pre-trained ResNet50 CNN model to extract image features from the tiles. Through this process, each image tile is represented by a vector of 2,048 derived features (pre-trained ResNet features).
(ii) 特征提取:使用预训练的 ResNet50 CNN 模型从图块中提取图像特征。通过这个过程,每个图像图块都由 2,048 个派生特征(预训练的 ResNet 特征)组成的向量表示。

(iii) Feature compression: Compress the 2,048 pre-trained ResNet features to 512 features using an autoencoder network. This helps to exclude noise, to avoid overfitting, and finally to reduce the computational demands.
(iii) 特征压缩:使用自动编码器网络将 2,048 个预训练的 ResNet 特征压缩为 512 个特征。这有助于排除噪声,避免过度拟合,并最终减少计算需求。

(iv) Prediction: This component takes the AE features as input and gene expressions as output.
(iv) 预测:该组件将 AE 特征作为输入,将基因表达作为输出。

2. Installations: 2. 安装:

To install DeepPT, please install the following requirements:
要安装 DeepPT,请安装以下要求:

python 3.9.7 蟒蛇3.9.7

numpy 1.20.3

pandas 1.3.4 熊猫1.3.4

matplotlib 3.4.3

sklearn 1.1.1

openslide 1.1.2 打开幻灯片1.1.2

opencv 4.5.4

torch 1.12.1  火炬1.12.1

3. DeepPT computational pipeline:
3. DeepPT计算管道:

- Step 1: Run “11slide_processing/1main_processing.py” to perform image pre-processing and feature extraction. This code will run on each slide simultaneously.
- 步骤1:运行“11slide_processing/1main_processing.py”进行图像预处理和特征提取。该代码将同时在每张幻灯片上运行。

- Step 2: Run “11slide_processing/collect_mask.py” to collect mask files into a single file “mask.pdf” that will be used to evaluate slide quality.
- 第 2 步:运行“11slide_processing/collect_mask.py”将掩模文件收集到单个文件“mask.pdf”中,该文件将用于评估幻灯片质量。

- Step 3: Run “11slide_processing/collect_features.py” to create a file that contains features of image tiles.
- 步骤 3:运行“11slide_processing/collect_features.py”创建包含图像图块特征的文件。

- Step 4: Run “12AE/1main_AE.py” to compress the 2,048 pre-trained features to 512 AE features.
- 步骤4:运行“12AE/1main_AE.py”将2,048个预训练特征压缩为512个AE特征。

- Step 5: Run “13DeepPT_train/1main_train.py” to train and predict gene expression from the AE features.
- 第 5 步:运行“13DeepPT_train/1main_train.py”以根据 AE 特征训练和预测基因表达。

4. License and Terms of use
4. 许可和使用条款

This model and its associated code have been filed for a US patent (application No. 63/349,829, United States, 2022) and are permitted solely for non-commercial, academic research purposes. Commercial use, sale, or any form of monetization of the DeepPT model is strictly prohibited without prior approval. Commercial entities interested in utilizing the model should contact the corresponding authors for authorization.
该模型及其相关代码已申请美国专利(申请号 63/349,829,美国,2022 年),并且仅允许用于非商业、学术研究目的。未经事先批准,严禁将 DeepPT 模型用于商业用途、销售或任何形式的货币化。有兴趣使用该模型的商业实体应联系相应作者以获得授权。

Files 文件

10metadata.zip 10元数据.zip

Files (134.4 MB)
文件(134.4 MB)

Name 姓名 Size 尺寸 Download all  下载全部
md5:f47d6bd12bf352e22dce14deca8594d7
31.8 MB 31.8MB Preview Download
预览下载
md5:5289156ff7168d310dca292a1c3d0089
10.6 kB Preview Download
md5:f3ad7477493a1d571c93d5f1dba99eb9
77.2 kB Preview Download
md5:512c69caf5fd974532115746bbd63f8c
7.3 kB Preview Download
md5:35af5f9a35d72d690142a57dc455991b
17.5 kB Download
md5:1160a97591960c585812c64efbd79de0
102.5 MB Download

Additional details

Software

Programming language
Python