ICGC网址:https://icgc.org/
几年前,国际癌症基因组协会(ICGC)在著名的《Nature Communications》杂志,发表了一篇令人瞠目的文章,它为即将到来的癌症基因组学研究时代,奠定了基础。该项目,是由Centro Nacional de Analisis Genómico (CNAG-CRG)和German Cancer Research Center (DKFZ)带领完成的,是一项旨在为体细胞突变检测产生可靠标准的研究,这些体细胞突变是癌症基因组的一个标志。体细胞突变是一个细胞自发获得的遗传改变,可以在细胞分裂和肿瘤生长过程中传递给突变细胞的后代。体细胞突变不同于从父母传递给儿童的种系变异。
今天给大家介绍如何从ICGC数据库下载数据,首先我们看下ICGC数据库有哪些数据:
[ALL-US] Acute Lymphoblastic Leukemia - TARGET, US
[AML-US] Acute Myeloid Leukemia - TARGET, US
[BLCA-CN] Bladder Cancer - CN
[BLCA-US] Bladder Urothelial Cancer - TCGA, US
[BOCA-FR] Soft Tissue cancer - Ewing sarcoma - FR
[BOCA-UK] Bone Cancer - UK
[BPLL-FR] B-Cell Prolymphocytic Leukemia
[BRCA-EU] Breast ER+ and HER2- Cancer - EU/UK
[BRCA-FR] Breast Cancer - FR
[BRCA-KR] Breast Cancer - Very young women
[BRCA-UK] Breast Triple Negative/Lobular Cancer - UK
[BRCA-US] Breast Cancer - TCGA, US
[BTCA-JP] Biliary Tract Cancer - JP
[BTCA-SG] Biliary Tract Cancer - SG
[CCSK-US] Clear Cell Sarcomas of the Kidney - TARGET, US
[CESC-US] Cervical Squamous Cell Carcinoma - TCGA, US
[CLLE-ES] Chronic Lymphocytic Leukemia - ES
[CMDI-UK] Chronic Myeloid Disorders - UK
[COAD-US] Colon Adenocarcinoma - TCGA, US
[COCA-CN] Colorectal Cancer - CN
[DLBC-US] Lymphoid Neoplasm Diffuse Large B-cell Lymphoma - TCGA, US
[EOPC-DE] Early Onset Prostate Cancer - DE
[ESAD-UK] Esophageal Adenocarcinoma - UK
[ESCA-CN] Esophageal Cancer - CN
[GACA-CN] Gastric Cancer - CN
[GACA-JP] Gastric Cancer - JP
[GBM-CN] Brain Cancer - CN
[GBM-US] Brain Glioblastoma Multiforme - TCGA, US
[HNSC-US] Head and Neck Squamous Cell Carcinoma - TCGA, US
[KICH-US] Kidney Chromophobe - TCGA, US
[KIRC-US] Kidney Renal Clear Cell Carcinoma - TCGA, US
[KIRP-US] Kidney Renal Papillary Cell Carcinoma - TCGA, US
[LAML-CN] Leukemia - CN
[LAML-KR] Acute Myeloid Leukemia - KR
[LAML-US] Acute Myeloid Leukemia - TCGA, US
[LGG-US] Brain Lower Grade Glioma - TCGA, US
[LIAD-FR] Benign Liver Tumour - FR
[LICA-CN] Liver Cancer - CN
[LICA-FR] Liver Cancer - FR
[LIHC-US] Liver Hepatocellular carcinoma - TCGA, US
[LIHM-FR] Liver Cancer - Hepatocellular macronodules
[LINC-JP] Liver Cancer - NCC, JP
[LIRI-JP] Liver Cancer - RIKEN, JP
[LMS-FR] Soft tissue cancer - Leiomyosarcoma
[LUAD-US] Lung Adenocarcinoma - TCGA, US
[LUSC-CN] Lung Cancer - CN
[LUSC-KR] Lung Cancer - KR
[LUSC-US] Lung Squamous Cell Carcinoma - TCGA, US
[MALY-DE] Malignant Lymphoma - DE
[MELA-AU] Skin Cancer - AU
[NACA-CN] Nasopharyngeal cancer - CN
[NBL-US] Neuroblastoma - TARGET, US
[NKTL-SG] Blood Cancer - T-cell and NK-cell lymphoma - SG
[ORCA-IN] Oral Cancer - IN
[OS-US] Osteosarcoma - TARGET, US
[OV-AU] Ovarian Cancer - AU
[OV-CN] Ovarian Cancer - CN
[OV-US] Ovarian Serous Cystadenocarcinoma - TCGA, US
[PAAD-US] Pancreatic Cancer - TCGA, US
[PACA-AU] Pancreatic Cancer - AU
[PACA-CA] Pancreatic Cancer - CA
[PACA-CN] Pancreatic Cancer - CN
[PAEN-AU] Pancreatic Cancer Endocrine neoplasms - AU
[PAEN-IT] Pancreatic Endocrine Neoplasms - IT
[PBCA-DE] Pediatric Brain Cancer - DE
[PBCA-US] Pediatric Brain Tumor - Multiple subtypes
[PEME-CA] Pediatric Medulloblastoma - CA
[PRAD-CA] Prostate Adenocarcinoma - CA
[PRAD-CN] Prostate Cancer - CN
[PRAD-FR] Prostate Cancer - Adenocarcinoma
[PRAD-UK] Prostate Adenocarcinoma - UK
[PRAD-US] Prostate Adenocarcinoma - TCGA, US
[READ-US] Rectum Adenocarcinoma - TCGA, US
[RECA-CN] Renal Cancer - CN
[RECA-EU] Renal Cell Cancer - EU/FR
[RT-US] Rhabdoid Tumors - TARGET, US
[SARC-US] Sarcoma - TCGA, US
[SKCA-BR] Skin Adenocarcinoma - BR
[SKCM-US] Skin Cutaneous melanoma - TCGA, US
[STAD-US] Gastric Adenocarcinoma - TCGA, US
[THCA-CN] Thyroid Cancer - CN
[THCA-SA] Thyroid Cancer - SA
[THCA-US] Head and Neck Thyroid Carcinoma - TCGA, US
[UCEC-US] Uterine Corpus Endometrial Carcinoma- TCGA, US
[UTCA-FR] Uterine Cancer - Carcinosarcoma
[WT-US] Wilms Tumor - TARGET, US
1、进入ICGC官网:https://icgc.org/
进入官网之后,往下拉,在左下方,我们就可以看到ICGC的数据类型,比如中国的膀胱癌数据:
2、在网站最上方导航栏进入数据界面:Data Portal
也可以直接点击进入网址:https://dcc.icgc.org/
进入Data Portal之后,我们选择DCC Data Releases进入数据版本
进入数据版本只有,我们可以看到很多数据版本,那么我们选择最新更新的数据
点击current进入数据选择界面
进入到数据界面,点击Projects
然后就到达选择下载界面,在这里我们可以看到也有TCGA、TARGET的数据,如果大家需要分析TCGA或者TARGET的数据库,那么直接进入TCGA和TARGET官网下载和分析就可以了,没有必要在这里选择。生信自学网也有专门的课程讲解TCGA和TARGET数据库挖掘。
3、选择我们感兴趣的研究,比如我们这里选择LIRI-JP
3、选择LIRI-JP进入数据下载页面,我们可以看到LIRI-JP所有的数据,我们可以把所有的这些数据下载下来,下载很简单,直接右键“另存为”
接下来,我们给大家解释一下这些数据:
donor.LIRI-JP.tsv.gz 病人的数据(临床数据)
exp_seq.LIRI-JP.tsv.gz 表达数据(测序数据)
sample.LIRI-JP.tsv.gz 样品信息
simple_somatic_mutation.open.LIRI-JP.tsv.gz 突变数据
specimen.LIRI-JP.tsv.gz 实验处理数据(可以区分正常和癌症)
structural_somatic_mutation.LIRI-JP.tsv.gz 结构变异数据
在这里需要提醒大家的是,ICGC每个项目的数据是不同的,大家需要根据自己研究找到合适的癌症,然后找到所有的这些数据。
当然也可以购买生信自学网给大家准备的《ICGC数据库挖掘视频课程》
(责任编辑:伏泽 微信:18520221056)
|