I am using the following code to get the data from TCGA.
I want to have only one allocate of each person then I will have unique patients ID.
Is there any line of code that I should add to this to get this? IS there any code to get/omit specific samples?
CancerProject <- "TCGA-LGG"
query <- GDCquery(project = CancerProject,
data.category = "Transcriptome Profiling",
data.type = "Gene Expression Quantification",
sample.type = c("Primary Tumor"),
workflow.type = "HTSeq - Counts")
#download raw counts for DESEq2
data <- GDCprepare(query, save = TRUE, save.filename = "expression.rda")
rna <- as.data.frame(SummarizedExperiment::assay(data)) # exp matrix# this go to coding filter(above)