About outliers and non -separated samples in PCA

About outliers and non -separated samples in PCA

0

Hi all,

I have plotted PCA for my samples(Tumor and Normal) in some cancer types. I have used the HTSeq-counts data from TCGA. Then I’ve normalized them by DESeq2 and the total normalized counts are in cnt dataframe.

Head of cnt:

enter image description here

Here is my code for PCA:

cnt.scaled <- t(scale(t(cnt) , scale = F))
pc <- prcomp(cnt.scaled)

pcr <- data.frame(pc$rotation[,1:3] , Group = gr) 
ggplot(pcr, aes(PC1 , PC2 , color = Group)) + geom_point()
  1. Could I perform DEA for this cancer type?
  2. Should I first remove the outliers that are marked by a red circle?

enter image description here

  1. How about this cancer type? Can I use this data for DEA? As you can see the normal and tumor samples aren’t separate.

enter image description here

Thanks for any help.


PCA


DESeq2


TCGA


RNA-Seq

• 39 views

Read more here: Source link