Hi,
I am using WGCNA
to construct a network and find significant genes for a trait of interest. After making the network, I chose a module based on the pvalue and correlation value. Then I calculated gene significance values for all genes in that module.
trait_x <- as.data.frame(datTraits[, "trait_x", drop = FALSE])
names(trait_x) = "trait_x"
modNames = substring(names(MEs), 3)
geneModuleMembership = as.data.frame(cor(datExpr, MEs, use = "p"));
MMPvalue = as.data.frame(corPvalueStudent(as.matrix(geneModuleMembership), nSamples));
names(geneModuleMembership) = paste("MM", modNames, sep="");
names(MMPvalue) = paste("p.MM", modNames, sep="");
geneTraitSignificance = as.data.frame(cor(datExpr, trait_x, use = "p"));
GSPvalue = as.data.frame(corPvalueStudent(as.matrix(geneTraitSignificance), nSamples));
names(geneTraitSignificance) = paste("GS.", names(trait_x), sep="");
names(GSPvalue) = paste("p.GS.", names(trait_x), sep="");
##Plotting the graph
module = "blue1"
column = match(module, modNames);
moduleGenes = moduleColors==module;
pdf("blue1.pdf", width = 7, height = 7);
par(mfrow = c(1,1));
verboseScatterplot(abs(geneModuleMembership[moduleGenes, column]),
abs(geneTraitSignificance[moduleGenes, 1]),
xlab = paste("Module Membership in", module, "module"),
ylab = "Gene significance",
main = paste("Module membership vs. gene significancen"),
cex.main = 1.2, cex.lab = 1.2, cex.axis = 1.2, col = module)
dev.off()
##Create the starting data frame
probes = colnames(datExpr)
geneInfo0 = data.frame(Genes = probes,
moduleColor = moduleColors,
geneTraitSignificance,
GSPvalue)
##Order modules by their significance for trait_x
modOrder = order(-abs(cor(MEs, trait_x, use = "p")));
##Order the genes in the geneInfo variable first by module color, then by geneTraitSignificance
geneOrder = order(geneInfo0$moduleColor, -abs(geneInfo0$GS.trait_x));
geneInfo = geneInfo0[geneOrder, ]
##Save
write.csv(geneInfo, file = "geneInfo_trait_x.csv")
#Hub genes
hub = chooseTopHubInEachModule(datExpr, moduleColors)
write.csv(hub, file = "hub_genes.csv")
I read that a threshold of 0.2 was used in some papers to choose the significant genes. I am not sure if the threshold is for gene significance values? How can I add that step to keep only the genes above threshold value?
I am also getting an error for hub genes:
Error in hubs[m] <- colnames(adj)[hub] : replacement has length zero
I chose one module for the trait of interest. However, there were other significant modules as well. Should I choose all significant modules or the most significant module only? I have multiple traits of interest, do I need to choose significant module/s for each trait and do this calculation for gene significance for all traits separately? Is there a way to automatize this?
Thank you!
Read more here: Source link