WGCNA significant genes


I am using WGCNA to construct a network and find significant genes for a trait of interest. After making the network, I chose a module based on the pvalue and correlation value. Then I calculated gene significance values for all genes in that module.

trait_x <- as.data.frame(datTraits[, "trait_x", drop = FALSE])
names(trait_x) = "trait_x"
modNames = substring(names(MEs), 3)
geneModuleMembership = as.data.frame(cor(datExpr, MEs, use = "p"));
MMPvalue = as.data.frame(corPvalueStudent(as.matrix(geneModuleMembership), nSamples));
names(geneModuleMembership) = paste("MM", modNames, sep="");
names(MMPvalue) = paste("p.MM", modNames, sep="");
geneTraitSignificance = as.data.frame(cor(datExpr, trait_x, use = "p"));
GSPvalue = as.data.frame(corPvalueStudent(as.matrix(geneTraitSignificance), nSamples));
names(geneTraitSignificance) = paste("GS.", names(trait_x), sep="");
names(GSPvalue) = paste("p.GS.", names(trait_x), sep=""); 
##Plotting the graph
module = "blue1"
column = match(module, modNames);
moduleGenes = moduleColors==module;
pdf("blue1.pdf", width = 7, height = 7);
par(mfrow = c(1,1));
verboseScatterplot(abs(geneModuleMembership[moduleGenes, column]),
                   abs(geneTraitSignificance[moduleGenes, 1]),
                   xlab = paste("Module Membership in", module, "module"),
                   ylab = "Gene significance",
                   main = paste("Module membership vs. gene significancen"),
                   cex.main = 1.2, cex.lab = 1.2, cex.axis = 1.2, col = module)     

##Create the starting data frame
probes = colnames(datExpr)
geneInfo0 = data.frame(Genes = probes,
                       moduleColor = moduleColors,
##Order modules by their significance for trait_x
modOrder = order(-abs(cor(MEs, trait_x, use = "p")));
##Order the genes in the geneInfo variable first by module color, then by geneTraitSignificance
geneOrder = order(geneInfo0$moduleColor, -abs(geneInfo0$GS.trait_x));
geneInfo = geneInfo0[geneOrder, ]                                            
write.csv(geneInfo, file = "geneInfo_trait_x.csv")      

#Hub genes
hub = chooseTopHubInEachModule(datExpr, moduleColors)
write.csv(hub, file = "hub_genes.csv") 

I read that a threshold of 0.2 was used in some papers to choose the significant genes. I am not sure if the threshold is for gene significance values? How can I add that step to keep only the genes above threshold value?

I am also getting an error for hub genes:

Error in hubs[m] <- colnames(adj)[hub] : replacement has length zero

I chose one module for the trait of interest. However, there were other significant modules as well. Should I choose all significant modules or the most significant module only? I have multiple traits of interest, do I need to choose significant module/s for each trait and do this calculation for gene significance for all traits separately? Is there a way to automatize this?

Thank you!

Read more here: Source link