r – How to replace row names in DESeq2 rlogTransformation matrix with actual gene name info present on another sheet?

I’m new to R and DESeq2 and I’m trying to run differential expression as below

library(DESeq2)
count_file_names <- grep("counts",list.files("HTSeq_counts"),value=T)
host_type < c("Damaged","Control")
sample_information <-data.frame(sampleName = count_file_names, fileName = count_file_names, condition = host_type)
DESeq_data <- DESeqDataSetFromHTSeqCount(sampleTable = sample_information, directory = "HTSeq_counts", design = ~condition)
colData(DESeq_data)$condition <- factor(colData(DESeq_data)$condition,levels = c('Damaged','Control'))
rld <- rlogTransformation(DESeq_data, blind=T)

When I look at the rld matrix, it has got row names that aren’t annotated and hence, not directly useful; note- I intend to generate plots based on this matrix in downstream steps and so, the corresponding actual gene names are required

print(rld)
class: DESeqTransform 
dim: 33219 38 
metadata(1): version
assays(1): ''
rownames(33219): g100.t1 g1000.t1 ... g9998.t1 g9999.t1
rowData names(7): baseMean baseVar ... dispFit rlogIntercept
colnames(38): Damaged_R1.counts Damaged_R2.counts ...
  Control_R4.counts Control_R5.counts
colData names(2): condition sizeFactor

I have a separate “Annotations.csv” which has the gene name information

gene_names <- read.csv("Annotations.csv", header=TRUE, sep=",", stringsAsFactors=FALSE)
head(gene_names)
    gene_id   name                                             product
1 g38227.t1   Stk10                Serine/threonine-protein kinase 10  
2 g38227.t2   Stk10                Serine/threonine-protein kinase 10  
3 g1000.t1    Ccnh                                          Cyclin-H  
4 g46237.t1 fam136a                                   Protein FAM136A  
5 g100.t1     H2B                                       Histone H2B  
6 g40390.t1   STAC2 SH3 and cysteine-rich domain-containing protein 2

and I would like to replace rownames in rld matrix according with the correct corresponding gene name as specified in gene_names. For example, the rld matrix rownames(33219) should appear as names “H2B CCnh … and so on” instead of “g100.t1 g1000.t1 … and so on”. I tried the below R code

new.rld  <- cbind(name=gene_names$name[ match(rownames(rld), gene_names$gene_id) ], rld)

but it gave me the this error:

Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘bindCOLS’ for signature ‘”character”’

Any help in generating correctly the new.rld matrix would be very much appreciated!

Read more here: Source link