Hello to everyone. I have a doubt about TMM normalization. I’m comparing male versus female samples. Can TMM normalization be affected by the presence of more reads on chromosome x in the female group??
It sounds like contrast is based on the sex, such that all the sample in one group are one sex and all the samples in the other group are the other sex. In that case, you would expect all of the X-chromosome signal to be different, so comparing X-chromosomal signals does not seem very interesting. As some of the sequencing reads will be dedicated to the X-chromosome, this will impact the number of reads available for the non-X signals, and this needs to be accounted for. I think the easiest thing would be to exclude the X-chromosome reads and let the library size adjustment take care of that, but TMM should actually be able to deal with the entire set of reads. It might be interesting to try it both ways and see how different the differential analysis is, both on the X-chromosome and elsewhere.
I also notice that you tagged both DiffBind and RNA-seq in this question, and the answer may be different depending on if you are trying to normalize mRNA or ChIP/ATAC data. The recommended method for normalizing data in DiffBind is to use counts in background bins, while for RNA-seq you would normalize only using the reads that are counted as mapping uniquely to transcripts.