Analyzing gene expression in different RNAseq datasets

Analyzing gene expression in different RNAseq datasets


Hello! I really need some assistance here, I came up with an analysis of my own that makes sense to me but I really new in this (started studying bioinformatics on my own with the pandemics) and I’m not sure if I can actually do this. I want to evaluate the expression of epigenetic enzimes genes across spermatogenic cell types, integrating three diffrent datasets from papers that have isolated the different populations of interest (spermatogonia, spermatocyte, spermatid, sperm), performed RNAseq and reported the RPKM values in GEO. I converted each table to percentile rank, and defined a score where if the mean percentile is above 0.75 and the percentage dispersion coeficient is below 0.2 i call that gene as expressed in that cell type. Actually I made a score where above 0.75 is 1 (+), above 0.9 is 2 (++) and above 0.95 is 3 (+++). I am not interested in doing a statistical comparisong between cell types, I just want to see if the epigenetic enzimes of the different families are highly expressed or not across cell types. You think I can do this? or should I run the same pipeline for the different fastq and analyze PCA, batch effects and so on? This is what my table looks like

enter image description here





Read more here: Source link