how to identiify real isomers in mirge3.0’s output files.

how to identiify real isomers in mirge3.0’s output files.


How do you distinguish/extract ‘real’ isomirnas from the exhaustive output of mirge3.0?

Im trying to do a differential expression analysis on the isomers of miRNA in my dataset. Im using mirge3.0 with the -gff and other outputs (basically all of the various outputs the program can give). Looking into a subjects output folder i see a .gff3 file, as well as files labeled “miR.Counts.csv”, “miR.RPM.csv”, “isomirs.samples.csv”, “isomirs.csv”, “miR.Counts.csv” etc.

Looking into the various files it seems like all of the collapsed reads that do not perfectly correspond to a complete mature miRNA for one of the conanical miRNAs are treated as ‘isomers’ in the gff file and others (for example, there are 145,000 lines of output for the file “isomers.csv”. There is a “Top Isomir RPM” value given for all of the canonical mirnas listed in the file “isomers.samples.csv” but no ID or sequence or anything that would allow me to locate this ‘top isomer’ in any of the other output files.

I feel certain there should be some way of identifying which of the canonical mirnas in the output has actual evidence for an isomer and what that isomer is in mirge3.0’s output, but do not know what it is. Can anyone offer advice?




Read more here: Source link