How to filter nanopore transcriptome alignments to trust 3′ ends?
I have direct RNA data mapped to the gencode transcriptome with minimap2. Finding the ‘true’ transcript of origin for a read is nontrivial as there are many secondary alignments with very close alignment scores to the primary. After visualising I can see some alignments are to transcripts which start further 3 prime than my alignment. However, due to the mechanism of direct RNA sequencing, the three prime ends of reads are the true end site.
I want to discard alignments to transcripts that have a 3′ start site over 100nt prior to my read start site.
I’ve thought about simply extracting TES from the gencode gtf but these are genomic coordinates and I need to use the transcriptome mapping. Another way I’ve been thinking is if the query end site is over 100nt of my read end site, to discard the alignment. But I am not sure how to do this, any ideas?
Thanks.
• 295 views
Read more here: Source link