Whole genome sequencing and RNA sequencing data analysis for a bacterial strain
I am new RNA sequencing work. I have raw fastq files (from extracted RNA and DNA both) for bacteria “Paucibacter toxinivorans strain IM4”. I am not able to find whole reference genome for the same bacteria. But, there is partial 16S sequence available at NCBI www.ncbi.nlm.nih.gov/nuccore/1031488746
Do I first need to process DNA fastq files for whole genome sequencing and then move to RNA sequencing analysis? If not, Can I use the available partial 16S sequence for alignment?
Please can anybody guide me ?
Original title: Whole genome sequencing for a bacterial strain for RNA sequencing
If you have the full genome sequenced, then it’d be preferable to first assemble the genome, annotate it (i.e. determine where gene regions are) and then use that to align the RNA-seq to. This seems like a pretty good run-down of the different ways to sequence and assemble bacterial genomes; I’m sure there are many more on PubMed.
If I interpret correctly, you’re saying that you have bulk RNA-seq data (= potentially all transcripts of that one bacteria strain), which is very different from ribosomal DNA (!) sequencing that’s usually applied to a MIX of different bacteria and is typically used to simply identify the different species present in the mix. I don’t see how your data set would benefit from focusing on rRNA genes.
That being said, why are you looking at that data set to begin with?