How to convert mapping bam file to fastq without loseing the mapping information

How to convert mapping bam file to fastq without loseing the mapping information


Hi all,

I want to create my RNA mapping data into a library for further analysis. Now I have bowtie2 mapping data, which is in bam files, I now use bedtools to extract fastq mapping reads from those bam mapping files. (bedtools bamtofastq function) But it seems like the fastq data produced by bedtools would only contain the original sequence name, but not which , for example , chromosomes it mapped to, and where it is.
Below is my bam files, I want to have the the mapped chromosomes and location information in the name of the fastq files, what I could do? Is there a simple way to do it? I have tried to do the replace with ‘sed’ command, but not very successful so far. Below I attached my bam files info and my attempt to replace names with sed command

  1. sed files

    while read p; do echo "$p"; sed -i -e $p mapped.fastq; done < replacename.txt

    replacename.txt in each line, is in 's/original_sequence_name/replaced_mapping_information/g'

  2. my bam files that need to convert to fastq

    SRR1761528.1829103 16  CL1Contig15_2_sc_0.515159_l_155 337 1   50M *   0   0   TTGGAGTGTATTGGGTGCGTTCGTGGCAAAAAATCACTTCGTGATTCGCG  BFHFEECHEIIIHDIIIIHDIIIIIHDHIHIIHEG;GFHFHHDD<FF@@@  AS:i:-5 XS:i:-5 XN:i:0  XM:i:1  XO:i:0  XG:i:0  NM:i:1  MD:Z:32C17  YT:Z:UU
    SRR1761528.942684  16  CL34Contig23_2_sc_0.402716_l_155    34  0   12M1D38M    *   0   0   TTTGGACATATTGAATTTACTGGGTGCGTTCGTGGCAAAAACTCACTTCG  JJJJJJJJJJJJIJJJJJIJJJJJJJJIJJJJJJJJJHGHFHEFDFFCCB  AS:i:-26    XS:i:-26    XN:i:0  XM:i:3  XO:i:1  XG:i:1  NM:i:4  MD:Z:12^G2G1G2T30   YT:Z:UU

For example, I want the CL1Conitg15_2_sc_0.515159_l_155 to be included in the fastq sequence name, to form like SRR1761528.1829103_CL1Conitg15_2_sc_0.515159_l_155.

If anyone has a good idea?





Source link