bedGraphToBigWig: Missing Genome Coordinates

I have a somewhat complicated pipeline in which bam files are converted from bam > bed > bedGraph > bigwig. The final bigwig files are missing genome ranges that have a coverage value of 0, despite a chromosome size file, S288C_R64.fa.fai, being referenced during file type conversion. The missing genome coordinates are causing an error in downstream analysis in R: when I import the bigwig files using rtracklayer ranges are missing (because they are missing in the bigwig file).

How can I force genome coordinates with 0 coverage to be included in the final bigwig file, when converting from a bed file? TYIA!

1) Convert bam to bed (bamtobed, bedtools), then modify read coordinates with awk.

bedtools bamtobed -i input.bam | bedtools sort -g S288C_R64.fa.fai > output.bed

output.bed>
chrI    877 878 K00408:348:HT7MVBBXY:5:1207:11454:46592 42  -
chrI    942 943 K00408:348:HT7MVBBXY:5:1211:24982:31523 35  +
chrI    945 946 K00408:348:HT7MVBBXY:5:1119:5954:27180  42  -
chrI    945 946 K00408:348:HT7MVBBXY:5:2119:3853:33141  42  -
chrI    971 972 K00408:348:HT7MVBBXY:5:2201:1793:46065  42  +

2) Convert bed to bedGraph (genomecov, bedtools)

bedtools genomecov -i output.bed -bg -g S288C_R64.fa.fai | sort -k1,1 -k2,2n > output.bedGraph

output.bedGraph>
chrI    877 878 1
chrI    942 943 1
chrI    945 946 2
chrI    971 972 1

4) Convert bedGraph to bigwig (bedGraphToBigWig). I converted bigwig to wig to get a human readable file to troubleshoot.

bedGraphToBigWig output.bedGraph S288C_R64.fa.fai output.bigwig
bigWigToWig output.bigwig output.wig

output.wig>
#bedGraph section chrI:877-32548
chrI    877 878 1
chrI    942 943 1
chrI    945 946 2
chrI    971 972 1

Note the coordinates above chrI:877-32548

If I convert the same bam file with deeptools, I get the proper “headers” for the genome

bamCoverage  -b input.bam -o output2.bigwig -of bigwig -bs 1 -p max --effectiveGenomeSize 12000000
bigWigToWig output2.bigwig output2.wig

output2.wig>
#bedGraph section chrI:0-33818
chrI    0   877 0
chrI    877 878 1
chrI    878 942 0
chrI    942 943 1
chrI    943 945 0
chrI    945 946 2
chrI    946 971 0
chrI    971 972 1

Note the coordinates above chrI:0-33818, and the inclusion of coverage for every base even if the value is 0.

Read more here: Source link