Circos plot of WGS assembly

Hello singh.bioinfo,

I was able to generate a comparative Circos plot of two plant genome assemblies by using someone else’s code and some manual edits of the configuration files. The first step is to ensure that the two assemblies have different chromosome/contig/scaffold IDs in the fasta file. The second is to use Minimap2 to generate a PAF alignment file between the two assemblies, similar to this example:

minimap2 -x asm10 [First Assembly] [Second Assembly] > output.paf

Next, I used the following tool to generate the configuration files that Circos requires to generate the plot:

Note that the “COV” and “SNPS” files are optional, and I did not include them in my use of the program. You will have to create a BED file that indicates which chromosomes/contigs/scaffolds you wish to plot, but this is easy to accomplish by using samtools faidx on your assembly fasta files:

for i in firstassembly.fasta secondassembly.fasta; do
echo $i; samtools faidx $i; perl -lane 'print "$F[0]t1t$F[1]";' < $i.fai;
done > input.bed

From there, follow the instructions from the github repository to generate a series of configuration files. It’s then worthwhile to attempt to run Circos on the main configuration file, but I would expect some errors. From my experience, you will need to modify the following files to remove duplicate entries and ensure that the plotting scheme is to your liking:

  • circos.conf
  • ideogram.conf
  • karyotype.txt
  • ticks.conf

It will require a bit of work to generate a plot, and it may take you longer to modify the files to generate a plot that you think could be publishable! I recommend playing around with the settings until you are happy with the output of Circos.

I hope that helps!


Read more here: Source link