Sambamba markdup usage help

Sambamba markdup usage help



I need so help regarding the usage of sambamba markdup. I have read the documentation but I don’t quite understand.

  1. What is meant by insert size here?

                        size of hash table for finding read pairs (default is 262144 reads);
                        will be rounded down to the nearest power of two;
                        should be > (average coverage) * (insert size) for good performance
  2. To get 100 GB here, should I just write: –sort-buffer-size 102400 ? The reason I wonder is that in sambamba sort you should specify e.g. Mb or Gb after the integer.

                            total amount of memory (in *megabytes*) used for sorting purposes;
                            the default is 2048, increasing it will reduce the number of created
                            temporary files and the time spent in the main thread

thx / Jonas




Read more here: Source link