10X Genomica Supernova Troubleshooting
I am attempting to perform de novo assembly of sunflower with Supernova 2.0.0.
I am having some difficulty getting it to finish within the wallclock limit for resources I am using. I have a wallclock limit of 48 hours on SDSC Comet (64 cores, 1.4TB memory) and 72 hours on Savio here at UCB (16 cores, 512GB memory).
I typically have not been including –maxreads in my scripts, assuming that will produce the best quality assembly, but this is not realistic considering my wallclock limits. One question I have is whether or not I should limit the number of reads to what the sequencing company has given in their report. Our sequences are from HiSeq X, and it says that the number of reads are 261M reads. Is this “reads” from the sequencing company different than the reads (as in maxreads) for supernova?
Also, do you set localcores and localmem? Or do you just let the program use the resources available on that node?
I should also add that the genome is 3.6G-bases, quite large. I also expect some heterozygosity.
• 2.2k views
Sorry for the long hiatus!
Turns out, it has an automatic checkpoint (so long as you don’t adjust your script!!!!). I was making the mistake of constantly changing my script to increase efficiency. In doing so, upon submitting the job, the scheduler would see that as a new job, and overwrite all files from the previously “failed” (due to timeout) job with that same name.
I ended up with a 25X coverage genome, which I am happy about for what I am using it for! But, it worked!
With 1.45TB Memory and 48 cores, it took about 8-9 days. So still a very very long time. Some steps took the entire 48 hour wall clock limit at SDSC’s comet.
Anyways, thanks for all the help and suggestions; if anyone would like to see my scripts for the jobs I submitted on SDSC Comet, I am happy to provide :).
Traffic: 1722 users visited in the last hour
Read more here: Source link