The Biostar Herald for Tuesday, August 17, 2021

The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.

This edition of the Herald was brought to you by contribution from Istvan Albert,
and was edited by lakhujanivijay,
Istvan Albert,

On the optimistic performance evaluation of newly introduced bioinformatic methods | SpringerLink (

So you read about a new, cool method that “improves” on previous methods. But are they really an improvement?

Most research articles presenting new data analysis methods claim that “the new method performs better than existing methods,” but the veracity of such statements is questionable. Our manuscript discusses and illustrates consequences of the optimistic bias occurring during the evaluation of novel data analysis methods, that is, all biases resulting from, for example, selection of datasets or competing methods, better ability to fix bugs in a preferred method, and selective reporting of method variants.

submitted by: Istvan Albert

CLIMB-BIG-DATA | Cloud Infrastructure for Microbial Bioinformatics (

Antimicrobial resistance is a critical universal issue and scientists need reliable, fast, reproducible tools for their research. The aim of this hackathon is to improve upon/build/extend bioinformatics tools and methods for the AMR community. This year’s hackathon has a special focus on antimicrobial resistance in bacteria.

submitted by: Istvan Albert

An FM-index of 400k SARS-CoV-2 genomes (

Leonardo Martins tweeted that xz can compress a 1.4 million SARS-CoV-2 genomes in a 39GB FASTA down to 74MB. That is a very impressive compression ratio! This reminds me of my earlier work on FM-index construction.

For an experiment, I downloaded ~400k SARS-CoV-2 genomes from EBI’s COVID-19 data portal (GISAID has ~1.5M genomes but imposes restrictions) and generated an FM-index of these sequences in both strands with ropebwt2

submitted by: Istvan Albert

A 39GB file containing SARS-COV-2 genomes can be compressed to just 74MB when using the xz program.

submitted by: Istvan Albert (

Watch the BioC 2021 talks online, 58 videos in total.

submitted by: Istvan Albert

GitHub – GoekeLab/xpore: Identification of differential RNA modifications from nanopore direct RNA sequencing (

xPore is a Python package for identification and quantification of differential RNA modifications from direct RNA sequencing.

submitted by: Istvan Albert

GitHub – marbl/merqury: k-mer based assembly evaluation (

Here we present Merqury, a novel tool for reference-free assembly evaluation based on efficient k-mer set operations. By comparing k-mers in a de novo assembly to those found in unassembled high-accuracy reads, Merqury estimates base-level accuracy and completeness.

submitted by: Istvan Albert

Want to get the Biostar Herald in your email? Who wouldn’t? Sign up righ’ere: toggle subscription

Read more here: Source link