The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.
This edition of the Herald was brought to you by contribution from Istvan Albert,
and was edited by Istvan Albert,
Here is a presentation describing the D4 format and d4tools (t.co/INikd9rWHL) for genomics depth data from quantitative assays (e.g., WGS, ChIP, ATAC, RNA). It describes how the format works, its performance, and the tools/APIs that exist today. t.co/z1HNY1cQlJ
— Aaron Quinlan (@aaronquinlan) August 19, 2021
Here is a presentation describing the D4 format and d4tools (t.co/INikd9rWHL) for genomics depth data from quantitative assays (e.g., WGS, ChIP, ATAC, RNA). It describes how the format works, its performance, and the tools/APIs that exist today. t.co/z1HNY1cQlJ
— Aaron Quinlan (@aaronquinlan) August 19, 2021
submitted by: Istvan Albert
Introduction – Refgenie (refgenie.databio.org)
An interesting tool to help with downloading references and also pre-built indices.
Refgenie manages storage, access, and transfer of reference genome resources. It provides command-line and Python interfaces to download pre-built reference genome “assets”, like indexes used by bioinformatics tools. It can also build assets for custom genome assemblies. Refgenie provides programmatic access to a standard genome folder structure, so software can swap from one genome to another.
submitted by: Istvan Albert
Computational biologists: bake-offs are not science. A science rant 1/13
— Steven Salzberg (@StevenSalzberg1) May 26, 2021
Computational biologists: bake-offs are not science. A science rant 1/13
— Steven Salzberg (@StevenSalzberg1) May 26, 2021
It all started out as a good idea. Have scientists congregate and decide how well software tools work. Alas, it seems it has gone downhill. A 13 tweet thread from Steven Salzberg explains.
submitted by: Istvan Albert
A benchmark of batch-effect correction methods for single-cell RNA sequencing data | Genome Biology | Full Text (genomebiology.biomedcentral.com)
We compare 14 methods in terms of computational runtime, the ability to handle large datasets, and batch-effect correction efficacy while preserving cell type purity. Five scenarios are designed for the study: identical cell types with different technologies, non-identical cell types, multiple batches, big data, and simulated data. Performance is evaluated using four benchmarking metrics including kBET, LISI, ASW, and ARI. We also investigate the use of batch-corrected data to study differential gene expression.
submitted by: Istvan Albert
Autocorrect errors in Excel still creating genomics headache (www.nature.com)
Despite geneticists being warned about spreadsheet problems, 30% of published papers contain mangled gene names in supplementary data.
submitted by: Istvan Albert
Want to get the Biostar Herald in your email? Who wouldn’t? Sign up righ’ere: toggle subscription
Read more here: Source link