Tag: K-mer

Genomic signatures associated with maintenance of genome stability and venom turnover in two parasitoid wasps

Genomic features of two Anastatus wasps, A. japonicus and A. fulloi We employed PacBio high-fidelity (HiFi) long-read sequencing and Illumina short-read sequencing technologies to generate high-quality contigs for two Anastatus wasps, A. japonicus and A. fulloi (Supplementary Tables 1 and 2). These contigs were further scaffolded using Hi-C libraries to…

Continue Reading Genomic signatures associated with maintenance of genome stability and venom turnover in two parasitoid wasps

Building a Simulated Metagenomic Dataset

Building a Simulated Metagenomic Dataset – HackMD       Published Linked with GitHub — tags: ‘JPL: Genetic Inventory Project’ — # Building a Simulated Metagenomic Dataset Here we’ll create a simulated metagenomic datasets for controlled testing. This dataset was used to determine the Kraken 2 confidence score that best…

Continue Reading Building a Simulated Metagenomic Dataset

KGDCMI: A New Approach for Predicting circRNA-miRNA Interactions From Multi-Source Information Extraction and Deep Learning

doi: 10.3389/fgene.2022.958096. eCollection 2022. Affiliations Expand Affiliations 1 School of Information Engineering, Xijing University, Xi’an, China. 2 College of Grassland and Environment Sciences, Xinjiang Agricultural University, Urumqi, China. 3 School of Computer Science, Northwestern Polytechnical University, Xi’an, China. Item in Clipboard Xin-Fei Wang et al. Front Genet. 2022. Show details Display…

Continue Reading KGDCMI: A New Approach for Predicting circRNA-miRNA Interactions From Multi-Source Information Extraction and Deep Learning

Mapping reads using kallisto – rna seq analysis

Mapping reads using kallisto – rna seq analysis 0 Hi, I’m trying to map reads to a reference genome using kallisto for rna seq analysis with terminal on mac and the following command keeps loading for hours and won’t run. I’m not exactly sure where I’ve gone wrong. kallisto index…

Continue Reading Mapping reads using kallisto – rna seq analysis

Mitogenome-wise codon usage pattern from comparative analysis of the first mitogenome of Blepharipa sp. (Muga uzifly) with other Oestroid flies

Outcome of DNA sequencing, assembly, and validation In this study, initially total DNA was isolated from the finely chopped, full-grown pupa of Blepharipa sp. The NanoDrop spectrophotometer (1294 ng/μl) and the Qubit fluorometer (732.8 ng/μl) both found that the concentration of total DNA in the sample at an optimum level for mitochondrial DNA enrichment. The Tape Station profile showed…

Continue Reading Mitogenome-wise codon usage pattern from comparative analysis of the first mitogenome of Blepharipa sp. (Muga uzifly) with other Oestroid flies

Reference-based alignment using MUSKET

Reference-based alignment using MUSKET 1 I’m running MUSKET on my dataset trimmed_data.tar.gz using 1000 threads, 2000 threads, and 4000 threads on a HPC. I’ve been unable to obtain any results because the software seems to be running for a long time. ./../musket-1.1/musket -k 90 600000000 -p 1000 -zlib 9 -ino…

Continue Reading Reference-based alignment using MUSKET

CRISPR-VAE: A Method for Explaining CRISPR/Cas12a Predictions, and an Efficiency-aware gRNA Sequence Generator

Abstract Deep learning has shown great promise in the prediction of the gRNA efficiency, which helps optimize the engineered gRNAs, and thus has greatly improved the usage of CRISPR-Cas systems in genome editing. However, the black box prediction of deep learning methods does not provide adequate explanation to the factors…

Continue Reading CRISPR-VAE: A Method for Explaining CRISPR/Cas12a Predictions, and an Efficiency-aware gRNA Sequence Generator

Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes

Sequencing data We used publicly available sequencing data from the GIAB consortium45, 1000 Genomes Project high-coverage data46 and Human Genome Structural Variation Consortium (HGSVC)4. All datasets include only samples consented for public dissemination of the full genomes. Statistics and reproducibility For generating the assemblies, we used all 14 samples for…

Continue Reading Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes

SAWRPI: A Stacking Ensemble Framework With Adaptive Weight for Predicting ncRNA-Protein Interactions Using Sequence Information

doi: 10.3389/fgene.2022.839540. eCollection 2022. Affiliations Expand Affiliations 1 School of Information Engineering, Xijing University, Xi’an, China. 2 School of Computer Science, Northwestern Polytechnical University, Xi’an, China. Item in Clipboard Zhong-Hao Ren et al. Front Genet. 2022. Show details Display options Display options Format AbstractPubMedPMID doi: 10.3389/fgene.2022.839540. eCollection 2022. Affiliations 1 School…

Continue Reading SAWRPI: A Stacking Ensemble Framework With Adaptive Weight for Predicting ncRNA-Protein Interactions Using Sequence Information

Should I trim adapter sequences and filter by phred score, before alignment by salmon? : bioinformatics

First, trimming adapters is definitely necessary as they are essentially a form of contamination. For quality trimming and filtering I would highly recommend reading the following: Trimming of sequence reads alters RNA-Seq gene expression estimates Essentially they show that aggressive trimming is a problem. To quote from the Conclusions: The…

Continue Reading Should I trim adapter sequences and filter by phred score, before alignment by salmon? : bioinformatics

Frontiers | Machine Learning and Deep Learning Applications in Metagenomic Taxonomy and Functional Annotation

Introduction The study of the microbial environments has benefited from the sequencing revolution, where technology improvement decreased the DNA sequencing cost and increased the number of sequenced nucleic bases. For approximately 20 years (depending on how we define the term metagenomics), it has allowed the decryption of the microbial composition…

Continue Reading Frontiers | Machine Learning and Deep Learning Applications in Metagenomic Taxonomy and Functional Annotation

Role of mobile genetic elements in the global dissemination of the carbapenem resistance gene blaNDM

Wu, W. et al. NDM metallo-β-lactamases and their bacterial producers in health care settings. Clin. Microbiol. Rev. 32, e00115–18 (2019). Yong, D. et al. Characterization of a new metallo-β-lactamase gene, bla NDM-1, and a novel erythromycin esterase gene carried on a unique genetic structure in Klebsiella pneumoniae sequence type 14…

Continue Reading Role of mobile genetic elements in the global dissemination of the carbapenem resistance gene blaNDM

Using AnnoTree to Get More Assignments, Faster, in DIAMOND+MEGAN Microbiome Analysis

INTRODUCTION Next-generation sequencing (NGS) has revolutionized many areas of biological research (1, 2), providing ever-more data at an ever-decreasing cost. One such area is microbiome research, the study of microbes in their theater of activity using metagenomic sequencing (3). Here, deep short-read sequencing, and improving performance of long-read sequencing, are…

Continue Reading Using AnnoTree to Get More Assignments, Faster, in DIAMOND+MEGAN Microbiome Analysis

Petabase-scale sequence alignment catalyses viral discovery

Serratus alignment architecture Serratus (v0.3.0) (github.com/ababaian/serratus) is an open-source cloud-infrastructure designed for ultra-high-throughput sequence alignment against a query sequence or pangenome (Extended Data Fig. 1). Serratus compute costs are dependent on search parameters (expanded discussion available: github.com/ababaian/serratus/wiki/pangenome_design). The nucleotide vertebrate viral pangenome search (bowtie2, database size: 79.8 MB) reached processing rates…

Continue Reading Petabase-scale sequence alignment catalyses viral discovery

A Fast, Memory-Efficient, and Accurate Mechanism to Find Fuzzy Seed Matches

BLEND is a mechanism that can efficiently find fuzzy seed matches between sequences to significantly improve the performance and accuracy while reducing the memory space usage of two important applications: 1) finding overlapping reads and 2) read mapping. Finding fuzzy seed matches enable BLEND to find both 1) exact-matching seeds…

Continue Reading A Fast, Memory-Efficient, and Accurate Mechanism to Find Fuzzy Seed Matches

ncRNA | Free Full-Text | Common Features in lncRNA Annotation and Classification: A Survey

CONC 2006 SVM Eukaryotes (both protein-coding and non-coding genes) peptide length, amino acid composition, predicted secondary structure content, mean hydrophobicity, percentage of residues exposed to solvent, sequence compositional entropy, number of homologues, alignment entropy 10-fold CV on protein-coding: F1-score: 97.4% ☼ Precision: 97.1% ☼ Recall: 97.8% ◙ On non-coding: F1-score:…

Continue Reading ncRNA | Free Full-Text | Common Features in lncRNA Annotation and Classification: A Survey

Intepreting kmer spectrum

Intepreting kmer spectrum 0 Could someone provide an intuitive description of how to inerpret a k-mer spectrum? I understand that the plot shows how many kmers appear a certain number of times, but could someone describe to me what valuable information we can gain from visualizing kmer counts this way?…

Continue Reading Intepreting kmer spectrum

Assembling all transcripts for an individual gene? (using single sequence to seed the assembly)

Assembling all transcripts for an individual gene? (using single sequence to seed the assembly) 0 Let’s say I have a candidate gene and I believe that in an individual sample, the genome sequence differs from the reference which then interferes with alignment. Is there a way for me to do…

Continue Reading Assembling all transcripts for an individual gene? (using single sequence to seed the assembly)

DNA Sequence Classification Based on Milvus

Introduction DNA sequencing is a popular concept in both academic research and practical applications, such as gene traceability, species identification, and disease diagnosis. Whereas all industries starve for a more intelligent and efficient research method, artificial intelligence has attracted much attention, especially from the biological and medical domains. More and…

Continue Reading DNA Sequence Classification Based on Milvus

Where can I get ?or how can I make a mappability track for hg38 assembly

Where can I get ?or how can I make a mappability track for hg38 assembly 2 Lucky you @manojmumar_bhosale I worked on similar problem recently and therefore have the bash script you can use. Required tools: GEM libary from here UCSC’s wigToBigWig from here (I chose binary for Linux 64…

Continue Reading Where can I get ?or how can I make a mappability track for hg38 assembly

a k-mer counter in Rust using the rust-bio and rayon crates

Tool:krust: a k-mer counter in Rust using the rust-bio and rayon crates 0 I hope this isn’t inappropriate as a Share a Tool post, it’s more about getting feedback on and seeing if anybody here is interested in this project: github.com/suchapalaver/krust It’s a k-mer counter in written in Rust. I’ve…

Continue Reading a k-mer counter in Rust using the rust-bio and rayon crates

Prevalence and Molecular Characteristics Based on Whole Genome Sequenc

Introduction Tuberculosis, caused by Mycobacterium tuberculosis, remains one of the top 10 causes of death worldwide and the leading cause of death from a single infectious agent (ranking above HIV/AIDS).1 In 2020, World Health Organization (WHO) reported that 7.1 million people with tuberculosis were newly diagnosed and notified in 2019,…

Continue Reading Prevalence and Molecular Characteristics Based on Whole Genome Sequenc

The Biostar Herald for Tuesday, August 17, 2021

The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here. This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by lakhujanivijay,…

Continue Reading The Biostar Herald for Tuesday, August 17, 2021

k-mer counters – presence/absence matrix

k-mer counters – presence/absence matrix 2 Hi lizabe, You’re right that this tutorial is out of date. The –matrix option is no longer valid as an option to jellyfish count. However, I don’t think it’s original intent was to do what you wanted anyway. It doesn’t write out a binary…

Continue Reading k-mer counters – presence/absence matrix