Tag: K-mer

A graph-based genome and pan-genome variation of the model plant Setaria

Variation and evolution in Setaria We collected genome-wide resequencing data for 630 wild (S. viridis), 829 landrace and 385 modern cultivated accessions from the Setaria genus with an average sequencing depth of ~15×, of which 1,004 were newly generated and 840 were from previous studies16,21 (Supplementary Table 1). After aligning…

Continue Reading A graph-based genome and pan-genome variation of the model plant Setaria

A self-transmissible plasmid from a hyperthermophile that facilitates genetic modification of diverse Archaea

Lederberg, J., Cavalli, L. L. & Lederberg, E. M. Sex compatibility in Escherichia Coli. Genetics 37, 720–730 (1952). Article  CAS  PubMed  PubMed Central  Google Scholar  Elisabeth, G., Günther, M. & Manuel, E. Conjugative plasmid transfer in gram-positive bacteria. Microbiol. Mol. Biol. Rev. 67, 277–301 (2003). Article  Google Scholar  de la…

Continue Reading A self-transmissible plasmid from a hyperthermophile that facilitates genetic modification of diverse Archaea

Biomedicines | Free Full-Text | High-Accuracy ncRNA Function Prediction via Deep Learning Using Global and Local Sequence Information

1. Introduction In recent years, growing access to massive transcriptome sequencing technologies has led to the discovery of an increasing number of novel transcripts from various species. The majority of these transcripts result in non-coding ribonucleic acid (ncRNA) molecules, short sequences of RNA that, with the exception of a small…

Continue Reading Biomedicines | Free Full-Text | High-Accuracy ncRNA Function Prediction via Deep Learning Using Global and Local Sequence Information

Characterization of metagenome-assembled genomes from the International Space Station | Microbiome

Metagenome-assembled bacterial genomes Out of the 42 ISS metagenomes submitted at NCBI, only PMA-treated metagenomes (n = 21) representing the viable/intact cells were used for generating bacterial MAGs. Characteristics of MAGs (n = 46) such as genome size (2.6 to 6.6 Mb), completeness, contamination percentage, the average mean coverage, number…

Continue Reading Characterization of metagenome-assembled genomes from the International Space Station | Microbiome

Illumina Complete Long Reads software analysis workflow for human WGS

Introduction Next-generation sequencing (NGS) enables scientists to decipher the genome for a deeper understanding of biology. Proven Illumina sequencing by synthesis (SBS) chemistry combined with award-winning DRAGEN secondary analysis delivers whole-genome sequencing (WGS) data with outstanding accuracy.1,2 DRAGEN Multigenome (graph) further improves mapping accuracy in challenging regions by ~50%.1 Still,…

Continue Reading Illumina Complete Long Reads software analysis workflow for human WGS

The inchworm process failed. Trinity running error.

The inchworm process failed. Trinity running error. 0 Hello everyone, I’m trying to perform a de novo transcriptome using Trinity and having many issues. The last time I got the inchworm error attached. ******************************************************************** ** Warning, Trinity cannot determine which version of Java is being used. Version 1.7 is required….

Continue Reading The inchworm process failed. Trinity running error.

Chromosome-level genome assemblies from two sandalwood species provide insights into the evolution of the Santalales

Genome sequencing and assembly We sequenced and assembled genomes for the sandalwood species S. album and S. yasi (Fig. 1). In total, ~23 Gb and ~25 Gb of clean short reads of S. album and S. yasi were obtained for the genomic survey, respectively (Supplementary Tables 1 and 2). According to k-mer analysis, the…

Continue Reading Chromosome-level genome assemblies from two sandalwood species provide insights into the evolution of the Santalales

Comparative genome features and secondary metabolite biosynthetic potential of Kutzneria chonburiensis and other species of the genus Kutzneria

Adamek, M., Spohn, M., Stegmann, E. & Ziemert, N. Mining bacterial genomes for secondary metabolite gene clusters. Methods Mol. Biol. 1520, 23–47 (2017). CAS  PubMed  Google Scholar  Belknap, K. C., Park, C. J., Barth, B. M. & Andam, C. P. Genome mining of biosynthetic and chemotherapeutic gene clusters in Streptomyces…

Continue Reading Comparative genome features and secondary metabolite biosynthetic potential of Kutzneria chonburiensis and other species of the genus Kutzneria

CircSSNN: circRNA-binding site prediction via sequence self-attention neural networks with pre-normalization | BMC Bioinformatics

Datasets To verify the effectiveness of the CircSSNN, we adopted 37 circRNA datasets as benchmark datasets following the baselines we compared [15, 16]. We first downloaded the datasets from the circRNA interactome database (circinteractome.nia.nih.gov/). Subsequently, we obtained 335,976 positive samples and 335,976 negative samples following the process of iCircRBP-DHN [17]….

Continue Reading CircSSNN: circRNA-binding site prediction via sequence self-attention neural networks with pre-normalization | BMC Bioinformatics

An unusual tandem kinase fusion protein confers leaf rust resistance in wheat

Plant material Bread wheat accessions Transfer (TA5524), WL711, TA5605, Ae. umbellulata accession TA1851 and Ae. triuncialis accession TA10438 were obtained from the Wheat Genetics Resource Center (WGRC). TcLr9 (Transfer/6*Thatcher) is a near-isogenic line carrying Lr9 from Transfer in the genetic background of the susceptible wheat line Thatcher. TcLr9 and TA5605…

Continue Reading An unusual tandem kinase fusion protein confers leaf rust resistance in wheat

A preliminary study of the use of MinION sequencing to specifically detect Shiga toxin-producing Escherichia coli in culture swipes containing multiple serovars of this species

Tarr, P. I., Gordon, C. A. & Chandler, W. L. Shiga toxin-producing Escherichia coli and haemolytic uremic syndrome. Lancet 365, 1073–1086 (2006). Google Scholar  Koudelka, G. B., Arnold, J. W. & Chkraborty, D. Evolution of STEC virulence: Insights from the antipredator activities of shiga toxing-producing E. coli. Int. J. Med….

Continue Reading A preliminary study of the use of MinION sequencing to specifically detect Shiga toxin-producing Escherichia coli in culture swipes containing multiple serovars of this species

RPI-EDLCN: An Ensemble Deep Learning Framework Based on Capsule Network for ncRNA-Protein Interaction Prediction

Noncoding RNAs (ncRNAs) play crucial roles in many cellular life activities by interacting with proteins. Identification of ncRNA-protein interactions (ncRPIs) is key to understanding the function of ncRNAs. Although a number of computational methods for predicting ncRPIs have been developed, the problem of predicting ncRPIs remains challenging. It has always…

Continue Reading RPI-EDLCN: An Ensemble Deep Learning Framework Based on Capsule Network for ncRNA-Protein Interaction Prediction

Hybrids of RNA viruses and viroid-like elements replicate in fungi

Ribozyme search of the Sequence Read Archive Observing that ribozymes are sufficiently short to be captured on a short sequence read (less than 100 nt), we reasoned it will be possible to screen large volumes of sequencing data to identify libraries potentially containing ribozyme agents. To this end we adapted…

Continue Reading Hybrids of RNA viruses and viroid-like elements replicate in fungi

A high-quality chromosomal-level genome assembly of Greater Scaup (Aythya marila)

Ethics statement All animal experimental procedures were approved by the Biomedical Ethics Committee of Qufu Normal University (approval number: 2022001). Sampling and sequening The experimental sample is a wounded male duck found during the wild bird survey in Jiangsu, China, which died unexpectedly during rescue. We dissected the sample and…

Continue Reading A high-quality chromosomal-level genome assembly of Greater Scaup (Aythya marila)

error in Genome Mepping by BWA tools in Linux

$ gmap_build -D:\btau8refflat.gtf Unknown option: D:btau8refflat.gtf -k flag not specified, so building main hash table with default 15-mers -j flag not specified, so building regional hash tables with default 6-mers gmap_build: Builds a gmap database for a genome to be used by GMAP or GSNAP. Part of GMAP package, version…

Continue Reading error in Genome Mepping by BWA tools in Linux

Introducing GPMeta: Ultrarapid GPU-accelerate | EurekAlert!

image: Runtime of GPMeta versus existing solutions view more  Credit: BGI Genomics Metagenomic sequencing (mNGS) is a powerful diagnostic tool to detect causative pathogens in clinical microbiological testing. Rapid and accurate classification of metagenomic sequences is a critical procedure for pathogen identification in the dry-lab step of mNGS tests. However, this…

Continue Reading Introducing GPMeta: Ultrarapid GPU-accelerate | EurekAlert!

Phenotypic and Genetic Analysis of KPC-49

Introduction The worldwide dissemination of carbapenem-resistant Enterobacteriaceae (CRE), particularly carbapenem-resistant K. pneumoniae (CRKP), poses a significant risk to public health. CRKP can cause various infections, such as urinary tract infections, bloodstream infections, and pneumonia, leading to high morbidity and mortality.1 Prevention and control of K. pneumoniae infection are becoming more…

Continue Reading Phenotypic and Genetic Analysis of KPC-49

kallisto bootstrap / condo installation problem

kallisto bootstrap / condo installation problem 0 I have used kallisto in the past, but am now struggling to get it to work on a new computer (MacBook M1). When I download kallisto using brew, and try to run kallisto quant, I get an error not generating bootstraps ‘Warning: kallisto…

Continue Reading kallisto bootstrap / condo installation problem

An apicomplexan parasite drives the collapse of the bay scallop population in New York

Lafferty, K. D., Porter, J. W. & Ford, S. E. Are diseases increasing in the ocean?. Ann. Rev. Ecol. Evol. Syst. 35, 31–54 (2004). Article  Google Scholar  Ward, J. R. & Lafferty, K. D. The elusive baseline of marine disease: Are diseases in ocean ecosystems increasing?. PLoS Biol. 2, 542–547…

Continue Reading An apicomplexan parasite drives the collapse of the bay scallop population in New York

Inference of phylogenetic trees directly from raw sequencing reads using Read2Tree

State-of-the-art phylogenomic pipelines require many steps, which can be both time consuming and error prone (Fig. 1a). With Read2Tree, we directly process raw sequencing reads and reconstruct sequence alignments for conventional tree inference methods (Fig. 1b and Supplementary Fig. 1). We start by aligning raw reads to nucleotide sequences derived…

Continue Reading Inference of phylogenetic trees directly from raw sequencing reads using Read2Tree

Co-evolution of large inverted repeats and G-quadruplex DNA in fungal mitochondria may facilitate mitogenome stability: the case of Malassezia

Burger, G., Gray, M. W. & Lang, B. F. Mitochondrial genomes: Anything goes. Trends Genet. 19, 709–716 (2003). Article  CAS  PubMed  Google Scholar  Hawksworth, D. L. & Lücking, R. Fungal diversity revisited: 2.2 to 3.8 million species. Microbiol. Spectr. 5, 5–4 (2017). Article  Google Scholar  Theelen, B., Christinaki, A. C.,…

Continue Reading Co-evolution of large inverted repeats and G-quadruplex DNA in fungal mitochondria may facilitate mitogenome stability: the case of Malassezia

NGS: Sequence QC – Texas A&M HPRC

Back to Bioinformatics Main Menu Evaluation FastQC GCATemplates available: grace terra module spider FastQC After running FastQC via the command line, you can ssh to an HPRC cluster enabling X11 forwarding by using the -X option and view the images using the eog tool. From your desktop: ssh -X username@grace.hprc.tamu.edu From your FastQC working…

Continue Reading NGS: Sequence QC – Texas A&M HPRC

removing lines of code from a function?

I’m working on a project for a bioinformatics class. We are given various DNA strings and an integer k for the project. The project’s goal is to identify a K-mer motif that minimises the total of the hamming distances between the motif and each DNA strand. So, first, look at…

Continue Reading removing lines of code from a function?

Comprehensive benchmark and architectural analysis of deep learning models for nanopore sequencing basecalling | Genome Biology

Benchmark setup We first developed a basecalling benchmarking framework enabling new and existing basecalling algorithms to be easily compared. Moreover, our benchmark facilitates the study of individual components of basecallers, as different combinations of basecaller components can readily be evaluated. The framework is divided into two main components: (i) standardized…

Continue Reading Comprehensive benchmark and architectural analysis of deep learning models for nanopore sequencing basecalling | Genome Biology

Super-pangenome analyses highlight genomic diversity and structural variation across wild and cultivated tomato species

Giovannoni, J. J. Genetic regulation of fruit development and ripening. Plant Cell 16, S170–S180 (2004). CAS  PubMed  PubMed Central  Google Scholar  Tieman, D. et al. A chemical genetic roadmap to improved tomato flavor. Science 355, 391–394 (2017). CAS  PubMed  Google Scholar  Peralta, I. E., Spooner, D. M. & Knapp, S….

Continue Reading Super-pangenome analyses highlight genomic diversity and structural variation across wild and cultivated tomato species

The Biostar Herald for Monday, April 03, 2023

The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here. This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan…

Continue Reading The Biostar Herald for Monday, April 03, 2023

poor classification using qiime2 – User Support

Good morning, I am experiencing some difficultie sto get results even if indeed my pipeline has not changed.In specific what I obtain is kind of poor classification: half of the sequences (very low number of OTU in addition (e.g 900) are just attributed to Bacteria or OD1. So I think…

Continue Reading poor classification using qiime2 – User Support

Chromosome-level genome assembly of the critically endangered Baer’s pochard (Aythya baeri)

Ethics statement All animal handling and experimental procedures were approved by the Qufu Normal University Biomedical Ethics Committee (approval number: 2022001). Sample and sequencing Baer’s pochard tissue for whole-genome sequencing was obtained from a dead individual that had strayed into a fishing net in Shandong (China). The muscle tissue that…

Continue Reading Chromosome-level genome assembly of the critically endangered Baer’s pochard (Aythya baeri)

Multi-faceted metagenomic analysis of spacecraft associated surfaces reveal planetary protection relevant microbial composition

. 2023 Mar 22;18(3):e0282428. doi: 10.1371/journal.pone.0282428. eCollection 2023. Sarah K Highlander  1 , Jason M Wood  2 , John D Gillece  1   3 , Megan Folkerts  1 , Viacheslav Fofanov  3   4 , Tara Furstenau  3 , Nitin K Singh  2 , Lisa Guan  2 , Arman Seuylemezian  2 , James N Benardini  2 , David M Engelthaler …

Continue Reading Multi-faceted metagenomic analysis of spacecraft associated surfaces reveal planetary protection relevant microbial composition

Preadapted to adapt: underpinnings of adaptive plasticity revealed by the downy brome genome

Bradley, B. A. et al. Cheatgrass (Bromus tectorum) distribution in the intermountain western United States and its relationship to fire frequency, seasonality, and ignitions. Biol. Invasions 20, 1493–1506 (2018). Article  Google Scholar  Balch, J. K., Bradley, B. A., D’Antonio, C. M. & Gomez-Dans, J. Introduced annual grass increases regional fire…

Continue Reading Preadapted to adapt: underpinnings of adaptive plasticity revealed by the downy brome genome

Functional metagenomics uncovers nitrile-hydrolysing enzymes in a coal metagenome

Introduction Cyanide-containing compounds are known as nitriles and are widely distributed in the natural environment. They are generated by different plants in various forms, such as ricinine, phenyl acetonitrile, cyanogenic glycosides, and β -cyanoalanine (Sewell et al., 2003). Anthropogenic activities have substantially influenced the production of vast quantities of nitrile…

Continue Reading Functional metagenomics uncovers nitrile-hydrolysing enzymes in a coal metagenome

Diagnostic Performance of mNGS in Detecting IAI

Introduction An intra-abdominal abscess is a collection of pus or infected fluid located inside or near the liver, kidneys, pancreas, spleen, or other abdominal organs.1 Unlike skin abscesses with obvious signs of redness and swelling,2 intra-abdominal abscesses occur less frequently and are often difficult to identify, of which patients may…

Continue Reading Diagnostic Performance of mNGS in Detecting IAI

Jellyfish Output

Jellyfish Output 0 Hi, I used the command below to count the k-mer in a genome. jellyfish count -m 3 -s 5M -L 1 B_amyloliquefaciens_CIAD-IB72.fna -o 3mers_jellyfish_output/B_amyloliquefaciens_CIAD-IB72_3.jf –text and here is a part of the output that I get: I want my output to only include the k-mers and their…

Continue Reading Jellyfish Output

QUAST Genome Assembly Quality Assessment

Genetic Variation studies often involve analyzing samples from a previously studied species. For instance, it is of interest to examine genomes of various cultivars, strains, or populations of the same species. In such cases, it may be necessary to perform de novo DNA-Seq assembly to obtain the genome of the…

Continue Reading QUAST Genome Assembly Quality Assessment

implementation of k-mer counting from krakenuniq to kraken2

implementation of k-mer counting from krakenuniq to kraken2 1 Unless I am reading this wrong, authors say that KrakenUniq is only compatible with Kraken 1 databases, not Kraken 2. You may be able to choose a classifier using the advice on that page. Login before adding your answer. Traffic: 2220…

Continue Reading implementation of k-mer counting from krakenuniq to kraken2

Hybrid de novo genome assembly and comparative genomics of three different isolates of Gnomoniopsis castaneae

Crous, P. et al. Fungal planet description sheets: 107–127. Pers. Mol. Phylogeny Evol. Fungi 28, 138–182. doi.org/10.3767/003158512X652633 (2012). Article  CAS  Google Scholar  Visentin, I. et al. Gnomoniopsis castanea sp. nov. (Gnomoniaceae, Diaporthales) as the causal agent of nut rot in sweet chestnut. J. Plant Pathol. 94, 411–419. doi.org/10.4454/JPP.FA.2012.045 (2012). Article …

Continue Reading Hybrid de novo genome assembly and comparative genomics of three different isolates of Gnomoniopsis castaneae

Detection of Streptococcus pyogenes M1UK in Australia and characterization of the mutation driving enhanced expression of superantigen SpeA

Walker, M. J. et al. Disease manifestations and pathogenic mechanisms of Group A Streptococcus. Clin. Microbiol. Rev. 27, 264–301 (2014). Article  PubMed  PubMed Central  Google Scholar  Carapetis, J. R., Steer, A. C., Mulholland, E. K. & Weber, M. The global burden of group A streptococcal diseases. Lancet Infect. Dis. 5,…

Continue Reading Detection of Streptococcus pyogenes M1UK in Australia and characterization of the mutation driving enhanced expression of superantigen SpeA

MMseqs error: Filter prefilter died

MMseqs error: Filter prefilter died 0 Hi all, I’m testing MMseqs to assign taxonomy, everything runs smoothly until the specific point of taxonomy assignation where I get: “Error: orf filter prefilter died”. Has anyone experienced this and knows a workaround? I posted this issue on MMseqs’ Github page but got…

Continue Reading MMseqs error: Filter prefilter died

A wheat kinase and immune receptor form host-specificity barriers against the blast fungus

Wheat blast, caused by Pyricularia oryzae (syn. Magnaporthe oryzae) pathotype Triticum was first identified in Brazil in 1985 (ref. 1). The pathogen subsequently spread to cause epidemics in other regions of Brazil and neighbouring countries, including Bolivia and Paraguay2. Outbreaks of wheat blast occurred in Bangladesh in 2016, and the…

Continue Reading A wheat kinase and immune receptor form host-specificity barriers against the blast fungus

K-mer sequencing vs sequence-alignment

K-mer sequencing vs sequence-alignment 2 I am a completely rookie on Bioinformatics, so please bear with me and use simple language (I am a computer scientist) 🙂 How can we use k-mers to find out if a gene is similar to our query string? For example: We have a reference…

Continue Reading K-mer sequencing vs sequence-alignment

Genetic mapping of microbial and host traits reveals production of immunomodulatory lipids by Akkermansia muciniphila in the murine gut

Animal studies Animal care and study protocols were approved by the AAALAC-accredited Institutional Animal Care and Use Committee of the College of Agricultural Life Sciences at the University of Wisconsin-Madison (UW-Madison). All experiments with mice were performed under protocols approved by the UW-Madison Animal Care and Use Committee (Protocol number…

Continue Reading Genetic mapping of microbial and host traits reveals production of immunomodulatory lipids by Akkermansia muciniphila in the murine gut

comparing two metagenomics data sets

comparing two metagenomics data sets 2 Hello all, I have a shotgun metagenomic dataset (20 samples) paired-end reads. I want to compare my data to another dataset published and available online. I am confused as how can I do it. Please let me know if you have an idea. Thanks…

Continue Reading comparing two metagenomics data sets

Draft genomes of Blastocystis subtypes from human samples of Colombia | Parasites & Vectors

Andersen LO, Bonde I, Nielsen HB, Stensvold CR. A retrospective metagenomics approach to studying Blastocystis. FEMS Microbiol Ecol. 2015. doi.org/10.1093/femsec/fiv072. Article  Google Scholar  Audebert C, Even G, Cian A, Loywick A, Merlin S, Viscogliosi E, et al. Colonization with the enteric protozoa Blastocystis is associated with increased diversity of human…

Continue Reading Draft genomes of Blastocystis subtypes from human samples of Colombia | Parasites & Vectors

A chromosome-level genome assembly of Plantago ovata

Genome assembly and chromosome identification A Plantago ovata genome reference was generated by utilizing a total of 5.98 M (7 cells, 40.21 Gb, N50 = 10.45 Kb, 50 bp–121.17 Kb) PacBio long reads and 636.5 million (47.74 Gb) Hi-C short-reads. PacBio reads were used to assemble contigs, while Hi-C reads were used to achieve chromosome-level assembly. The final…

Continue Reading A chromosome-level genome assembly of Plantago ovata

Annelid functional genomics reveal the origins of bilaterian life cycles

Hall, B. K. & Wake, M. H. in The Origin and Evolution of Larval Forms (eds Hall, B. K. & Wake, M. H.) 1–19 (Academic Press, 1999). Nielsen, C. Animal phylogeny in the light of the trochaea theory. Biol. J. Linn. Soc. 25, 243–299 (2008). Article  Google Scholar  Garstang, W….

Continue Reading Annelid functional genomics reveal the origins of bilaterian life cycles

Need help understanding reference transcriptome and where to download

Hello, Apologies for a pretty elementary question. I tried my best to answer it using resources online but I find many tutorials/explanations out there difficult to understand. I am trying to quantify human rnaseq data using salmon. The reason I am using salmon is because I would like to perform…

Continue Reading Need help understanding reference transcriptome and where to download

Determinants of associations between codon and amino acid usage patterns of microbial communities and the environment inferred based on a cross-biome metagenomic analysis

Data collection Metagenomic project information was collected from the MGnify metagenomic database31. Currently (September 2021), microbiome data (sequence, taxonomic, and functional information, etc.) of 325,323 environmental samples can be found in this database. Often, microbes from similar ecological communities have been studied by different groups at different times and locations….

Continue Reading Determinants of associations between codon and amino acid usage patterns of microbial communities and the environment inferred based on a cross-biome metagenomic analysis

In Silico Validation Of NcRNA-ncRNA Interaction Sites With NcRNAs Represented By K-mers Features

A recent catalogue of human transcriptome, namely CHESS database, assembled from RNA sequencing experiments as a part of the Genotype-Tissue Expression (GTEx) Project reported more non-coding RNA genes (21,856) than protein-coding (21,306), revealing an unexpectedly vast amount of transcriptional noise (Pertea et al, 2018). In this study, we introduce…

Continue Reading In Silico Validation Of NcRNA-ncRNA Interaction Sites With NcRNAs Represented By K-mers Features

Genomic signatures associated with maintenance of genome stability and venom turnover in two parasitoid wasps

Genomic features of two Anastatus wasps, A. japonicus and A. fulloi We employed PacBio high-fidelity (HiFi) long-read sequencing and Illumina short-read sequencing technologies to generate high-quality contigs for two Anastatus wasps, A. japonicus and A. fulloi (Supplementary Tables 1 and 2). These contigs were further scaffolded using Hi-C libraries to…

Continue Reading Genomic signatures associated with maintenance of genome stability and venom turnover in two parasitoid wasps

Building a Simulated Metagenomic Dataset

Building a Simulated Metagenomic Dataset – HackMD       Published Linked with GitHub — tags: ‘JPL: Genetic Inventory Project’ — # Building a Simulated Metagenomic Dataset Here we’ll create a simulated metagenomic datasets for controlled testing. This dataset was used to determine the Kraken 2 confidence score that best…

Continue Reading Building a Simulated Metagenomic Dataset

KGDCMI: A New Approach for Predicting circRNA-miRNA Interactions From Multi-Source Information Extraction and Deep Learning

doi: 10.3389/fgene.2022.958096. eCollection 2022. Affiliations Expand Affiliations 1 School of Information Engineering, Xijing University, Xi’an, China. 2 College of Grassland and Environment Sciences, Xinjiang Agricultural University, Urumqi, China. 3 School of Computer Science, Northwestern Polytechnical University, Xi’an, China. Item in Clipboard Xin-Fei Wang et al. Front Genet. 2022. Show details Display…

Continue Reading KGDCMI: A New Approach for Predicting circRNA-miRNA Interactions From Multi-Source Information Extraction and Deep Learning

Mapping reads using kallisto – rna seq analysis

Mapping reads using kallisto – rna seq analysis 0 Hi, I’m trying to map reads to a reference genome using kallisto for rna seq analysis with terminal on mac and the following command keeps loading for hours and won’t run. I’m not exactly sure where I’ve gone wrong. kallisto index…

Continue Reading Mapping reads using kallisto – rna seq analysis

Mitogenome-wise codon usage pattern from comparative analysis of the first mitogenome of Blepharipa sp. (Muga uzifly) with other Oestroid flies

Outcome of DNA sequencing, assembly, and validation In this study, initially total DNA was isolated from the finely chopped, full-grown pupa of Blepharipa sp. The NanoDrop spectrophotometer (1294 ng/μl) and the Qubit fluorometer (732.8 ng/μl) both found that the concentration of total DNA in the sample at an optimum level for mitochondrial DNA enrichment. The Tape Station profile showed…

Continue Reading Mitogenome-wise codon usage pattern from comparative analysis of the first mitogenome of Blepharipa sp. (Muga uzifly) with other Oestroid flies

Reference-based alignment using MUSKET

Reference-based alignment using MUSKET 1 I’m running MUSKET on my dataset trimmed_data.tar.gz using 1000 threads, 2000 threads, and 4000 threads on a HPC. I’ve been unable to obtain any results because the software seems to be running for a long time. ./../musket-1.1/musket -k 90 600000000 -p 1000 -zlib 9 -ino…

Continue Reading Reference-based alignment using MUSKET

CRISPR-VAE: A Method for Explaining CRISPR/Cas12a Predictions, and an Efficiency-aware gRNA Sequence Generator

Abstract Deep learning has shown great promise in the prediction of the gRNA efficiency, which helps optimize the engineered gRNAs, and thus has greatly improved the usage of CRISPR-Cas systems in genome editing. However, the black box prediction of deep learning methods does not provide adequate explanation to the factors…

Continue Reading CRISPR-VAE: A Method for Explaining CRISPR/Cas12a Predictions, and an Efficiency-aware gRNA Sequence Generator

Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes

Sequencing data We used publicly available sequencing data from the GIAB consortium45, 1000 Genomes Project high-coverage data46 and Human Genome Structural Variation Consortium (HGSVC)4. All datasets include only samples consented for public dissemination of the full genomes. Statistics and reproducibility For generating the assemblies, we used all 14 samples for…

Continue Reading Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes

SAWRPI: A Stacking Ensemble Framework With Adaptive Weight for Predicting ncRNA-Protein Interactions Using Sequence Information

doi: 10.3389/fgene.2022.839540. eCollection 2022. Affiliations Expand Affiliations 1 School of Information Engineering, Xijing University, Xi’an, China. 2 School of Computer Science, Northwestern Polytechnical University, Xi’an, China. Item in Clipboard Zhong-Hao Ren et al. Front Genet. 2022. Show details Display options Display options Format AbstractPubMedPMID doi: 10.3389/fgene.2022.839540. eCollection 2022. Affiliations 1 School…

Continue Reading SAWRPI: A Stacking Ensemble Framework With Adaptive Weight for Predicting ncRNA-Protein Interactions Using Sequence Information

Should I trim adapter sequences and filter by phred score, before alignment by salmon? : bioinformatics

First, trimming adapters is definitely necessary as they are essentially a form of contamination. For quality trimming and filtering I would highly recommend reading the following: Trimming of sequence reads alters RNA-Seq gene expression estimates Essentially they show that aggressive trimming is a problem. To quote from the Conclusions: The…

Continue Reading Should I trim adapter sequences and filter by phred score, before alignment by salmon? : bioinformatics

Frontiers | Machine Learning and Deep Learning Applications in Metagenomic Taxonomy and Functional Annotation

Introduction The study of the microbial environments has benefited from the sequencing revolution, where technology improvement decreased the DNA sequencing cost and increased the number of sequenced nucleic bases. For approximately 20 years (depending on how we define the term metagenomics), it has allowed the decryption of the microbial composition…

Continue Reading Frontiers | Machine Learning and Deep Learning Applications in Metagenomic Taxonomy and Functional Annotation

Role of mobile genetic elements in the global dissemination of the carbapenem resistance gene blaNDM

Wu, W. et al. NDM metallo-β-lactamases and their bacterial producers in health care settings. Clin. Microbiol. Rev. 32, e00115–18 (2019). Yong, D. et al. Characterization of a new metallo-β-lactamase gene, bla NDM-1, and a novel erythromycin esterase gene carried on a unique genetic structure in Klebsiella pneumoniae sequence type 14…

Continue Reading Role of mobile genetic elements in the global dissemination of the carbapenem resistance gene blaNDM

Using AnnoTree to Get More Assignments, Faster, in DIAMOND+MEGAN Microbiome Analysis

INTRODUCTION Next-generation sequencing (NGS) has revolutionized many areas of biological research (1, 2), providing ever-more data at an ever-decreasing cost. One such area is microbiome research, the study of microbes in their theater of activity using metagenomic sequencing (3). Here, deep short-read sequencing, and improving performance of long-read sequencing, are…

Continue Reading Using AnnoTree to Get More Assignments, Faster, in DIAMOND+MEGAN Microbiome Analysis

Petabase-scale sequence alignment catalyses viral discovery

Serratus alignment architecture Serratus (v0.3.0) (github.com/ababaian/serratus) is an open-source cloud-infrastructure designed for ultra-high-throughput sequence alignment against a query sequence or pangenome (Extended Data Fig. 1). Serratus compute costs are dependent on search parameters (expanded discussion available: github.com/ababaian/serratus/wiki/pangenome_design). The nucleotide vertebrate viral pangenome search (bowtie2, database size: 79.8 MB) reached processing rates…

Continue Reading Petabase-scale sequence alignment catalyses viral discovery

A Fast, Memory-Efficient, and Accurate Mechanism to Find Fuzzy Seed Matches

BLEND is a mechanism that can efficiently find fuzzy seed matches between sequences to significantly improve the performance and accuracy while reducing the memory space usage of two important applications: 1) finding overlapping reads and 2) read mapping. Finding fuzzy seed matches enable BLEND to find both 1) exact-matching seeds…

Continue Reading A Fast, Memory-Efficient, and Accurate Mechanism to Find Fuzzy Seed Matches

ncRNA | Free Full-Text | Common Features in lncRNA Annotation and Classification: A Survey

CONC 2006 SVM Eukaryotes (both protein-coding and non-coding genes) peptide length, amino acid composition, predicted secondary structure content, mean hydrophobicity, percentage of residues exposed to solvent, sequence compositional entropy, number of homologues, alignment entropy 10-fold CV on protein-coding: F1-score: 97.4% ☼ Precision: 97.1% ☼ Recall: 97.8% ◙ On non-coding: F1-score:…

Continue Reading ncRNA | Free Full-Text | Common Features in lncRNA Annotation and Classification: A Survey

Intepreting kmer spectrum

Intepreting kmer spectrum 0 Could someone provide an intuitive description of how to inerpret a k-mer spectrum? I understand that the plot shows how many kmers appear a certain number of times, but could someone describe to me what valuable information we can gain from visualizing kmer counts this way?…

Continue Reading Intepreting kmer spectrum

Assembling all transcripts for an individual gene? (using single sequence to seed the assembly)

Assembling all transcripts for an individual gene? (using single sequence to seed the assembly) 0 Let’s say I have a candidate gene and I believe that in an individual sample, the genome sequence differs from the reference which then interferes with alignment. Is there a way for me to do…

Continue Reading Assembling all transcripts for an individual gene? (using single sequence to seed the assembly)

DNA Sequence Classification Based on Milvus

Introduction DNA sequencing is a popular concept in both academic research and practical applications, such as gene traceability, species identification, and disease diagnosis. Whereas all industries starve for a more intelligent and efficient research method, artificial intelligence has attracted much attention, especially from the biological and medical domains. More and…

Continue Reading DNA Sequence Classification Based on Milvus

Where can I get ?or how can I make a mappability track for hg38 assembly

Where can I get ?or how can I make a mappability track for hg38 assembly 2 Lucky you @manojmumar_bhosale I worked on similar problem recently and therefore have the bash script you can use. Required tools: GEM libary from here UCSC’s wigToBigWig from here (I chose binary for Linux 64…

Continue Reading Where can I get ?or how can I make a mappability track for hg38 assembly

a k-mer counter in Rust using the rust-bio and rayon crates

Tool:krust: a k-mer counter in Rust using the rust-bio and rayon crates 0 I hope this isn’t inappropriate as a Share a Tool post, it’s more about getting feedback on and seeing if anybody here is interested in this project: github.com/suchapalaver/krust It’s a k-mer counter in written in Rust. I’ve…

Continue Reading a k-mer counter in Rust using the rust-bio and rayon crates

Prevalence and Molecular Characteristics Based on Whole Genome Sequenc

Introduction Tuberculosis, caused by Mycobacterium tuberculosis, remains one of the top 10 causes of death worldwide and the leading cause of death from a single infectious agent (ranking above HIV/AIDS).1 In 2020, World Health Organization (WHO) reported that 7.1 million people with tuberculosis were newly diagnosed and notified in 2019,…

Continue Reading Prevalence and Molecular Characteristics Based on Whole Genome Sequenc

The Biostar Herald for Tuesday, August 17, 2021

The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here. This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by lakhujanivijay,…

Continue Reading The Biostar Herald for Tuesday, August 17, 2021

k-mer counters – presence/absence matrix

k-mer counters – presence/absence matrix 2 Hi lizabe, You’re right that this tutorial is out of date. The –matrix option is no longer valid as an option to jellyfish count. However, I don’t think it’s original intent was to do what you wanted anyway. It doesn’t write out a binary…

Continue Reading k-mer counters – presence/absence matrix