Tag: BLAST

BlastX through Biopython

BlastX through Biopython 0 I have an unknown gene segment in the Human_gene.txt file and I want to run blastx (translated nucleotide) using the blast module of Biopython by making the E-value threshold 0.0001 and displaying the match result of 50 residues of query and subject. I am trying this…

Continue Reading BlastX through Biopython

Bioinformatics with basic local alignment search tool (BLAST) and fast alignment (FASTA)

Article, 2014 In: Journal of Bioinformatics and Sequence Analysis, ISSN 2141-2464, Volume 6, 1, Pages 1-6, 2014 DOI:10.5897/ijbc2013.0086 Organisations Abstract Following advances in DNA and protein sequencing, the application of computational approaches in analysing biological data has become a very important aspect of biology. Evaluating similarities between biological sequences…

Continue Reading Bioinformatics with basic local alignment search tool (BLAST) and fast alignment (FASTA)

SARS-COV-2 Blast : bioinformatics

Hello everyone I have multiple short sequences of SARS-COV-2 in the range of 500-800 nucleotides, I need to blast them, I get a good result on BLAST-NCBI, but is there anyone here who uses AudacityInstant which belongs to GISAID, this site aligns only sequences higher than 10000 nucleotides, as you…

Continue Reading SARS-COV-2 Blast : bioinformatics

FastQ_7 April 2022(1) – Copy.pptx – What is the FASTA format? The FASTA format is the “workhorse” of bioinformatics. It is used to represent sequence

the FASTA format is not “officially” defined – even though it carries the majority of data information onliving systems. Its origins go back to asoftware tool calledFastawritten byDavidLipman(ascientist that later became, and still is, the director of NCBI) andWilliam R. Pearsonof the University ofVirginia. The tool itself has (to some…

Continue Reading FastQ_7 April 2022(1) – Copy.pptx – What is the FASTA format? The FASTA format is the “workhorse” of bioinformatics. It is used to represent sequence

BenchSci hiring Bioinformatics Engineer (Remote) in Toronto, Ontario, Canada

BenchSci’s vision is to bring novel medicine to patients 50% faster by 2025. We’re achieving it by empowering scientists with the world’s most advanced biomedical artificial intelligence. Backed by F-Prime, Gradient Ventures (Google’s AI fund), and Inovia Capital, our platform accelerates science at 15 top-20 pharmaceutical companies and over 4,300…

Continue Reading BenchSci hiring Bioinformatics Engineer (Remote) in Toronto, Ontario, Canada

NcbiblastpCommandline alignment results are different from blast webpage

What you are trying to do is fairly simple, and you are complicating it by: 1) not providing your sequences so that someone can reproduce your attempt; 2) giving a result in a form that is impossible to read. Be honest, can you make any sense of the result you…

Continue Reading NcbiblastpCommandline alignment results are different from blast webpage

All vs All blast not self hit? Orthogroup clustering and single copy genome?

Hey guys Self hit I have this actually a bit weird question about blast. I’ve been doing some work around single copy genome construction using Reciprocal best blast hit (RBBH) method. As I have something like 100+ annotated genome, I concatenated all annotated CDS into one fasta and makeblastdb with…

Continue Reading All vs All blast not self hit? Orthogroup clustering and single copy genome?

Description, Programming Languages, Similar Projects of Bioconda Recipes

Conda recipes for the bioconda channel. 4571 Projects Similar to Bioconda Recipes Sequenceserver Intuitive local web frontend for the BLAST bioinformatics tool Homebrew Bio 🍺🔬 Bioinformatics formulae for the Homebrew package manager (macOS and Linux) Galaxy Data intensive science for everyone. Travel Guide “A travel guide to suggest activities you…

Continue Reading Description, Programming Languages, Similar Projects of Bioconda Recipes

Ubuntu Manpage: Bio::Tools::Seg – parse “seg” output

Provided by: libbio-perl-perl_1.7.2-2_all NAME Bio::Tools::Seg – parse “seg” output SYNOPSIS use Bio::Tools::Seg; my $parser = Bio::Tools::Seg->(-file => ‘seg.fasta’); while ( my $f = $parser->next_result ) { if ($f->score < 1.5) { print $f->location->to_FTstring, ” is low complexity\n”; } } DESCRIPTION “seg” identifies low-complexity regions on a protein sequence. It is…

Continue Reading Ubuntu Manpage: Bio::Tools::Seg – parse “seg” output

Qiime2 Exclude Seqs with FASTQ as query data.

Qiime2 Exclude Seqs with FASTQ as query data. 0 Hello, I am working with FASTQ files and I want to filter them based on the alignment with references sequences in FASTA format. I decided to use QIIME2 for this. So I imported both FASTA and FASTQ files to the required…

Continue Reading Qiime2 Exclude Seqs with FASTQ as query data.

Issues with searching Swissprot #25

Eddykay310 Hi @cruizperez Please help me understand the problem here and how I can fix it. I have successfully generated my DBs but I get this error during analysis. The .dmnd files do not exist in the folders as the error says but I don’t know how I can generate…

Continue Reading Issues with searching Swissprot #25

Creating local nt blast database : bioinformatics

Hi all, I’m trying to create a local nt blast database, my eventual goal is to create a subset based on a taxanomic group to be used on a cluster with limited storage space, its seems the only way to do this though is to start with the whole database…

Continue Reading Creating local nt blast database : bioinformatics

Does anyone know? how to sort one16s rDNA from multiple 16s rDNA in one strains for phylogenetic tree construction?

Does anyone know? how to sort one16s rDNA from multiple 16s rDNA in one strains for phylogenetic tree construction? 3 Hi, i had extract 16srDNA sequence by using barrnap from complete genome of each strains. These are the output which are given below. Actually when i opened each strains file…

Continue Reading Does anyone know? how to sort one16s rDNA from multiple 16s rDNA in one strains for phylogenetic tree construction?

What is ClustalW? Tutorial of How to Use ClustalW

Share Tweet Share Share Email ClustalW is a computer tool of significant importance in bioinformatics. Primarily, biologists and statisticians used it for multiple sequence alignment. Many versions of ClustalW over the development of the algorithm are available now. How to perform a search on ClustalW? ClustalW homepage 1. Go to…

Continue Reading What is ClustalW? Tutorial of How to Use ClustalW

Extensively drug resistant E. coli LZ00114

Introduction Escherichia coli is a common Gram-negative opportunistic pathogen that causes invasive host infections through virulence factors such as flagella, toxin secretion, and adhesins. According to the source of the infection, pathogenic E. coli can be classified as intestinal (diarrheagenic) and extraintestinal (ExPEC). Uropathogenic E. coli (UPEC) is the most…

Continue Reading Extensively drug resistant E. coli LZ00114

Introduction to the BLAST Suite and BLASTN | Michael Agostino

In Chapter 2 we learned how to search databases with text queries. All of these were exact matches—that is, we were expecting to find the exact accession number or exactly spelled words. In this chapter, a much harder database-searching problem is introduced. How do you find matches when your query…

Continue Reading Introduction to the BLAST Suite and BLASTN | Michael Agostino

BLASTn using R

BLASTn using R 0 Hello, I have around 2000 DNA nucleotide sequences (60 bases long) stored in each row in an excel sheet. I want to run BLAST over each one of them individually and extract the “Description” of the first hit. Like for Example: Suppose on NCBI BLAST website…

Continue Reading BLASTn using R

“No such file or directory: ‘test.xml”

Biopython NcbiblastpCommandline not working: “No such file or directory: ‘test.xml” 0 from Bio.Blast.Applications import NcbiblastpCommandline blastp=r”C:\NCBI\blast-BLAST_VERSION+\bin\blastp.exe” blastp_cline = NcbiblastpCommandline(blastp, query=r”C:/NCBI/blast-BLAST_VERSION+/bin/test.fasta”, db=r’C:/NCBI/blast-BLAST_VERSION+/bin/bos_protein.fasta’, outfmt=5, evalue=0.00001, out=r”C:/NCBI/blast-BLAST_VERSION+/bin/test.XML”) blastp_cline from Bio.Blast import NCBIXML with open(“test.XML”) as result_handle: E_VALUE_THRESH=0.01 blast_records = NCBIXML.parse(result_handle) blast_record = NCBIXML.read(result_handle) for alignment in blast_record.alignments: for hsp in alignment.hsps: if hsp.expect…

Continue Reading “No such file or directory: ‘test.xml”

Design and implementation of a novel pharmacogenetic assay for the identification of the CYP2D6*10 genetic variant | BMC Research Notes

Methods The study was conducted at the Human Genetics Unit, Faculty of Medicine, University of Colombo. It was an experimental study, where a novel assay was designed for the targeted variant, and a cohort of hormone receptor positive breast cancer patients were genotyped for the CYP2D6*10 variant using the optimized…

Continue Reading Design and implementation of a novel pharmacogenetic assay for the identification of the CYP2D6*10 genetic variant | BMC Research Notes

Frontiers | Machine Learning and Deep Learning Applications in Metagenomic Taxonomy and Functional Annotation

Introduction The study of the microbial environments has benefited from the sequencing revolution, where technology improvement decreased the DNA sequencing cost and increased the number of sequenced nucleic bases. For approximately 20 years (depending on how we define the term metagenomics), it has allowed the decryption of the microbial composition…

Continue Reading Frontiers | Machine Learning and Deep Learning Applications in Metagenomic Taxonomy and Functional Annotation

How can I find genes located in the same region (overlapping) of the chromosome ?

How can I find genes located in the same region (overlapping) of the chromosome ? 1 I take the BAM file as input and perform RNA-Seq. The program prints out a list of genes to which the reads match. Some of the genes in the list overlapping in the same…

Continue Reading How can I find genes located in the same region (overlapping) of the chromosome ?

From scientific name to taxonomy information entrez

From scientific name to taxonomy information entrez 1 Hi all, I have a txt file with a list of scientific names of plants and I would like to obtain a final file with taxonomy information. For example, if one of my organism is Acalypha hispida, I would like to obtain…

Continue Reading From scientific name to taxonomy information entrez

Google Researchers Use Machine Learning Approach To Annotate Protein Domains

Source: www.nature.com/articles/s41587-021-01179-w.epdf Proteins play an important part in the construction and function of all living organisms. Each protein is made up of a chain of amino acid building blocks. Much like an image might have numerous things, a protein can have multiple components, known as protein domains. Researchers have been…

Continue Reading Google Researchers Use Machine Learning Approach To Annotate Protein Domains

Index of /~psgendb/local/biopython-1.55.old/Scripts/xbbtools

Name Last modified Size Description Parent Directory   –   nextorf.py 2010-10-07 10:28 9.1K   test.fas 2010-10-07 10:28 517   testrp.fas 2010-10-07 10:28 50K   xbb_blast.py 2010-10-07 10:28 4.7K   xbb_blastbg.py 2010-10-07 10:28 2.3K   xbb_help.py 2010-10-07 10:28 2.2K   xbb_search.py 2010-10-07 10:28 5.0K   xbb_sequence.py 2010-10-07 10:28 399  …

Continue Reading Index of /~psgendb/local/biopython-1.55.old/Scripts/xbbtools

Role of mobile genetic elements in the global dissemination of the carbapenem resistance gene blaNDM

Wu, W. et al. NDM metallo-β-lactamases and their bacterial producers in health care settings. Clin. Microbiol. Rev. 32, e00115–18 (2019). Yong, D. et al. Characterization of a new metallo-β-lactamase gene, bla NDM-1, and a novel erythromycin esterase gene carried on a unique genetic structure in Klebsiella pneumoniae sequence type 14…

Continue Reading Role of mobile genetic elements in the global dissemination of the carbapenem resistance gene blaNDM

Optimization of cerebrospinal fluid microbial DNA metagenomic sequencing diagnostics

We implemented a metagenomic DNA sequencing methodology to unbiasedly detect microbial species in CSF samples from patients with CNS symptoms in which a pathogen or EBV had been detected (Additional 3: Table 1). Samples positively identified with pathogen-specific quantitative PCR (qPCR), 16S rRNA gene sequencing or bacterial/mycotic culture in CSF…

Continue Reading Optimization of cerebrospinal fluid microbial DNA metagenomic sequencing diagnostics

Bioconductor – TAPseq

DOI: 10.18129/B9.bioc.TAPseq     This package is for version 3.12 of Bioconductor; for the stable, up-to-date release version, see TAPseq. Targeted scRNA-seq primer design for TAP-seq Bioconductor version: 3.12 Design primers for targeted single-cell RNA-seq used by TAP-seq. Create sequence templates for target gene panels and design gene-specific primers using…

Continue Reading Bioconductor – TAPseq

biopython – How to blastp with fasta file that contains ~50 sequences

I’m trying to blastp multiple aminoacids sequences using biopython. I just can’t seem to get it right and i cant figure out the handbook for how to do this. I have come up with the following: open(“proteins_PROT.fasta”,”r”) from Bio.Blast.Applications import NcbiblastpCommandline cline = NcbiblastpCommandline(query=”proteins_PROT.fasta”, db=”nr”, evalue=0.001, remote=True, ungapped=True) NcbiblastpCommandline(cmd=’blastp’, query=”proteins_PROT.fasta”,…

Continue Reading biopython – How to blastp with fasta file that contains ~50 sequences

peroxisomal multifunctional enzyme type 2-like, maker-scaffold366_size194251-snap-gene-0.19 (gene) Tigriopus kingsejongensis

Associated RNAi Experiments Homology BLAST of peroxisomal multifunctional enzyme type 2-like vs. L. salmonis genes Match: EMLSAG00000010112 (supercontig:LSalAtl2s:LSalAtl2s668:190059:194758:1 gene:EMLSAG00000010112 transcript:EMLSAT00000010112 description:”augustus_masked-LSalAtl2s668-processed-gene-1.1″) HSP 1 Score: 102.064 bits (253), Expect = 2.195e-25Identity = 65/191 (34.03%), Postives = 101/191 (52.88%), Query Frame = 0 Query: 134 GKVALVTGAGGGLGKAYALLLASRGASVVVNDLGGSRTGEGQSSKAADEVVNEIRQKGGKAV—–GNYDSVEDGEAVIKTALDNFGRIDIVINNAGILRDRSIGRTSDSDWDLVQKVHLRGAFQVIRAAWPHMKKQKYGRIINTSSVAGIFGNFGQSNYSSAKAGLIGLTSTLAIEGERSGIQANVIVP 319 GKVAL+TGA G+G++ A+L A…

Continue Reading peroxisomal multifunctional enzyme type 2-like, maker-scaffold366_size194251-snap-gene-0.19 (gene) Tigriopus kingsejongensis

Using AnnoTree to Get More Assignments, Faster, in DIAMOND+MEGAN Microbiome Analysis

INTRODUCTION Next-generation sequencing (NGS) has revolutionized many areas of biological research (1, 2), providing ever-more data at an ever-decreasing cost. One such area is microbiome research, the study of microbes in their theater of activity using metagenomic sequencing (3). Here, deep short-read sequencing, and improving performance of long-read sequencing, are…

Continue Reading Using AnnoTree to Get More Assignments, Faster, in DIAMOND+MEGAN Microbiome Analysis

Hinted by Clinical Misclassification of a Neisseria mucosa Strain

The taxonomy of the genus Neisseria remains confusing, particularly regarding Neisseria mucosa and Neisseria sicca. In 2012, ribosomal multi-locus sequence typing reclassified both as N. mucosa, but data concerning 17 N. sicca strains remain available in GenBank. The continuous progress of high-throughput sequencing has facilitated ready accessibility of whole-genome data,…

Continue Reading Hinted by Clinical Misclassification of a Neisseria mucosa Strain

Butterfly eyespots evolved via cooption of an ancestral gene-regulatory network that also patterns antennae, legs, and wings

Although the hypothesis of gene-regulatory network (GRN) cooption is a plausible model to explain the origin of morphological novelties (1), there has been limited empirical evidence to show that this mechanism led to the origin of any novel trait. Several hypotheses have been proposed for the origin of butterfly eyespots,…

Continue Reading Butterfly eyespots evolved via cooption of an ancestral gene-regulatory network that also patterns antennae, legs, and wings

Bioinformatics Research Scientist (Blue Sky Initiative), Memphis, Tennessee

M. Madan Babus Group and the Center for Data-Driven Discovery in the Department of Structural Biology is seeking a highly driven, Full time Machine Learning Research Scientist support the Kalodimos and Babu Groups on the Blue Sky Initiative “Seeing the Invisible in Protein Kinases.” This project is supported by $35…

Continue Reading Bioinformatics Research Scientist (Blue Sky Initiative), Memphis, Tennessee

Errors

Errors /blast/moderated Leave a mail with to get this resolved. Contact: VIB / UGentBioinformatics & Evolutionary GenomicsTechnologiepark 927B-9052 GentBELGIUM+32 (0) 9 33 13807 (phone)+32 (0) 9 33 13809 (fax) Don’t hesitate to contact the in case of problems with the website! You are visiting an outdated page of the BEG/Van…

Continue Reading Errors

Petabase-scale sequence alignment catalyses viral discovery

Serratus alignment architecture Serratus (v0.3.0) (github.com/ababaian/serratus) is an open-source cloud-infrastructure designed for ultra-high-throughput sequence alignment against a query sequence or pangenome (Extended Data Fig. 1). Serratus compute costs are dependent on search parameters (expanded discussion available: github.com/ababaian/serratus/wiki/pangenome_design). The nucleotide vertebrate viral pangenome search (bowtie2, database size: 79.8 MB) reached processing rates…

Continue Reading Petabase-scale sequence alignment catalyses viral discovery

taxonomy – Assign multiple taxids to a sequence when constructing a local BLAST database

I recently had a script fail due to poor handling of BLAST output. The BLAST -outfmt staxids field usually returns a single taxid, but occasionally it returns two or more taxids separated by a semicolon, such as 556514;701533. Fixing the script to handle this should be fairly straightforward. But the…

Continue Reading taxonomy – Assign multiple taxids to a sequence when constructing a local BLAST database

NCBI looking for testers for a new web-only (for now) clustered `nr` database

News:NCBI looking for testers for a new web-only (for now) clustered `nr` database 0 Find details about how to participate by going to this link. Clustered nr is the standard NCBI nr database clustered with each sequence within 90% identity and 90% length to other members of the cluster. Your…

Continue Reading NCBI looking for testers for a new web-only (for now) clustered `nr` database

(PDF) AntiHunter: searching BLAST output for EST antisense transcripts | Alessandro Guffanti

(PDF) AntiHunter: searching BLAST output for EST antisense transcripts | Alessandro Guffanti – Academia.edu Academia.edu uses cookies to personalize content, tailor ads and improve the user experience. By using our site, you agree to our collection of information through the use of cookies. To learn more, view our Privacy Policy. ×…

Continue Reading (PDF) AntiHunter: searching BLAST output for EST antisense transcripts | Alessandro Guffanti

An intronic transposon insertion associates with a trans-species color polymorphism in Midas cichlid fishes

Conflicting results suggest a missing variant In order to narrow down candidates for the causal genetic variant, we performed genome-wide association mapping separately in individual lake populations (previously, association mapping was only performed across the whole species flock5). Interestingly, despite clear association peaks in the crater lakes (Fig. 1a, b), the…

Continue Reading An intronic transposon insertion associates with a trans-species color polymorphism in Midas cichlid fishes

CRISPR-Cas12a ribonucleoprotein-mediated gene editing in the plant pathogenic fungus Magnaporthe oryzae

. 2021 Dec 24;3(1):101072. doi: 10.1016/j.xpro.2021.101072. eCollection 2022 Mar 18. Affiliations Expand Affiliation 1 Department of Plant Pathology, Kansas State University, Manhattan, KS, USA. Free PMC article Item in Clipboard Jun Huang et al. STAR Protoc. 2021. Free PMC article Show details Display options Display options Format AbstractPubMedPMID . 2021 Dec…

Continue Reading CRISPR-Cas12a ribonucleoprotein-mediated gene editing in the plant pathogenic fungus Magnaporthe oryzae

Bioinformatics HW5.docx – 1 Run a Delta-BLAST with the silkworm insulin protein(P26726 Limit to human proteins in the RefSeq_Protein database How many

1. Run a Delta-BLAST with the silkworm insulin protein (P26726). Limit to human proteins in theRefSeq_Protein database.How many total sequences? Get answer to your question and much more b. How many human homologs appear to have the insulin domain (irrespective of the e-valuethreshold)? Get answer to your question and much…

Continue Reading Bioinformatics HW5.docx – 1 Run a Delta-BLAST with the silkworm insulin protein(P26726 Limit to human proteins in the RefSeq_Protein database How many

BLAST | ICGRC

In bioinformatics, BLAST (Basic Local Alignment Search Tool) is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of different proteins or the nucleotides of DNA sequences. A BLAST search enables a researcher to compare a query sequence with a library or database of sequences, and…

Continue Reading BLAST | ICGRC

MCScanX: not found – githubmate

Hi CJ-Chen, Thanks for developing this tool. I having an issue when I try to use Quick Run MCScanX Wrapper. error log here: [Debug…All Standard Error Info will show as following:…] Curr log file:/tmp/TBtools.14595798488989250861.20210723103734.log Curr java version:11.0.11 Curr TBtools version:1.09854 Maxmum Memory for Curr TBtools: 4162846720 curVersion:1.09854:force Fetch Location:200 Factor:0.1516546094367225…

Continue Reading MCScanX: not found – githubmate

makeblastdb creating multiple files of unexpectedly large sizes

I have a set of 100 amino acid sequences and I want to perform a BLASTP sesrch against the refseq_protein database. Accordingly I had set up the standalone version of BLAST (Version 2.11.0+) and downloaded the refseq_protein database from NCBI using the following code wget ftp.ncbi.nlm.nih.gov/refseq/release/complete/*.faa.gz The database gets downloaded…

Continue Reading makeblastdb creating multiple files of unexpectedly large sizes

sequence alignment – Help with MinION sequencing data species identification

Hi I’m new to bioinformatics and have just completed my first run on the MinION (long read sequencing Oxford Nanopore Technologies). I was hoping someone could direct me towards R packages, workflow, tutorials or guides that will help me identify species that are present in my sample mainly for fungi…

Continue Reading sequence alignment – Help with MinION sequencing data species identification

Zooplankton diversity monitoring strategy for the urban coastal region using metabarcoding analysis

1. Eyun, S. Phylogenomic analysis of Copepoda (Arthropoda, Crustacea) reveals unexpected similarities with earlier proposed morphological phylogenies. BMC Evol. Biol. 17, 23 (2017). PubMed  PubMed Central  Google Scholar  2. Eyun, S. et al. Evolutionary history of chemosensory-related gene families across the Arthropoda. Mol. Biol. Evol. 34, 1838–1862 (2017). CAS  PubMed …

Continue Reading Zooplankton diversity monitoring strategy for the urban coastal region using metabarcoding analysis

bioinformatics – Local BLAST NCBI C++ Exception

I’m getting an error trying to to use blast v2.12 against a local nt database. I’ve downloaded nt twice from the ftp server thinking the first time it was corrupt but that didn’t change anything. My command is: blastn -db nt -num_threads 8 -outfmt “6 qseqid sacc stitle ssciname nident…

Continue Reading bioinformatics – Local BLAST NCBI C++ Exception

Add module descriptors for javadoc

$ mvn javadoc:aggregate … [INFO] — maven-javadoc-plugin:3.0.1:aggregate (default-cli) @ biojava-legacy — [ERROR] no module descriptor for org.biojava:biojava-legacy [ERROR] no module descriptor for org.biojava:bytecode [ERROR] no module descriptor for org.biojava:core [ERROR] no module descriptor for org.biojava:alignment [ERROR] no module descriptor for org.biojava:biosql [ERROR] no module descriptor for org.biojava:blast [ERROR] no module…

Continue Reading Add module descriptors for javadoc

ncRNA | Free Full-Text | Common Features in lncRNA Annotation and Classification: A Survey

CONC 2006 SVM Eukaryotes (both protein-coding and non-coding genes) peptide length, amino acid composition, predicted secondary structure content, mean hydrophobicity, percentage of residues exposed to solvent, sequence compositional entropy, number of homologues, alignment entropy 10-fold CV on protein-coding: F1-score: 97.4% ☼ Precision: 97.1% ☼ Recall: 97.8% ◙ On non-coding: F1-score:…

Continue Reading ncRNA | Free Full-Text | Common Features in lncRNA Annotation and Classification: A Survey

Blast command line pipeline not working

Blast command line pipeline not working 0 Hello, I am running now a local blast pipeline using MacOs. The goal here is to take interval of the 5 best hits and then extract the SNP variants from multiple vcf.gz files. But I am facing an error which I cannot solve….

Continue Reading Blast command line pipeline not working

Summer Intern -Bioinformatics – Roche – Pleasanton

·  Job facts Summer Intern – (Bioinformatics) The Summer @ Roche Intern Program has been developed to provide students with a fun yet rewarding summer through hands-on experience and numerous opportunities to network with other interns as well as employees in the organization. Additionally, we help our students meet their…

Continue Reading Summer Intern -Bioinformatics – Roche – Pleasanton

18S gene not present in genome assembly?

18S gene not present in genome assembly? 0 I am designing PCR primers to amplify a region of the 18S rRNA gene of Penicillium expansum. As the template for primer design, I use the consensus sequence of a multiple sequence alignment of 18S sequences obtained from the SILVA database. When…

Continue Reading 18S gene not present in genome assembly?

Load .gb file in R

Hello community, I am trying to load the .gb file I got after blast on ncbi. Tried a few libraries, like “genbankr”, “read.gb”, but without success. I got the following error: library(read.gb) read.gb(“orig_with_blast.gb”, DNA = TRUE, Type = “full”, Source = “File”) Error in eval(parse(text = Order[i])) : object ‘ORGANISM’…

Continue Reading Load .gb file in R

How to retrieve fasta sequence after local blast?

How to retrieve fasta sequence after local blast? 1 Hello, I have created a Blast database using a reference genome. Then, I have performed a local blast search in command line using a gene of interest. I have obtained some hits with the usual Blasting information. Now, I want to…

Continue Reading How to retrieve fasta sequence after local blast?

Blast database built error

Blast database built error 1 makeblastdb -in nt -dbtype nucl -out nt Building a new DB, current time: 12/09/2021 15:55:27 New DB name: /home/internal/Databases/NT/NT_Jul2021/nt New DB title: nt Sequence type: Nucleotide Deleted existing Nucleotide BLAST database named /home/internal/Databases/NT/NT_Jul2021/nt Keep MBits: T Maximum file size: 1000000000B No volumes were created. Error:…

Continue Reading Blast database built error

Issue with installing QIIME2 2021.11 on Windows 10 – Technical Support

Hi QIIME support team, I’m attempting to install QIIME2 on my Windows 10 machine. I installed Anaconda3, then set up conda to run in Git Bash: echo “. ${PWD}/conda.sh” >> ~/.bashrc Once I restarted Git Bash and activated Conda, I installed python-wget because installation of wget kept getting the following…

Continue Reading Issue with installing QIIME2 2021.11 on Windows 10 – Technical Support

Making a FASTA file from a segment of a DNA sequence

Making a FASTA file from a segment of a DNA sequence 0 Hello everyone, I have copied a segment of a known DNA sequence and I want to turn this segment of DNA into a FASTA file in order to BLAST it against a custom made database. I mostly work…

Continue Reading Making a FASTA file from a segment of a DNA sequence

How to extract homologous sequence data from multiple .vcf.gz files?

How to extract homologous sequence data from multiple .vcf.gz files? 0 Hello, I have short read data from multiple samples stored as scaffolds.vcf.gz files. I have some gene sequence of interest. I want to find the closest homologous sequence of the respective genes from all the other samples. At first,…

Continue Reading How to extract homologous sequence data from multiple .vcf.gz files?

Scripts for BGC analysis in large MAGs and results of their application to soil metagenomes within Chernevaya Taiga RSF-funded project

This repository include scripts for analysis of biosynthetic gene clusters (BGCs) in large metagenome assemblies. All scripts were created within the Chernevaya Taiga project funded by the Russian Science Foundation (grant 19-16-00049). The repository also contains results of the scripts application to four hybrid (illumina + ONT) assemblies of various…

Continue Reading Scripts for BGC analysis in large MAGs and results of their application to soil metagenomes within Chernevaya Taiga RSF-funded project

igBLAST query/options error

igBLAST query/options error 2 When I try to run this command: igblastn -germline_db_V $GERMLINE_DB”/human_gl_HV” -germline_db_J $GERMLINE_DB”/human_gl_HJ” -germline_db_D $GERMLINE_DB”/human_gl_HD” -organism human -domain_system imgt -query $WORKDIR”https://www.biostars.org/”$FILE”.fasta” -auxiliary_data $IGBLASTDIR”/optional_file/human_gl.aux” -outfmt 7 -num_threads 4 -num_alignments_V 5 -out $FILE”_tab.igblast” I get this error: BLAST query/options error: Germline annotation database human/human_V could not be found in…

Continue Reading igBLAST query/options error

increasing word size extremely slows down the search

standalone blastp: increasing word size extremely slows down the search 1 Hello, I need to blastp a genome (15,000 seqs) against genome (12,000 seqs) using Biopython. I decided to use local blast and query genome 1 fasta file against genome 2 database ( made by makeblastdb command with second genome…

Continue Reading increasing word size extremely slows down the search

Getting coordinates for a given sequence motif

Getting coordinates for a given sequence motif 2 I would like to get all coordinates for a given series of short sequence motifs, e.g. GATTACA (and its reverse complementary TGTAATC) from a reference genome, but want to do it in a way that allows me to get the coordinates of…

Continue Reading Getting coordinates for a given sequence motif

Finding a sequence

Finding a sequence 1 Hi, I have a sequence from ITS1 (MT895507, Genbank acess) and I need to finding similares sequences in other works. My goal is to find how many sequences are similar to mine in that other work (PRJNA335788). I was doing blast the SRA with my sequence…

Continue Reading Finding a sequence

High-Throughput Metabolic Profiling for Model Refinements of Microalgae

This protocol demonstrates the use of a phenotype microarray (PM) technology platform to define metabolic requirements of Chlamydomonas reinhardtii, a green microalga, and refine an existing metabolic network model. The phenotype micro-array technology is an effective high-throughput method that functionally determines cellular metabolic activities in response to a wide array…

Continue Reading High-Throughput Metabolic Profiling for Model Refinements of Microalgae

Finding orthologues via BLAST : bioinformatics

I want to find orthologues for a certain gene to determine if it’s conserved among related taxa. For this I’m using BLAST but I’m still sceptical if I use the tool in a right way. Should I use the protein data base for blasting or the nucleotide data base (or…

Continue Reading Finding orthologues via BLAST : bioinformatics

Blast multiple sequences at once with NCBI blastn : bioinformatics

Hi all, I am designing pool of probes for fluorescent HCR in situ hybridization. I use blastn (blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch) to check the specificity of each probe by entering for instance “GACATTTACTTATGTGCAGAGAAAC XX CAATTTCCACGGACGCACTGACGTG” as the query and Mus Musculus as the Search Set. I usually do it one by one but…

Continue Reading Blast multiple sequences at once with NCBI blastn : bioinformatics

hw 3 genomics .pdf – Question 1 About how many different regions in your contig appear to have one or more matches to Swissprot Which family of genes

Question 1: About how many different regions in your contig appear to have one or more matches to Swissprot? Which family of genes seems to predominate in the latter half of the contig? There appear to be 4 regions and the Hox gene seems to predominate the output. Question 2:…

Continue Reading hw 3 genomics .pdf – Question 1 About how many different regions in your contig appear to have one or more matches to Swissprot Which family of genes

BLAST comparision and parsing output in particular format

BioPython : BLAST comparision and parsing output in particular format 1 I have query sequence, Suppose query: NNNNNNNNNNNNNNNNNN Database 1: Homo sapien Database 2: Mycobacterium tuberculosis I compared query sequence with above two Databases using Standalone BLAST individually and I got result as ex. Result1.txt and Result 2.txt. Now, I…

Continue Reading BLAST comparision and parsing output in particular format

samtools vs libdna – compare differences and reviews?

The number of mentions indicates the total number of mentions that we’ve tracked plus the number of user suggested alternatives. Stars – the number of stars that a project has on GitHub. Growth – month over month growth in stars. Activity is a relative number indicating how actively a project…

Continue Reading samtools vs libdna – compare differences and reviews?

De novo assembly of mRNA-seq has many transcripts binding to introns or genome parts (not protein coding genes)

Hi all, For the first time, I am working on some de novo mRNAseq data already analyzed by a company (the same that perfomed the sequencing). The samples are from a species for which we don’t have an annotated genome. They assembled the transcriptome with trinity (which resulted in more…

Continue Reading De novo assembly of mRNA-seq has many transcripts binding to introns or genome parts (not protein coding genes)

getting sscinames in viral blastx query

getting sscinames in viral blastx query 2 I am running a local blast instance on my computer and I am trying to extract the sscinames from the database.  I downloaded the viral database from ftp.ncbi.nlm.nih.gov/refseq/release/viral and created the blast database (with taxID) via :  grep -i “Viral|virus” /reference/blast_database/refseq/RefSeq-release74.catalog &> viral_taxid.txt…

Continue Reading getting sscinames in viral blastx query

National Center for Biotechnology Information | Psychology Wiki

National Center for Biotechnology Information logo The National Center for Biotechnology Information (NCBI) is part of the United States National Library of Medicine (NLM), a branch of the National Institutes of Health. The NCBI is located in Bethesda, Maryland and was founded in 1988. The NCBI houses genome sequencing data…

Continue Reading National Center for Biotechnology Information | Psychology Wiki

mcscanx

mcscanx 0 I have made files as per given in example blast and gff file xyz.blast contains: Eg01_t003090.1 Eg01_t003090.1 100.00 306 0 0 1 306 1 306 5e-173 594 Eg01_t003090.1 Eg01_t003100.1 98.38 309 2 1 1 306 1 309 1e-171 590 Eg01_t003090.1 Mba03_g02610.1 93.46 306 19 1 1 306 1…

Continue Reading mcscanx

The Chloranthus sessilifolius genome provides insight into early diversification of angiosperms

1. The Plant List. The Plant List—A Working List of All Plant Species (Royal Botanic Gardens, Kew and Missouri Botanical Garden, 2019). www.theplantlist.org/. Retrieved 20 Aug 2019. 2. Christenhusz, M. J. M. & Byng, J. W. The number of known plants species in the world and its annual increase. Phytotaxa…

Continue Reading The Chloranthus sessilifolius genome provides insight into early diversification of angiosperms

Chromosome-scale genome assembly of the high royal jelly-producing honeybees

1. Knecht, D. & Kaatz, H. H. Patterns of larval food production by hypopharyngeal glands in adult worker honey bees. Apidologie 21, 457–468, doi.org/10.1051/apido:19900507 (1990). Article  Google Scholar  2. Kamakura, M. Royalactin induces queen differentiation in honeybees. Nature 473, 478–483, doi.org/10.1038/nature10093 (2011). ADS  CAS  Article  PubMed  Google Scholar  3. Ramadan,…

Continue Reading Chromosome-scale genome assembly of the high royal jelly-producing honeybees

Compare two protein FASTA files and give a excel that show header with the same sequence

Compare two protein FASTA files and give a excel that show header with the same sequence 2 Dear All, I have two files file1.fasta file2.fasta. Both contain some identical sequences but different headers. I want to know the correspondence relationship between the headers of the two fasta files and may…

Continue Reading Compare two protein FASTA files and give a excel that show header with the same sequence

Benchmarking different approaches for Norovirus genome assembly in metagenome samples | BMC Genomics

Assembly Raw data obtained from eight human Norovirus samples passed FASTQC (v0.11.5, Babraham Bioinformatics) quality filters regarding the parameters per base sequence quality, per sequence average quality, N content and adapter sequences after the trimming steps described in the methods section. Mean read length was 100 bp as expected from library…

Continue Reading Benchmarking different approaches for Norovirus genome assembly in metagenome samples | BMC Genomics

Running Assembly Jobs on the Cluster with Checkpointing

Running Assembly Jobs on the Cluster with Checkpointing NERSC Tutorial 2/12/2013 Alicia Clum How We Use Genepool? • Assembly – Fungal, Microbial, Metagenome • • • Alignments Error correction Kmer matching/counting Tool benchmarking Data preprocessing – Linker trimming, changing quality formats, changing read formats, etc • Post assembly improvement •…

Continue Reading Running Assembly Jobs on the Cluster with Checkpointing

kseq compatible DNA fastA/Q encoding and compression library : bioinformatics

Hi all!! I would like to share with you this little project I have been working on for a while now. I would greatly appreciate if you find bugs or manage to break it with your own data. sqzlib is a little fastA/Q encoding library that uses zlib or zstd…

Continue Reading kseq compatible DNA fastA/Q encoding and compression library : bioinformatics

#1000359 – FTBFS: test failure: External MBEDTLS version mismatch

#1000359 – FTBFS: test failure: External MBEDTLS version mismatch – Debian Bug report logs Reported by: Stefano Rivera <stefanor@debian.org> Date: Mon, 22 Nov 2021 02:15:02 UTC Severity: serious Found in version python-biopython/1.79+dfsg-1 Fix blocked by 1000358: ncbi-blast+: Please remove the mbedtls version check Reply or subscribe to this bug. Toggle…

Continue Reading #1000359 – FTBFS: test failure: External MBEDTLS version mismatch

Please rebuild against MBEDTLS 2.16.11

Package: ncbi-blast+ Version: 2.11.0+ds-1 Severity: normal Affects: python-biopython Running blastn outputs: Critical: External MBEDTLS version mismatch: 2.16.9 headers vs. 2.16.11 runtime This causes python-biopython to FTBFS: ====================================================================== FAIL: test_blastn (test_NCBI_BLAST_tools.CheckCompleteArgList) Check all blastn arguments are supported. ———————————————————————- Traceback (most recent call last): File “/<<PKGBUILDDIR>>/.pybuild/cpython3_3.9/build/Tests/test_NCBI_BLAST_tools.py”, line 420, in test_blastn self.check(“blastn”, Applications.NcbiblastnCommandline)…

Continue Reading Please rebuild against MBEDTLS 2.16.11

Fusion of the Paired Box 3 (PAX3) and Myocardin (MYOCD) Genes in Pediatric Rhabdomyosarcoma

Abstract Background/Aim: Fusions of the paired box 3 gene (PAX3 in 2q36) with different partners have been reported in rhabdomyosarcomas and biphenotypic sinonasal sarcomas. We herein report the myocardin (MYOCD on 17p12) gene as a novel PAX3-fusion partner in a pediatric tumor with adverse clinical outcome. Materials and Methods: A…

Continue Reading Fusion of the Paired Box 3 (PAX3) and Myocardin (MYOCD) Genes in Pediatric Rhabdomyosarcoma

Index of /~psgendb/local/biopython-1.64.old/Bio

Name Last modified Size Description Parent Directory   –   Affy/ 2014-05-29 05:25 –   Align/ 2014-06-11 10:27 –   AlignIO/ 2014-06-11 10:27 –   Alphabet/ 2014-06-11 10:27 –   Application/ 2014-05-29 05:25 –   Blast/ 2014-05-29 05:25 –   CAPS/ 2014-05-29 05:25 –   Cluster/ 2014-05-29 05:25 –  …

Continue Reading Index of /~psgendb/local/biopython-1.64.old/Bio

Genotyping of intraspecies polymorphisms of Sporothrix globosa using partial sequence of mitochondrial DNA – Mochizuki – – The Journal of Dermatology

1 INTRODUCTION Sporotrichosis is the most predominant and worldwide deep-seated dermatomycosis. The causative fungi, Sporothrix spp., which inhabits soil, causes lesions when inoculated into skin or subcutaneous tissue by tiny wounds. Sporothrix schenckii had long been regarded as the only species causing sporotrichosis until Marimon et al.1, 2 conducted molecular…

Continue Reading Genotyping of intraspecies polymorphisms of Sporothrix globosa using partial sequence of mitochondrial DNA – Mochizuki – – The Journal of Dermatology

blastn or tblastx for nucleotide blasts?

blastn or tblastx for nucleotide blasts? 1 When we have both a nucleotide query sequence and a nucleotide database is it best to use tblastx? I read that tblastx is superior because it helps account for wobble bases in codons, but in many publications I simply see people using blastn….

Continue Reading blastn or tblastx for nucleotide blasts?

Frontiers | Metagenomic Analysis Reveals New Microbiota Related to Fiber Digestion in Pigs

Introduction Corn and soybean meal are the main components of high energy and high protein diets for pigs and are also the main raw materials of food products for human consumption, fermentation, and bioenergy industry (Sevillano et al., 2018). As the arable land of food crops was limited whereas the…

Continue Reading Frontiers | Metagenomic Analysis Reveals New Microbiota Related to Fiber Digestion in Pigs

(PDF) Alkahest NuclearBLAST : a user-friendly BLAST management and analysis system | Charles Opperman

(PDF) Alkahest NuclearBLAST : a user-friendly BLAST management and analysis system | Charles Opperman – Academia.edu Academia.edu uses cookies to personalize content, tailor ads and improve the user experience. By using our site, you agree to our collection of information through the use of cookies. To learn more, view our Privacy…

Continue Reading (PDF) Alkahest NuclearBLAST : a user-friendly BLAST management and analysis system | Charles Opperman

Single cell genomics reveals plastid-lacking Picozoa are close relatives of red algae

1. Strassert, J. F. H., Irisarri, I., Williams, T. A. & Burki, F. A molecular timescale for eukaryote evolution with implications for the origin of red algal-derived plastids. Nat. Commun. 12, 1879 (2021). ADS  CAS  PubMed  PubMed Central  Google Scholar  2. Burki, F., Roger, A. J., Brown, M. W. &…

Continue Reading Single cell genomics reveals plastid-lacking Picozoa are close relatives of red algae

Age- and gender-independent association of XRCC1 Arg399Gln

Introduction Chronic myeloid leukemia (CML) is a member of myeloproliferative neoplasms characterized by the acquired reciprocal chromosomal translocation, t(9;22) (q34; q11) which occurs as a result of translocation of ABL1 (Abelson murine leukemia) gene from chromosome 9 and its fusion with the BCR (breakpoint cluster region) gene on chromosome 22.1…

Continue Reading Age- and gender-independent association of XRCC1 Arg399Gln

protein sequence database slideshare

The N-terminal amino acid of the protein can be cleaved off. As we can see from the image below, starting from the 1990ties, PDB content growth … A high quality sequence alignment gives the idea about Additionally most PMF algorithms assume that the peptides come from a single protein. Retrieve/ID…

Continue Reading protein sequence database slideshare

How to extract certain CDS from a GenBank file in linux terminal

How to extract certain CDS from a GenBank file in linux terminal 1 Hi, I have a large GenBank file that contains multiple records. I got this file after performing BLAST against my query sequence. This .gb file contains all the results (full record in GenBank format for each accession)…

Continue Reading How to extract certain CDS from a GenBank file in linux terminal

Genome of the estuarine oyster provides insights into climate impact and adaptive plasticity

1. Hoegh-Guldberg, O. & Bruno, J. F. The impact of climate change on the world’s marine ecosystems. Science 328, 1523–1528 (2010). CAS  PubMed  Google Scholar  2. Chou, C. et al. Increase in the range between wet and dry season precipitation. Nat. Geosci. 6, 263–267 (2013). CAS  Google Scholar  3. Li,…

Continue Reading Genome of the estuarine oyster provides insights into climate impact and adaptive plasticity

16S rRNA merge forward and reverse primers

16S rRNA merge forward and reverse primers 2 Hi all, I have received 16S rRNA sequencing data for ~ 100 samples. I want to use these sequences to identify the samples to genus level using the NCBI BLAST 16s bacteria and archaea database. For each sample I have a forward…

Continue Reading 16S rRNA merge forward and reverse primers

I have a problem with blastx against database nr.

I have a problem with blastx against database nr. 1 -Hello -I built the database for the nr with the following command $ makeblastdb -dbtype prot -in nr -which went well and generated .pal .psq files. phr -then I ran the blastx with the following command $blastx -db nr -query…

Continue Reading I have a problem with blastx against database nr.

Set database size in MMseqs2 (like -dbsize in blast)

Set database size in MMseqs2 (like -dbsize in blast) 0 Hello all, I would like to search a sequence database using MMseqs2. Because the database might change in size during my analyses, I would like to be able to specify a fixed database size to avoid creating inconsistencies in reported…

Continue Reading Set database size in MMseqs2 (like -dbsize in blast)

BLAST help : bioinformatics

Hello I’m an undergrad Biochem student and we’re currently learning about how to use BLAST and interpreting the result and I’m quite confused about how to interpret the result. So I got a set of DNA sequence that belongs to Drosophila and the task is to identify the gene using…

Continue Reading BLAST help : bioinformatics

Different gene rearrangements of the genus Dardanus (Anomura: Diogenidae) and insights into the phylogeny of Paguroidea

1. Boore, J. L. Animal mitochondrial genomes. Nucl. Acids Res. 27, 1767–1780 (1999). CAS  PubMed  PubMed Central  Article  Google Scholar  2. Gyllensten, U., Wharton, D., Josefsson, A. & Wilson, A. C. Paternal inheritance of mitochondrial DNA in mice. Nature 352, 255–257 (1991). ADS  CAS  PubMed  Article  PubMed Central  Google Scholar …

Continue Reading Different gene rearrangements of the genus Dardanus (Anomura: Diogenidae) and insights into the phylogeny of Paguroidea

Presence absence matrix from blast results

Presence absence matrix from blast results 0 I have a many blast output files of genome names, which looks like this. In the first column of the file, it contains all the identified query UIDs, I want to make a presence-absence matrix in csv format in which a column would…

Continue Reading Presence absence matrix from blast results

How to identify corresponding chromosomes and coordinates of a species for query genes from a another species

How to identify corresponding chromosomes and coordinates of a species for query genes from a another species 0 I have a list of genes from species A and reference genome and gff3 of species B. I want to know homologous genes of species A genes in species B. I am…

Continue Reading How to identify corresponding chromosomes and coordinates of a species for query genes from a another species

blastp Error: NCBI C++ Exception: ncbi::CObject::ThrowNullPointerException()

blastp Error: NCBI C++ Exception: ncbi::CObject::ThrowNullPointerException() – Attempt to access NULL pointer 0 Hi, I am trying to blast a fasta file of protein sequences against the non-redundant database on a HPC. I run the following command: cat prot/split_fasta/master.dataframe.tide-tandem.protein.part_001.fa | parallel –GNU –block 100k –recstart ‘>’ –pipe ‘/home/users/nus/e0470749/ncbi-blast-2.8.1+/bin/blastp -query –…

Continue Reading blastp Error: NCBI C++ Exception: ncbi::CObject::ThrowNullPointerException()