Categories
Tag: msa
Using ColabFold to predict protein structures | by Natan Kramskiy | Jan, 2024
A few hours before the CASP14 (14th Critical Assessment of Structure Prediction) meeting, the latest biannual structure prediction experiment where participants build models of proteins given their amino acid sequences, this image went viral on twitter. Ranking of participants in CASP14, as per the sum of the Z-scores of their…
Plot the Guide Strand with different optional seeds
R: Plot the Guide Strand with different optional seeds plot_seeds {SeedMatchR} R Documentation Plot the Guide Strand with different optional seeds Description Plot the Guide Strand with different optional seeds Usage plot_seeds(guide.seq) Arguments guide.seq Guide a.k.a anti-sense sequence oriented 5′ > 3′. Sequence must be greater than 8 bp. Value…
extract the original sequence from multiple sequence alignment (MSA)
extract the original sequence from multiple sequence alignment (MSA) 0 An interesting article in the supplementary material contains an alignment of proteins that are important to me. It looks like this: >sp1_lox4 ————HVPRS–TE——YYT—————————— ————————————–TRN————-P—– ———–AY—SP—-HVYSPPVT——————SEPDRIRF-D-G -SD——–IA–TSVGA-Y-P-T-ST—-V—-P——————–S >sp2_pb —————-S———T-L————–PKRQRTA-FTNNQLLEL EKEFHYNKYLCRSRRIEIAKALSLTERQ———-V———–KIWFQNRRMK YKKVN—TGF-E-SPD——————-GM———————- ———–MK———————————————– —————————————PE——————- In addition to amino acids, there are only…
Q&A Report from the workshop_ _Exploring EMBL-EBI sequence analysis tools and managing bioinformatics workflows | PDF | Sequence Alignment
Q&A Report from the workshop: QuestonWha is he bes msa ool?clusal 2 and clusal omega are he sameHow would we ener multple sequences? because here is only one inpu boxCould he legend explaining symbiols (*, -,…) be shown in he resul window?Wha is he max number of sequences one…
MSA using ggtree msaplot
MSA using ggtree msaplot 0 Hi, I’m trying to do a protein sequence aligment of orthologues of a ~1000 aminoacid (aa) protein, in R. Normal MSA tools display boxes for each aa, which makes the aligment figure too big for my interests. The only MSA tool that I found that…
Resin acids play key roles in shaping microbial communities during degradation of spruce bark
Bark preparation Spruce bark was obtained from the Iggesund pulp and paper mill (Iggesund, Holmen AB, Sweden), from a bark pile resulting from stripping of spruce logs at the mill after harvest, with the average age of trees at harvest being ~70 years. The bark was left to dry at…
sequence alignment – BioPython bootstrap is not reliable?
I think this is a bug. It seems to work if you do this, creating an equivalent Alignment object instead of a MultipleSeqAlignment to give the bootstrap step: from Bio.Align import Alignment alignment2 = Alignment(list(alignment)) consensus_tree = bootstrap_consensus(alignment=alignment2, times=50, tree_constructor=constructor, consensus=majority_consensus) bootstrap_consensus calls bootstrap_trees, which makes however many randomly shuffled…
new FREM platform delivers 98% accuracy for rapid screening and genotyping
In a recent study published in eBioMedicine, researchers developed the Flexible, Robust, Equipment-free Microfluidic (FREM) platform for malaria screening and Plasmodium species genotyping. Study: A versatile microfluidic platform for malaria infection screening and Plasmodium species genotyping. Image Credit: nechaevkon/Shutterstock.com Background Malaria, a worldwide health concern caused by Plasmodium species, needs precise detection and genotyping…
Yes .. BBMap can do that!
NOTE: This collection was originally posted at SeqAnswers.com. Creating a copy here to preserve the information.Part I is available here: Yes .. BBMap can do that! – Part I : bbmap (aligner), bbduk (scan/trim), repair (fix PE reads) and reformat (format conversions)Part II is available here: Yes .. BBMap can…
Multi-domain and complex protein structure prediction using inter-domain interactions from deep learning
Overview of the method DeepAssembly is designed to automatically construct multi-domain protein or complex structure through inter-domain interactions from deep learning. Figure 1 shows an overview of the DeepAssembly protocol. Starting from the input sequence of multi-domain protein (or protein complex), DeepAssembly first generates multiple sequence alignments (MSAs) from genetic databases…
Alfalfa vein mottling virus, a novel potyvirid infecting Medicago sativa L. | Virology Journal
Plant material Five alfalfa plants (stems and leaves) were sampled from each of the four different fields, 10–15 acres in size, located in Yuma Country, Arizona, USA. Geographic coordinates of the alfalfa fields and the adjacent crops are shown in Table 1. Table 1 Geographic locations of alfalfa fields Total…
Identification of constrained sequence elements across 239 primate genomes
De novo assembly and repeat-masking To maximize the species diversity of primates in our analyses, we newly sequenced and assembled the genomes of 187 different primate species, initially presented in refs. 11,23, for which no other reference genome assembly was available. In brief, each individual was sequenced with 150 bp paired…
Development and evaluation of specific polymerase chain reaction assays for detecting Theileria equi genotypes | Parasites & Vectors
Knowles DP, Kappmeyer LS, Stiller D, Hennager SG, Perryman LE. Antibody to a recombinant merozoite protein epitope identifies horses infected with Babesia equi. J Clin Microbiol. 1992;30:3122–6. CAS PubMed PubMed Central Google Scholar Ueti MW, Palmer GH, Kappmeyer LS, Statdfield M, Scoles GA, Knowles DP. Ability of the vector tick…
Biophysical properties of NaV1.5 channels from atrial-like and ventricular-like cardiomyocytes derived from human induced pluripotent stem cells
hiPSC cultures and cardiomyocyte differentiation The hiPSC lines CBRCULi001-A54 and CBRCULi008-A55 were generated from a 44-year-old male and 75-year-old female control lymphoblastoids, respectively, and they were reprogramed at the LOEX core facility (Quebec City, QC, Canada). All the work with hiPSCs were approved by CIUSSS de la Capitale-Nationale ethics committee (Project…
Vision Transformer from scratch using PyTorch | by Mickael Boillaud | Nov, 2023
Before hit the real subject of Vision Transformer, it is essentiels to see where we did come from to this idea. So let’s embark into a chronological exploration of computer vision’s image classification evolution! We trace its roots to Yann LeCun’s, the pioneering of Convolutional Neural Networks (ConvNets). These early…
Predicting multiple conformations via sequence clustering and AlphaFold2
AlphaFold2 (AF2) 1 has revolutionized structural biology by accurately predicting single structures of proteins. However, a protein’s biological function often depends on multiple conformational substates2, and disease-causing point mutations often cause population changes within these substates3,4. We demonstrate that clustering a multiple sequence alignment (MSA) by sequence similarity enables AF2…
When will RNA get its AlphaFold moment? | Nucleic Acids Research
Abstract The protein structure prediction problem has been solved for many types of proteins by AlphaFold. Recently, there has been considerable excitement to build off the success of AlphaFold and predict the 3D structures of RNAs. RNA prediction methods use a variety of techniques, from physics-based to machine learning approaches….
Enhancing alphafold-multimer-based protein complex structure prediction with MULTICOM in CASP15
The comparison between MULTICOM servers and other CASP15 server assembly predictors According to the CASP15 official assessment (see the official ranking predictioncenter.org/casp15/zscores_multimer.cgi), MULTICOM_qa and MULTICOM_deep servers ranked 3rd and 5th among all CASP15 assembly server predictors. The MULTICOM human predictors (MULTICOM_human and MULTICOM) ranked 7th and 10th among all CASP15…
Analysis of microbial composition and sharing in low-biomass human milk samples: a comparison of DNA isolation and sequencing techniques
Victora CG, Bahl R, Barros AJD, França GVA, Horton S, Krasevec J, et al. Breastfeeding in the 21st century: Epidemiology, mechanisms, and lifelong effect. Lancet. 2016;387:475–90. Article PubMed Google Scholar Bardanzellu F, Fanos V, Strigini FAL, Artini PG, Peroni DG. Human breast milk: Exploring the linking ring among emerging components….
Single-cell transcriptomics reveals the brain evolution of web-building spiders
Animals for single-cell sequencing Adult samples of the aerial web-building spider (Hylyphantes graminicola) were collected from Anci district, Langfang, Hebei, China (39° 31.90’ N, 116° 38.15’ E) between September and October 2020. Collected spiders used for brain dissection were housed individually in a glass tube (Φ12 mm × 80 mm) at temperature- and humidity-controlled condition (24–26 °C and 50–60%…
Dryad | Data — Mitochondrial DNA from Borsuka Cave
README: Multiple sequence alignment of newly reconstructed and published Human mtDNA: One multiple sequence alignments (MSA) used for tree building and molecular branch shortening for the associated publication. These contain both previously published and newly reconstructed human mtDNA genomes. 1. Aligned_human_mtDNA_Borsuka.fst This MSA is aligned to the revised cambridge reference…
Building customized database using HHblits
Building customized database using HHblits 1 Hi, Sorry for asking this naive question I have 7000 FASTA sequences (not MSA) and I want to build a customized database of these and search against themselves using HHblits. I am following HH-suite tutorial (github.com/soedinglab/hh-suite/wiki#building-customized-databases) but I am getting an error every time…
Unraveling Protein Structures: The Revolution of AlphaFold
Unraveling Protein Structures: The Revolution of AlphaFold Introduction The Central Dogma of Molecular Biology Experimental Methods for Protein Structure Determination The AlphaFold Artificial Intelligence System AlphaFold’s Architecture Overview Section 1: Generating the Initial Multiple Sequence Alignment (MSA) Section 2: The Evoformer Neural Network Section 3: The Structure Module Iterative Process…
Topological links in predicted protein complex structures reveal limitations of AlphaFold
Identification of topological links in protein complexes To demonstrate why new algorithms are needed to identify topological links in protein‒protein complex structures, we applied existing methods to 4 structures predicted by AlphaFold-Multimer v2.2.0 (Fig. 1a–d) and 4 experimental complex structures from the Protein Data Bank (PDB, Fig. 2a–d). For these 4 experimental…
Mycobacterium tuberculosis Sub Lineage 4.2.2/SIT149 as DR
Introduction Antimicrobial resistance is a hidden global pandemic that shattered over 4.9 million people in 2019 alone, and the burden is highest, mainly in low-resource settings.1 Drug-resistant tuberculosis (DR-TB) caused by Mycobacterium tuberculosis (Mtb) complex (MTBC), which is resistant to one or more anti-TB drugs, is a leading global public…
Solved R/Rstudio question:I have this R code and I am
R/Rstudio question: I have this R code and I am supposed to write a Market Segmentation Analysis based on the data I extract out of the SAS file I loaded into my R code. I also have the CSV file I want my data loaded into. My question is how…
The Fundamentals of Metagenomics | Devpost
Inspiration Computational Biology allows for the intersection of biology and computing for technological innovations and optimizations in genomics, modeling systems, biology, phylogenetics, etc. In our project, we chose to focus on studying the structure and function of sequences from a community of organisms, like on human skin, in the soil,…
[Solved] You have sequenced the human dyskerin pseudouridine synthase 1…
You have sequenced the human dyskerin pseudouridine synthase 1 (DKC1) gene from several individuals with the goal of finding variations between individuals. You asked the crazy swamp bioinformatician to work on a multiple sequence alignment for you, but they have made the unorthodox choice to give…
Index of /~psgendb/birchhomedir/local/pkg/ugene/data/cmdline
Name Last modified Size Description Parent Directory – align-clustalo.uwl 2019-09-24 01:05 3.0K align-clustalw.uwl 2019-09-24 01:05 5.1K align-kalign.uwl 2019-09-24 01:05 2.8K align-mafft.uwl 2019-09-24 01:05 2.7K align-tcoffee.uwl 2019-09-24 01:05 3.1K align-to-reference.uwl 2019-09-24 01:05 3.1K align.uwl 2019-09-24 01:05 2.7K convert-msa.uwl 2019-09-24 01:05 1.4K …
EasyCGTree: a pipeline for prokaryotic phylogenomic analysis based on core gene sets | BMC Bioinformatics
EasyCGTree was implemented in Perl programming languages (www.perl.org/) and was built using a collection of published reputable tools, including Clustal Omega version 1.2.4 [12]; consense from PHYLIP version 3.698 [13]; FastTree version 2.1 [14]; hmmbuild and hmmsearch from HMMER version 3.0 (hmmer.org/); IQ-TREE version 2.1.1 [15]; trimAl version 1.2 [16];…
incompatible with python3-biopython > 1.79
Source: prody Version: 2.3.1+dfsg-3 Severity: serious Justification: FTBFS Tags: sid ftbfs Forwarded: github.com/prody/ProDy/issues/1723 Hello, prody FTBFS with python3-biopython > 1.79: ====================================================================== FAIL: testBuildMSAlocal (prody.tests.sequence.test_analysis.TestBuildMSA.testBuildMSAlocal) ———————————————————————- Traceback (most recent call last): File “/<<PKGBUILDDIR>>/.pybuild/cpython3_3.11_prody/build/prody/tests/sequence/test_analysis.py”, line 1210, in testBuildMSAlocal assert_array_equal(expect, result) File “/usr/lib/python3/dist-packages/numpy/testing/_private/utils.py”, line 985, in assert_array_equal assert_array_compare(operator.__eq__, x, y, err_msg=err_msg, File “/usr/lib/python3.11/contextlib.py”,…
Solved 2. Post the residue number of the catalytic Lysine
Transcribed image text: 2. Post the residue number of the catalytic Lysine (Lys, K) in CadA of E. coli K-12 MG1655? (For example: If the lysine was at position 11 along the amino acid sequence, then you would provide: K11) 3. Show an alignment (MSA) of the family that you…
Unraveling the secrets of the prevalent Staphylococcus strain
In a recent study published in Microbial Genomics, researchers investigated the genomes of a group of Staphylococcus capitis isolates from neonates. Study: Characterisation of neonatal Staphylococcus capitis NRCS-A isolates compared with non NRCS-A Staphylococcus capitis from neonates and adults. Image Credit: Dmitry Kalinovsky/Shutterstock.com Background NRCS-A, a clone of S. capitis,…
Ivanhoe Mines Reports Record Quarterly Production of 103,947 Tonnes from Kamoa-Kakula Copper Complex for Q3 2023
Kamoa-Kakula milled 2.24 million tonnes of ore during the quarter at an average grade of 5.55% copper Kamoa-Kakula has produced 301,336 tonnes copper year-to-date, well on track to achieve 2023 guidance Phase 3 concentrator, smelter and hydropower project on schedule for Q4 2024 start Kamoa 1 and Kansoko underground access…
The leaderless communication peptide (LCP) class of quorum-sensing peptides is broadly distributed among Firmicutes
Leaderless communication peptide (LCP) system is broadly distributed To assess the distribution of SIP-like LCP-based qs systems across bacteria, we employed a large-scale search strategy that includes: (i) search for RopB homologs across a dataset of 129,001 bacterial genomes and 9,421 reference metagenomics-assembled genomes (MAGs), (ii) probing the genomic vicinity…
DNA-bridging by an archaeal histone variant via a unique tetramerisation interface
Chromatin isolation and MNase digestion M. jannaschii DSM 2661 cells were grown in 100 l fermenters in minimal medium containing 0.3 mM K2HPO4, 0.4 mM KH2PO4, 3.6 mM KCl, 0.4 M NaCl, 10 mM NaHCO3, 2.5 mM CaCl2, 38 mM MgCl2, 22 mM NH4Cl, 31 µM Fe(NH4)2(SO4)2, 1 mM C6H9NO6, 1.2 µM MgSO4, 0.4 mM CuSO4, 0.3 µM MnSO4, 36 nM FeSO4, 36 nM CoSO4, 3.5 nM…
AlphaMissense revolutionizes genetic mutation analysis for disease prediction
In a recent article published in the journal Science, researchers presented AlphaMissense, a highly accurate protein structuring model adapted from AlphaFold (AF) to predict and characterize human proteome-wide missense variants’ pathogenicity at a single amino acid substitution level. Study: Accurate proteome-wide missense variant effect prediction with AlphaMissense. Image Credit: ArtemisDiana / Shutterstock It…
microRNA and circRNA in Parkinson’s Disease and atypical parkinsonian syndromes
doi: 10.1016/bs.acc.2023.03.002. Epub 2023 Mar 28. Affiliations Expand Affiliations 1 1st Department of Neurology, Medical School, Aeginition Hospital, National and Kapodistrian University of Athens, Athens, Greece. Electronic address: abougea@med.uoa.gr. 2 1st Department of Neurology, Medical School, Aeginition Hospital, National and Kapodistrian University of Athens, Athens, Greece. Item in Clipboard Anastasia…
Replace ambiguous characters in fasta MSA
Replace ambiguous characters in fasta MSA 0 Hi everyone, I have a MSA that I feed into a software that does not deal with Ns and many of the sequences of my MSA (~20%) have at least a couple of them I am looking for a program that can compute…
Visualization of multiple sequence alignment quality
I have a multiple sequence alignment output, and I wish to visualize the quality of the output thus: ![enter image description here][1] bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-7-484#author-information [1]: /media/images/f62af136-1853-43a0-aac2-1a063779 that paper doesn’t appear to have a downloadable program, unfortunately. The author’s email address doesn’t work. What program will produce visualizations of MSA quality like…
Exploring the Druggable Conformational Space of Protein Kinases Using AI-Generated Structures,bioRxiv – Bioinformatics
Exploring the Druggable Conformational Space of Protein Kinases Using AI-Generated Structures bioRxiv – Bioinformatics Pub Date : 2023-09-01 , DOI: 10.1101/2023.08.31.555779 Noah B Herrington, David Stein, Yan Chak Li,…
Multiple sequence aligments – parallelisation
Multiple sequence aligments – parallelisation 1 I have a folder with approx 3.000 fasta files. Each fasta FILE corresponds to one gene (orthogroup) and it contains multiple sequences (orthologues from multiple species). I want to do multiple sequence alignments in ClustalO or MUSCLE for each of these files (genes). I…
The ‘protein-folding problem’ and its solution
In CASP13, DeepMind’s performance was remarkable with a significant margin over the next competitor. But CASP14 saw the improved AlphaFold blow the competition out of the park. It not only made the best prediction for 88 out of 97 target sequences, but the accuracy of the predictions were unprecedented and…
MenT nucleotidyltransferase toxins extend tRNA acceptor stems and can be inhibited by asymmetrical antitoxin binding
MenAT1 sequence analysis Analysis of gene neighbourhoods for rv0078B (menA1) was performed using default settings in FlaGs (www.webflags.se/). Output sequences for MenA1 and cognate MenT1 homologues were then used to perform sequence alignments using MUSCLE (www.ebi.ac.uk/Tools/msa/muscle/), then formatted in Jalview (www.jalview.org/), sorting by pairwise alignment. Residues of interest were then…
Fortress Biotech Reports Second Quarter 2023 Financial Results and Recent Corporate Highlights
Fortress Biotech Reports Second Quarter 2023 Financial Results and Recent Corporate Highlights Total net revenue was $17.4 million in the second quarter of 2023, a 40% increase from $12.4 million in the first quarter of 2023 Positive topline results from two Phase 3 clinical trials evaluating DFD-29 demonstrated achievement of…
OpenProteinSet provides open source training data for structural biology at scale
Summary OpenProteinSet provides a massive dataset of the same quality as the one used to train AlphaFold 2, which was not made available to the research community. Proteins are the workhorses of life. Understanding their sequences and structures is key to tackling challenges ranging from designing new enzymes to developing…
How to get image of mutliple alignment sequence that is labeled by row?
I have been looking at many, many different multiple sequence alignment software packages. There’s a post that I was using earlier from BioStars, but I can’t find it anymore. ADOMA doesn’t produce anything. There is no error message, just blank HTML ESPript, which requires faxing or snail-mailing a license request,…
Is there a command line tool that can take an alignment fasta and input and outputs an image?
Is there a command line tool that can take an alignment fasta and input and outputs an image? 1 Hello! I’ve been trying to find a software that does what the post title describes, but without luck. I tried installing JalView but it won’t install for some reason “Installer User…
Printable visualizations of large-scale alignments
Printable visualizations of large-scale alignments 2 I would like to visualize a large (“large” as in 20 bacterial genomes) multi-sequence alignment such that (a) the sequences are wrapped within pages, (b) the individual nucleotide letters remain visible (at least minutely), and (c) nucleotide differences compared to a consensus are highlighted….
Chemically programmed STING-activating nano-liposomal vesicles improve anticancer immunity
Rational design of STING-activating pro-drugs for liposomal formulation A recently discovered non-nucleotide STING agonist, MSA-2, was selected to test the validity of our drug design. To facilitate the liposomal formulation of MSA-2, four synthetic MSA-2 derivatives (compounds 1–4) were initially constructed via ester bonds using MP-typed alkanols of varying lengths…
What is Cancer immunotherapy targeting STING?
What is Cancer immunotherapy targeting STING? What is Cancer immunotherapy targeting STING? Over the past decade, cancer immunotherapy using immune checkpoint inhibitors has achieved unprecedented success in cancer treatment; however, only a small fraction ( 10-35% ) of patients can derive clinical benefit from this treatment, Therefore, there is…
Comment: search for intron conservancy across species
the process of calculating conservation scores does not, so far as i am aware, change dependent upon functional annotations I’m not interested in functional annotation, I didn’t mention it. Not sure why it might be related to my question. 1) generate multiple sequence alignment Once again, my aim is to…
Gastrointestinal symptoms of long COVID-19 related to the ectopic colonization of specific bacteria that move between the upper and lower alimentary tract and alterations in serum metabolites | BMC Medicine
Wang D, Hu B, Hu C, Zhu F, Liu X, Zhang J, et al. Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus–infected pneumonia in Wuhan. China JAMA. 2020;323(11):1061–9. Article CAS PubMed Google Scholar Zhao Y, Zhao Z, Wang Y, Zhou Y, Ma Y, Zuo W. Single-cell RNA expression…
Integrating full and partial genome sequences to decipher the global spread of canine rabies virus
RABV sequence and metadata acquisition and composition All 25,787 available sequences for the five genes of the RABV genome were downloaded from the NCBI Virus database5. After quality control, the RABV data set was reduced to 14,752 sequences that spanned 121 countries and were extracted from 192 different host species…
The Next Frontier For Large Language Models Is Biology
David Baker (University of Washington), Demis Hassabis (DeepMind) and George Church (Harvard) have … [+] helped pioneer the field of AI-driven protein design. Photo source: U of W, Royal Society, Harvard Large language models like GPT-4 have taken the world by storm thanks to their astonishing command of natural language….
Remove sequences with (50% gaps) from MSA
Remove sequences with (50% gaps) from MSA 1 How do I remove sequences from my MSA that contain 50% gaps? I know there are various posts about removing columns with gaps. But I’m looking for a simple script to identify alignments within my MSA that have >50% gaps “-” and…
Patch-clamp studies and cell viability assays suggest a distinct site for viroporin inhibitors on the E protein of SARS-CoV-2 | Virology Journal
Liu J, Xie W, Wang Y, Xiong Y, Chen S, Han J, Wu Q. A comparative overview of COVID-19, MERS and SARS: review article. Int J Surg. 2020;81:1–8. Article PubMed PubMed Central Google Scholar Farag NS, Breitinger U, Breitinger HG, El Azizi MA. Viroporins and inflammasomes: a key to understand…
AlphaFold: Advancements in Machine Learning for Protein Structure Prediction and Analysis
The AlphaFold Method: Revolutionizing Protein Structure Prediction The AlphaFold method is making huge advancements in the field of machine learning. This revolutionary technology has significantly improved the accuracy of predicting protein structures. In this article, we will provide an overview of the AlphaFold network and discuss its key features and…
Align A Sequence Against A Pre-Made Alignment?
Align A Sequence Against A Pre-Made Alignment? 1 Thanks in advance for any advice. I am working on predicting structures for vhh antibody molecules and so I am exploring different options for the MSA step. Assuming I produced a .a3m alignment file for a query sequence. Also that the query…
Ivanhoe Mines Reports Record Quarterly Production of 103,786 Tonnes from Kamoa-Kakula Copper Complex for Q2 2023
Ivanhoe Mines Reports Record Quarterly Production of 103,786 Tonnes from Kamoa-Kakula Copper Complex for Q2 2023 Kamoa-Kakula achieves 11% quarter-on-quarter increase in copper production Kamoa-Kakula milled a record 2.2 million tonnes of ore during the quarter at an average grade of 5.2% copper Kamoa-Kakula sets a new weekly production record…
Is it possible to get a pairwise distance matrix from a mafft alignment?
Is it possible to get a pairwise distance matrix from a mafft alignment? 1 I’ve googled around a bit. From reading the publications and software documentation, its clear that there is a distance matrix being used but it doesnt seem possible to obtain. There is rumor of a –distout flag…
ColabFold – E-Learning@VIB
ColabFold is an initiative by Milot Mirdita, Sergey Ovchinnikov and Martin Steinegger, providing more user-friendly access to AlphaFold prediction models. The different Google Colab notebooks can be found at github.com/sokrypton/ColabFold. Its main contributions in comparison with the official notebook are as follows: Use of MMSeqs2 for the MSA search, resulting…
Run AlphaFold-Multimer – E-Learning@VIB
After successfully predicting a monomer structure using AlphaFold, only slight changes to the setup are required for predicting a protein complex with AlphaFold-Multimer. Information can be found at elearning.vib.be/courses/alphafold/lessons/alphafold-on-the-hpc/topic/extra-alphafold-multimer/. An example complex, which will be discussed in further exercises, is the SARS-CoV-1 RBD bound by the cross-reactive single-domain antibody SARS…
Universal whole-genome Oxford nanopore sequencing of SARS-CoV-2 using tiled amplicons
Clinical RNA SARS-CoV-2 isolate For surveillance studies, a set of residual nasopharyngeal swab specimens positive for SARS-CoV-2 qRT-PCR from the Republican Diagnostic Centre (RDC) (umc.org.kz/en/) and private laboratory KDL “Olymp” (www.kdlolymp.kz/) were collected across the Republic of Kazakhstan between 2020 and 2022. 341 sequences that passed quality control are deposited…
RCAC – Knowledge Base: Biocontainers: bbtools
bbtools Link to section ‘Introduction’ of ‘bbtools’ Introduction BBTools is a suite of fast, multithreaded bioinformatics tools designed for analysis of DNA and RNA sequence data. Docker hub: hub.docker.com/r/staphb/bbtoolsHome page: jgi.doe.gov/data-and-tools/software-tools/bbtools/ Link to section ‘Versions’ of ‘bbtools’ Versions 39.00 Link to section ‘Commands’ of ‘bbtools’ Commands Xcalcmem.sh a_sample_mt.sh addadapters.sh addssu.sh…
Using nanopore sequencing to identify fungi from clinical samples with high phylogenetic resolution
Fungal genomic DNA DNA extracted from Aspergillus niger (kindly provided by Dr. Takamitsu Imoto, Medical Research Institute, Kitano Hospital, Osaka, Japan) and a mock community DNA standard (Mycobiome Genomic DNA Mix, MSA-1010; ATCC, Manassas, VA, USA) were used to evaluate the validity of the sequencing methodology. The mock community standard…
Exonerate fails to find match
Exonerate fails to find match 0 Hi, I’m currently using Exonerate to find a match between these two sequences (see below). The percentage of identity between the sequences at both the nucleotide (72.5%) and amino acid (90% – 3’5′ Frame) levels seems quite high to me. Here are the sequences:…
Reuse MSA for multiple alphafold complex models?
Reuse MSA for multiple alphafold complex models? 0 I am using a local installation of alphafold 2 for predicting the structures of two-protein complexes. Now, i plan to model complexes of a big protein A with many alternative (small) partners B1, B2, B3 etc. I seems a waste of CPU…
Count SNP per read
Count SNP per read 0 Hello, I have 6 samples of near-whole length sequenced hiv fasta files. The samples cover three different time points and there is a gene +/- sample per time-point. The samples have different numbers of reads of around ~8000 bp. I have performed a multi-seq alignment…
Exploring the Global Radio Headset Market and its Role in Enhancing Efficiency and Safety
PRESS RELEASE Published June 2, 2023 As per the study initiated by Evolve Business Intelligence, the global Radio Headset market size accounted for USD 4.5 Billion in 2022, growing at a CAGR of 6.1% from 2023 to 2033. Radio headsets are designed to provide reliable and efficient communication in demanding…
De novo protein design by inversion of the AlphaFold structure prediction network
Abstract De novo protein design enhances our understanding of the principles that govern protein folding and interactions, and has the potential to revolutionize biotechnology through the engineering of novel protein functionalities. Despite recent progress in computational design strategies, de novo design of protein structures remains challenging, given the vast size…
Genome assembly statistical tools
I’ll follow up what a few others have mentioned, but I like stats.sh within the BBTools package for raw assembly stats. I’ll then polish my assembly and pare it down to the targeted contigs (eg. those annotated as bacteria/viruses/your genome of interest/etc.) then I like to use quast/metaquast to evaluate…
Considering gaps in calculating conservation score from MSA
Considering gaps in calculating conservation score from MSA 0 Dear all, I was looking for a good way to calculate conservation scores over columns in an MSA. I usually use Kullback-Leibler-Divergence (kl_divergence) or Shannon entropy. However, I would like to know if it makes sense to penalize gaps, when calculating…
Solved During a particular baseball trout, 8 baseball
During a particular baseball trout, 8 baseball players were given 40 pitches each. Of each player’s 40 pitches, 20 pitches came from left-handed pitchers and 20 came from right-handed pitchers. Of the 20 pitches that each player saw from each of the pitchers, 10 pitches were curve balls and 10…
Efficient site-specific integration in CHO-K1 cells using CRISPR/Cas9-modified donors
Vector construction The Cas9/sgRNA vector (all-in-one) (GenBank accession no. OQ579018) contained sgRNA and Cas9-2A-mCherry expression cassettes. The sgRNA (5′-ATGCAGAACTAGAGTACAGC-3′) targets the phiC31 pseudo-attP intergenic site located on chromosome 3 of CHO-K1 genome (GenBank accession no. APMK01032147.1) (Fig. 1a). To construct the Cas9-mSA/sgRNA vector (GenBank accession no. OQ579019), the monomeric streptavidin (mSA)…
Identifying Mutation Frequency Changes in MSAs over time?
Identifying Mutation Frequency Changes in MSAs over time? 0 Hello, I have 6 sample fasta files total over 3 time points: WEEK 3, 5 & 7. There are two samples per time-point which are Control and Positive for a Specific Gene. These samples are YU2 HIV viral transcriptomes. In each…
An unusual tandem kinase fusion protein confers leaf rust resistance in wheat
Plant material Bread wheat accessions Transfer (TA5524), WL711, TA5605, Ae. umbellulata accession TA1851 and Ae. triuncialis accession TA10438 were obtained from the Wheat Genetics Resource Center (WGRC). TcLr9 (Transfer/6*Thatcher) is a near-isogenic line carrying Lr9 from Transfer in the genetic background of the susceptible wheat line Thatcher. TcLr9 and TA5605…
Identifying signatures of positive selection in human populations from North Africa
Henn, B. M. et al. Genomic ancestry of North Africans supports back-to-Africa migrations. PLOS Genet. 8, e1002397 (2012). Article CAS PubMed PubMed Central Google Scholar Arauna, L. R. et al. Recent historical migrations have shaped the gene pool of Arabs and Berbers in North Africa. Mol. Biol. Evol. 34, 318–329…
Enabling accurate and early detection of recently emerged SARS-CoV-2 variants of concern in wastewater
Wastewater sample collection, RNA extraction, and sequencing Houston Water collected and provided weekly 24-hour time-weighted composite influent (raw wastewater) samples from 39 wastewater treatment plants (WWTPs) in Houston covering a service area of approximately 580 miles2 and serving over 2.3 million people. In total, 2637 samples were analyzed. Untreated wastewater…
Confused about the GERP++ output
2 hours ago Doozy • 0 Hi guys, I am new to the GERP++, while I found that mendel.stanford.edu/SidowLab/downloads/gerp/ is not avaliable. Though I installed the GERP suit by conda and tried to guess how it works in bitbucket.org/bucklerlab/msa_pipeline/src/master/ (confused with the header of the output GERP_ExpSubst and GERP_RejSubstScore), I…
Extract sequence subset from multiple sequence alignment based on position in a specific species.
I would like to generate a logo for the amino acid domain surrounding a particular mutation of interest. I am using the “msa” package in R to generate the multiple sequence alignment based on the full peptide sequence, then I would like to extract a short subset of the consensus…
Introduction to Computational Evolutionary Biology
R is a very flexible programming language, and it allows developers to create their own data structures (called classes) for their packages. Over the years, some packages have become so popular that the classes they use to store data are now used the “standard” representations for particular types of data….
Peptide-encoding mRNA barcodes for the high-throughput in vivo screening of libraries of lipid nanoparticles for mRNA delivery
We reasoned that an ideal screening system for functional mRNA delivery would be model independent so that it could be applied in any preclinical model of disease, would consist of multiple measures of protein production that are each orthogonal to any others such that multiple formulations could be tested within…
CRISPR-Cas13a-powered electrochemical biosensor for the detection of the L452R mutation in clinical samples of SARS-CoV-2 variants | Journal of Nanobiotechnology
Liang Y, Lin H, Zou L, Deng X, Tang S. Biosens Bioelectron. 2022;205:114098. Article CAS PubMed PubMed Central Google Scholar Telenti A, Hodcroft EB, Robertson DL. Cold Spring Harb Perspect Med. 2022;12:a041390. Article CAS PubMed Google Scholar Shrestha LB, Foster C, Rawlinson W, Tedla N, Bull RA. Reviews in Medical…
Maintaining the coverage filter in mmseqs for cascaded clustering
Hi everyone, Hopefully we have some experienced mmseqs users here who can help me with an issue in regards to cascaded clustering. I am a fairly new user of mmseqs and have run into some unexpected behavior which I am unable to resolve. I am attempting to cluster a database…
Efficient evolution of human antibodies from general protein language models
Acquiring amino acid substitutions via language model consensus We select amino acid substitutions recommended by a consensus of language models. We take as input a single wild-type sequence x = (x1,…,xN)∈ \(\mathcal{X}\)N, where \(\mathcal{X}\) is the set of amino acids, and N is the sequence length. We also require a set of…
AlphaFold Spreads through Protein Science | May 2023
By Chris Edwards Communications of the ACM, May 2023, Vol. 66 No. 5, Pages 10-1210.1145/3586582Comments Credit: Veronica Falconieri Hays Two years ago, as the COVID-19 pandemic swept across the world, researchers at DeepMind, the artificial intelligence (AI) and research laboratory subsidiary of Alphabet Inc., demonstrated how it could use machine…
Inference of phylogenetic trees directly from raw sequencing reads using Read2Tree
State-of-the-art phylogenomic pipelines require many steps, which can be both time consuming and error prone (Fig. 1a). With Read2Tree, we directly process raw sequencing reads and reconstruct sequence alignments for conventional tree inference methods (Fig. 1b and Supplementary Fig. 1). We start by aligning raw reads to nucleotide sequences derived…
Subgenome-aware analyses suggest a reticulate allopolyploidization origin in three Papaver genomes
Reconstruction of gene and macro-synteny trees for the three studied Papaver species Syntenic blocks within the three Papaver species were identified with OrthoFinder v2.3.15. Orthologous and paralogous relationships, as well as orthogroups, were inferred using the parameters “-M msa -T fasttree” based on proteome sequences from multiple species. The resulting…
Parkinson’s cure ‘inevitable’ after biomarker breakthrough
The Michael J. Fox Foundation for Parkinson’s Research (MJFF) has announced what it says is the ‘most significant breakthrough yet’ in the search for a Parkinson’s biomarker: a biological test for Parkinson’s disease. The test demonstrates high diagnostic accuracy, differentiates molecular subtypes and detects disease in individuals before cardinal movement…
Is a vaccine for Parkinson’s disease possible?
Finding a truly effective treatment for Parkinson’s disease – that goes beyond simply managing symptoms – has long been a challenging task and, to this day, there are no available therapeutic options that can effectively slow or stop the underlying disease. However, research and trials to find treatments are ongoing…
Programmable protein delivery with a bacterial contractile injection system
Plasmid construction The PVCpnf structural and accessory region (pvc1-16) and payload and regulatory region (Pdp1, Pnf and regulatory genes PAU_RS16570-RS24015) were synthesized de novo (GenScript) and cloned into pAWP78 and pBR322 backbones, respectively. All manipulations involving payload and regulatory plasmids (pPayload) involved standard PCR amplification with Phusion Flash 2x Master…
Is there a function to get the number of aligned sites between pairs of sequences in a multiple sequence alignment in R?
Is there a function to get the number of aligned sites between pairs of sequences in a multiple sequence alignment in R? 0 Hi, I am working on multiple sequence alignments and I want to obtain the number of aligned sites between each pair of aligned sequences (in other words,…
Study of the error correction capability of multiple sequence alignment algorithm (MAFFT) in DNA storage | BMC Bioinformatics
MSA is a parameter-free error correction method whose performance is determined by the sequence copies used (or sequencing depth). We encode (00-A, 01-T, 10-G, 11-C) a text file named “The Grandmother” into 140 DNA sequences of 120 bases (8 bases for index and 112 bases for data). The error rate…
Dryad | Data — Multiple sequence alignments of newly reconstructed and published cervid and human mtDNA
Assigning prehistoric objects to specific individuals is usually impossible outside of burial contexts. Here we present a non-destructive method for gradually releasing DNA from ancient bone and tooth artifacts. Application of the method to an Upper Paleolithic deer tooth pendant from Denisova Cave (Russia) resulted in the recovery of DNA…
Strainphlan shallow shotgun sequencing data – StrainPhlAn
Dear, I recently tried to run strainphlan3 on shallow shotgun sequencing data of 166 skin samples (sequencing depth 12M reads). In the example, underneath, I show the phylogenetic tree of the most abundant species Cutibacterium acnes within all my samples.My first question is, can you use strainphlan3 on shallow shotgun…
Protein structure and folding pathway prediction based on remote homologs recognition using PAthreader
PAthreader overview The pipeline of PAthreader is illustrated in Fig. 1, and the details are presented in the Methods section. First, multi-peak distance profiles are obtained by our in-house DeepMDisPre, which may predict multiple possible distances for flexible protein regions. Structure profiles are extracted from PAcluster80, a master structure database built…
Adaptations of Pseudoxylaria towards a comb-associated lifestyle in fungus-farming termite colonies
Genome reduction is associated with a termite comb-associated lifestyle For our studies, we collected fungus comb samples originating from mounds of Macrotermes natalensis, Odontotermes spp., and Microtermes spp. termites and were able to obtain seven viable Pseudoxylaria cultures (X802 [Microtermes sp.], Mn132, Mn153, X187, X3-2 [Macrotermes natalensis], and X167, X170LB [Odontotermes…
UniProt id to MSA
UniProt id to MSA 0 I have UniProt ids of 4000 prokaryotic proteins. I have to do multiple sequence alignments of those proteins. Would anyone suggest to me how to do that? MSA • 28 views • link updated 48 minutes ago by GenoMax 126k • written 2 hours ago…
phylogenetics – Phylogeny building in R from FASTA files:
The formal question I’ve been given is what aligner would I use? There is an issue between forming a data pipeline for ape and the latest and greatest. Its trade and the compromise would be msaClustalOmega(), but the rationale is complicated. I strictly use muscle5 or specifically muscle -super5 option…