Tag: UniRef90

Error Message from Diamond in humann v3.8 – HUMAnN

Hello,Iam trying to execute humann v3.8 for particular sample with input around 34 million. The sample has went through kneadata and Metaphlan4 separately. So the input for humann v3.8 is the fastq file from Kneaddata and taxonomy profile from metaphlan4. Tried this approach for various samples separately with resources as…

Continue Reading Error Message from Diamond in humann v3.8 – HUMAnN

UniRef90 to UniRef50 conversion

UniRef90 to UniRef50 conversion 1 Hi… i have a long list of UniRef90 IDs. Is it possible to convert all of them to their respective UniRef50IDs? If yes, how? Thanks Uniref Id_conversion uniprot • 66 views • link updated 2 hours ago by GenoMax 135k • written 7 hours ago…

Continue Reading UniRef90 to UniRef50 conversion

Error when running HUMAnN – HUMAnN

Hi, I am keep getting an error when running humann v3.7 humann v3.7 was installed with conda conda install humann -c biobakery –solver=libmambaBefore installing the environment was created and channels were configured as per instructions I downloaded the databases as follow humann_databases –download chocophlan full /home/swijegun/humann/databases/ –update-config yes humann_databases –download…

Continue Reading Error when running HUMAnN – HUMAnN

HUMAnN3 – mapping between UniRef90 to EC-enzymes – HUMAnN

Dear developers and users, In HUMAnN3 in utility_mapping accessory files, there is file called map_level4ec_uniref90.txt.gz which maps EC-enzymes (4-numbers) to UniRef90 protein ids. I was wondering how/from where I can generate such a mapping table myself? If for example I’d like to use the latest UniRef90 (which is updated ~8…

Continue Reading HUMAnN3 – mapping between UniRef90 to EC-enzymes – HUMAnN

MUSCLE in ShortBRED

Hi, I have installed ShortBRED yesterday by conda conda install -c biobakery shortbread When I was trying to run shortbred_identify.py to create markers out of CARD database by conda activate /cluster/projects/nn8021k/Conda-env/my_shortbred shortbred_identify.py –goi /cluster/projects/nn8021k/Databases/CARD_327/markers_shortbred/protein_homolog_model.fasta –ref /cluster/projects/nn8021k/Databases/humann_database/uniref/uniref90.fasta –threads 32 –clustid 0.95 –markers /cluster/projects/nn8021k/Databases/CARD_327/markers_shortbred/CARD_markers.faa –tmp /cluster/projects/nn8021k/Databases/CARD_327/markers_shortbred/Temp I got this error Invalid command…

Continue Reading MUSCLE in ShortBRED

Clustered bacterial RefSeq?

Clustered bacterial RefSeq? 1 Hi all, I am sure I am missing something obvious, and hope someone would point me in the right direction. I was wondering if there are datasets similar to UniRef90/UniRef50 etc, but done on bacterial RefSeq genome sequences, e.g. by clustering using something like ANI? Basically…

Continue Reading Clustered bacterial RefSeq?

docker – python: can’t open file ‘/home/administrator/alphafold/run_alphafold.py’: [Errno 2] No such file or directory

I’m trying to run docker for alphafold but it says that there is no ‘/home/administrator/alphafold/run_alphafold.py’ in the directory even if it does exist Code I’m trying to run python3 docker/run_docker.py \ –fasta_paths=your_protein.fasta \ –max_template_date=2022-01-01 \ –data_dir=/data8/Alphafold_database\ –output_dir=/data8/Alphafold_output_dir OBS: this is just a test to know if docker is working (your_protein.fasta…

Continue Reading docker – python: can’t open file ‘/home/administrator/alphafold/run_alphafold.py’: [Errno 2] No such file or directory

How can I speed up wget from UniProt for UniRef90/50 fasta?

How can I speed up wget from UniProt for UniRef90/50 fasta? 1 I think you need a program that can create multiple download streams. Here is one example: github.com/aria2/aria2 With aria2, a week ago I downloaded a compressed UniProt90 file in ~12 hours. Login before adding your answer. Read more…

Continue Reading How can I speed up wget from UniProt for UniRef90/50 fasta?

Efficient evolution of human antibodies from general protein language models

Acquiring amino acid substitutions via language model consensus We select amino acid substitutions recommended by a consensus of language models. We take as input a single wild-type sequence x = (x1,…,xN)∈ \(\mathcal{X}\)N, where \(\mathcal{X}\) is the set of amino acids, and N is the sequence length. We also require a set of…

Continue Reading Efficient evolution of human antibodies from general protein language models

A Error occurred by running StrainPhlAn(4.0.6), this error is [Error] Phylogeny can not be inferred. Too many samples were discarded – StrainPhlAn

Yunan April 21, 2023, 11:11am 1 Dear authors, I am encountering an error when running StrainPhlAn(version=4.0.6) using the pipeline provided in the Github(StrainPhlAn 4 · biobakery/MetaPhlAn Wiki · GitHub). The error message is”[Error] Phylogeny can not be inferred. Too many samples were discarded”. I am using the data provided in…

Continue Reading A Error occurred by running StrainPhlAn(4.0.6), this error is [Error] Phylogeny can not be inferred. Too many samples were discarded – StrainPhlAn

[tblastn] Examining 5 or more matches is recommended

Warning: [tblastn] Examining 5 or more matches is recommended 0 Hi, i am trying to run sma3s regarding annotation and it give following warning: is this warnings are ignorable or they will cause some issue in output file ./sma3s_v2.pl -i pan_genome_reference.fa -d uniref90.fasta -nucl -go -b -p -num_threads 20 -goslim…

Continue Reading [tblastn] Examining 5 or more matches is recommended

Error when running sma3s for proteome annotation

Error when running sma3s for proteome annotation 0 Hello everyone, I have been trying to run sma3s for protein annotation, according to the author’s instructions (UPOBioinfo Group). However, when running the command ./sma3s.pl -i query_dataset.fasta -d uniref90.fasta -goslim I got the message “Problem with blastdbcmd. It could be due to…

Continue Reading Error when running sma3s for proteome annotation

Predicting protein folding from single sequences with Meta AI ESM-2

Emergence of structure when scaling language models to 15 billion parameters. (A) Predicted contact probabilities (bottom right) and actual contact precision (top left) for PDB 3LYW. A contact is a positive prediction if it is within the top L most likely contacts for a sequence of length L. (B to…

Continue Reading Predicting protein folding from single sequences with Meta AI ESM-2

How could I “recreate” UniRef50/UniRef90 with MMSEQS2?

How could I “recreate” UniRef50/UniRef90 with MMSEQS2? 1 UniRef50/UniRef90 are really useful clustered databases. I’m interested in trying a similar approach to this nested clustering but with my own protein database. Are there specific commands that were used for UniRef clustering with MMSEQS2? I couldn’t find these documented anywhere. clustering…

Continue Reading How could I “recreate” UniRef50/UniRef90 with MMSEQS2?

Alphafold in Linux – HHblits error

When I try to run alphafold on a cluster from the ubuntu shell, I am getting this error in HHBlits: RuntimeError: HHblits failed stdout: stderr: 17:40:03.749 ERROR: Could find neither hhm_db nor a3m_db! Does anybody know why this might be happening? I assumed it would be an issue with the…

Continue Reading Alphafold in Linux – HHblits error

How to map Uniref90 genes (IDs) to Antibiotic resistant genes databases?

How to map Uniref90 genes (IDs) to Antibiotic resistant genes databases? 0 I have a gene abundance table for my metagenomics data. I want to just extract antibiotic resistant genes from the table and analyze them separately. The genes are defined based on the Uniref90 database. I am wondering how…

Continue Reading How to map Uniref90 genes (IDs) to Antibiotic resistant genes databases?

Uniref90 database – HUMAnN – The bioBakery help forum

Hello.I performed metagenome analysis using humann3 with the full UniRef90 database (20.7GB). I found many UniRef100-IDs in the file analyzed with UniRef90 database. Why are UniRef100-IDs included in Uniref90 database? Do I mistake anywhere?Thank you. Do you mean 1) that you saw IDs that looked like UniRef100_XYZ or 2) that…

Continue Reading Uniref90 database – HUMAnN – The bioBakery help forum

Efficient way of mapping UniProt IDs to representative UniRef90 IDs?

You can do this directly on UniProt: www.uniprot.org/uploadlists/ Just paste or upload your list of UniProt IDs, and select “UniProtKB AC/ID” in the “From” field and “UniParc” in the “To” field I’ve also written a script, pasted below, that can do this with some useful options: $ uniprot_map.pl -h uniprot_map.pl…

Continue Reading Efficient way of mapping UniProt IDs to representative UniRef90 IDs?

Custom genetic database – Deepmind/Alphafold

It is possible, but only with a code change in data/pipeline.py: If the database is a FASTA file, you could add a new Jackhmmer searcher for that database. You can take a look at the jackhmmer_uniref90_runner and basically follow the same logic for your database. If the database is a…

Continue Reading Custom genetic database – Deepmind/Alphafold

Install alphafold on the local machine, get out of docker.

AlphaFold This package provides an implementation of the inference pipeline of AlphaFold v2.0. This is a completely new model that was entered in CASP14 and published in Nature. For simplicity, we refer to this model as AlphaFold throughout the rest of this document. Any publication that discloses findings arising from…

Continue Reading Install alphafold on the local machine, get out of docker.

get only one representative fasta sequence per family

Pfam – get only one representative fasta sequence per family 2 Hey can u help me with getting only one representative fasta sequence per family? Is there way to simply do that? cheers X pfam fasta protein • 186 views It’s not trivial. You could use the sequences from the…

Continue Reading get only one representative fasta sequence per family

Download UniProt databases via HTTP/HTTPS instead of FTP? (Google Apps Scripts Question)

Download UniProt databases via HTTP/HTTPS instead of FTP? (Google Apps Scripts Question) 1 Hello, Does anyone know of a mirror to download the UniRef90 database via HTTP or HTTPS, or something supported by Google Apps Scripts? I want to download the UniRef90 database, but it is not possible via FTP…

Continue Reading Download UniProt databases via HTTP/HTTPS instead of FTP? (Google Apps Scripts Question)

Any idea about extracting bacteria related proteins(or protein clusters) from uniref90 database?

Any idea about extracting bacteria related proteins(or protein clusters) from uniref90 database? 1 Hi guys, I want to do a blastp to align some microbial DNA sequences(fastq) against uniref90 protein catalog, in order to answer some questions about protein abundance in my data. The thing is that uniref90 fasta file…

Continue Reading Any idea about extracting bacteria related proteins(or protein clusters) from uniref90 database?