Tag: SED

NifH database for taxonomic assignment in qiime2 – General Discussion

JThurston (Josh Thurston) January 19, 2024, 8:05pm 1 Hi all! I’m currently working through a Qiime2 pipeline analysing Illumina miseq paired-end amplicon data. I’ve successfully analysed bacterial (16s) amplicons from importing, filtering through to taxonomic assignment. However, I also have amplicon data for a functional gene (nifH; nitrogenase for N2…

Continue Reading NifH database for taxonomic assignment in qiime2 – General Discussion

Extract fasta sequence from gff3 file

Extract fasta sequence from gff3 file 2 Hi everyone, I have a lot of .gff3 files with the CDS features and below with the fasta sequence. This sequence is separated from the CDS features like this: ##FASTA >NZ_NZ_LR130533.1 I would like to extract all the fasta sequence into new fasta…

Continue Reading Extract fasta sequence from gff3 file

Function ‘SubBackward0’ returned nan values in its 1th output – autograd

I am trying to implement a model where the forward function calls to an external function that computes the values using the model’s parametes. class R_model(torch.nn.Module): def __init__(self,) : super().__init__() self.Kx = torch.nn.Parameter(torch.randint(-200, -100, (1,)).float()*0.0001) self.Ky = torch.nn.Parameter(torch.randint(-200, -100, (1,)).float()*0.0001) self.Kz = torch.nn.Parameter(torch.randint(-200, -100, (1,)).float()*0.0001) def forward(self): return generate_r(Kx=self.Kx, Ky=self.Ky,…

Continue Reading Function ‘SubBackward0’ returned nan values in its 1th output – autograd

Species coverage in the NCBI protein NR database ?

Hi Biostars, I am currently trying to build a Eukaryote version of the NCBI NR database and I am not really sure that I fully understand how the NR is implemented. Here is the code that I’m using to do so : #!/usr/bin/bash ############## # DOWNLOAD FULL NR ############## baseURL=”https://ftp.ncbi.nlm.nih.gov/blast/db/”…

Continue Reading Species coverage in the NCBI protein NR database ?

Pruning with –indep-pairwise with plink 1.9

I’m new to PLINK and I would like to obtain a file with SNPs in approximate linkage equilibrium. Here is my script and the outputs of each step. If someone could tell me if there is an error in the script because at…

Continue Reading Pruning with –indep-pairwise with plink 1.9

STAR output

STAR output 1 Hello, I am trying to map with STAR but it is not clear to me why I am not getting the SAM/BAM mapping file, could you help me? [epola@mazorka alignment_STAR]$ ls -lh total 13M -rw-rw-r– 1 epola epola 13M Nov 17 12:34 SRR22164928SJ.out.tab -rw——- 1 epola epola…

Continue Reading STAR output

Pflowtts Pytorch – Open Source Agenda

P-Flow: A Fast and Data-Efficient Zero-Shot TTS through Speech Prompting Authors : Sungwon Kim, Kevin J Shih, Rohan Badlani, Joao Felipe Santos, Evelina Bhakturina,Mikyas Desta1, Rafael Valle, Sungroh Yoon, Bryan Catanzaro Affiliations: NVIDIA Status : Generated first sample (Check LJSpeech_Sample_100_epochs.wav) on 11/16/2023. Unofficial implementation of the paper P-Flow: A Fast…

Continue Reading Pflowtts Pytorch – Open Source Agenda

[main_samview] fail to read the header from “human_g1k_v37.annotate.fasta”.

[main_samview] fail to read the header from “human_g1k_v37.annotate.fasta”. 1 Hi, I tried to annotate chromosome with prefix “chr” in a fasta file like: sed ‘s/^>/>chr/’ human_g1k_v37.fasta > human_g1k_v37.annotate.fasta However, after that, I failed to view header of the new fasta file: samtools view -H human_g1k_v37.annotate.fasta >>> [main_samview] fail to read…

Continue Reading [main_samview] fail to read the header from “human_g1k_v37.annotate.fasta”.

How to swap UMIs?

How to swap UMIs? 0 Hi all, I have a fastq file. This is my UMI TCAAT. but unfortunately it appears at the end of line. After the alignment this information was lost. How to swap this information Expected output @NS500595:901:HCCFKAFX5:1:11101:20769:1089_TCAAT 1:N:0:GGGGGGGG+AGATCTCG_TCAAT Input i have @NS500595:901:HCCFKAFX5:1:11101:20769:1089 1:N:0:GGGGGGGG+AGATCTCG_TCAAT ATGTGGGAAACTCGACTGCATAATTTGTGGTAGTGGGGGACTGCGTTCGCGCTTTCCCCT + EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEE…

Continue Reading How to swap UMIs?

How To Get Chromosome Position Given Rs Number?

How To Get Chromosome Position Given Rs Number? 3 I have a list of a few hundred SNPs given by rs number. I want to get the chromosome and position for each SNP. For example: input: rs4477212 output: chr1:82154 snp chromosome position • 29k views you can download this information…

Continue Reading How To Get Chromosome Position Given Rs Number?

Comparative genomics and genome-wide SNPs of endangered Eld’s deer provide breeder selection for inbreeding avoidance

De novo genome assemblies and genome annotation We assembled a de novo genome of a seven-year-old male SED from Ubon Ratchathani Zoo using a combination of Illumina short-reads (92.94 × coverage) and PacBio long-reads (61.6 × coverage) (GenBank accession number: JACCHN000000000). Additionally, we used MGI short-reads (52.15 × coverage) to assemble a de novo genome of…

Continue Reading Comparative genomics and genome-wide SNPs of endangered Eld’s deer provide breeder selection for inbreeding avoidance

PIGx ChIP-seq pipeline error

Hi Lisa, You also need to modify the gtf annotation file using: sed ‘/^#/d’ annotation_file.gtf > annotation_file_no_header.gtf Best, Alex > On 12. Oct 2022, at 15:07, Bora Uyar <borauy…@gmail.com> wrote: > > You would need to check how your fasta headers look and how the chromosomes are represented in…

Continue Reading PIGx ChIP-seq pipeline error

Fixed effect, random, or both in generalized additive mixed model (GAMM)? – rstudio

Hello, I need help identifying if a predictor variable needs to be a fixed effect, a random effect, or if both are necessary. I understand a fixed effect to mean “a variable of interest” and a random effect to be something that represents a structural component, like a sample design…

Continue Reading Fixed effect, random, or both in generalized additive mixed model (GAMM)? – rstudio

r – Problems with Rprofile, dont load at startup

I have a R (4.2.2) and RStudio (2023.06.2) installed on a MacOS system, before I update Rstudio I have no problem, but with those version I don’t how to load .Rprofile at startup RStudio. The defaul working directory for RStudio is in ~/R and my ~/.Rprofile is at home file.path(Sys.getenv(“HOME”),…

Continue Reading r – Problems with Rprofile, dont load at startup

ACSA reflects on W Cape socio-economic development as it celebrates 30 years

Airports Company South Africa’s (ACSA) contribution to the economic growth and development of South Africa extends beyond the numbers, telling a story of a key enabler of economic growth, transformation and socio-economic development. Elelwani Tshikovhi As ACSA celebrates its 30th anniversary this year, it is fitting to note its massive…

Continue Reading ACSA reflects on W Cape socio-economic development as it celebrates 30 years

main-armv7-default][biology/viennarna] Failed for viennarna-2.6.3 in build

You are receiving this mail as a port that you maintain is failing to build on the FreeBSD package build server. Please investigate the failure and submit a PR to fix build. Maintainer: y…@freebsd.org Log URL: pkg-status.freebsd.org/ampere2/data/main-armv7-default/p08943441f26e_s6e92fc9309/logs/viennarna-2.6.3.log Build URL: pkg-status.freebsd.org/ampere2/build.html?mastername=main-armv7-default&build=p08943441f26e_s6e92fc9309 Log: =>> Building biology/viennarna build started at Mon Oct 16…

Continue Reading main-armv7-default][biology/viennarna] Failed for viennarna-2.6.3 in build

Edit fasta header for TSA submission

Edit fasta header for TSA submission 0 Hi everyone, I’ve been trying to edit the headers of my fasta file which is intend to upload on NCBI TSA. Can’t seem to successfully upload my file on TSA and if im not mistaken it could be because of the header format….

Continue Reading Edit fasta header for TSA submission

Help with interpreting GO enrichment resutls using goseq – usegalaxy.org support

ding66 October 11, 2023, 6:35pm 1 Hi there! I’m new to RNA-Seq results analysis. By following the tutorial of ” Reference-based RNA-Seq data analysis” (Reference-based RNA-Seq data analysis), I have so far completed mapping, annotation, differential expression analysis. Now the next step that I want to do is gene enrichment…

Continue Reading Help with interpreting GO enrichment resutls using goseq – usegalaxy.org support

18S taxonomy assignment SILVA database formatting

Hi Bioinformatic community, I would like to classify 18S data (V7) of Fungi with assignTaxonomy from dada2. For that I downloaded SILVA_132_SSURef_tax_silva.fasta.gz from the SILVA website and need to format it, what I do with some Linux command line oneliner. But some species in the database have a different number…

Continue Reading 18S taxonomy assignment SILVA database formatting

When nature turns deadly: A look at Abrin

What is abrin? In certain parts of Asia and Australia, grows a flowering plant of the bean family Fabaceae, known as Abrus precatorius, also more commonly as jequirity bean or rosary pea [1]. It’s a delicate, perennial climber but extremely invasive and classified as a weed in several countries. However, what…

Continue Reading When nature turns deadly: A look at Abrin

linux – Output file name not being correctly named for Bash

The file name format is like this: 4digitnumber_S_R1_001.fastq.gz. To give you an example 3145_S2_R1_001.fastq.gz I’m trying to have my output file name not include _R1_001 part but it keeps including the full file name. I am not sure why it’s not giving me the correct output file name format that…

Continue Reading linux – Output file name not being correctly named for Bash

Types of blood tests a doctor may order

No single test can confirm that a person has lupus, but a doctor may use several types of blood tests to reach a lupus diagnosis. These include tests to look for antinuclear antibodies (ANA) in the blood. Systemic lupus erythematosus (SLE), or lupus, is a chronic autoimmune disease resulting from…

Continue Reading Types of blood tests a doctor may order

main-arm64-default][biology/viennarna] Failed for viennarna-2.6.3 in build

You are receiving this mail as a port that you maintain is failing to build on the FreeBSD package build server. Please investigate the failure and submit a PR to fix build. Maintainer: y…@freebsd.org Log URL: pkg-status.freebsd.org/ampere2/data/main-arm64-default/p85ccf094713a_sc584bb9cac1/logs/viennarna-2.6.3.log Build URL: pkg-status.freebsd.org/ampere2/build.html?mastername=main-arm64-default&build=p85ccf094713a_sc584bb9cac1 Log: =>> Building biology/viennarna build started at Sat Sep 23…

Continue Reading main-arm64-default][biology/viennarna] Failed for viennarna-2.6.3 in build

Salmon index not progressing

Salmon index not progressing 0 Hi! I am having issue with salmon index formation since I cannot use STAR due to limited amount of RAM (as per my recent post). I tried to follow this tutorial on how to create decoy-aware transcriptome as well as doing directly this and I…

Continue Reading Salmon index not progressing

Starting Server for Non-Default Users in JupyterHub: 500 Internal Server Error – JupyterHub

I am encountering an issue with starting a server for non-default users in JupyterHub. When attempting to start a server for a user named “mahdi” (or any other non-default user), I receive the following error message in the JupyterHub container logs: [I 2023-09-20 07:40:15.461 JupyterHub provider:659] Creating oauth client jupyterhub-user-mahdi…

Continue Reading Starting Server for Non-Default Users in JupyterHub: 500 Internal Server Error – JupyterHub

failed reading from temporary file

STAR error EXITING because of FATAL ERROR: failed reading from temporary file 1 Hello all, I’m attempting to use STAR to map some RNA-seq data, but keep getting some sort of error. Here is the command I used: for i in `ls *_clean.fastq | sed ‘s/_clean.fastq//’`; do STAR –runThreadN 20…

Continue Reading failed reading from temporary file

Error executing process > consensus_classification in NanoCLUST

Error executing process > consensus_classification in NanoCLUST 1 Hello, I’m trying to run NanoCLUST with my 16S sequence data. I run it on a Linux ubuntu 20.04 machine. When I use the command: nextflow run main.nf -profile docker –reads “*.fastq” –db “db/16S_ribosomal_RNA” –tax “db/taxdb/” I get the following terminal output:…

Continue Reading Error executing process > consensus_classification in NanoCLUST

FASTA file of fixed length

FASTA file of fixed length 7 Hi, I have a FASTA file like this: >1 TCAAGAGGGGTGAATGTGTTTCGCATGCACAAGGGACAGGAGTCT >2 ATCAGAGCTGGTGGGGTGGAGAGACAGAAACAAGTGGGAGAAGGT >3 TTATACCTACCTTATAGATAAGGAAATTGAAGCTTATAGAGTTTA >4 ATTTTTCCTTATGATACTCTATTGCCTCTCCATGGATAAAGACAG >5 AAACTCCTGACCTCAGGTGATCCACCTGCCTCGGCCTCCCAAAGT >6 TGCACACCTTCAGAACTGTGAACCAAATAAACCTCTCTTCTTTAAAATTATTCATCCTCT GGTATTCCTTTATAACAA >7 CTCTTGATGTCATTTCACTTCGGATTCTTCTTTAGAAAACTTGGA Every sequence has fixed length of 45nt but some sequences like sequence no .6 has more length. There are some more…

Continue Reading FASTA file of fixed length

Corrupted FASTq files with missing “+” under some sequences.

Corrupted FASTq files with missing “+” under some sequences. 1 Hi, I have been trying to recover corrupted fastqs files. I had a decompression error; invalid compressed data–crc error. I got around the crc error by using gzrecover and then used a seqkit sana to fix sequence inconsistencies. Now, the…

Continue Reading Corrupted FASTq files with missing “+” under some sequences.

How to remove fasta headers in a multifasta file and write file name as a fasta header?

How to remove fasta headers in a multifasta file and write file name as a fasta header? 3 I have fasta file namely 119XCA.fasta as shown below, >cellulase ATGCTA >gyrase TGATGCT >16s TAGTATG I need to remove all the fasta headers, keep the sequences one by one and need to…

Continue Reading How to remove fasta headers in a multifasta file and write file name as a fasta header?

How to limit fasta header to 40 characters?

How to limit fasta header to 40 characters? 0 I have FASTA headers with long annotation names, but the program it will be run through for proteomics has a limit of roughly 40 characters or else it will crash. The file starts off like this: >TRINITY_DN0_c0_g1_i1.p1 – RecName: Full=E3 ubiquitin-protein…

Continue Reading How to limit fasta header to 40 characters?

Cannabis seeds on the move: Exploring shipping policies and trends across seed banks

This article delves deep into the current shipping policies and trends among these seed banks, providing a panoramic understanding of the seed distribution landscape. In recent years, the cannabis industry has burgeoned, primarily due to changing regulations, scientific research, and shifting societal attitudes. One subset of the cannabis industry that…

Continue Reading Cannabis seeds on the move: Exploring shipping policies and trends across seed banks

efetch from NCBI E-utilities returns “curl error s 400 & 500” and takes a very long time

efetch from NCBI E-utilities returns “curl error s 400 & 500” and takes a very long time 0 I run this command to download ~4,000 gene sequences for invA gene for taxonomy# 28901. It works fine for smaller datasets, but … but takes very long time and never finishes for…

Continue Reading efetch from NCBI E-utilities returns “curl error s 400 & 500” and takes a very long time

Installing R and RStudio on Linux for Data Analysis

R is a versatile programming language and environment designed specifically for data analysis and statistical computing, making it an incredible choice for data-driven work. R has gained significant popularity across the data science, data analysis, data visualization, and statistical communities due to its extensive capabilities and active user community. In…

Continue Reading Installing R and RStudio on Linux for Data Analysis

Rstudio can’t find CMAKE even though it is in /usr/local/bin – Package Management

I have been attempting to install a package that requires cmake. However, Rstudio can’t seem to find it for some reason: R version 4.3.1 (2023-06-16) — “Beagle Scouts” Copyright (C) 2023 The R Foundation for Statistical Computing Platform: x86_64-apple-darwin22.4.0 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY….

Continue Reading Rstudio can’t find CMAKE even though it is in /usr/local/bin – Package Management

PacBio 16S pipeline

QIIME2 analysis pipeline 0. activate conda environment conda activate qiime2-2019.10 1. flip sequences to the same direction mkdir raw_data_rc/ mkdir raw_data_cat/ parallel -j 8 ‘seqtk seq -r {} > raw_data_rc/{/.}_rc.fastq’ ::: rawdata/*.fastq parallel -j 8 –link ‘cat {1} {2} > raw_data_cat/{1/.}_cat.fastq’ ::: rawdata/*.fastq ::: raw_data_rc/*_rc.fastq 2. trim primers mkdir trimmed_reads/…

Continue Reading PacBio 16S pipeline

bamCoverage fails in bam files with large number of small contigs in headers

bamCoverage fails in bam files with large number of small contigs in headers 0 Hi, I plan to use bamCoverage from Deeptools to get bw files, but it looks like the thread is dead as it never finishes (no errors). I have hundred of thousands of short unlocalized/random contigs in…

Continue Reading bamCoverage fails in bam files with large number of small contigs in headers

vcf file chr notation

“I have a single VCF file named ‘ALL.wgs.shapeit2_integrated_snvindels_v2a.GRCh38.27022019.sites.vcf.gz’. The issue at hand is that the file uses different chromosomal notation and lacks the ‘chr’ prefix. Like this “##fileformat=VCFv4.3 ##FILTER=<ID=PASS,Description=”All filters passed”> ##fileDate=11032019_15h52m43s ##source=IGSRpipeline ##reference=ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa ##contig=<ID=1> ##contig=<ID=2> ##contig=<ID=3> ##contig=<ID=4> ##contig=<ID=5> ##contig=<ID=6> ##contig=<ID=7> ##contig=<ID=8> ##contig=<ID=9> ##contig=<ID=10> ##contig=<ID=11> ##contig=<ID=12> ##contig=<ID=13> ##contig=<ID=14> ##contig=<ID=15> ##contig=<ID=16>…

Continue Reading vcf file chr notation

Apply Plink2 Score – Error Invalid chromosome code

I am trying to run a calculator tool for polygenic scores called pgsc_calc (The Polygenic Score Catalog Calculator pipeline) that runs with nextflow and docker in linux, with my own VCF file. Its failing step 8: process > PGSCATALOG_PGSCALC:PGSCALC:APPLY_SCORE:PLINK2_SCORE **ERROR ~ Error executing process > ‘PGSCATALOG_PGSCALC:PGSCALC:APPLY_SCORE:PLINK2_SCORE (NG13RY1WV.vcf.gz chromosome ALL effect…

Continue Reading Apply Plink2 Score – Error Invalid chromosome code

Invasion success of a Lessepsian symbiont-bearing foraminifera linked to high dispersal ability, preadaptation and suppression of sexual reproduction

Simberloff, D. et al. Impacts of biological invasions: What’s what and the way forward. Trends Ecol. Evol. 28, 58–66 (2013). Article  PubMed  Google Scholar  Bellard, C., Cassey, P. & Blackburn, T. M. Alien species as a driver of recent extinctions. Biol. Lett. doi.org/10.1098/rsbl.2015.0623 (2016). Article  PubMed  PubMed Central  Google Scholar …

Continue Reading Invasion success of a Lessepsian symbiont-bearing foraminifera linked to high dispersal ability, preadaptation and suppression of sexual reproduction

Bash script with command line options gets stuck and doesn’t set default values for variables

I am pretty green when it comes to bash scripts and completely new to command line functionality in bash. I tried my hand at a script which is supposed to be useable both with command line arguments as well as manual setting of variable values, if the user prefers to…

Continue Reading Bash script with command line options gets stuck and doesn’t set default values for variables

transcripts missing from tx2gene

transcripts missing from tx2gene 2 How can I know the reference trascriptome used in the pre-computed index ? You can download the fasta transcriptome file archive (fasta, .fai index and chrome.sizes) used for that index here: refgenomes.databio.org/v3/assets/archive/2230c535660fb4774114bfa966a62f823fdb6d21acf138d4/fasta_txome?tag=default This should get you the table you need $ grep “^>E” 2230c535660fb4774114bfa966a62f823fdb6d21acf138d4.fa |…

Continue Reading transcripts missing from tx2gene

Bioinformatics Analyst III, Spatial Biology, CGR job with Frederick National Laboratory for Cancer Research

Bioinformatics Analyst III, Spatial Biology, CGR Job ID: req3667Employee Type: exempt full-timeDivision: Clinical Research ProgramFacility: Rockville: 9615 MedCtrDrLocation: 9615 Medical Center Drive, Rockville, MD 20850 USA The Frederick National Laboratory is a Federally Funded Research and Development Center (FFRDC) sponsored by the National Cancer Institute (NCI) and operated by Leidos…

Continue Reading Bioinformatics Analyst III, Spatial Biology, CGR job with Frederick National Laboratory for Cancer Research

Useful Bash Commands To Handle Fasta Files

You will probably get a lot of different answers because there are many ways to parse fasta files with Bash and tools like grep, awk and sed. Here are some suggestions. To extract ids, just use the following: grep -o -E “^>\w+” file.fasta | tr -d “>” A useful step…

Continue Reading Useful Bash Commands To Handle Fasta Files

BEDOPS gtf2bed conversion error with Ensembl GTF

You can generate BED files (from e.g. GTF file of the Ensembl release) by executing the following command in Linux Shell: # For genes grep -P “\tgene\t” your_ensembl.gtf | cut -f1,4,5,7,9 | \ sed ‘s/[[:space:]]/\t/g’ | sed ‘s/[;|”]//g’ | \ awk -F $’\t’ ‘BEGIN { OFS=FS } { print $1,$2-1,$3,$6,”.”,$4,$10,$12,$14…

Continue Reading BEDOPS gtf2bed conversion error with Ensembl GTF

HMM gets zero or 1 hits when many more expected

HMM gets zero or 1 hits when many more expected 1 Hi all, My ultimate goal is to understand the phylogeny of a set of restriction-modification enzymes among certain genomes. For this, I have done the following: Downloaded all RM genes DNA sequences into psych_rm_genes.fna from REBASE Cleaned rebase file…

Continue Reading HMM gets zero or 1 hits when many more expected

main-amd64-default][biology/viennarna] Failed for viennarna-2.5.1 in build

You are receiving this mail as a port that you maintain is failing to build on the FreeBSD package build server. Please investigate the failure and submit a PR to fix build. Maintainer: y…@freebsd.org Log URL: pkg-status.freebsd.org/beefy18/data/main-amd64-default/p8fb94260154e_s510fd83138/logs/viennarna-2.5.1.log Build URL: pkg-status.freebsd.org/beefy18/build.html?mastername=main-amd64-default&build=p8fb94260154e_s510fd83138 Log: =>> Building biology/viennarna build started at Sat Jul 15…

Continue Reading main-amd64-default][biology/viennarna] Failed for viennarna-2.5.1 in build

Jupyterhub: Kernels in different environments not working – Kernels

Hi,I’m not able to get xeus-cling- and R-kernel to work in jupyterlab or notebook startet from jupyterhub.My system : Ubuntu 22.04 with miniconda3 installed in /opt.I have an environment for jupyterhub (jupyterhubenv) and one for xeus-cling (xeusclingenv) and one for R (R-env).It works, when I install “nb_conda_kernels” and start the…

Continue Reading Jupyterhub: Kernels in different environments not working – Kernels

Mapping to mtDNA and then align the unmapped

Mapping to mtDNA and then align the unmapped 1 Hello all, I have aligned my samples against the mitochondrion genome of the species I work with. My idea was that after this I would keep the unmapped ones (which would be the nuclear reads), and then align these against the…

Continue Reading Mapping to mtDNA and then align the unmapped

Bug#1040953: bookworm-pu: package sra-sdk/3.0.3+dfsg-6~deb12u1

Package: release.debian.org Severity: normal Tags: bookworm User: release.debian….@packages.debian.org Usertags: pu X-Debbugs-Cc: sra-…@packages.debian.org Control: affects -1 + src:sra-sdk [ Reason ] Per #1039621, the new libngs-jni package accidentally wound up with bad content (unexpanded variables in the key symlink’s source *and* target) that rendered it useless. [ Impact ] This package’s…

Continue Reading Bug#1040953: bookworm-pu: package sra-sdk/3.0.3+dfsg-6~deb12u1

Subject:[QIIME2.2023.5] Need help with Qiime2 installation: ResolvePackageNotFound error – Technical Support

Subject: Need help with Qiime2 installation: ResolvePackageNotFound error Dear Qiime2 Community, I hope this message finds you well. I am currently facing an issue during the installation of Qiime2 and would greatly appreciate your assistance in resolving it. During the installation process, after following the Qiime2 instructions, I encountered the…

Continue Reading Subject:[QIIME2.2023.5] Need help with Qiime2 installation: ResolvePackageNotFound error – Technical Support

Extract information from files

Extract information from files 0 hello, how are you? I have several table files with the following information, separated by tabs: GH5_8 Bacteria Actinoalloteichus fjordicus ADI127-7 APU14662.1 GH5_8 Bacteria Actinoalloteichus hoggarensis DSM 45943 ASO20105.1 GH5_8 Bacteria Actinoalloteichus sp. AHMU CJ021 AUS77477.1 GH5_8 Bacteria Actinoalloteichus sp. GBA129-24 APU20630.1 GH5_8 Bacteria Actinobacteria…

Continue Reading Extract information from files

Contig labels in BAM off by 1, how do I fix it?

Contig labels in BAM off by 1, how do I fix it? 0 After alot of hairpulling over why strelka wouldn’t run on my bam files, I found that for 18 contig labels were wrong, and they’re off by one (picture below). I’ve figured out which labels should be changed…

Continue Reading Contig labels in BAM off by 1, how do I fix it?

Running RStudio in own/local Galaxy instance – usegalaxy.org support

Hello,I successfully managed to setup my own local Galaxy instance. Now I am trying to integrate the interactive tools starting with RStudio. After adding it to the tool_conf.xml and starting RStudio in Galaxy, I get the following error: sed: can’t read /etc/services.d/nginx/run: No such file or directory chmod: changing permissions…

Continue Reading Running RStudio in own/local Galaxy instance – usegalaxy.org support

(error) qiime feature-classifier fit-classifier-naive-bayes – Technical Support

I wanted to see V3 and V4 regions as well as all other regions in the Classify taxonomy step, so I downloaded the latest version (138.1) of the data from the Download tab on the SILVA homepage and followed the steps below.However, I got an error message in the ‘qiime…

Continue Reading (error) qiime feature-classifier fit-classifier-naive-bayes – Technical Support

grep – Keeping DNA sequence after changing FASTA header on command line

I have a FASTA header that looks like this: >7c8250ef-c89f-4d42-9d48-12c8fe245fb2 runid=606f271fc97598006ba5a922136a2c304cef75a5 sampleid=Pool12-1 read=19008 ch=301 start_time=2021-07-03T08:48:18Z barcode=barcode01 And I am able to change it to the desired output here: >7c8250ef-c89f-4d42-9d48-12c8fe245fb2_001 Using this command: grep ‘^>’ 001_old.fasta | cut -d ‘ ‘ -f 1,8 | sed ‘/^>/s/$/_001/’ > 001_new.fasta However it completely…

Continue Reading grep – Keeping DNA sequence after changing FASTA header on command line

randomreads.sh only produces reads for chr1 to chr7

randomreads.sh only produces reads for chr1 to chr7 0 I used randomreads.sh from bbmap to generate reads from a fasta file generated with FastaAlternateReferenceMaker from GATK. It seems no matter which options I choose the script stops generating reads and chromosome 7 despite the fasta file contains all contigs from…

Continue Reading randomreads.sh only produces reads for chr1 to chr7

how to add the sample name to the end read headers

how to add the sample name to the end read headers 1 I would need to add the sample name at the end of all the read headers in that fasta sample. For example I have #Sample1 #>read1 #ATGC #Sample2 #>read1 #ATGC Desire output: #Sample1 #>read1/Sample1 #ATGC #Sample2 #> read1/Sample2…

Continue Reading how to add the sample name to the end read headers

write output files with default name

write output files with default name 0 I have prepared a shell script file and for the output I want to have a default name, but something is wrong. Can anybody revise this command? calldir=/profile/variant/input/ base=$(echo $sam | sed “s/.sam.*/_sorted/g”) sam=/profile/variant/input/s5000W_b2.bam –output $calldir/$(basename $base)_series_call.vcf.gz But in the output the file…

Continue Reading write output files with default name

[slurm-users] Need to free up memory for running more than one job on a node

Hello, (This is my first time submitting a question to the list) We have a test-HPC with 1 login node and 2 computer nodes. When we submit 90 jobs onto the test-HPC, we can only run one job per node. We seem to be allocating all memory to the one…

Continue Reading [slurm-users] Need to free up memory for running more than one job on a node

Getting sequences from fastq file using Grep command

Getting sequences from fastq file using Grep command 2 I have been trying to get a sequence (e.g. GCGAGCCCCACATCGCCCCCCCGATTGTAATAAATAA) from a fastq file (file.fastq) I have and output a fq file. I have tried the command: grep -A 2 -B 1 ‘GCGAGCCCCACATCGCCCCCCCGATTGTAATAAATAA’ file.fastq | sed ‘/–/d’ > output.fq I got…

Continue Reading Getting sequences from fastq file using Grep command

Grepping through API payloads with Gron

Introduction If you have spent any time reading some of my older articles, you know I am a fan of jq. In my article on How to Exploit APIs with cURL I showed how to parse API responses with it. I went even further when I showed how to extract…

Continue Reading Grepping through API payloads with Gron

Metagenomic highlight contrasting elevational pattern of bacteria- and fungi-derived compound decompositions in forest soils

Bomble YJ, Lin CY, Amore A, Wei H, Holwerda EK, Ciesielski PN, Donohoe BS, Decker SR, Lynd LR, Himmel ME (2017) Lignocellulose deconstruction in the biosphere. Curr Opin Chem Biol 41:61–70. doi.org/10.1016/j.cbpa.2017.10.013 Article  CAS  PubMed  Google Scholar  Cardenas E, Kranabetter JM, Hope G, Maas KR, Hallam S, Mohn WW (2015)…

Continue Reading Metagenomic highlight contrasting elevational pattern of bacteria- and fungi-derived compound decompositions in forest soils

Computational Scientist/ Spatial Biology – Frederick National Lab for Cancer Research

The Frederick National Laboratory is a Federally Funded Research and Development Center (FFRDC) sponsored by the National Cancer Institute (NCI) and operated by Leidos Biomedical Research, Inc. The lab addresses some of the most urgent and intractable problems in the biomedical sciences in cancer and AIDS, drug development and first-in-human…

Continue Reading Computational Scientist/ Spatial Biology – Frederick National Lab for Cancer Research

A Beginner’s Guide to Perform Molecular Dynamics Simulation of a Membrane Protein using GROMACS — GROMACS tutorials https://tutorials.gromacs.org documentation

Building the protein-membrane system in CHARMM-GUI We are now ready to embed the protein structure in the membrane in the proper location and orientation and construct the membrane composition we desire. To do this, we utilized the CHARMM-GUI input Generator, a handy web-based tool to generate GROMACS inputs for the…

Continue Reading A Beginner’s Guide to Perform Molecular Dynamics Simulation of a Membrane Protein using GROMACS — GROMACS tutorials https://tutorials.gromacs.org documentation

How to extract haplotype data from phased bcf files

How to extract haplotype data from phased bcf files 1 Hello, I have filtered/processed phased bcf files from wgs. I would like to extract the haplotype data per sample, so that I have a tab delim file which looks like this: Sample Chr Pos hap1 hap2 AW23 chr1 1234 A…

Continue Reading How to extract haplotype data from phased bcf files

Docker Error while running nf-core/rnaseq pipeline

Docker Error while running nf-core/rnaseq pipeline 1 I have run a nf-core pipeline with the following parameters: nextflow run nf-core/rnaseq -r 3.10.1 –input samplesheet.csv –outdir outputlatest –fasta chr22_with_ERCC92.fa -profile docker –gtf chr22_with_ERCC92.gtf –skip_multiqc true –skip_dupradar true –skip_stringtie true –aligner star_salmon –pseudo_aligner salmon –max_memory 3.5GB –max_cpus 4 Receiving an error related…

Continue Reading Docker Error while running nf-core/rnaseq pipeline

Deferentially expressed gene with high log2foldchange by DESeq2; but not meaningful at the individual level

Hi all, I am working with the RNA-Seq data on human (24Cases-20 controls) to find differentially expressed genes. my RNA-Seq data is unstranded. Here is the comments that I used to align the fastq files: ls *_1P.fastq.gz | parallel –bar -j8 ‘R2=$(echo {} | sed s/_1/_2/) && out=$(echo {} |…

Continue Reading Deferentially expressed gene with high log2foldchange by DESeq2; but not meaningful at the individual level

Trying to Change the Formatting of a Graph in R using ggplot2 – RStudio IDE

I want to modify this output graph so that each Wall and Restored bar is next to each other for each site. The wall and restored are the site types:This is the graph that I would like it to resemble in terms of formatting the bar placements: Here is my…

Continue Reading Trying to Change the Formatting of a Graph in R using ggplot2 – RStudio IDE

How to Split 3000 WGS CRAM files into 1Mbp length chunks

How to Split 3000 WGS CRAM files into 1Mbp length chunks 1 Hello, I have 3000 WGS CRAM files and I want to split them into 1Mbp chunks. I want to split with exact genomic coordinate locations, e.g. starting from 1 to 1000000bp, 1000001bp to 2000000bp, 2000001bp to 3000000 etc….

Continue Reading How to Split 3000 WGS CRAM files into 1Mbp length chunks

Error in Adding 1000Genomes Ancestral Allele info: Using VCF tools fill-aa

Error in Adding 1000Genomes Ancestral Allele info: Using VCF tools fill-aa 1 Hi I am trying to add ancestral allele to 1000 Genomes Phase3 VCF files. I have used the “human_ancestor_GRCh37_e59.tar.bz2” files for ancestral allele input file. The steps I have used are: cat human_ancestor_3.fa | sed ‘s,^>.*,>1,’ | bgzip…

Continue Reading Error in Adding 1000Genomes Ancestral Allele info: Using VCF tools fill-aa

angular – Cannot get the data from openAPI service – returns undefined

I have this service that I got from processing an openAPI spec file (.yaml) and the thing is I do not get the filtered data I want. Here is the method from my service : /** * Search cases with filters * @param status status of cases to return (all…

Continue Reading angular – Cannot get the data from openAPI service – returns undefined

Edit and re-head BAM file

Edit and re-head BAM file 0 Hi there I have a BAM file which needs to be edited and re-headed. Now, I’m aware of how to do so the problem is that for some reason the sed command I’m using does not catch the sequence I have to remove… Below,…

Continue Reading Edit and re-head BAM file

Unable to create environment – Technical Support

Tried to create an environment using Conda and was not able to do so. Have copy pasted the message below. Would be grateful to know what the issue is and how to resolve the issue. (base) C:\Users\Mathangi Janakiraman>wget data.qiime2.org/distro/core/qiime2-2023.2-py38-linux-conda.yml–2023-05-11 12:54:47– data.qiime2.org/distro/core/qiime2-2023.2-py38-linux-conda.ymlResolving data.qiime2.org (data.qiime2.org)… 54.200.1.12Connecting to data.qiime2.org (data.qiime2.org)|54.200.1.12|:443… connected.ERROR: cannot verify…

Continue Reading Unable to create environment – Technical Support

No differentially expressed genes after multiple testing correction in mice

No differentially expressed genes after multiple testing correction in mice 0 Hi all, I am working with the RNA-seq data on mice (group A N=3 vs group B N=3). Mice are littermates, of which group A overexpresses a human transgene which I verified. I have had .cram files from mouse…

Continue Reading No differentially expressed genes after multiple testing correction in mice

Convert Accession Numbers in blast HIT output to Full Taxonomy

Convert Accession Numbers in blast HIT output to Full Taxonomy 1 I have the Hit table output from a BlastWeb search which presents itself basically like this: M_A00619 | XM_034926345.1 | 100.000 M_A00619 | OV754683.1 | 95.588 M_A00619 | OV754677.1 | 95.588 M_A00619 | OV737695.1 | 95.588 I want to…

Continue Reading Convert Accession Numbers in blast HIT output to Full Taxonomy

Changing a fasta header

Changing a fasta header 2 Hi I have a fasta file anotated and I want to add to the first position after > the next word to ‘Similar to’ >_Anouracaudifer_00017283-RA transcript Name:”Similar to Chid1 Chitinase domain-containing protein 1 (Rattus norvegicus OX=10116)” offset:0 AED:0.30 eAED:0.30 QI:0|0|0|1|1|1|12|0|393 ATGAAGGCGCTCCTGCATGTGCTCTGGCTCACTCTGGCCTGCGGCTCTGCTCACACCACCCTGTCGAAGTCGGATGCCAAGAAGTCTGCCTCCAAGACACTGCAGGAGAAGACTCAGCTCTCAGAGACACCTGTGCAGGACCGGGGTCTGGTGGTAACAGACCCCCGAGCCGAGGACG I want the output…

Continue Reading Changing a fasta header

Changig a fasta header

Changig a fasta header 1 Hi I have a fasta file anotated and I want to add to the first position after > the next word to ‘Similar to’ _Anouracaudifer_00017283-RA transcript Name:”Similar to Chid1 Chitinase domain-containing protein 1 (Rattus norvegicus OX=10116)” offset:0 AED:0.30 eAED:0.30 QI:0|0|0|1|1|1|12|0|393 ATGAAGGCGCTCCTGCATGTGCTCTGGCTCACTCTGGCCTGCGGCTCTGCTCACACCACCCTGTCGAAGTCGGATGCCAAGAAGTCTGCCTCCAAGACACTGCAGGAGAAGACTCAGCTCTCAGAGACACCTGTGCAGGACCGGGGTCTGGTGGTAACAGACCCCCGAGCCGAGGACG I want the output…

Continue Reading Changig a fasta header

Error while running nf-core/rnaseq pipeline

Error while running nf-core/rnaseq pipeline 1 Hello guys! I am trying to run the nf-core/rnaseq pipeline with the following parameters: nextflow run nf-core/rnaseq -r 3.10.1 –input samplesheet.csv –outdir output –fasta chr22_with_ERCC92.fa -profile docker –gtf chr22_with_ERCC92.gtf –max_memory 200GB I keep getting a persistent error: WARN: Got an interrupted exception while taking…

Continue Reading Error while running nf-core/rnaseq pipeline

hpc – Slurm – Execute a lot of serial jobs parallel

Batch script to run many serial jobs parallel on a HPC with slurm I want to run a large number of independent serial jobs in parallel using slurm. However, I run into the maximum number of 100 jobs that a user can submit. Therefore only 100 jobs are processed simultaneously…

Continue Reading hpc – Slurm – Execute a lot of serial jobs parallel

Answer: R scripting

Your code works for me, as long as you (1) fix the column heading to make sure Location is capitalized, and (2) make sure your data frame is actually 3 columns. If it is only a single column, I get the error you get. Your data should be in 3…

Continue Reading Answer: R scripting

find and replace between two files

HI all, I know there’s a way to do this within Unix, but I cannot figure out how to do it with the functions that I know (grep, sed, awk, cut, paste). I am dealing with output from blast, so I thought I would try to see if anyone in…

Continue Reading find and replace between two files

Help in replicating LDSC heritability estimates

Hi, I am trying to replicate the heritability estimates based on the insomnia GWAS summary statistics using LDSC. However, I have encountered a problem as my estimates seem to be only about half of the original estimates listed in Table S1. Despite my efforts to locate the error, I have…

Continue Reading Help in replicating LDSC heritability estimates

Change sequence ID in fastq file generated by bcftools mpileup

Change sequence ID in fastq file generated by bcftools mpileup 0 Hi everobody ! I’m currently work on a HHV8 genetic study and I face to an issue with my bcftools command. Indeed, I want to generate consensus sequences thanks bcftools mpileup command and bam files. However, all ID get…

Continue Reading Change sequence ID in fastq file generated by bcftools mpileup

GFF/GTF file error / featureCounts

Hi all, I am trying to generate a count.matrix for sorted bam files, using featureCounts on linux. I have a non-modal organism (bacteria), so I generated the annotation.file using both PROKKA and RAST. I used all the following files in featurecounts; PROKKA.gff, RAST.gff RAST.gtf gffread converted-PROKKA.gtf file But still facing…

Continue Reading GFF/GTF file error / featureCounts

Editting fasta headers

Yes, I can help you with this. You can use a scripting language like Python to automate this task. Here is a Python code that you can use to rename the headers of your fasta files: In this code, you need to replace “/path/to/fasta/files/” with the path to the directory…

Continue Reading Editting fasta headers

1000 genomes hg38 with dbSNP rsid

1000 genomes hg38 with dbSNP rsid 1 Hi, Anyone know where I can download the latest version of 1000 Genomes, on build hg38, in VCF format (or PLINK format), that ALSO contains the dbSNP RSid in the VCF ID field? I looked at the IGSR website, dbSNP, UCSC, etc. So…

Continue Reading 1000 genomes hg38 with dbSNP rsid

Mapping paired end reads with ngm and samtools, using prefixes and suffixes for creating vcf eventually

Mapping paired end reads with ngm and samtools, using prefixes and suffixes for creating vcf eventually 1 So, I have problems with a script for mapping and with creating sam and bam files to eventually get to a vcf. My input files look like this: 262 files, paired reads, with…

Continue Reading Mapping paired end reads with ngm and samtools, using prefixes and suffixes for creating vcf eventually

Nextflow memory issues custom config -c

Nextflow memory issues custom config -c 1 Hi all, I am trying to run nextflow on my laptop nextflow run nf-core/rnaseq \ –input samplesheet.csv \ –genome mm10 \ -profile docker I am having issues with memory: Error executing process > ‘NFCORE_RNASEQ:RNASEQ:FASTQC_UMITOOLS_TRIMGALORE:FASTQC (KO_3)’ Caused by: Process requirement exceed available memory –…

Continue Reading Nextflow memory issues custom config -c

ggplot2 – ggplot: “No non-missing arguments to min/max; returning Inf”

I’m attempting to recreate this plot (my version: lat/lon by year), but keep getting these warnings after running the ggplot code: sms2 |> mutate(fCYR = factor(CYR)) |> ggplot(aes(x = Longitude, y = Latitude, fill = est, group = fCYR)) + geom_raster(aes(x = Longitude, y = Latitude, fill = est, group…

Continue Reading ggplot2 – ggplot: “No non-missing arguments to min/max; returning Inf”

prefix extraction and preparation for mapping and variant calling

prefix extraction and preparation for mapping and variant calling 1 hello humans, I am struggling with a bash script that should actually work as far as I can see. I need to extract prefixes of 262 files in a directory that contains reads. I will map them for later variance…

Continue Reading prefix extraction and preparation for mapping and variant calling

Xenocell – Error in classify reads

Hello everyone, I’m trying to run Xenocell on my dataset. I have some problems executing the “classify reads” step. The command terminates after starting the classification (“terminate called after throwing an instance of ‘std::ios_base::failure’”). and I don’t know how to fix the error. Any help would be appreciated. Thank you!…

Continue Reading Xenocell – Error in classify reads

docker – Permissions error running NextFlow RNAseq test pipeline

I’ve been trying to run a minimal example of the NextFlow RNAseq pipeline, like so: nextflow run nf-core/rnaseq -r 3.10.0 -profile test,docker –outdir /home/kai/RNASeq/rnaseq_test/test_output However, this appears to return the error below: Error executing process > ‘NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF2BED (genome_gfp.gtf)’ Caused by: Process `NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF2BED (genome_gfp.gtf)` terminated with an error exit status (1)…

Continue Reading docker – Permissions error running NextFlow RNAseq test pipeline

Realigning BAM files to new reference

Hi, I am looking to create a panel of normals for a somatic variant caller. For normals, I have been provided with a set of WES bam files that have been preprocessed according to GATK best practices. However, they have been aligned to another reference genome than my case samples…

Continue Reading Realigning BAM files to new reference

Failure working with the tmp directory

Nextflow Gatk DepthOfCoverage: Failure working with the tmp directory 0 I have nextflow workflow for which the process DepthOfCoverage failed to work with the defined tmp directory –tmp-dir tmp process pf_read_depth { tag “tag” scratch true publishDir … input: tuple val(pair_id), path(pf_bam) path refdir output: file(“final_${pair_id}.tsv”) script: “”” samtools index…

Continue Reading Failure working with the tmp directory

Introducing Twilio’s OpenAPI Specification GA

Today, we are thrilled to share the news that we have officially open-sourced the OpenAPI specification for every Twilio API. As a commitment to supporting and streamlining the development process for our users, we have long provided helper libraries and tooling in various popular programming languages and environments. With this…

Continue Reading Introducing Twilio’s OpenAPI Specification GA

Command line training – genotoul-bioinfo

  The GenoToul bioinformatics platform, Sigenae and SaAB (MIAT) offers a catalog of training sessions. If you need bio-informatic training on tools which are not covered in the existing catalog please feel free to contact us (please add “Request for training” in the subject of your demand). For example we have…

Continue Reading Command line training – genotoul-bioinfo

Alignment File Processing | Variant Analysis

Learning objectives Differentiate between query-sorted and coordinate-sorted alignment files Describe and remove duplicate reads Process a raw SAM file for input into a BAM for GATK The processing of the alignment files (SAM/BAM files) can be done either with samtools or Picard and they are for the most part interchangable….

Continue Reading Alignment File Processing | Variant Analysis

PhD Position to Develop Machine Learning Methods for Microbiome Analysis

Job:PhD Position to Develop Machine Learning Methods for Microbiome Analysis 0 Looking for a highly motivated PhD student for Computational Biology research, with an algorithm development focus. The Ecological and Evolutionary Signal-processing (EESI) and Informatics lab is doing a restart from the pandemic and will be composed of a dynamic,…

Continue Reading PhD Position to Develop Machine Learning Methods for Microbiome Analysis

HISAT2 paired end multiple files loop error

HISAT2 paired end multiple files loop error 0 Hi, I got stuck with running hisat2 with a loop. my input files are here, here is my loop code, for f in `ls -1 *_1_fp.fastq.gz | sed ‘s/_1_fp.fastq.gz//’ ` do hisat2 -rna-strandness RF -x GRCm39 -1 ${f}_1_fp.fastq.gz -2 ${f}_2_rp.fastq.gz 2> ${f}.log|…

Continue Reading HISAT2 paired end multiple files loop error