Tag: GREP

Trying to edit VCF file

Trying to edit VCF file 0 Hi, I’ve been trying to take some samples out of a file but it appears its only taken some of the information out. When I tried to run a code I had in R that works for all the samples it gave me an…

Continue Reading Trying to edit VCF file

Filtering mitochondrial reads from ATAC-Seq aligned reads- what to do with reads that have MT in RNEXT field

Hi all, I am trying to filter mitochondrial reads from my ATAC-seq data after trimming with Trimmomatic and then aligning with Bowtie2. After searching through many pipelines, I have found 2 ways that people often do this (both using inputs that are sorted and indexed BAM files): 1) with samtools…

Continue Reading Filtering mitochondrial reads from ATAC-Seq aligned reads- what to do with reads that have MT in RNEXT field

scripts/generate_errors.pl – third_party/github.com/ARMmbed/mbedtls – Git at Google

#!/usr/bin/env perl # Generate error.c # # Usage: ./generate_errors.pl or scripts/generate_errors.pl without arguments, # or generate_errors.pl include_dir data_dir error_file use strict; my ($include_dir, $data_dir, $error_file); if( @ARGV ) { die “Invalid number of arguments” if scalar @ARGV != 3; ($include_dir, $data_dir, $error_file) = @ARGV; –d $include_dir or die “No such…

Continue Reading scripts/generate_errors.pl – third_party/github.com/ARMmbed/mbedtls – Git at Google

scripts/ecc-heap.sh – third_party/github.com/ARMmbed/mbedtls – Git at Google

#!/bin/sh # Measure heap usage (and performance) of ECC operations with various values of # the relevant tunable compile-time parameters. # # Usage (preferably on a 32-bit platform): # cmake -D CMAKE_BUILD_TYPE=Release . # scripts/ecc-heap.sh | tee ecc-heap.log set –eu CONFIG_H=‘include/mbedtls/config.h’ if [ –r $CONFIG_H ]; then :; else echo…

Continue Reading scripts/ecc-heap.sh – third_party/github.com/ARMmbed/mbedtls – Git at Google

What Are The Most Common Stupid Mistakes In Bioinformatics?

Forum:What Are The Most Common Stupid Mistakes In Bioinformatics? 78 While I of course never have stupid mistakes…ahem…I have many “friends” who: forget to check both strands generate random genomic sites without avoiding masked (NNN) gaps confuse genome freezes and even species but I’m sure there are some other very…

Continue Reading What Are The Most Common Stupid Mistakes In Bioinformatics?

VCF header line counting

VCF header line counting 2 Hello happy bioinformaticians 🙂 It can be a very simple question but I want to ask that how can I count line (row) of header of VCF ? I can be done manually but I want to get accurate result. Thanks,BG vcf header • 6.4k…

Continue Reading VCF header line counting

[slurm-users] sbatch mem-per-gpu and gres interaction

Hello everybody, I am observing an interaction between the –mem-per-gpu, –cpus-per-gpu and –gres settings in sbatch which I do not understand. Basically, if the job is submitted with –gres=gpu:2 the –mem-per-gpu and –cpus-per-gpu settings appear to be observed. If the job is submitted with –gres=gpu:a100:2 the settings appear to be ignored…

Continue Reading [slurm-users] sbatch mem-per-gpu and gres interaction

Segmentation fault Biopython pairwise alignment

Segmentation fault Biopython pairwise alignment 0 Hi everybody ! I’m working in order to create my own pairwise sequence alignment program in Python. I use the pairwise2.align command from Bipython. When I use it with small sequences it works. I put the code bellow (2 for a match, -2 for…

Continue Reading Segmentation fault Biopython pairwise alignment

Upgrade to PyTorch 2.0 – DEV Community

Why Upgrade? Upgrade Objectives Python ≥ 3.8, ≤ 3.11 CUDA ≥ 11.7.0 CUDNN ≥ 8.5.0.96 Pytorch ≥ 2.0.0 “We expect that with PyTorch 2, people will change the way they use PyTorch day-to-day”“Data scientists will be able to do with PyTorch 2.x the same things that they did with 1.x,…

Continue Reading Upgrade to PyTorch 2.0 – DEV Community

grep value from html file

grep value from html file 1 I have 200 html files that contain information such as Filename, Filetype, total Sequences etc. Please see attached the screenshot I need to grep the Filename and Total Sequences from the Value column (in this screenshot I need IGM17-B_S162_read_1.fastq and the value 9237623) and…

Continue Reading grep value from html file

Problems with CP2K+PLUMED build

After successfully building local.psmp architecture of CP2K, I tried to install CP2K+PLUMED.  This failed, as I describe below. 1) I built PLUMED from source without errors, using the standard procedure:      ./configure –prefix=/storage/home/stm9/group/SOFTWARE/plumed-2.8.2/exe     make     make install 2) I created an architecture file local_PLUMED.psmp file by modifying local.psmp,…

Continue Reading Problems with CP2K+PLUMED build

r – First I had an error with GLIBCXX_3.4.30 and now I can’t create any more conda environments

I was using R in RStudio under a conda environment with various bioconductor packages. But suddenly I ran into this error when I tried to load a package: ImportError: /home/user/anaconda3/envs/dmcgb/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.30′ not found (required by /lib/x86_64-linux-gnu/libLLVM-13.so.1) It is not the first time that I have this error so I…

Continue Reading r – First I had an error with GLIBCXX_3.4.30 and now I can’t create any more conda environments

Subset FASTA file for taxonomy (phylum name) in R

Subset FASTA file for taxonomy (phylum name) in R 2 Dear all, I would like to subset a FASTA file so that I get the sequences belonging to a certain phylum (in my case: Nematoda). The headers of the FASTA file start with the phylum name, so I thought this…

Continue Reading Subset FASTA file for taxonomy (phylum name) in R

[slurm-users] Question about PMIX ERROR messages being emitted by some child of srun process

HI,   So I’m testing the use of Open MPI 5.0.0 pre-release with the Slurm/PMIx setup currently on NERSC Perlmutter system. First off, if I use the PRRte launch system, I don’t see the issue I’m raising here.   But, many NERSC users prefer to use the srun “native” launch…

Continue Reading [slurm-users] Question about PMIX ERROR messages being emitted by some child of srun process

count protein-coding genes per contig

count protein-coding genes per contig 1 If you have an annotation file, as for example, the following GTF from human: 1 havana gene 11869 14409 . + . gene_id “ENSG00000223972”; gene_version “5”; gene_name “DDX11L1”; gene_source “havana”; gene_biotype “transcribed_unprocessed_pseudogene”; 1 havana transcript 11869 14409 . + . gene_id “ENSG00000223972”; gene_version “5”;…

Continue Reading count protein-coding genes per contig

Get protein information from ensemblbacteria using interpro

Get protein information from ensemblbacteria using interpro 1 So I am trying to access protein information using interpro ids on ensemblbacteria. I have written a MySQL code in R however, I can’t quite figure out how to get protein information using the ids using programming language. I have put in…

Continue Reading Get protein information from ensemblbacteria using interpro

filter reads in BAM having a tag

filter reads in BAM having a tag 3 Anyone has a simple solution for filtering reads in a BAM/SAM file having a certain TAG? This came up trying to filter out reads from 10x without a proper CB tag defined (which is causing troubles in downstream analysis tools). I’m surprised…

Continue Reading filter reads in BAM having a tag

Filtering rows based on a list if all the entries of the list correspond to a specific row value

Filtering rows based on a list if all the entries of the list correspond to a specific row value 2 Hi, I have a table with millions of columns and hundreds of rows. There are two columns namely Genome and Gene. I also have a list of genes of interest…

Continue Reading Filtering rows based on a list if all the entries of the list correspond to a specific row value

Introducing Slurm | Princeton Research Computing – SLURM Examples –

OUTLINE   On total of the cluster systems (except Nobel and Tigressdata), addicts run programs of submitting scripts into the Slurm mission scheduler. A Slurm script must execute three-way things: prescribe the resource requirements for the workplace set the environment specify the work to be carrying going in the form of cup commands Below…

Continue Reading Introducing Slurm | Princeton Research Computing – SLURM Examples –

name ‘torch’ is not defined

Are you encountering this Python nameerror name torch is not defined error message right now? If you don’t have any idea how to troubleshoot this error, then continue reading. In this article, we’ll show you how you can fix the nameerror: name ‘torch’ is not defined in a simple way. Before we start, this…

Continue Reading name ‘torch’ is not defined

Solved GETTING THE SAME ERROR IN while running the line of

GETTING THE SAME ERROR IN while running the line of rule provide the activity names on the dots displayed on the scatter plot rules, library(arules) library(arulesViz) library(data.table) library(ggplot2) data <- fread(“severeinjury.csv”) data <- data[grep(“^23”, data$Primary.NAICS.Code),] length(data$Nature) length(data$Part.of.Body) length(data$Event) sum(is.na(data$Nature)) sum(is.na(data$Part.of.Body)) sum(is.na(data$Event)) data <- data[complete.cases(data[, c(“Nature”, “Part.of.Body”, “Event”)]),] data[, Nature :=…

Continue Reading Solved GETTING THE SAME ERROR IN while running the line of

Problem extracting the species marker genes from metaphlan4 database – StrainPhlAn

Hi, i met some problem extracting species marker genes from metaphlan4 database in step 2 of tutorial identifying strain transmission eventThe version information was as follows:$ strainphlan -vWed May 3 20:42:37 2023: StrainPhlAn version 4.0.3 (24 Oct 2022) when i run module load Anaconda3source activate metaphlan4module load Bowtie2 filename=$(cat ~/WORKSPACE/result/wzy/mother_infant/metaphlan4/SGB_input_transmission.txt…

Continue Reading Problem extracting the species marker genes from metaphlan4 database – StrainPhlAn

Modular Docs – Inference Engine Python demo

This is a preview of the Modular Inference Engine. It is not publicly available yet and APIs are subject to change. If you’re interested, please sign up for early access. The Modular Inference Engine is the world’s fastest unified inference engine, designed to run any TensorFlow or PyTorch model on…

Continue Reading Modular Docs – Inference Engine Python demo

Convert Accession Numbers in blast HIT output to Full Taxonomy

Convert Accession Numbers in blast HIT output to Full Taxonomy 1 I have the Hit table output from a BlastWeb search which presents itself basically like this: M_A00619 | XM_034926345.1 | 100.000 M_A00619 | OV754683.1 | 95.588 M_A00619 | OV754677.1 | 95.588 M_A00619 | OV737695.1 | 95.588 I want to…

Continue Reading Convert Accession Numbers in blast HIT output to Full Taxonomy

How to demultiplex a pooled fastq sequence file and extract each sample sequences

How to demultiplex a pooled fastq sequence file and extract each sample sequences 0 Hello all, I have a pooled sequence file named “ERR1806550_1.fastq.gz” containing single-end sequences. Now, I want to demultiplex this sequence file and extract 37 sample sequences of my interest from it. These are the barcode sequences…

Continue Reading How to demultiplex a pooled fastq sequence file and extract each sample sequences

deep learning – PyTorch not detecting AMD GPU although ROCM installed on Ubuntu 20.04 LST

OS Version: Ubuntu 20.04 LTS PyTorch Version: 2.0 ROCM version: 5.0.2 I installed a fresh copy of Ubuntu 20.04 LTS on my desktop with AMD Radeon RX 5700 XT GPU. Both ROCM and PyTorch installed fine. However, PyTorch is not able to detect GPU. Any pointers here? $ python -c…

Continue Reading deep learning – PyTorch not detecting AMD GPU although ROCM installed on Ubuntu 20.04 LST

[SOLVED] Runtimeerror: couldnt install torch

Usually, we often run into errors like “runtimeerror: couldn’t install torch.”. It is one of the most common errors that developers may encounter during running their code. The “Runtimeerror: couldnt install torch” error typically occurs because there is a problem with installing the PyTorch library on your system. We will…

Continue Reading [SOLVED] Runtimeerror: couldnt install torch

tx2gene.txt : transcript-to-gene mapping file

tx2gene.txt : transcript-to-gene mapping file 0 Hi, I am trying to quantify gene count from transcript abundance (from kallisto, salmon etc.) using Tximport. For that i have to create a transcript to gene mapping file. How can i create this? I created one with from GCF_013265735.2_USDA_OmykA_1.1_rna.fasta (Rainbow trout) fro ncbi…

Continue Reading tx2gene.txt : transcript-to-gene mapping file

rna seq – Why is there antisense sequence in RNAseq data

I’m looking at RNAseq data from CCLE. The data is paired-end. Take the cell line Hs578T and the gene HRAS as an example. The cell line carries a G12D mutation (c.35G>A), so the change in cds is: ggc ggtgtgggca agagtgcgct g – Wildtype CDS gAc ggtgtgggca agagtgcgct g – Mutant…

Continue Reading rna seq – Why is there antisense sequence in RNAseq data

pangenome – Create a diagram venn

pangenome – Create a diagram venn 1 Hello, I would like to know if you can help me. I want to make a venn diagram with the presence and absence data (.Rtab) of roary (example fragment, the real list is about 8000 genes): Gene StrainA StrainB StrainC group_633 1 0…

Continue Reading pangenome – Create a diagram venn

How do i search a FASTA database by sequence in seqkit?

How do i search a FASTA database by sequence in seqkit? 1 You could do it using seqkit grep or locate but in this case you should use a proper search program like blat instead. Login before adding your answer. Traffic: 1853 users visited in the last hour Read more…

Continue Reading How do i search a FASTA database by sequence in seqkit?

Question regarding the output of BCFtools merge tool for VCF files

Sorry for repeating the question again here as I did not get enough answer last time: I have 6 VCF files that contains SNPs only, were produced by GATK. Each VCF represent one individual animal from breed X, so they are biological replicates. I have also another 6 files from…

Continue Reading Question regarding the output of BCFtools merge tool for VCF files

Illumina HumanHT-12 V3.0 expression beadchip reading data

Edit November 28, 2020: Further reproducible code: A: GPL6883_HumanRef-8_V3_0_R0_11282963_A (illumina expression beadchip) — Most Illumina ‘chip’ studies that I have seen on GEO do not contain the raw data IDAT files. You can start with the tab-delimited file, but will also require the annotation file (contained in the *_RAW.tar file),…

Continue Reading Illumina HumanHT-12 V3.0 expression beadchip reading data

How to find newly submitted accessions in NCBI

How to find newly submitted accessions in NCBI 2 Dear all, I want to automate a process to identify newly submitted plant accessions in NCBI. I am scanning the NCBI FTP server, but I have not yet found any address to locate all SRA accessions. ftp.ncbi.nlm.nih.gov/ Does anybody have an…

Continue Reading How to find newly submitted accessions in NCBI

find and replace between two files

HI all, I know there’s a way to do this within Unix, but I cannot figure out how to do it with the functions that I know (grep, sed, awk, cut, paste). I am dealing with output from blast, so I thought I would try to see if anyone in…

Continue Reading find and replace between two files

zap.sh api scan config

I would like to use zap.sh or zap.jar for to scan openapi api, but I do not have to much luck yet. (Docker is not an option) So I have a problem with api scan with jar (but it also a problem with zap.sh) so I have already installed required…

Continue Reading zap.sh api scan config

Either at.20377 doesn’t exist or the content differs.

Source: at Version: 3.2.5-1 Severity: serious Control: tags -1 bookworm-ignore User: debian…@lists.debian.org Usertags: regression Dear maintainer(s), Your package has an autopkgtest, great. However, it fails on arm(64|el|hf) since September 2022 (and slightly longer on s390x). Can you please investigate the situation and fix it? I copied some of the output…

Continue Reading Either at.20377 doesn’t exist or the content differs.

Using chroot and PAM to hide directories from users on an HPC cluster

I recently needed to make the group’s cluster computing environment available to a third party that was not fully trusted, and needed some isolation (most notably user data under /home), but also needed to provide a normal operating environment (including GPU, Infiniband, SLURM job submission, toolchain management, etc.). After thinking…

Continue Reading Using chroot and PAM to hide directories from users on an HPC cluster

PyTorch 2.0 distribution that uses cuda only if available?

Hey folks, after upgrading to torch==2.0 yesterday, I have found that I am no longer able to run torch programs if the system doesn’t have CUDA. Here’s my observation on the various distributions: # PyTorch 2.0 for use with CUDA, CUDA libs get installed as pip deps if unavailable on…

Continue Reading PyTorch 2.0 distribution that uses cuda only if available?

error when attempting to install qiime2 2023.2 – Technical Support

dnfarsi (Dominic Farsi) April 10, 2023, 11:14am 1 Hello I appear to be having a similar issue. However I don’t have the (core dumped). I am using Ubuntu 22.04 LTS and installed miniconda and then natively installed qiime2. When using any qiime command I get this illegal instruction. Interestingly it…

Continue Reading error when attempting to install qiime2 2023.2 – Technical Support

r – Problems installing Biostrings. Failing to install GenomeInfoDb

I have seen this issue being recurrent and tried many options for the last two days but non yielded to correct installation of any of these packages. I used BiocManager as suggested in other issues, also tried to install from local source, nothing seems to be working. This issue started…

Continue Reading r – Problems installing Biostrings. Failing to install GenomeInfoDb

Users of spack-based GROMACS installations beware of possible performance loss! – User discussions

Hi, It has recently come to our attention default Spack builds of GROMACS use RelWithDebInfo instead of Release which is the default in our build system. Due to the lower optimization levels in RelWithDebInfo such build will run up to 20% slower than release builds. Therefore, I strongly recommend to…

Continue Reading Users of spack-based GROMACS installations beware of possible performance loss! – User discussions

Bug#1033820: node-snapdragon: autopkgtest regression: Cannot find module ‘snapdragon-node’

On 4/3/23 21:55, Paul Gevers wrote: > Hi yadd, > > On 03-04-2023 05:42, Yadd wrote: >> I’m unable to reproduce this issue: there is a link that provides >> snapdragon-node inside snapdragon-capture-set: > > I could by running the following on my laptop: > paul@mulciber ~ $ autopkgtest –no-built-binaries node-snapdragon…

Continue Reading Bug#1033820: node-snapdragon: autopkgtest regression: Cannot find module ‘snapdragon-node’

VEP-like tool for sequence ontology and HGVS annotation of VCF files

Mehari is a software package for annotating VCF files with variant effect/consequence. The program uses hgvs-rs for projecting genomic variants to transcripts and proteins and thus has high prediction quality. Other popular tools offering variant effect/consequence prediction include: Mehari offers predictions that aim to mirror VariantValidator, the gold standard for…

Continue Reading VEP-like tool for sequence ontology and HGVS annotation of VCF files

what does “exp1” mean in the gage() function?

what does “exp1” mean in the gage() function? 1 @james-w-macdonald-5106 Last seen 6 hours ago United States All the columns except for ‘stat.mean’ come from the input data. As an example, using the help page for gage: data(gse16873) cn=colnames(gse16873) hn=grep(‘HN’,cn, ignore.case =TRUE) dcis=grep(‘DCIS’,cn, ignore.case =TRUE) data(kegg.gs) data(go.gs) #go.gs with the…

Continue Reading what does “exp1” mean in the gage() function?

Problem with fatsq-dump

Problem with fatsq-dump 0 Hi, I am absolutely new in NGS data analysis and have just started working in centos. I installed sratoolkit with the commands : conda create –n sratoolkit_env –y conda activate sratoolkit_env conda install –c bioconda sra-tools –y Then as given in the Biostar Handbook (Bioinformatics Data…

Continue Reading Problem with fatsq-dump

Converting an output de-novo transcriptome assembled with Trinity to a .gff3 file

Converting an output de-novo transcriptome assembled with Trinity to a .gff3 file 2 Hello! I’ve de-novo assembled a transcriptome from Trinity, resulting into Trinity.fasta, whose headers look like this: >TRINITY_DN29256_c0_g1_i1 len=323 path=[0:0-322] Followed, in the next line, by the sequence. To run an external downstream analysis with a R script,…

Continue Reading Converting an output de-novo transcriptome assembled with Trinity to a .gff3 file

samtools idxstats not removing ChrM

samtools idxstats not removing ChrM 2 I am trying to remove ChrM from my ChIP-seq data. Below is my pipeline for one sample up to where I am having the issue (samtools idxstats). The output file from samtools idxstats is the same size as the input so it doesn’t look…

Continue Reading samtools idxstats not removing ChrM

Functional metagenomics uncovers nitrile-hydrolysing enzymes in a coal metagenome

Introduction Cyanide-containing compounds are known as nitriles and are widely distributed in the natural environment. They are generated by different plants in various forms, such as ricinine, phenyl acetonitrile, cyanogenic glycosides, and β -cyanoalanine (Sewell et al., 2003). Anthropogenic activities have substantially influenced the production of vast quantities of nitrile…

Continue Reading Functional metagenomics uncovers nitrile-hydrolysing enzymes in a coal metagenome

how to make a .tbi file of .gtf.gz?

how to make a .tbi file of .gtf.gz? 2 Hello, I have a .gtf.gz file which I am going to use in a python code. for using the pysam module in python it requires an indexed file for gtf.gz? How can I index that file? Thank you in advance. tbi…

Continue Reading how to make a .tbi file of .gtf.gz?

Please give me a grep command to get Gene IDS and TPM values from a stringtie output gtf file

Please give me a grep command to get Gene IDS and TPM values from a stringtie output gtf file 2 Hi, Could anyone please give me a grep command to get gene_id and respective TPM values from a string tie output file. My result output file looks like the following…

Continue Reading Please give me a grep command to get Gene IDS and TPM values from a stringtie output gtf file

Tool To Find Out If Fastq Is In Sanger Or Phred64 Encoding?

Tool To Find Out If Fastq Is In Sanger Or Phred64 Encoding? 9 Is there a simple tool I can use to quickly find out if a FASTQ file is in Sanger or Phred64 encoding? Ideally something that tells me ‘Encoding XX’ somewhere the terminal output. fastq tools • 46k…

Continue Reading Tool To Find Out If Fastq Is In Sanger Or Phred64 Encoding?

How to extract phased haplotypes from GATK HaplotypeCaller

I would like to extract the physically phased haplotypes from a VCF file generated by GATK’s HaplotypeCaller on Illumina data of some isolates from different yeast (S. cerevisiae) strains. According to this FAQ: In the format field of a PGT (Pre-Implantation Genetic Testing) VCF, you may find a description similar…

Continue Reading How to extract phased haplotypes from GATK HaplotypeCaller

samtools idxstats versus samtools view command

samtools idxstats versus samtools view command 1 Hi, I have mapped RNA-seq data to the human genome concatenated with a viral genome (26 chromosomes in total) with bowtie and need to get some numbers to calculate FPKM values manually for one viral gene, to retrieve the “total number of reads”…

Continue Reading samtools idxstats versus samtools view command

no module named torch.fx [SOLVED]

In this post, you will learn the solutions to resolve the modulenotfounderror: no module named ‘torch.fx’ error which is encountered of all programmers in python language. Before we proceed to solve the solutions, we will discuss first if what is the meaning and usage of ‘torch.fx’. What is torch.fx? The…

Continue Reading no module named torch.fx [SOLVED]

Resolving abbreviated bacterial names

Resolving abbreviated bacterial names 2 Hi community, I am working in the area of text mining and working with full-text articles. I encounter a number of bacterial names and their abbreviated forms also. But I have issues resolving the text for example “M. chelonae is a rapidly growing mycobacterium.” So…

Continue Reading Resolving abbreviated bacterial names

Extract reads within given region, and their mates

Extract reads within given region, and their mates 1 Hi there, I want to extract all the reads located inside a given location. I would like to extract also the mates of those reads, because I’m working with PE data. I know that I can extract the reads using samtools…

Continue Reading Extract reads within given region, and their mates

featureCounts with NCBI T2T not capturing all genes

Hello, My team would greatly appreciate assistance with running featureCounts using the human NCBI T2T assembly (assembly (T2T-CHM13v2.0) as a reference; when we run it we end up with nearly 14,000 fewer genes than what the annotation supposedly contains.What (if any) modifications can be made to run Subread or RSubread…

Continue Reading featureCounts with NCBI T2T not capturing all genes

How to extract FASTA headers in R

How to extract FASTA headers in R 1 I have downloaded a reference uniprotkb FASTA file. How can I only extract the FASTA headers of each gene (raw-wise) into a CSV file using R? r bioinformatics • 26 views • link updated 1 hour ago by cfos4698 &utrif; 670 •…

Continue Reading How to extract FASTA headers in R

Rstudio and conda: GLIBCXX_3.4.30 not found – RStudio IDE

I get the following error when running library(stringi) and many other packages in RStudio. I do not get this error if I run R in the terminal. >> library(stringi) Error: package or namespace load failed for ‘stringi’ in dyn.load(file, DLLpath = DLLpath, …): unable to load shared object ‘/home/user/miniconda3/envs/r-4.2/lib/R/library/stringi/libs/stringi.so’: /lib/x86_64-linux-gnu/libstdc++.so.6:…

Continue Reading Rstudio and conda: GLIBCXX_3.4.30 not found – RStudio IDE

About next_reneighbour – LAMMPS Development

Dear LAMMPS Developers and users, I have a small confusion or am struggling to understand about next_reneighbor flag in fixes. Few fixes uses this flag and set to update->ntimestep which forces reneighboring immediately. I am using couple fixes: bond/create and bond/break. My confusion is: does the reneighboring takes place when…

Continue Reading About next_reneighbour – LAMMPS Development

Bwa mem different alignment results for the same reference genome

Bwa mem different alignment results for the same reference genome 0 I used a genome A and an A+B genome to construct two A.db and AB.db with bwa respectively. The reads can be alignment with A alone, but only the B genome is alignment in the results of AB. I…

Continue Reading Bwa mem different alignment results for the same reference genome

clusterProfiler for KEGG enrichment (non-model species) Over-Representation Analysis

Hi there! I would like to perform KEGG enrichment with some differentially expressed gene data from RNAseq data. I am working on a non-model organism. I have 1) KEGG to GeneName Mapping head(expr5_FS_final) KEGG unigene_FS 1 K02727 FS_gene_1 2 K17277 FS_gene_3 3 K17307 FS_gene_10 4 K14453 FS_gene_11 5 K14700 FS_gene_11…

Continue Reading clusterProfiler for KEGG enrichment (non-model species) Over-Representation Analysis

LAMMPS hangs with OpenMPI – LAMMPS Installation

Dear all, I am compiling LAMMPS 8Feb23 on an old cluster. Here are the details: OS: Linux “Ubuntu 16.04.4 LTS” 4.13.0-39-generic Compiler: GNU C++ 5.4.0 20160609 with OpenMP not enabled C++ standard: C++11 MPI v3.1: Open MPI v4.1.5, package: Open MPI otello@vikos Distribution, ident: 4.1.5, repo rev: v4.1.5, Feb 23,…

Continue Reading LAMMPS hangs with OpenMPI – LAMMPS Installation

Processing Tandem Repeats Finder (Trf) Output For Downstream Motif Analysis

Processing Tandem Repeats Finder (Trf) Output For Downstream Motif Analysis 4 I have used Tandem Repeats Finder (TRF) for tandem repeat search in my fasta files. Output looks like this: Sequence: ENSG01 Parameters: 2 5 7 80 10 50 2000 1053 1139 4 22.2 4 67 2 62 28 4…

Continue Reading Processing Tandem Repeats Finder (Trf) Output For Downstream Motif Analysis

Can someone help me with searching overlapping values between two files?

Can someone help me with searching overlapping values between two files? 0 I have two files, one file (file1.csv) contains a single column with 1200 values, 11-digits long, For instance: 00000001111 00000001152 etc. Another file (file2.csv) contains 8 columns and consists of several tabs, in the third column there are…

Continue Reading Can someone help me with searching overlapping values between two files?

PhD Position to Develop Machine Learning Methods for Microbiome Analysis

Job:PhD Position to Develop Machine Learning Methods for Microbiome Analysis 0 Looking for a highly motivated PhD student for Computational Biology research, with an algorithm development focus. The Ecological and Evolutionary Signal-processing (EESI) and Informatics lab is doing a restart from the pandemic and will be composed of a dynamic,…

Continue Reading PhD Position to Develop Machine Learning Methods for Microbiome Analysis

Using LiftOver to change genomic build

Using LiftOver to change genomic build 0 Hi, all – Two questions about using LiftOver: The .bed file changes after using LiftOver. Correct me if I’m wrong, but I can just use the .bim and .fam file from before LiftOver as those do not change? I have used LiftOver to…

Continue Reading Using LiftOver to change genomic build

Getting a curl: (22) The requested URL returned error: 500 ERROR

Getting a curl: (22) The requested URL returned error: 500 ERROR 0 Hi dear friends I am trying to use a list of assemblies (one by line) to download some data from ncbi. Example of the list: GCA_937921735.1 GCA_937897655.1 GCA_902386345.1 GCA_902386385.1 GCA_902386595.1 I am using this command (that I think…

Continue Reading Getting a curl: (22) The requested URL returned error: 500 ERROR

reads aligned concordantly exactly 1 time

Good evening, I’d like to compare the alignment quality of hisat2, bowtie2 and bwa for my files. The first 2 packages output the percentage of reads aligned concordantly exactly 1 time, bwa does not, because does not output alignment summary. The samtools flagstat report is not enough, because it outputs…

Continue Reading reads aligned concordantly exactly 1 time

Bioconductor, how to select a subset of samples in an ExpressionSet?

I’m working on an R script that downloads gene expression data from GEO, through Bioconductor and the getGEO() function. These commands download all the 436 samples of the repository, but I’m only interested in 157 of them. Precisely, I’m interested in handling only the “samples collection:ch1” column with values “”on…

Continue Reading Bioconductor, how to select a subset of samples in an ExpressionSet?

Automated dbSNP lookup by rsID position, plus genome build liftover

Hola, just passing by to say ‘hi’. Please post bugs / suggestions as comments to this tutorial. rsID to position GRCh38 cat rsids.list rs1296488112 rs1226262848 rs1225501837 rs1484860612 rs1235553513 rs1424506967 cat rsids.list | while read rsid ; do pos=$(curl -sX GET “https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=snp&id=$rsid&retmode=text&rettype=text” | sed ‘s/<\//\n/g’ | grep -o -P ‘\<CHRPOS\>.{0,15}’ |…

Continue Reading Automated dbSNP lookup by rsID position, plus genome build liftover

[slurm-users] srun: Job step aborted

Hi all, I’m facing the following issue with a DGX A100 machine: I’m able to allocate resources, but the job fail when I try to execute srun, follow a detailed analysis of the incident: “` $ salloc -n1 -N1 -p DEBUG -w dgx001 –time=2:0:0 salloc: Granted job allocation 1278…

Continue Reading [slurm-users] srun: Job step aborted

finding error to run edgeR , please check my code to be helpful for finding error and solving it. difficulty in finding the next steps of the code because of the occurring errors.

library(edgeR) counts <- read.delim(“GSE116959_series_matrix.txt”, row.names = 1) head(counts) data <- read.table(“annotation.txt”,header=TRUE , sep = “\t”) data head(data) d0<- DGEList(counts=counts , group = factor(counts)) d0 dim(d0) d0.full <- d0 #keep the old one in case we mess up countsPerMillion <- cpm(d0) summary(countsPerMillion) countCheck <- countsPerMillion > 1 head(countCheck) keep <- which(rowSums(countCheck)…

Continue Reading finding error to run edgeR , please check my code to be helpful for finding error and solving it. difficulty in finding the next steps of the code because of the occurring errors.

DE Analysis on cells from a patient derived mouse xenograft with high levels of mouse count “contamination”

I am performing a differential expression analysis for collaborators. The overall biological design from my collaborators is as follows: 1) Received patient sample. 2) Amplified patient sample using patient derived xenograft (PDX) in a mouse host. 3) Extracted cells from mouse and enriched for human cells by positive selection using…

Continue Reading DE Analysis on cells from a patient derived mouse xenograft with high levels of mouse count “contamination”

sage starts from command line but not from desktop menu

Package: sagemath Version: 9.5-6 Severity: normal X-Debbugs-Cc: jorge.m…@gmail.com Dear Maintainer, Sage fails to start from the graphical menu with “Failed to execute default Terminal Emulator” “Input/output error”. The problem is confined to the menu launcher: sage starts without problem from the command line (e.g. typing “sage -n” from the terminal emulator). —————– …

Continue Reading sage starts from command line but not from desktop menu

Stop BLAST from phoning home

Some time back I learned from Devon Ryan on the bird app (no link because I have stopped using said app) that BLAST phones home every time you used it, by default. I was never aware of this until I saw the post and I’m not really a fan of…

Continue Reading Stop BLAST from phoning home

extract pattern using grep/sed

extract pattern using grep/sed 1 Hi Pm4.1LM10m04850 0.24924Pm4.1LM01m05240 0.02328Pm4.1LM01m11200 -0.02328Pm4.1LM01m11050 0.02899Pm4.1LM03m10920 0.04638Pm4.1LM00m08740 -0.04638Pm4.1LM09m10890 0.18085Pm4.1LM05m02500 0.23509Pm4.1LM03m01390 This is my query data above. I want to exclude those digits that are in bold and the rest in rows like this: Pm4.1LM10m04850 Pm4.1LM01m05240 Pm4.1LM01m11200 so on.. I tried using grep command: grep “Pm4.1LM[0-9]”…

Continue Reading extract pattern using grep/sed

failed // cgroups v1 problem

Hi, I’m experiencing a strange issue related to a CPU swap (8352Y -> 6326) on two of our nodes. I adapted the slurm.conf to accommodate the new CPU: slurm.conf: NodeName=ice27[57-58] CPUs=64 Sockets=2 CoresPerSocket=16 ThreadsPerCore=2 Realmemory=257550 MemSpecLimit=12000 which is also what slurmd -C autodetects: NodeName=ice2758 CPUs=64 Boards=1 SocketsPerBoard=2 CoresPerSocket=16…

Continue Reading failed // cgroups v1 problem

Hub Error about SQLite3 Version – Zero to JupyterHub on Kubernetes

sam123 February 8, 2023, 5:01pm 1 Hi, there, I rebuild Hub docker image based on amazon linux2. When I tried to run it locally, I got error:For the sqlite version error: sqlalchemy.exc.NotSupportedError: (sqlite3.NotSupportedError) deterministic=True requires SQLite 3.8.3 or higher The default SQLite coming with amazon linux2 is 3.7.17. However, I…

Continue Reading Hub Error about SQLite3 Version – Zero to JupyterHub on Kubernetes

Is My Bam File Sorted ?

Is My Bam File Sorted ? 5 You can use the sort order (SO) flag in the header to check if the file has been sorted: % samtools view -H 5_110118_FC62VT6AAXX-hg18-unsort.bam @HD VN:1.0 SO:unsorted % samtools view -H 5_110118_FC62VT6AAXX-hg18-sort.bam @HD VN:1.0 SO:coordinate Unfortunately samtools index will work on both types…

Continue Reading Is My Bam File Sorted ?

linux – Check if folder contains files with extensions and write directories into categories

unary operator expected is because [ and * (in your *fastq.gz) work independently. [ is not shell syntax. [ is a regular command (a builtin in Bash, but still a command) and ] is its last argument, a mandatory one. Anything in between is an argument too. The shell expands…

Continue Reading linux – Check if folder contains files with extensions and write directories into categories

Using Entrez Utilities to query the Nucleotide database by collection_date

Using Entrez Utilities to query the Nucleotide database by collection_date 1 Greetings, I was wondering if there was a way to query the NCBI nucleotide database using E-utilities by collection_date. In the image below I retrieved GenBank file data. Column E is the collection date. Is the only way to…

Continue Reading Using Entrez Utilities to query the Nucleotide database by collection_date

molecular modeling – Model coordination complex using GROMACS or CP2K

Here is how you can get the structure from PubChem, read the SMILES into Avogadro2, do a MMFF94 (classical) optimization, and then a single-point energy calculation using NWChem (RHF, 6-31G*). The Avogadro2 Input builder will create a CP2K file if you prefer. For GROMACS, there is extensive documentation that includes…

Continue Reading molecular modeling – Model coordination complex using GROMACS or CP2K

22.10 – Configuring MySQL for SLURM

I’m having problems getting SLURM (for job scheduling) to work with a MySQL database. I was using this as a reference, but perhaps I misunderstood something in it. If someone can let me know what I’ve missed, that would be great… This is SLURM 21.08 on Ubuntu 22.10. I’m using…

Continue Reading 22.10 – Configuring MySQL for SLURM

Add Information to Protein Fasta Headers

Add Information to Protein Fasta Headers 1 Hi, I have protein fasta file whose headers look like ‘>evm.model.chr.9.52’. There are almost 30k+ proteins. I have performed functional annotations and also added every information to gene structure we get from EVM. The thing is, in that files I had columns so…

Continue Reading Add Information to Protein Fasta Headers

email – Troubleshooting slurm e-mail settings

I am trying to setup a slurm installation and I have advanced towards the e-mail stage. So far I do not receive any mails. I have a working setup using msmtp-mta and msmtp. When I batch a script the slurmctld log shows email msg to **@**: Slurm Job_id=73 Name=example_script.sh Began,…

Continue Reading email – Troubleshooting slurm e-mail settings

How to Calulate Allele Frequency from a VCF File?

I have a VCF file with 200 samples (mitochondrial genome of Plasmodium falciparum). Here is a pic to take a look at: And a few relevant lines from the actual file: ##INFO=<ID=AC,Number=A,Type=Integer,Description=”Allele count in genotypes, for each ALT allele, in the same order as listed”> ##INFO=<ID=AF,Number=A,Type=Float,Description=”Allele Frequency, for each ALT…

Continue Reading How to Calulate Allele Frequency from a VCF File?

Tool for aligning short protein sequences

Tool for aligning short protein sequences 2 Hi, I have a file that looks like: >ref_frame=1 XFKKNLAFLQKKAKEFSSEQTRANSPTRRELQVWGRDNNSPSEA >ref_frame=2 FLKKIWPSYKKRPKNFLQSRPEPTAPPEESFRSGVETTTPPQKQ >ref_frame=3 F*KKSGLPTKKGQRIFFRADQSQQPHQKRASGLG*RQQLPLRSR >read1_frame=1 FFKKNLAFLQKKAKEFSSEQTRANSPTRRELQVWGRDNNSPSEA >read1_frame=2 FLKKIWPSYKKRPKNFLQSRPEPTAPPEESFRSGVETTTPPQKQ >read1_frame=3 F*KKSGLPTKKGQRIFFRADQSQQPHQKRASGLG*RQQLPLRSR I want to do a protein alignment where I align each read frame against each ref frame. What tool can I use to…

Continue Reading Tool for aligning short protein sequences

High ref mismatch rate after liftOver from 23andme hg19 to hg38

I lifted some 23andme files from hg19 to hg38 using the following workflow in R calling samtools,plink and liftOver: library(tidyverse) #set working directory to data directory trio_wd <- str_glue(here::here(),’/trio/K/’) #create file list for raw data file_list <- str_c(trio_wd,dir(trio_wd)) %>% str_extract(‘genome.+\\d.txt’) %>% str_extract(‘^(?:(?!admix).)+$’) %>% unique() %>% {.[!is.na(.)]} %>% str_c(trio_wd,.) #liftover loop…

Continue Reading High ref mismatch rate after liftOver from 23andme hg19 to hg38

r – Calibri font on Mac in ggplot

I am on Mac and need to use Calibri font for all ggplots, but i cannot make it work with either extrafont or showtext packages: font_import(prompt = FALSE, pattern = “calibri”) returns Scanning ttf files in /Library/Fonts/, /System/Library/Fonts, /System/Library/Fonts/Supplemental, ~/Library/Fonts/ … Extracting .afm files from .ttf files… Error in data.frame(fontfile…

Continue Reading r – Calibri font on Mac in ggplot

Redirect Stdout From twoBitToFa in Bash

Hi there, I’ve been going round and round trying to figure out a way to redirect stdout to use with seqtk. I’ve read many posts that are similar, so I am confident the answer is out there, but since I’m having such a hard time I figured I might as…

Continue Reading Redirect Stdout From twoBitToFa in Bash

Compressing BAM, SAM, CRAM | Genozip

How good is Genozip at compressing BAM files? ​ See Benchmarks. ​ Compressing a BAM, SAM or CRAM file  ​ In the rest of this page we will give examples of BAM files. Genozip is also capable of compressing SAM files, and with some limitations, CRAM files as well. ​…

Continue Reading Compressing BAM, SAM, CRAM | Genozip

lazy loading failed, unable to load shared object rtracklayer.so

Hello! I am working on analyzing a dataset I created with the 10x Chromium Single Cell Multiome kit. In order to add gene annotation to the ATAC data, I am trying to install and use the “EnsDb.Mmusculus.v79” and “BSgenome.Mmusculus.UCSC.mm10” packages with bioconductor. The same ERROR has come up repeatedly whenever…

Continue Reading lazy loading failed, unable to load shared object rtracklayer.so

Samtools Convert Sam To Bam With Code Examples

Samtools Convert Sam To Bam With Code Examples In this session, we’ll try our hand at solving the Samtools Convert Sam To Bam puzzle by using the computer language. The code that follows serves to illustrate this point. # Basic syntax: samtools view -S -b sam_file.sam > bam_file.bam # Where:…

Continue Reading Samtools Convert Sam To Bam With Code Examples

Number of sequences in RefSeq.

Number of sequences in RefSeq. 2 Dear colleagues I can not understand. When I download all the genomic sequences from the refseq database, after counting, I see that there are much fewer records than presented in the release (123394 organisms ftp.ncbi.nlm.nih.gov/refseq/release/release-notes/RefSeq-release214.txt). What am I doing wrong? 1. wget ftp.ncbi.nlm.nih.gov/genomes/refseq/assembly_summary_refseq.txt 2….

Continue Reading Number of sequences in RefSeq.

Subset row-entries according to a list

Subset row-entries according to a list 1 Hello! I want to subset a selected dataset (a list of entries) from a big data file. I have a list named “contig.list” that looks like this: Contig_339241_4 Contig_1004621_3 Contig_1666_1 Contig_836268_32 Contig_1479_10 Contig_640297_1 Contig_365838_1 .. I want to subset the entries of this…

Continue Reading Subset row-entries according to a list

Rsubread featurecounts

Rsubread featurecounts 1 Hi there, I seem to be getting this error when reading in a BAM file which was generated by PBMM2 align on pacbio data. I have tried to google the error message but there are no results. I wonder if anyone has ideas on what the error…

Continue Reading Rsubread featurecounts

Merge multiple text files to create a combined dataframe and rename columns in R – General

Hi, I have multiple .txt files (each file contains 4 columns; an identifier Gene column, a raw_counts and other columns). I would like to merge those files into a combined dataframe using the common gene column. I was able to import multiple .txt files together, merge based on identifier column,…

Continue Reading Merge multiple text files to create a combined dataframe and rename columns in R – General

Unable to install bioconda packages in conda environments

From your command line it appears you are on windows. There are several veresions of pybedtools on bioconda, however, if I grep through them, they are all for the linux platform. If you’re on Windows 10, you could consider setting up the ‘windows subsystem for linux’ (and possibly Xming), installing…

Continue Reading Unable to install bioconda packages in conda environments