Categories
Tag: GREP
Installation with GPU – User discussions
GROMACS version: 2023.3GROMACS modification: Yes/NoHere post your question Im trying to install gromacs with SYCL GPU suport using $ cmake … -DGMX_BUILD_OWN_FFTW=ON -DREGRESSIONTEST_DOWNLOAD=ON -DGMX_GPU=SYCL -DGMX_SYCL_HIPSYCL=on but I got the error CMake Error at cmake/gmxManageSYCL.cmake:77 (message):HipSYCL build requires Clang compiler, but GNU is usedCall Stack (most recent call first):CMakeLists.txt:667 (include) and…
mysql – Trying to Authenticate the Slurm User via Keys Instead of Password Using the pam Plugin on MariaDB
mysql – Trying to Authenticate the Slurm User via Keys Instead of Password Using the pam Plugin on MariaDB – Database Administrators Stack Exchange Stack Exchange Network Stack Exchange network consists of 183 Q&A communities including Stack Overflow, the largest, most trusted online community for developers to learn, share their…
bcl2fastq problems with the yield
bcl2fastq problems with the yield 1 Dear people, I am currently trying to demultiplex the raw data files I got from NovaSeq with bcl2fastq tool. Well, the code seems to work right as I am getting the output, the problem is that the index mismatch rate is 99 to 100,…
bcl2fastq troubleshooting all reads dumped to “Undetermined”
Hi everyone, Another lab ran a single-end sequencing run on a NextSeq for us, but now they can’t properly demultiplex them. I’m trying to see if I can figure it out. I run bcl2fastq (newest version) on the files, but all reads are dumped to Undetermined_S0_L001_R1_001.fastq.gz I’ve got a SampleSheet.csv…
400 : Bad Request OAuth state missing from cookies – JupyterHub
I have setup two services of jupyterhub on single server which are running on different portroot@e2e-94-254:~# netstat -lntup | grep nodetcp 0 0 0.0.0.0:8000 0.0.0.0:* LISTEN 683069/nodetcp 0 0 0.0.0.0:20000 0.0.0.0:* LISTEN 682679/nodetcp 0 0 127.0.0.1:20002 0.0.0.0:* LISTEN 682679/nodetcp 0 0 127.0.0.1:8001 0.0.0.0:* LISTEN 683069/node I have setup nginx as…
Issues with Chromosome Encoding and VCF Annotation in dbSNP Alpha Release
Body: Hello, Biostars Community, I am working on creating a custom database of variants using the VCF from the latest dbSNP alpha release available at ftp.ncbi.nih.gov/snp/population_frequency/latest_release/. I have encountered a couple of issues that I’m hoping someone might help me resolve. Firstly, the chromosome encoding uses RefSeq IDs (e.g., NC_000007.12)…
How to resolve the error of protein lacking a stop codon when using GenomeThreader for homology prediction?
How to resolve the error of protein lacking a stop codon when using GenomeThreader for homology prediction? 0 Dear all,the error message and running process are as follows. Thank you for your answers. makeblastdb -in pudorinus.fa -parse_seqids -dbtype nucl -out index/pu& nohup tblastn -query all.pep.fa -out pu.blast -db index/pu -outfmt…
How to Install autodock-vina software package in Ubuntu 16.04 LTS (Xenial Xerus)
How to Install autodock-vina software package in Ubuntu 16.04 LTS (Xenial Xerus) autodock-vina software package provides docking of small molecules to proteins, you can install in your Ubuntu 16.04 LTS (Xenial Xerus) by running the commands given below on the terminal, $ sudo apt-get update $ sudo apt-get install autodock-vina…
r – ggplot colorbar align to top when using facets
I am using facetted plots via ggplot that contain a colorbar. I want to scale the colorbar to the size of the facetted plot. Following the ideas of @AllanCameron in this post regarding a single plot and tweaking the function to account for both the strip size and the space…
Help! RStudio starts to a blank screen after update; clean install worked, but not after time machine backup. – RStudio IDE
Error Information: Description of issue – After updating to OS Sonoma, RStudio is a blank window. I only see the name of the toolbar. There are no options in the toolbar either, it says empty. R app works fine. Attempted steps taken to fix –Reinstalled appsReinstalled OS system sonoma (didn’t…
Where do these snpeff annotation come from?
Where do these snpeff annotation come from? 0 I am annotating a VCF with annotation from snpeff, which I want to use eventually to parse for predicted loss of function variants I want to understand the annotation better and document how they are happening. I run this command: snpEff “hg38″…
Comparing 3 Data Sets using DeSeq and Heatmaps
Hi all, I am new to bioinformatics analysis, so I’d appreciate if someone could check my code for the goal I am trying to achieve. I have 3 samples – Wild Type (WT) FoxP3-TCF-HEB (I have 3 replicates of this) TCFKO I have defined these in the sample information csv…
Species coverage in the NCBI protein NR database ?
Hi Biostars, I am currently trying to build a Eukaryote version of the NCBI NR database and I am not really sure that I fully understand how the NR is implemented. Here is the code that I’m using to do so : #!/usr/bin/bash ############## # DOWNLOAD FULL NR ############## baseURL=”https://ftp.ncbi.nlm.nih.gov/blast/db/”…
Extracting list of identical items from several excel files
Extracting list of identical items from several excel files 0 Hi guys, I have 10 excel files with several columns and rows. The column headers are similar in all file but for the rows some items are similar while others are not. Below is an excerpt for one of the…
SNPs of a specific mouse strain
Hi, I wonder how can I get SNPs for a particular mouse strain like C57BL6. I have downloaded a mouse reference vcf from ftp.ebi.ac.uk/pub/databases/mousegenomes/REL-2112-v8-SNPs_Indels/mgp_REL2021_snps.rsID.vcf.gz Its header is #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 129P2_OlaHsd 129S1_SvImJ 129S5SvEvBrd A_J AKR_J B10.RIII BALB_cByJ BALB_cJ BTBR_T+_Itpr3tf_J BUB_BnC3H_HeH C3H_HeJ C57BL_10J C57BL_10SnJ C57BL_6NJ…
Functional filter for whole-genome sequencing data identifies HHT and stress-associated non-coding SMAD4 polyadenylation site variants >5 kb from coding DNA
Summary Despite whole-genome sequencing (WGS), many cases of single-gene disorders remain unsolved, impeding diagnosis and preventative care for people whose disease-causing variants escape detection. Since early WGS data analytic steps prioritize protein-coding sequences, to simultaneously prioritize variants in non-coding regions rich in transcribed and critical regulatory sequences, we developed GROFFFY,…
Wpad-basic-mbedtls check_conflicts_for wpad-basic-wolfssl – Installing and Using OpenWrt
I run 23.05.0 for some time now and do an opkg update I get (after upgrading upgradable): root@slate:~# opkg list-upgradable wpad-basic-mbedtls – 2023-09-08-e5ccbfc6-4 – 2023-09-08-e5ccbfc6-6 root@slate:~# opkg upgrade wpad-basic-mbedtls Upgrading wpad-basic-mbedtls on root from 2023-09-08-e5ccbfc6-4 to 2023-09-08-e5ccbfc6-6… Collected errors: * check_conflicts_for: The following packages conflict with wpad-basic-mbedtls: * check_conflicts_for: wpad-basic-wolfssl…
merge .pdata and .xdata sections from host object files
diff –git a/src/cmd/cgo/internal/test/callback_windows.go b/src/cmd/cgo/internal/test/callback_windows.gonew file mode 100644index 0000000..95e97c9— /dev/null+++ b/src/cmd/cgo/internal/test/callback_windows.go@@ -0,0 +1,133 @@+// Copyright 2023 The Go Authors. All rights reserved.+// Use of this source code is governed by a BSD-style+// license that can be found in the LICENSE file.++package cgotest++/*+#include <windows.h>+USHORT backtrace(ULONG FramesToCapture, PVOID *BackTrace) {+#ifdef _AMD64_+ CONTEXT context;+…
Print IP Addresses and Hostnames From Host File on Linux
The host file, located at /etc/hosts on Linux systems, is a crucial part of the networking configuration. It maps hostnames to IP addresses, allowing the system to resolve domain names locally before querying external DNS servers. Sometimes, it becomes necessary to view the entries in this file for troubleshooting or…
Optimizing Language Model Training: A Practical Guide to SLURM | by Viktorciroski | Nov, 2023
In the dynamic world of deep learning, pushing the boundaries of language models often bumps into the memory limits of individual GPUs, like the NVIDIA GeForce RTX 3090. With 24 GB of GDDR6X memory, it’s a powerhouse, but models such as Llama 2 can still stress these resources, causing headaches…
[Kernel-packages] [Bug 2007050] Autopkgtest regression report (initramfs-tools/0.142ubuntu15.1)
All autopkgtests for the newly accepted initramfs-tools (0.142ubuntu15.1) for mantic have finished running. The following regressions have been reported in tests triggered by the package: initramfs-tools/0.142ubuntu15.1 (armhf) Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1]. people.canonical.com/~ubuntu-archive/proposed-…
r – Adding dotted line to guide_colourbar() in a ggplot2 plot legend
To do this reliably without having to manually change each plot, you could create a new guide. The basic idea is described in Teunbrand’s answer to a related question about modifying the colour bar here library(grid) library(ggplot2) guide_linedcolourbar <- function(…, threshold = 1.5) { guide <- guide_colourbar(…, threshold = threshold)…
How To Get Chromosome Position Given Rs Number?
How To Get Chromosome Position Given Rs Number? 3 I have a list of a few hundred SNPs given by rs number. I want to get the chromosome and position for each SNP. For example: input: rs4477212 output: chr1:82154 snp chromosome position • 29k views you can download this information…
Sage 10.0 released
Congratulations to Volker and to all Sage contributors on the new release. 2023-05-21 01:00 UTC, Matthias Köppe: Thanks. > Some notes: > > – The list of “New Contributors” was also automatically generated > by GitHub. It did not know the contributors of previous versions. > I’ve manually removed many…
bedgraph-to-bigwig error – usegalaxy.eu support
Adrian1 November 12, 2023, 10:30pm 1 tool: bedgraph-to-bigwig input: bedgraph files from MACS2 output problem: result bigwig file has 0 lines tool standard output: hashMustFindVal: ‘GL456210.1’ not foundtool standard error” grep: write error: Broken pipe Error running wigToBigWig. more details: I used mm39 as the database/build for all my inputs,…
Bug#1055669: bcftools: test_vcf_merge failures on armhf: Bus error
Source: bcftools Version: 1.18-1 Severity: serious Tags: ftbfs Justification: ftbfs Control: forwarded -1 github.com/samtools/bcftools/issues/2036 Dear Maintainer, bcftools currently ftbfs on armhf due to multiple test_vcf_merge failures with Bus error[1]. I already informed upstream[2]. This bug is mostly to keep track of the issue on Debian side and eventually comment on possible Debian specific…
10 AI Code Generation Trends: See Examples
Artificial Intelligence (AI) has become an integral part of various disciplines, and software development is no exception. With the advent of AI, code generation has witnessed significant advancements. Today, AI applications can independently generate code, simplifying and accelerating the software development process. Exploring AI and code generation Introducing AI-generated coding…
Handle repository with different VCS folders.
Reported by Simon Tournier <zimon.toutoune@gmail.com> * guix/hash.scm (vcs-file?): Add optional argument for passing VCS kind of the file/repository. (file-hash*): Adjust accordingly. * guix/scripts/hash.scm (guix-hash)[file-hash]: Detect VCS kind of the file/repository and passes it. Change-Id: I8e286c3426ddefd664dc3a471d5a09e309824faa — guix/hash.scm | 18 ++++++++++++—— guix/scripts/hash.scm | 18 +++++++++++++—– 2 files changed, 25 insertions(+), 11…
How to use awk and grep -v option to exclude several patterns from several lines
How to use awk and grep -v option to exclude several patterns from several lines 1 I am trying to use a combination of awk and grep to filter several files and to exclude a couple of patterns. I have several options and I just get error. My commands that…
Differential Expression Analysis using Bioconductor (RStudio) and GEO2R (GEO)
Hello everyone, I’ve been having the same question for a while now. I’m also conducting my own analysis of differential expression on a microarray dataset in R. However, the data is different from the results obtained using GEO2R. Here’s my line of code: my_id <- “GSE80178” gse <- getGEO(my_id, GSEMatrix…
How to efficiently count missense mutations from an annotated vcf file?
How to efficiently count missense mutations from an annotated vcf file? 1 Hi! I am currently working on my undergraduate study about the frequency of missense mutations in early and advanced stages of early luminal breast cancer. The vcf file contains 47 transcriptomic samples–12 early (stage II) and 35 advanced…
Using easyPubMed and scholar package to get all citations of your paper
This is a tutorial on downloading all the citations for the articles present in any Google Scholar Profile. Use Case You might want to do it for your CV or help a friend. Update your lab website with the latest publication list. You might want to add all your published…
How to get just protein_coding genes using biomart in R
How to get just protein_coding genes using biomart in R 2 Dear all, I would like to have help with getting just protein_coding genes from gene expression file using biomart. What I have is a file of all genes expression for mouse (mm10) with ensemble gene_names, and I need to…
Building multi-tenant JupyterHub Platforms on Amazon EKS
Introduction In recent years, there’s been a remarkable surge in the adoption of Kubernetes for data analytics and machine learning (ML) workloads in the tech industry. This increase is underpinned by a growing recognition that Kubernetes offers a reliable and scalable infrastructure to handle these demanding computational workloads. Furthermore, a…
How to count fasta sequences efficiently using (or not ) biopython
How to count fasta sequences efficiently using (or not ) biopython 6 This is not a very memory friendly way of counting sequences from a multi fasta, any ideas to improve this? generator = SeqIO.parse(“test_fasta.fasta”,”fasta”) sizes = [len(rec) for rec in SeqIO.parse(“test_fasta.fasta”, “fasta”)] I’m avoiding using tools like grep since…
coverage of dnase-seq narrow peak file of genome
First, generate intervals for hg19 (perhaps stripping out non-nuclear and mitochondrial chromosomes): $ fetchChromSizes hg19 | awk -v FS=”\t” -v OFS=”\t” ‘{ print $1, “0”, $2; }’ | grep -v “[_*_|MT]” | sort-bed – > hg19.nuc.bed To calculate coverage: $ bedmap –skip-unmapped –delim ‘\t’ –echo –bases-uniq –echo-ref-size –bases-uniq-f hg19.nuc.bed <(sort-bed…
How to find OS duration, status information of patients in GSE1159 or GSE6891 (Microarray data analysis)
How to find OS duration, status information of patients in GSE1159 or GSE6891 (Microarray data analysis) 0 Hello. I want to analyze the survival of AML patients in GSE1159 and GSE6891. However, the datasets for these series do not have survival data (OS, status). Many papers have shown survival curves…
main-powerpc64le-default][misc/pytorch] Failed for pytorch-1.13.1_1 in build
You are receiving this mail as a port that you maintain is failing to build on the FreeBSD package build server. Please investigate the failure and submit a PR to fix build. Maintainer: y…@freebsd.org Log URL: pkg-status.freebsd.org/foul2/data/main-powerpc64le-default/pbc0e38d0f08e_s0afcac3e37/logs/pytorch-1.13.1_1.log Build URL: pkg-status.freebsd.org/foul2/build.html?mastername=main-powerpc64le-default&build=pbc0e38d0f08e_s0afcac3e37 Log: =>> Building misc/pytorch build started at Tue Oct 24…
A Bioconductor workflow for processing, evaluating,…
Introduction Proteins are responsible for carrying out a multitude of biological tasks, implementing cellular functionality and determining phenotype. Mass spectrometry (MS)-based expression proteomics allows protein abundance to be quantified and compared between samples. In turn, differential protein abundance can be used to explore how biological systems respond to a perturbation….
r – Using a dataframe name from the dataframes list in the axis name in a ggplot plot
I have a list of data frames Based on this, I prepare ggplot charts co_gpl <- lapply(co_lst, function(x) { ggplot(x) + geom_line (aes (x = `Czas`, y = get(names(x[grep(“Temp. zasilania”, colnames (x))])) ), color = “red3”, size = 0.3) + geom_line (aes (x = `Czas`, y = get(names(x[grep(“Temp. powrotu”, colnames…
How to find OS duration, status information of patients in GSE1159 (Microarray data analysis)
How to find OS duration, status information of patients in GSE1159 (Microarray data analysis) 0 Hello, many papers showed survival analysis using GSE1159 (N Engl J Med 2004; 350:1617-1628 and etc…). Although I got pData in GSE1159 in R, I didn’t find OS, status of patient infromation. how to get…
Filtering a 10X generated .bam file based on a list of barcodes
Hello everyone, Basically, I have clustered and annotated the barcodes in R, then I wanted to look at reads in several particular clusters in IGV. I generated a barcode list following the 10X tutorial as the picture below shown, briefly subset the clusters in R then tagged the barcodes with…
Why are some HUMAnN features in [name]_pathabundance.tsv missing taxonomy?
I’m using HUMAnN 3.8 I made a custom HUMAnN protein database using UniRef50 mappings. My command was the following: INPUT=veba_output/preprocess/S4/output/joined.fastq.gz OUTPUT=test_output DMND_DB_DIR=veba_output/misc/diamond_database/ NUM_THREADS=1 humann –input ${INPUT} –output ${OUTPUT} –protein-database ${DMND_DB_DIR} –threads ${NUM_THREADS} –bypass-nucleotide-search –input-format fastq.gz –translated-identity-threshold 50 –translated-query-coverage-threshold 80 –search-mode uniref50 –id-mapping veba_output/misc/humann_uniref_annotations.tsv Everything ran as expected however, I’m not…
[PAM Error 7] Authentication failure – JupyterHub
rick63 October 12, 2023, 8:54am 1 Hi,I installed Jupyterhub 4.0.2 in my pc linux (with Ubuntu 22.04 o.s.).I can to access to Jupyterhub by my system user login.So, I want to make any other user (for example “salvatore”) that can access to Jupyterhub.I make a generic user, add this user…
Is Guix full-source bootstrap a lie?
One of the biggest concern, in my humble opinion, about the current state of this awesome story is non-deterministic compilations. And especially at early stages, for example gash-boot. $ guix build -e ‘(@@ (gnu packages commencement) gash-boot)’ $ guix build -e ‘(@@ (gnu packages commencement) gash-boot)’ –check guix build: error:…
Alternative for grep in a for loop
Alternative for grep in a for loop 0 Hello Stars, I have two files list1.txt and list2.txt which look like this: cat list1.txt AT4G38910 3:17541308-17542307 AT4G38910 3:17639717-17640716 AT4G24540 1:25400514-25401513 AT4G24540 1:3398359-3399358 AT1G27730 1:4463470-4463858 AT1G27730 1:10073550-10074358 cat list1.txt | wc -l 650000 and cat list2.txt MYB94 AT3G47600 3:17541308-17542307 VPS29 AT3G47810 3:17639717-17640716…
Determine INDELs number (both classes separately) from reference and graph-based VCF files
Hi there, this is more so of a hint/suggestion post than a real question since I could manage to find some related posts here on Biostars but appreciate a feedback on the procedure/results for the analysis. In principle, I’m trying to compare the bwa-mem_GATK pipeline working on the linear reference…
[slurm-users] Slurm powersave
I’m experimenting with slurm powersave and I have several questions. I’m following the guidance from slurm.schedmd.com/power_save.html and the great presentation from our own slurm.schedmd.com/SLUG23/DTU-SLUG23.pdf I am running slurm 23.02.3 1) I’m not sure I fully understand ReconfigFlags=KeepPowerSaveSettings The documentations ways that if set, an “scontrol reconfig” command will preserve the current state of…
18S taxonomy assignment SILVA database formatting
Hi Bioinformatic community, I would like to classify 18S data (V7) of Fungi with assignTaxonomy from dada2. For that I downloaded SILVA_132_SSURef_tax_silva.fasta.gz from the SILVA website and need to format it, what I do with some Linux command line oneliner. But some species in the database have a different number…
bedtools intersect by position & stand not working even with common regions
bedtools intersect by position & stand not working even with common regions 0 i want to extract only sites (strand aware) that appear in both D-A-3_modpileup_5mC.chr1.bed and consensus_HighQual_motif_sites.bed but it’s not working even though they have overlapping sites. bedtools intersect -a D-A-3_modpileup_5mC.chr1.bed -b consensus_HighQual_motif_sites.bed -s Here’s common site with same…
Software tool to filter productive sequences
Software tool to filter productive sequences 1 Hello, I have a fasta file with different amino acid sequences, for example: >abc HSTSDSAQTMFPVALLLLAAGSCVKGEQLTQPTSVTVQPGQRLTITCQVSYSLGTYFTAW IRQPAGKGLEWIGMRSTGASYYKDSLKNKFSIDLDTSSKTVTLNGQNVQPEDTAVYYCAR APSRGFDYWGKGTMVTITSATPKGPTVFPL >def TARQIQHKPCFL*LCCCWQLDHV*RVNS*HSRPL*LCSQVNV*PSPVRSLILLVPTSQLG SDSLQEKDWSGLE*DLLELHTTKIH*RTSSVST*TLPAKL*L*MDRMCSLKTLLCITVPE RPVGVLTTGGKAPWSPSPRPPQRDQLCFL* >ghi GSQHVRFSTNHVSCSSAAVGSWIMCEG*TVDTADLCDCAARSTSDHHLSGLLFSW*LLHS LDQTACRKRTGVDWEQIYWSCILQRFIKEQVQYRLRHFQQNCDSKWTECAA*RHCCVLLC QTTGSGSWLLGERHHGHHHLGHPKGTNCVSS and I want to filter out the sequences that are “productive” from the “non-productive” ones. By “non-productive” I…
132releng-armv7-quarterly][misc/pytorch] Failed for pytorch-1.13.1_1 in configure
You are receiving this mail as a port that you maintain is failing to build on the FreeBSD package build server. Please investigate the failure and submit a PR to fix build. Maintainer: y…@freebsd.org Log URL: pkg-status.freebsd.org/ampere1/data/132releng-armv7-quarterly/2be22e0743b5/logs/pytorch-1.13.1_1.log Build URL: pkg-status.freebsd.org/ampere1/build.html?mastername=132releng-armv7-quarterly&build=2be22e0743b5 Log: =>> Building misc/pytorch build started at Mon Sep 25…
Microbiome and metabolome in home-made fermented soybean foods of India revealed by metagenome-assembled genomes and metabolomics
Grep-chhurpi, peha, peron namsing and peruñyaan are lesser-known home-made fermented soybean foods prepared by the native people of Arunachal Pradesh in India. Present work aims to study the microbiome, their functional annotations, metabolites and recovery of metagenome-assembled genomes (MAGs) in these four fermented soybean foods. Metagenomes revealed the dominance of…
Just adding the full trace that sage produces
Package: sagemath Version: 9.5-6 Followup-For: Bug #1052051 X-Debbugs-Cc: jordi.burguet.cast…@gmail.com Dear Maintainer, When running sage, there is an ImportError related to libsingular- Singular-4.3.1.so (full trace below). From what I can see, python3-sage depends on libsingular4m3n0: $ apt depends python3-sage | grep libsingular Depends: libsingular4m3n0 (>= 1:4.3.1-p3+ds) Depends: libsingular4-dev (>= 1:4.2.1-p2+ds-3) but…
Salmon index not progressing
Salmon index not progressing 0 Hi! I am having issue with salmon index formation since I cannot use STAR due to limited amount of RAM (as per my recent post). I tried to follow this tutorial on how to create decoy-aware transcriptome as well as doing directly this and I…
ensembldb::getGenomeTwoBitFile() not working for many species
That’s sort of a roundabout way of getting a TwoBit file. Why not query directly? > z <- query(hub, c(“xenopus tropicalis”,”twobit”)) > z AnnotationHub with 103 records # snapshotDate(): 2023-04-25 # $dataprovider: Ensembl # $species: Xenopus tropicalis, x… # $rdataclass: TwoBitFile # additional mcols(): # taxonomyid, genome, # description, #…
I need to retrieve a set of protein and mRNA sequences
Using EntrezDirect: $ esearch -db gene -query ‘7157’ | elink -db gene -target nuccore -name gene_nuccore_refseqrna | efetch -format fasta | grep “>” >NR_176326.1 Homo sapiens tumor protein p53 (TP53), transcript variant 14, non-coding RNA >NM_001407264.1 Homo sapiens tumor protein p53 (TP53), transcript variant 10, mRNA >NM_001407263.1 Homo sapiens tumor…
Starting Server for Non-Default Users in JupyterHub: 500 Internal Server Error – JupyterHub
I am encountering an issue with starting a server for non-default users in JupyterHub. When attempting to start a server for a user named “mahdi” (or any other non-default user), I receive the following error message in the JupyterHub container logs: [I 2023-09-20 07:40:15.461 JupyterHub provider:659] Creating oauth client jupyterhub-user-mahdi…
Usage of the `singleuser.image.name` configuration – Zero to JupyterHub on Kubernetes
Hello, members. Does someone tell me how to how to specify a private image registry?(Usage of the singleuser.image.name configuration) 1. Environment Z2JH: 3.0.3 k8s: v1.27.4 OS: Ubuntu 22.04 2. Question Could you tell me how to specify a private image registry?(It it possible specify http instead of https?) Could you…
How to Install bioperl-run software package in Ubuntu 16.04 LTS (Xenial Xerus)
How to Install bioperl-run software package in Ubuntu 16.04 LTS (Xenial Xerus) bioperl-run software package provides BioPerl wrappers: scripts, you can install in your Ubuntu 16.04 LTS (Xenial Xerus) by running the commands given below on the terminal, $ sudo apt-get update $ sudo apt-get install bioperl-run bioperl-run is installed…
How to Install bioperl software package in Ubuntu 16.04 LTS (Xenial Xerus)
How to Install bioperl software package in Ubuntu 16.04 LTS (Xenial Xerus) bioperl software package provides Perl tools for computational molecular biology, you can install in your Ubuntu 16.04 LTS (Xenial Xerus) by running the commands given below on the terminal, $ sudo apt-get update $ sudo apt-get install bioperl…
Identify genes for mapped reads with combined human-7HPV genome index
Hi, I have created a combined genome index with a human genome and 7 HPV genomes. I am running STAR aligner against this index with a number of cancer samples. From the idxstats file, I can see some mapped reads for a particular HPV chromosome: M14119.1 7931 6 0 If…
GERP++ (gerpcol) error on a test data
GERP++ (gerpcol) error on a test data 2 Hi guys, I’m wondering if there’s a tutorial or experienced users of GERP++ as I’m doing something wrong and need your help. In particular, I can’t get GERP++ work even on a test data set For example, when I run first part…
How to order a gff3 file by coordinates
I have discovered that my gff3 file is not in order at the time of defining the gene, mRNA and CDS. An example LG1 phytozomev10 gene 10835748 10846741 . – . ID=gene00257-v1.0-hybrid.v1.1;Name=gene00257-v1.0-hybrid LG1 phytozomev10 mRNA 10835748 10846741 . – . ID=mrna00257.1-v1.0-hybrid.v1.1;Name=mrna00257.1-v1.0-hybrid;pacid=27244575;longest=1;Parent=gene00257-v1.0-hybrid.v1.1 LG1 phytozomev10 CDS 10846566 10846741 . – 2 ID=mrna00257.1-v1.0-hybrid.v1.1.CDS.1;Parent=mrna00257.1-v1.0-hybrid.v1.1;pacid=27244575…
Is there a tool that sorts gtf files?
gff3sort.pl seems to make sure lines having no “Parent=” attribute comes before those having it, if chrom and start position are the same. I think with unix standard program it should go like this: $ (grep -v “Parent=” sortme.gtf;grep “Parent=” sortme.gtf)| sort -k1,1 -k4,4n -s EDIT: Should’nt we have to…
Selecting MOUSE gene set libraries in Enrichr
Selecting MOUSE gene set libraries in Enrichr 1 I see there are currently 212 gene set libraries available in Enrichr (maayanlab.cloud/Enrichr/index.jsp#). I am running Enrichr on the web and not in R. In scrolling through the 212 gene set libraries, I see there are mouse libraries available (i.e. KEGG_2019_Mouse) but…
How to remove fasta headers in a multifasta file and write file name as a fasta header?
How to remove fasta headers in a multifasta file and write file name as a fasta header? 3 I have fasta file namely 119XCA.fasta as shown below, >cellulase ATGCTA >gyrase TGATGCT >16s TAGTATG I need to remove all the fasta headers, keep the sequences one by one and need to…
How many ‘novel’ splice junctions/splice events are resonably expected from human RNA,
Hello all, I was just wondering what a reasonable percentage of ‘novel’ splice junctions/splice events is for human RNAseq data using the program junction_annotation.py. I am new to RNAseq and just running some published human RNAseq data through my pipeline in order to familiarize myself with the programs and protocols….
Top 25 RStudio Interview Questions and Answers
RStudio, a premier integrated development environment (IDE) for R programming language, has established itself as an indispensable tool for statisticians, data scientists and researchers. With its user-friendly interface, it provides powerful coding tools and makes the process of data analysis and visualization simpler and more effective. Its wide range of…
[slurm-users] Coexisting jobs with gres/shard and gres/gpu in the same GPU
Hi everyone, we have recently enabled sharding to allow GPU sharing by multiple jobs. According to SLURM documentation: once a GPU has been allocated as a gres/gpu resource it will not be available as a gres/shard (and vice versa). However, we had the situation where, on nodes with a single…
Issue with dbNSFP using SnpSift
Issue with dbNSFP using SnpSift 1 Hi everyone, I’m having trouble annotating a VCF with SnpSift and the dbnsfp option. From the documentation, running: java -jar SnpSift.jar dbnsfp -v -db path/to/db path/to/vcf > out.vcf …should annotate for all fields in the database. However, it seems that my files are only…
[slurm-users] Nodes stay drained no matter what I do
Hi Rob – Thanks for this suggestion. I’m sure I restarted slurmd on the nodes multiple times with nothing in the slurm log file on the node, but after # tail -f /var/slurm-llnl/slurmd.log # systemctl restart slurmd I started to get errors in the log which eventually lead me to the…
cell ranger custom gtf file
cell ranger custom gtf file 0 I wish to make a custom gtf file using a multiline fasta file which has multiple transcripts. e.g., >NM_001282823.1 prolactin receptor (PRLR), mRNA GCCAAGAGACTGGGAGTCAAAGAAAGTTTCTGAAATCAGTGGATTCTGCTTGAGAACAGAGCCTGGTTAT >NM_001682822.1 SNAP25 (SNAP25), mRNA GCCAAGAGACTGGGAGTCAAAGAAAGTTTCTGAAATCAGTGGATTCTGCTTGAGAACAGAGCCTGGTTAT >NM_001287822.1 CACNA1F (CACNA1F), mRNA GCCAAGAGACTGGGAGTCAAAGAAAGTTTCTGAAATCAGTGGATTCTGCTTGAGAACAGAGCCTGGTTAT Is there a way I could make a gtf file…
command to extract SRA fastq data summary
command to extract SRA fastq data summary 0 Hi, I was trying to calculate the total read length of all the sample present in bioproject in command utility as: code 1: esearch -db bioproject -query “PRJNA438426” | efetch -format docsum | xtract -pattern DocumentSummary -element RunTotalBases which is giving me…
PAMauthenticator with dummy Users and Passwords – JupyterHub
I want to use a PAMauthenticator to define user names and dummy passwords for a Python lecture. I did this in jupyterhub_config.py as follows: from jupyterhub.auth import PAMAuthenticator from jupyterhub.spawner import SimpleLocalProcessSpawner c.JupyterHub.authenticator_class = PAMAuthenticator c.PAMAuthenticator.open_sessions = False c.JupyterHub.spawner_class = SimpleLocalProcessSpawner c.Authenticator.whitelist = {‘user01′,’user02’} c.PAMAuthenticator.dummy_passwords = {‘user01’: ‘passwordUser01′,’user02’: ‘passwordUser02’} Login…
500 – INTERNAL SERVER ERROR
There are a few common causes for this error code including problems with the individual script that may be executed upon request. Some of these are easier to spot and correct than others. File and Directory Ownership The server you are on runs applications in a very specific way in…
Trimming of reads in miRNA-Seq data
Trimming of reads in miRNA-Seq data 0 Dear All, I have been trying to filter out reads from Fastq files from miRNA-Seq that we received. The read structure looks like the one shown in the figure below. I can use Cutadapt to filter out the adapter (we have the adapter…
Error executing nf-core/metaboigniter pipeline
Error executing nf-core/metaboigniter pipeline 0 I ran this command: export NXF_VER=22.10.8; nextflow run nf-core/metaboigniter -profile test The error obtained: Error executing process > ‘get_software_versions’ Caused by: Process `get_software_versions` terminated with an error exit status (127) Command executed: echo 1.0.1 > v_pipeline.txt echo 22.10.8 > v_nextflow.txt Rscript -e “cat(as.character(packageVersion(‘CAMERA’)),’\n’)” &> v_camera.txt…
main-armv7-default][misc/pytorch] Failed for pytorch-1.13.1_1 in configure
You are receiving this mail as a port that you maintain is failing to build on the FreeBSD package build server. Please investigate the failure and submit a PR to fix build. Maintainer: y…@freebsd.org Log URL: pkg-status.freebsd.org/ampere2/data/main-armv7-default/pd101ab5189f9_sb231322dbe/logs/pytorch-1.13.1_1.log Build URL: pkg-status.freebsd.org/ampere2/build.html?mastername=main-armv7-default&build=pd101ab5189f9_sb231322dbe Log: =>> Building misc/pytorch build started at Mon Aug 21…
TCGA gene expression quantitation batch information
I add samples to cart at GDC data portal, then downloaded them. I merge them with the R code below. Hope this will help library(data.table) library(tidyverse) ##You can generate gdc_sample_sheet.tsv at GDC data portal index=read.table(“gdc_sample_sheet.tsv”,sep=”\t”,header=TRUE) index=index[order(index$Sample.ID),] ##read files setwd(“where_you_download_your_data”) expr_file=index$File.Name mat=do.call(cbind,lapply(as.character(expr_file),function(x){fread(x,header=T,sep=”\t”)[,c(4)]})) exp_mat=read.table(as.character(index$File.Name[1]),sep=”\t”,header=T) mat=data.frame(exp_mat$gene_id,exp_mat$gene_name,exp_mat$gene_type,mat) mat=mat[5:nrow(mat),] colnames(mat)=c(“ensembl_gene_id”,”hgnc_symbol”,”gene_biotype”,index$new_id) ensg_id=unlist(strsplit(as.character(mat$ensembl_gene_id),split=”[.]”)) ensg_id=ensg_id[grep(“ENSG*”,ensg_id)] mat$ensembl_gene_id=ensg_id write.table(mat,”TCGA.tsv”,row.names =…
Pisces doesn’t like high-quality reads when there is a soft-clip affecting the full read.
When using Pisces, I get the following error. System.Exception: RACP2-6poolv4_P5-A_FINAL_SORTED.bam: Error processing chr ‘chr7’: Failed to process variants for MN01972:49:000H5KYMY:1:11102:26356:2968 … 150S —> System.Exception: Failed to process variants for MN01972:49:000H5KYMY:1:11102:26356:2968 … 150S at Pisces.Domain.Logic.CandidateVariantFinder.ProcessCigarOps(Read alignment, String refChromosome, Int32 readStartPosition, String chromosomeName) at Pisces.Logic.SomaticVariantCaller.Execute() at Pisces.Processing.Logic.BaseGenomeProcessor.ProcessByBam(BamWorkRequest workRequest, String chrName) — End…
CDS phase 0,1,2 in GFF format
The question was asked before in Calculate CDS phase in gff3 format ; Negative value in “phase” line of a gff3 file.What does it mean? ; etc… but I still don’t get it. So let’s use an existing GFF3 file: github.com/samtools/bcftools/blob/develop/test/csq/ENST00000580206/short.gff The GFF3 is valid in ‘bcftools csq’ This is…
efetch from NCBI E-utilities returns “curl error s 400 & 500” and takes a very long time
efetch from NCBI E-utilities returns “curl error s 400 & 500” and takes a very long time 0 I run this command to download ~4,000 gene sequences for invA gene for taxonomy# 28901. It works fine for smaller datasets, but … but takes very long time and never finishes for…
Building a mappability mask with SNPable
I am trying to build a mappability mask with Heng Li’s SNPable program lh3lh3.users.sourceforge.net/snpable.shtml . The instructions there are pretty brief and do not detail all the necessary steps, which makes it challenging for introductory bioinformaticians. Suppose the reference genome is genome.fa, copy-pasting the given instructions, they are: Extract all…
a cross-platform, efficient, practical and pretty CSV/TSV toolkit
Tool:csvtk – a cross-platform, efficient, practical and pretty CSV/TSV toolkit 2 Hi all, I’d like to share my another practical toolkit, csvtk, after introducing SeqKit yesterday. Introduction Similar to FASTA/Q format in field of Bioinformatics, CSV/TSV formats are basic and ubiquitous file formats in both Bioinformatics and data sicence. People…
Installing R and RStudio on Linux for Data Analysis
R is a versatile programming language and environment designed specifically for data analysis and statistical computing, making it an incredible choice for data-driven work. R has gained significant popularity across the data science, data analysis, data visualization, and statistical communities due to its extensive capabilities and active user community. In…
How to run quantifier from miRDeep2 (miRNA seqs from MirGeneDB)
How to run quantifier from miRDeep2 (miRNA seqs from MirGeneDB) 0 Dear all, I want to perform miRNA quantification using quantifier.pl from miRDeep2. As default miRDeep2 is compatible with miRBase but I have to perform the quantification with mature and precursor sequences (fasta format) of miRNAs from MirGeneDB. I ran…
Unintended behaviour when trying to remove gene version from ENSG
Unintended behaviour when trying to remove gene version from ENSG 0 Hi all, When I remove the gene version numbers from from the ENSG ID, those genes with the, “_PAR_”, suffix e.g. “ENSG00000002586.20_PAR_Y” “ENSG00000124333.16_PAR_Y” “ENSG00000124334.17_PAR_Y” “ENSG00000167393.18_PAR_Y” “ENSG00000169084.15_PAR_Y” aren’t being removed. I have tried using the following (obtained from stack) with…
transcripts missing from tx2gene
transcripts missing from tx2gene 2 How can I know the reference trascriptome used in the pre-computed index ? You can download the fasta transcriptome file archive (fasta, .fai index and chrome.sizes) used for that index here: refgenomes.databio.org/v3/assets/archive/2230c535660fb4774114bfa966a62f823fdb6d21acf138d4/fasta_txome?tag=default This should get you the table you need $ grep “^>E” 2230c535660fb4774114bfa966a62f823fdb6d21acf138d4.fa |…
bash – /dev/tty: No such device or address error
In the current directory, I have a list of multi-resolution .mcool files. I want to run the following code using EagleC and store the output (including the .stderr and .log files) in the ./../EagleC_output directory. Code: for mcool_file in *.mcool; do OUTPUT=$(echo $mcool_file | grep -oE “[A-Z0-9]*\.hg38” | cut -d…
124amd64-quarterly][misc/pytorch] Failed for pytorch-1.13.1_1 in build
You are receiving this mail as a port that you maintain is failing to build on the FreeBSD package build server. Please investigate the failure and submit a PR to fix build. Maintainer: y…@freebsd.org Log URL: pkg-status.freebsd.org/beefy2/data/124amd64-quarterly/1c331580481c/logs/pytorch-1.13.1_1.log Build URL: pkg-status.freebsd.org/beefy2/build.html?mastername=124amd64-quarterly&build=1c331580481c Log: =>> Building misc/pytorch build started at Thu Jul 27…
Useful Bash Commands To Handle Fasta Files
You will probably get a lot of different answers because there are many ways to parse fasta files with Bash and tools like grep, awk and sed. Here are some suggestions. To extract ids, just use the following: grep -o -E “^>\w+” file.fasta | tr -d “>” A useful step…
Extracting Atoms using make_ndx or select – User discussions
GROMACS version:2022.2GROMACS modification: No Hi @jalemkul @hess I have been trying to use gmx make_ndx and gmx select to extract from a base pdb for making an index file.Seems both the option doesn’t give the correct atoms associated to the residue. Is this a known phenomenon of Gromacs or am…
Problem with mamba/conda env create
Problem with mamba/conda env create 0 Hello everyone, I am new into this creating environment thing and not an expert in bioinformatics. So I do not know maybe this is a dumb question. I try to install packages, however I encountered with this warning message: Command: mamba env create –quiet…
LD search for multi-allelic variants
Of course. Here is an example: rs1557550 rs111368459 rs1632969 rs17206070 rs281861394 “ And all these variants present in my .bed file and are multi-allelic. To make my .bed files, I retained only the first instance of a variant when it was a multi-allelic variant that had multiple bi-allelic entries :…
[slurm-users] MaxMemPerCPU not enforced?
Hello, Matthew Brown <brow…@vt.edu> writes: > Minimum memory required per allocated CPU. … Note that if the job’s > –mem-per-cpu value exceeds the configured MaxMemPerCPU, then the > user’s limit will be treated as a memory limit per task Ah, thanks, I should’ve read the documentation more carefully. From my limited tests…
Problem with mamba/conda install
Problem with mamba/conda install 0 Hello everyone, I am new into this creating environment thing and not an expert in bioinformatics. So I do not know maybe this is a dumb question. I try to install packages, however I encountered with this warning message: Encountered problems while solving: – nothing…
[slurm-users] Custom Gres for SSD
Hi Shunran, we do something very similar. I have nodes with 2 SSDs in a Raid1 mounted on /local. We defined a gres ressource just like you and called it local. We define the ressource in the gres.conf like this: # LOCAL NodeName=hpc-node[01-10] Name=local and add the ressource in counts of…
[slurm-users] Flag OVERLAP in advanced reservation
Hello, I observe strange behavior of advanced reservations having OVERLAP in their flag’s list. If I create two advanced reservations on different set of nodes and a particular username is configured to only have an access to one with the flag OVERLAP, then the username can also run jobs on nodes in…
Problem with goseq – error
Hello, I’m trying to do GO analysis using goseq but I get this error: < In pcls(G) : initial point very close to some inequality constraints Loading required package: AnnotationDbi Loading required package: stats4 Loading required package: BiocGenerics Loading required package: paral>> as input I had this: To add more…