Tag: cluster

parallel processing – Forcing subsequent SLURM jobs to wait until first job is done?

I am running 1000 jobs on a cluster, using a sbatch job array. I’ve set up my code such that if the job array index is set to 0, precomputations are executed and saved to file; the jobs 1-999 then access these precomputations. The precomputations in job 0 take much…

Continue Reading parallel processing – Forcing subsequent SLURM jobs to wait until first job is done?

Petabase-scale sequence alignment catalyses viral discovery

Serratus alignment architecture Serratus (v0.3.0) (github.com/ababaian/serratus) is an open-source cloud-infrastructure designed for ultra-high-throughput sequence alignment against a query sequence or pangenome (Extended Data Fig. 1). Serratus compute costs are dependent on search parameters (expanded discussion available: github.com/ababaian/serratus/wiki/pangenome_design). The nucleotide vertebrate viral pangenome search (bowtie2, database size: 79.8 MB) reached processing rates…

Continue Reading Petabase-scale sequence alignment catalyses viral discovery

Novel diagnostic biomarkers for keloid based on GEO database

Introduction Keloid is excessive fibrosis of the skin that extends beyond the area of injury and does not regress.1 Keloid can occur in the joints and mouth after several years of severe injury, including burns, chemical injury, wound, and surgical incision.2 Keloids on the joints affect the quality of life,…

Continue Reading Novel diagnostic biomarkers for keloid based on GEO database

In-Depth Analysis of Bacillus anthracis 16S rRNA Genes and Transcripts Reveals Intra- and Intergenomic Diversity and Facilitates Anthrax Detection

Various ratios of 16S-BA/BC-alleles constitute possible explanations for differences in FISH signals of cells of diverse B. anthracis strains (Fig. 1). Indeed, we found a significant correlation between 16S-BA/BC-allele ratios in sequenced genomes and mean intensities of the Cy3 FISH signals targeting the16S-BA-allele (tested with the cor.test function in R, Pearson’s…

Continue Reading In-Depth Analysis of Bacillus anthracis 16S rRNA Genes and Transcripts Reveals Intra- and Intergenomic Diversity and Facilitates Anthrax Detection

[gmx-users] Is such hardware suitable for gromacs

Your mail looks like spam at first glance. Surely. They can be used to run Gromacs. In fact, Gromacs welcomes bothIntel and AMD’ modern CPU with SIMD capability. To build a small cluster, you also need some networking devices: a GBether switch and some good cables. Rest is software installation…

Continue Reading [gmx-users] Is such hardware suitable for gromacs

NCBI looking for testers for a new web-only (for now) clustered `nr` database

News:NCBI looking for testers for a new web-only (for now) clustered `nr` database 0 Find details about how to participate by going to this link. Clustered nr is the standard NCBI nr database clustered with each sequence within 90% identity and 90% length to other members of the cluster. Your…

Continue Reading NCBI looking for testers for a new web-only (for now) clustered `nr` database

r – How to parallelize future_pmap() across multiple slurm nodes

I have access to a large computing cluster with many nodes each of which has >16 cores, running Slurm 20.11.3. I want to run a job in parallel using furrr::future_pmap(). I can parallelize across multiple cores on a single node but I have not been able to figure out the…

Continue Reading r – How to parallelize future_pmap() across multiple slurm nodes

H2O is an in-memory platform for distributed, scalable machine learning

H2O is an in-memory platform for distributed, scalable machine learning. H2O uses familiar interfaces like R, Python, Scala, Java, JSON and the Flow notebook/web interface, and works seamlessly with big data technologies like Hadoop and Spark. H2O provides implementations of many popular algorithms such as Generalized Linear Models (GLM), Gradient…

Continue Reading H2O is an in-memory platform for distributed, scalable machine learning

Correlation Distance Metric and Sum of Squared Errors

The sum of squared error is more easily implemented than the correlation distance metric, so I would advise you to use biopython together with the following helper function. It should compute the sum of squared errors for you from the data (assumed to be a numpy array) and biopython’s clusterid…

Continue Reading Correlation Distance Metric and Sum of Squared Errors

Cannot install Phyloseq and dada2

Hello all, I have been having issues installing packages that I really need to use. Basically, I cannot download either Phyloseq or dada2 and I believe it’s because I don’t have GenomeInfoDbData. But at the same time, I cannot install GenomeInfoDbData because I can’t seem to update the dependencies (“fansi”…

Continue Reading Cannot install Phyloseq and dada2

[slurm-users] Questions about scontrol reconfigure / reconfig

Hello, I have some questions about adding nodes in configless mode. My version of SLURM is 21.08.5. I gave logs below to ease the read of the message. First, is “scontrol reconfigure” equal to “scontrol reconfig” ? Then, I have a strange behaviour at node addition. I have an healthy…

Continue Reading [slurm-users] Questions about scontrol reconfigure / reconfig

Senior Manager, Bioinformatics and Genomics Data Scientist

Job Description Do you want to be part of an inclusive team that works to develop innovative therapies for patients? Every day, we are driven to develop and deliver innovative and effective new medicines to patients and physicians. If you want to be part of this exciting work, you belong…

Continue Reading Senior Manager, Bioinformatics and Genomics Data Scientist

gromacs 2021.5 – Download, Browsing & More

gromacs 2021.5 – Download, Browsing & More | Fossies Archive “Fossies” – the Fresh Open Source Software Archive Contents of gromacs-2021.5.tar.gz (14 Jan 16:58, 38023772 Bytes) About: GROMACS performs molecular dynamics, i.e. simulates the Newtonian equations of motion for systems with hundreds to millions of particles (designed for biochemical molecules…

Continue Reading gromacs 2021.5 – Download, Browsing & More

apache spark – Pyspark kernel in jupyterhub, access to master remotely

i have two kernels for Spark, one to run locally and one to run towards a cluster. Is there a way to set an environment variable to my spark master so that users dont have to define master in SparkContext in the kernel which is to speak with the spark…

Continue Reading apache spark – Pyspark kernel in jupyterhub, access to master remotely

Identification of a regulatory pathway inhibiting adipogenesis via RSPO2

Integration of APC scRNA-seq data reveals heterogeneity of adipocyte progenitor cells In a previous study9, we defined Lin−Sca1+CD142+ APCs as adipogenesis regulatory (Areg) cells and demonstrated that these cells are both refractory toward adipogenesis and control adipocyte formation of APCs through paracrine signaling. In contrast, Merrick et. al.4 observed that Lin−CD142+ cells…

Continue Reading Identification of a regulatory pathway inhibiting adipogenesis via RSPO2

The role of ATXR6 expression in modulating genome stability and transposable element repression in Arabidopsis

Significance The plant-specific H3K27me1 methyltransferases ATXR5 and ATXR6 play integral roles connecting epigenetic silencing with genomic stability. However, how H3K27me1 relates to these processes is poorly understood. In this study, we performed a comprehensive transcriptome analysis of tissue- and ploidy-specific expression in a hypomorphic atxr5/6 mutant and revealed that the…

Continue Reading The role of ATXR6 expression in modulating genome stability and transposable element repression in Arabidopsis

Genetic diversity and selection in Puerto Rican horses

Horses have been considered one of our most prized possessions, used for travel, work, food, and pleasure for at least five and a half millennia17,18,19,20. Nevertheless, the ancestry of various horse breeds and their characteristic traits remains unclear21. In this paper, we describe the patterns and the origins of genetic…

Continue Reading Genetic diversity and selection in Puerto Rican horses

Running Jobs on Titan

Running Jobs on Titan Table of Contents Titan’s Job Scheduler – SLURM Documentation Translating to SLURM commands from other workload managers Basic SLURM Commands squeue sinfo scontrol sbatch scancel Titan’s Environment Module System – LMOD Listing all available modules on Titan Loading a module into your environment Listing all modules…

Continue Reading Running Jobs on Titan

Error in print &molecular_dipoles under &LOCALIZE in cp2k

May I ask how to print the dipole moment per molecule of the system in cp2k? I tried to do print molecular_dipoles in LOCALIZE:    &LOCALIZE       METHOD CRAZY       USE_HISTORY       &PRINT         &MOLECULAR_DIPOLES           FILENAME…

Continue Reading Error in print &molecular_dipoles under &LOCALIZE in cp2k

Identification of differentially expressed genes in AF

Defeng Pan,1,* Yufei Zhou,2,* Shengjue Xiao,1,* Yue Hu,3,* Chunyan Huan,1 Qi Wu,1 Xiaotong Wang,1 Qinyuan Pan,1 Jie Liu,1 Hong Zhu1 1Department of Cardiology, The Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, 221004, People’s Republic of China; 2Department of Cardiology, Shanghai Institute of Cardiovascular Diseases, Zhongshan Hospital and Institutes of…

Continue Reading Identification of differentially expressed genes in AF

Decoding gene regulation in the fly brain

1. Li, H. et al. Classifying Drosophila olfactory projection neuron subtypes by single-cell RNA sequencing. Cell 171, 1206–1220 (2017). CAS  PubMed  PubMed Central  Google Scholar  2. Davie, K. et al. A single-cell transcriptome atlas of the aging Drosophila brain. Cell 174, 982–998 (2018). CAS  PubMed  PubMed Central  Google Scholar  3….

Continue Reading Decoding gene regulation in the fly brain

Bioinformation Analysis Reveals IFIT1 as Potential Biomarkers in Centr

Introduction Tuberculosis (TB) is considered to be one of the top ten causes of death in the world, about a quarter of the world’s population is infected with M. tuberculosis.1 The World Health Organization (WHO) divides tuberculosis into pulmonary tuberculosis (PTB) and extra-pulmonary tuberculosis (EPTB). Although breakthroughs have been made…

Continue Reading Bioinformation Analysis Reveals IFIT1 as Potential Biomarkers in Centr

16S rRNA, One of the Most Important rRNAs

The 16S rRNA and 18S rRNA genes are the most frequently used targets for bacteria/archaea and eukaryotes, respectively. Based on the diversity of ribosomal RNA sequences, one can explore the structure of the microbiome in terms of presence and relative abundance. prok. -30S -50S What are the functional components of…

Continue Reading 16S rRNA, One of the Most Important rRNAs

Mendelian randomization analyses support causal relationships between blood metabolites and the gut microbiome

1. Wang, J. & Jia, H. Metagenome-wide association studies: fine-mining the microbiome. Nat. Rev. Microbiol. 14, 508–522 (2016). CAS  PubMed  Google Scholar  2. Moschen, A. R. et al. Lipocalin 2 protects from inflammation and tumorigenesis associated with gut microbiota alterations. Cell Host Microbe 19, 455–469 (2016). CAS  PubMed  Google Scholar …

Continue Reading Mendelian randomization analyses support causal relationships between blood metabolites and the gut microbiome

Monocle3 differential expression failed when active.assay is not “RNA”

after run estimate_size_factors, data with active.assay = ‘integrated’ works too, but no deg in the result. > [email protected] = ‘integrated’ > cds_raw <- as.cell_data_set(seurat_object) Warning: Monocle 3 trajectories require cluster partitions, which Seurat does not calculate. Please run ‘cluster_cells’ on your cell_data_set object > cds <- cluster_cells(cds_raw) > pr_graph_test_res <-…

Continue Reading Monocle3 differential expression failed when active.assay is not “RNA”

Bioconductor – bnem (development version)

DOI: 10.18129/B9.bioc.bnem     This is the development version of bnem; for the stable release version, see bnem. Training of logical models from indirect measurements of perturbation experiments Bioconductor version: Development (3.15) bnem combines the use of indirect measurements of Nested Effects Models (package mnem) with the Boolean networks of…

Continue Reading Bioconductor – bnem (development version)

Sod1 integrates oxygen availability to redox regulate NADPH production and the thiol redoxome

Significance Cu/Zn superoxide dismutase (Sod1) is a key antioxidant enzyme, and its importance is underscored by the fact that its ablation in cell and animal models results in oxidative stress; metabolic defects; and reductions in cell proliferation, viability, and lifespan. Curiously, Sod1 detoxifies superoxide radicals (O2•−) in a manner that…

Continue Reading Sod1 integrates oxygen availability to redox regulate NADPH production and the thiol redoxome

Job Opportunity: HPC Engineer at European Bioinformatics Institute (EMBL-EBI) (Hinxton, UK)

New Job opportunity posted by European Bioinformatics Institute (EMBL-EBI): We are seeking a HPC engineer to join our Compute team within our Technical Services Cluster (TSC), serving an institute of over 800 researchers and technical staff. You will be working closely with members of the department, and more widely with…

Continue Reading Job Opportunity: HPC Engineer at European Bioinformatics Institute (EMBL-EBI) (Hinxton, UK)

What is SNP array testing?

What is SNP array testing? The SNP array test looks for changes in specific areas of a person’s chromosomes, such as gains (duplications) or losses (deletions). These gains or losses result in extra or missing copies of genetic material. How does SNP array work? SNP array is a type of…

Continue Reading What is SNP array testing?

clusterExport, environment and variable scoping

You’re getting the error when calling clusterExport(cl, list(“a”, “b”, “data”)) because clusterExport is trying to find the variables in .GlobalEnv, but fn1 isn’t setting them in .GlobalEnv but in its own local environment. An alternative is to pass the local environment of fn1 to fn2, and specify that environment to…

Continue Reading clusterExport, environment and variable scoping

Pan-AMPK activator O304 prevents gene expression changes and remobilisation of histone marks in islets of diet-induced obese mice

O304 treatment prevents islet gene expression signature changes induced by HFD We have previously demonstrated that the AMPK activator O304 improves blood glucose homeostasis in both human T2D subjects as well as in high-fat diet induced obese and diabetic mouse models. In the present study, we have now analysed the…

Continue Reading Pan-AMPK activator O304 prevents gene expression changes and remobilisation of histone marks in islets of diet-induced obese mice

Microbiota and Body Composition During the Period of Complementary Feeding

In this study,  scientists aimed at investigating the relationships between food category consumption, fecal microbial profile, and body composition throughout the supplemental feeding phase. In a cohort of 50 babies aged 6 to 24 months, the diet was examined using a quantitative food frequency questionnaire, fecal microbiota profile was analyzed using…

Continue Reading Microbiota and Body Composition During the Period of Complementary Feeding

speed slow down on running CP2K

Dear All I am newbie in CP2K and trying to run MD simulation on the benzene-water cluster with BLYP functional. I found a strange phenomenon when I check one of the NVE output files with the filename extension of .ener. The CPU time slowed down significantly after step 12283 and…

Continue Reading speed slow down on running CP2K

GeneTonic: an R/Bioconductor package for streamlining the interpretation of RNA-seq data | BMC Bioinformatics

1. Van den Berge K, Hembach KM, Soneson C, Tiberi S, Clement L, Love MI, Patro R, Robinson MD. RNA sequencing data: Hitchhikers guide to expression analysis. Annu Rev Biomed Data Sci. 2019;2(1):139–73. doi.org/10.1146/annurev-biodatasci-072018-021255. Article  Google Scholar  2. Conesa A, Madrigal P, Tarazona S, Gomez-Cabrero D, Cervera A, McPherson A,…

Continue Reading GeneTonic: an R/Bioconductor package for streamlining the interpretation of RNA-seq data | BMC Bioinformatics

Researchers Identify Four Lupus Subgroups Associated with Lupus Outcomes using Long-term Autoantibody Data and Artificial Intelligence

Updated antibody research by Lupus Foundation of America Gary S. Gilkeson Career Development Awardee May Choi identifies subgroups of lupus patients with different outcomes based on long-term autoantibody data with the aid of artificial intelligence. An autoantibody is a type of protein produced when the body’s immune system is attacking…

Continue Reading Researchers Identify Four Lupus Subgroups Associated with Lupus Outcomes using Long-term Autoantibody Data and Artificial Intelligence

SLURM and tailoring walltime for different jobs –

Hi, so finally, I have access to a big cluster that uses SLURM as scheduler for Matlab. So far so good. Now, I would need to understand if I am planning the execution of my program properly. I have a Main file, with several batch jobs. At the moment, it…

Continue Reading SLURM and tailoring walltime for different jobs –

Bioinformatics Engineer – Idealist

POSITION SUMMARY The Simons Foundation is seeking a Bioinformatics/Senior Bioinformatics Engineer (dependant upon experience) to develop and support whole exome and genome sequence data analysis pipelines in both research and operational modalities. This position will report to the Director of Data and Analytics in the informatics group and will work…

Continue Reading Bioinformatics Engineer – Idealist

Adrenal aldosterone-producing adenoma | IJGM

Background Primary hyperaldosteronism (PA) is characterized by spontaneous secretion of excessive aldosterone and inhibition of plasma renin activity.1 The pathogenesis of adrenal aldosterone-producing adenoma (APA) involves the abnormal proliferation of adrenal cortex cells and the excessive secretion of aldosterone, accounting for nearly 30% of PA. Excessive secretion of aldosterone can…

Continue Reading Adrenal aldosterone-producing adenoma | IJGM

Systems biology analysis of human genomes points to key pathways conferring spina bifida risk

Significance Genetic investigations of most structural birth defects, including spina bifida (SB), congenital heart disease, and craniofacial anomalies, have been underpowered for genome-wide association studies because of their rarity, genetic heterogeneity, incomplete penetrance, and environmental influences. Our systems biology strategy to investigate SB predisposition controls for population stratification and avoids…

Continue Reading Systems biology analysis of human genomes points to key pathways conferring spina bifida risk

Integrating Bulk RNA-seq data with Single cell RNA seq data

Integrating Bulk RNA-seq data with Single cell RNA seq data 0 Hello all, recently, I had been trying to integrate bulk RNAseq data into single-cell data where I treat each sample in my bulk RNAseq data as a single cell and integrate it into the single-cell data based on the…

Continue Reading Integrating Bulk RNA-seq data with Single cell RNA seq data

docker – SLURM cluster inside k8s cannot run srun command

I’m a beginner k8s user, I’m trying to recreate this docker-compose SLURM cluster with kubernetes. First I converted the docker-compose.yaml file into k8s yaml file in order to use kubectl apply -f . to create pods and services. I’m using minikube on my computer with the none driver (like this…

Continue Reading docker – SLURM cluster inside k8s cannot run srun command

Novel bioinformatics pipeline for fast and scalable analysis of large viral phylogenies

A team of researchers recently developed a bioinformatics approach to analyze viral phylogenetic clusters and posted their findings to the bioRxiv* preprint server. Study: ClusTRace, a bioinformatic pipeline for analyzing clusters in virus phylogenies. Image Credit: M. PATTHAWEE/Shutterstock Background Coronavirus disease 2019 (COVID-19)…

Continue Reading Novel bioinformatics pipeline for fast and scalable analysis of large viral phylogenies

Towards the biogeography of prokaryotic genes

1. Sunagawa, S. et al. Structure and function of the global ocean microbiome. Science 348, 1261359 (2015). PubMed  Google Scholar  2. Zou, Y. et al. 1,520 reference genomes from cultivated human gut bacteria enable functional microbiome analyses. Nat. Biotechnol. 37, 179–185 (2019). CAS  PubMed  PubMed Central  Google Scholar  3. Mohammad,…

Continue Reading Towards the biogeography of prokaryotic genes

Single-cell delineation of lineage and genetic identity in the mouse brain

STICR lentiviral library preparation and validation We synthesized a high-complexity lentivirus barcode library that encodes approximately 60–70 million distinct oligonucleotide RNA sequences (STICR barcodes). STICR barcodes comprised three distinct oligonucleotide fragments cloned sequentially into a multicloning site within the 3′ UTR of an enhanced green fluorescent protein (eGFP) transgene under…

Continue Reading Single-cell delineation of lineage and genetic identity in the mouse brain

Import problem: Not a(n) QIIME1DemuxFormat file – Technical Support

Hi @emiliomastriani, Did you download the sequences form sra?This previous question may give you some help: Hi there, I am familiar with QIIME1 but relatively new with QIIME2. I have gotten my raw file in the past from a facility in the CASAVA pair ended demultiplexed format and I had…

Continue Reading Import problem: Not a(n) QIIME1DemuxFormat file – Technical Support

[workshop] Building HPC clusters with LXD, Slurm & GPU:s – community-workshop

Welcome to join another exciting community workshop with Juju and related projects! This time @mmrezaie will take us through an interesting deployment of Slurm charms to build a HPC cluster with juju and lxd. Meeting link: meet.google.com/ceh-zber-jnf Date: Fri 2021-12-17 09:00 – 10:00 (UTC) Abstract In this workshop we will…

Continue Reading [workshop] Building HPC clusters with LXD, Slurm & GPU:s – community-workshop

Error with file guillaumeKUnitigsAtLeast32bases_all.fasta, kUnitigLengths.txt is of size 0, must be at least of size 1.

Hello, I am trying running an assembly with MaSuRCa but am getting an error at the step: “Computing super reads from PE”. here’s the output with the error: [xxxx@vic Bovidae]$ cd Assembly_test/ [xxxx@vic Assembly_test]$ ls assemble.sh guillaumeKUnitigsAtLeast32bases_all.fasta.tmp masurca_assembly.o4302352 meanAndStdevByPrefix.pe.txt pe_data.tmp quorum_mer_db.jf work1 environment.sh guillaumeKUnitigsAtLeast32bases_all.jump.fasta masurca_config.txt pe.cor.fa pe.renamed.fastq super1.err ESTIMATED_GENOME_SIZE.txt masurca_assembly.e4302352…

Continue Reading Error with file guillaumeKUnitigsAtLeast32bases_all.fasta, kUnitigLengths.txt is of size 0, must be at least of size 1.

Researchers Develop i-Melt: A Deep Neural Network That Can Predict Glass Quality Based On Melt Composition

Source: medium.com/pytorch/from-windows-to-volcanoes-how-pytorch-is-helping-us-understand-glass-8720d480f4f2 Glass can be found all around us. It’s in our computer screens, next-generation batteries, medical implants, and even volcanoes. Glass is manufactured by melting something and then swiftly cooling it. The chemical makeup of the molten liquid determines the physical qualities of glass. By alternating these properties, it…

Continue Reading Researchers Develop i-Melt: A Deep Neural Network That Can Predict Glass Quality Based On Melt Composition

Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large scales

TL;DR: Using a variation on Amazon’s “Herring” technique, which leverages reduction servers, we can perform the all-reduce collective faster than NCCL: up to 2x as fast as NCCL in microbenchmarks up to 50% speedup in end-to-end training workloads You can use it right away in your project, with no code…

Continue Reading Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large scales

Adding numbers and characters to legend key in ggplot2 of UMAP clusters

Adding numbers and characters to legend key in ggplot2 of UMAP clusters 0 Hi everyone, I have a UMAP cluster, however there are so many clusters that the descriptions look clunky if i put them on the umap…but then there are too many colors if it just colors. So, i…

Continue Reading Adding numbers and characters to legend key in ggplot2 of UMAP clusters

qiime2-import data from non-working directory – User Support

Hello, qiime2 users community! I have the following set-up: a huge collection of .fastq files which I would like to process with the dada2 pipeline a remote cluster of servers with the separated storage where those files are stored, and the working machines for computing. Question: Is it possible to…

Continue Reading qiime2-import data from non-working directory – User Support

No differentially expressed genes

Hi, I have a dataset now with 36k lncRNAs and I’m using DESeq2 to find differentially expressed lncRNAs between a healthy group and a disease group, but unfortunately I cannot find any DE lncRNAs with low padj values. However, when I explore my data by taking the log2 fold change…

Continue Reading No differentially expressed genes

alphafold2: HHblits failed – githubmemory

I’ve tried using the standard alphafold2 setup via docker (converted to a singularity container) via the setup described at github.com/kalininalab/alphafold_non_docker, and both result in the following error: […] E1210 12:01:01.009660 22603932526400 hhblits.py:141] – 11:49:18.512 INFO: Iteration 1 E1210 12:01:01.009703 22603932526400 hhblits.py:141] – 11:49:19.070 INFO: Prefiltering database E1210 12:01:01.009746 22603932526400 hhblits.py:141]…

Continue Reading alphafold2: HHblits failed – githubmemory

Prognosis Biomarkers via WGCNA in HCC

Introduction According to the cancer statistics reported in 2020, hepatocellular carcinoma (HCC) is the main type of Primary Carcinoma of the Liver and the second leading causes of cancer-related death globally, with a five-year survival rate < 20%.1 Currently, surgical resection, a standard therapy for HCC, contributes to the prognosis…

Continue Reading Prognosis Biomarkers via WGCNA in HCC

What is the single nucleotide polymorphism database ( dbsnp )?

The Single Nucleotide Polymorphism Database (dbSNP) is a free public archive for genetic variation within and across different species developed and hosted by the National Center for Biotechnology Information (NCBI) in collaboration with the National Human Genome Research Institute (NHGRI). Furthermore, are there any databases for single nucleotide polymorphisms?As there…

Continue Reading What is the single nucleotide polymorphism database ( dbsnp )?

RStudio AI Weblog: Coaching ImageNet with R

ImageNet (Deng et al. 2009) is a picture database organized in keeping with the WordNet (Miller 1995) hierarchy which, traditionally, has been utilized in pc imaginative and prescient benchmarks and analysis. Nonetheless, it was not till AlexNet (Krizhevsky, Sutskever, and Hinton 2012) demonstrated the effectivity of deep studying utilizing convolutional…

Continue Reading RStudio AI Weblog: Coaching ImageNet with R

How can I test whether a treatment truncates transcript length using transcriptomics?

How can I test whether a treatment truncates transcript length using transcriptomics? 0 I am testing whether a treatment gives shorter transcripts than a solvent in E. coli using transcriptomics: I have the following dataset: (A) treatment condition and a (B) solvent condition 3 concentrations (1X, 2X, and 3X) 4…

Continue Reading How can I test whether a treatment truncates transcript length using transcriptomics?

Postdoctoral Fellow job with Cleveland Clinic – Genomic Medicine Institute

We are seeking multiple Experimental and Bioinformatics research positions (including Postdoctoral Research Fellows and Research Associate) to join the Alzheimer’s Network Medicine and Artificial Intelligence (AI) research group (www.lerner.ccf.org/gmi/cheng/) led by Dr. Feixiong Cheng at the Genomic Medicine Institute, Cleveland Clinic Lerner Research Institute, and Department of Molecular Medicine at…

Continue Reading Postdoctoral Fellow job with Cleveland Clinic – Genomic Medicine Institute

Bioconductor – MLInterfaces

    This package is for version 3.3 of Bioconductor; for the stable, up-to-date release version, see MLInterfaces. Uniform interfaces to R machine learning procedures for data in Bioconductor containers Bioconductor version: 3.3 This package provides uniform interfaces to machine learning code for data in R and Bioconductor containers. Author:…

Continue Reading Bioconductor – MLInterfaces

Design formula in DESeq2

Hello, I am using DESeq2 for analysis of RNAseq data. I would like to ask you about the design in the DESEq2 formula. I have tissue from animals treated with a chemical and my animal model is a colorectal cancer model. My variables are gender (male or female), treatment (treated…

Continue Reading Design formula in DESeq2

Microchip Analysis – an overview

33.3.1.3 Oil Field Microbiology: Asia Oilfields in Russia and China have been extensively studied to determine microbial ecology as well as understand the microbial community for bioprospecting aspects. Molecular approaches as well as radioisotopic activity measurements from formation water of a high temperature oil reservoir in Russia (Samotlor) indicated enrichment…

Continue Reading Microchip Analysis – an overview

Discovery and Biosynthesis of Natural Products from New Zealand Soil Metagenome Libraries

Antibiotic discovery rates dramatically declined following the “golden age” of the 1940’s to the 1960’s. The platforms that underpinned that age of discovery rested upon laboratory cultivation of a small clade of bacteria, the actinomycetes, primarily isolated from soil environments. Fermentation extracts of these isolated bacteria have provided the majority…

Continue Reading Discovery and Biosynthesis of Natural Products from New Zealand Soil Metagenome Libraries

Identification of Hub Genes in Patients with Alzheimer Disease and Obs

Introduction Alzheimer’s disease (AD) ranks first among the common dementia type of the world. According to epidemiological investigation from the International Alzheimer’s disease association, about 45 million people has been suffered from AD, and the number is expected to increase to 131 million in 2050.1 Despite the widespread prevalence of…

Continue Reading Identification of Hub Genes in Patients with Alzheimer Disease and Obs

h2o.ai – Killing xxx because the cloud is no longer accepting new H2O nodes

plz help~ i create h2o-stateful-set which set replicas: 3, then i run a h2o automl job, it works well. but suddenly one of pod breakdown, i use kubectl delete pod h2o-k8s-1 to delete this pod. the statefulset create a new pod has same name h2o-k8s-1. But here’s the problem, the…

Continue Reading h2o.ai – Killing xxx because the cloud is no longer accepting new H2O nodes

Containerization: Kubernetes 1.23 stabilizes operation with two network stacks

Kubernetes 1.23 is the third and final release of container orchestration this year. Among other things, it stabilizes the dual-stack operation in the cluster, the horizontal pod autoscaler and generic ephemeral volumes. With the new functions, initially introduced as alpha, the server-side validation of fields and the connection to OpenAPI…

Continue Reading Containerization: Kubernetes 1.23 stabilizes operation with two network stacks

tSNE and UMAP of scATAC-seq data looks like spaghetti

tSNE and UMAP of scATAC-seq data looks like spaghetti 0 I would like to use R to generate cluster my 20k cells from a single cell ATAC-seq experiment. I ran PCA then selected the first 50 components, which were put into tSNE’s normalize_input() then Rtsne(). This is the result I…

Continue Reading tSNE and UMAP of scATAC-seq data looks like spaghetti

Ttc30a affects tubulin modifications in a model for ciliary chondrodysplasia with polycystic kidney disease

Significance Cilia are tubulin-based cellular appendages, and their dysfunction has been linked to a variety of genetic diseases. Ciliary chondrodysplasia is one such condition that can co-occur with cystic kidney disease and other organ manifestations. We modeled skeletal ciliopathies by mutating two established disease genes in Xenopus tropicalis frogs. Bioinformatic…

Continue Reading Ttc30a affects tubulin modifications in a model for ciliary chondrodysplasia with polycystic kidney disease

Transposition and duplication of MADS-domain transcription factor genes in annual and perennial Arabis species modulates flowering

Annual and perennial species occur in many plant families. Annual plants and some perennials are monocarpic (flowering once in their life cycle), characterized by a massive flowering and typically produce many seeds before the whole plant senesces. By contrast, most perennials live for many years, show delayed reproduction, and are…

Continue Reading Transposition and duplication of MADS-domain transcription factor genes in annual and perennial Arabis species modulates flowering

How to use “SingleR” on the marker genes from `FindAllMarkers` for each cluster?

How to use “SingleR” on the marker genes from `FindAllMarkers` for each cluster? 0 Hi, I tried to use SingleR to identify cell types for clusters. I have the table of results from FindAllMakers of Seurat package. I know that I can use: SingleR(GetAssayData(seurat.object, assay = assay, slot = “data”),…

Continue Reading How to use “SingleR” on the marker genes from `FindAllMarkers` for each cluster?

Which trajectory method is better !?

Which trajectory method is better !? 2 Hello I was engaged with a basic problem. I have dataset consist ~2000 cells and composed 8-9 clusters using Seurat package, then I transfer Seurat object to the Monocle. I tried monocle2 and monocle3. The problem is, how to make the trajectory ?…

Continue Reading Which trajectory method is better !?

Epistasis shapes the fitness landscape of an allosteric specificity switch

Computational design Protein modeling and design was performed with Rosetta version 3.5 (2015.19.57819)35,37. Python and shell scripts for generating input from Rosetta and analyzing from Rosetta are available at: github.com/raman-lab/biosensor_design The high-resolution TtgR structure co-crystalized with tetracycline was selected as the starting point for computational design (PDB: 2UXH)28. The structure…

Continue Reading Epistasis shapes the fitness landscape of an allosteric specificity switch

Biopython Contact Us | Contact Information Finder

Listing Results Biopython Contact Us Biopython · Biopython 2 hours ago Biopython.org View All Biopython. See also our News feed and Twitter. Introduction. Biopython is a set of freely available tools for biological computation written in Python by an international team of developers.. It is a distributed collaborative effort to…

Continue Reading Biopython Contact Us | Contact Information Finder

Dissemination of Mycobacterium abscessus via global transmission networks

Dataset construction, cluster identification and definition of DCCs Whole genome sequencing of two collections of isolates from Manchester, UK, and the Netherlands was carried out as previously described2. Briefly, DNA was extracted from colony sweeps of subcultured samples before to paired-end sequencing using the Illumina HiSeq platform. These samples were…

Continue Reading Dissemination of Mycobacterium abscessus via global transmission networks

Trinity Statistic

Trinity Statistic 0 Hello, I want to check trinity stat and validation result, I am using cluster. I used this command perl TrinityStats.pl trinity_out_dir/Trinity.fasta > Trinity_stats1.txt output is Can’t open perl script “TrinityStats.pl”: No such file or directory Do I need to add TrinityStats.pl in my directory, because it is…

Continue Reading Trinity Statistic

Unable to download fastq files in parallel / SOS

Unable to download fastq files in parallel / SOS 0 Hi! Very new to all this so bear with me if I’m using incorrect terminology. Also english is my second language. I’m trying to download my fastq files in parallel but it doesn’t work and I keep receiving this error:…

Continue Reading Unable to download fastq files in parallel / SOS

How to get a profile of consistent cd-hit clusters across different sequence files?

How to get a profile of consistent cd-hit clusters across different sequence files? 0 I have 10 different nucleotide sequence fasta files. I would like to run cd-hit on them and get a cluster abundance profile. If I run the fasta files on cd-hit separately, the clusters will not be…

Continue Reading How to get a profile of consistent cd-hit clusters across different sequence files?

Ensembl vep singularity

Ensembl vep singularity 0 Hello all, I would like to use variant effect prédictor on an hpc cluster. For that i use singularity with the docker image of vep : singularity pull –name vep.sif But i have a problem to use one plugin because a perl module is missing….

Continue Reading Ensembl vep singularity

Network plot using NetCoMi and igraph

Hi, I am trying to plot the Network plot as suggested here (github.com/stefpeschel/NetCoMi#single-association-network-on-genus-level) by using igraph and NetCoMi. But I am not getting the network plot as expected- I just want to label the hub genera and phyla. and want the network plot in spherical layout. # Agglomerate to genus…

Continue Reading Network plot using NetCoMi and igraph

Haplotype divergence supports long-term asexuality in the oribatid mite Oppiella nova

Significance Putatively ancient asexual species pose a challenge to theory because they appear to escape the predicted negative long-term consequences of asexuality. Although long-term asexuality is difficult to demonstrate, specific signatures of haplotype divergence, called the “Meselson effect,” are regarded as strong support for long-term asexuality. Here, we provide evidence…

Continue Reading Haplotype divergence supports long-term asexuality in the oribatid mite Oppiella nova

How to add metadata to dendrogram of hierarchical clustering in R

How to add metadata to dendrogram of hierarchical clustering in R 0 I have a RNAseq dataset with genes as rows and samples as columns. I have managed to do hierarchical clustering of the samples, and I want to look for subgroups in the disease. I additionally have metadata with…

Continue Reading How to add metadata to dendrogram of hierarchical clustering in R

Phylogeographic reconstruction of the marbled crayfish origin

Procambarus fallax collections and PCR genotyping Animals were collected from various wild populations (Table S1) in compliance with state and local regulations (Georgia department of natural resources scientific collection permit 115621108, state of Florida collection permits S-19-10 and S-20-04). DNA was isolated from abdominal muscle tissue using SDS-based extraction and precipitation…

Continue Reading Phylogeographic reconstruction of the marbled crayfish origin

Identification of Prognosis-Associated Biomarkers in Thyroid Carcinoma

Introduction Thyroid cancer (TC) is a common endocrine malignancy with a rapidly increasing incidence worldwide, and the estimated new cases and deaths are notably higher in women than in men.1 Papillary thyroid carcinoma (PTC) is identified as the most common pathological type of TC, and accounts for approximately 80–85% of…

Continue Reading Identification of Prognosis-Associated Biomarkers in Thyroid Carcinoma

Mutational Analysis of Mitochondrial tRNA Genes

Introduction Diabetes is a very complex disease characterized by the presence of chronic hyperglycemia. Clinically, insulin-dependent type 1 and non-insulin-dependent type 2 are the main types of diabetes. Among them, type 2 diabetes mellitus (T2DM, [MIM125853]) is a common endocrine disorder affecting approximately 10% of adult population.1 In most cases,…

Continue Reading Mutational Analysis of Mitochondrial tRNA Genes

Align fastq SOLiD data

Align fastq SOLiD data 1 Hello everyone, I have downloaded some data from the short read archive using the sratoolkit. The data is SOLiD data. I have seen people using the Lifescope (Life Technologies) to align the reads, as I presume it works for this type of data. But unfortunately,…

Continue Reading Align fastq SOLiD data

Cosmo_00080 : CDS information — DoBISCUIT

Category 1.1 PKS Product polyketide synthase chain length factor subunit Product (GenBank) CosC Gene Gene (GenBank) cosC EC number Keyword Note Note (GenBank) ketosynthase – beta subunit Reference ACC Q2PZR8 PmId [16810496] Insights in the glycosylation steps during biosynthesis of the antitumor anthracycline cosmomycin: characterization of two glycosyltransferase genes. (Appl…

Continue Reading Cosmo_00080 : CDS information — DoBISCUIT

Converting between UCSC id and gene symbol with bioconductor annotation resources

You need to use the Homo.sapiens package to make that mapping. > library(Homo.sapiens) Loading required package: AnnotationDbi Loading required package: stats4 Loading required package: BiocGenerics Loading required package: parallel Attaching package: ‘BiocGenerics’ The following objects are masked from ‘package:parallel’: clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply,…

Continue Reading Converting between UCSC id and gene symbol with bioconductor annotation resources

Plot LFC with pheatmap of differentially expressed gene list from DESeq2.

Hi, all! First post, so apologies for any flaws with post structure. I am attempting to make a basic heatmap that shows the log fold change of differentially expressed genes, as identified by DESeq2. See below the code I am using for DESeq2: ##Load DESeq2 source(“https://bioconductor.org/biocLite.R”) biocLite(“DESeq2”) biocLite(“stringi”) biocLite(“MASS”) install.packages(“survival”)…

Continue Reading Plot LFC with pheatmap of differentially expressed gene list from DESeq2.

Clustering resolution for scRNA-seq

Clustering resolution for scRNA-seq 0 This is an open-ended question, but maybe you could share your heuristics. What’s your approach when clustering and annotating clusters in scRNA-seq data? Do you prefer to start with a low number of clusters(e.g. CD4, CD8, monocytes) and then re-cluster or look for low-level cell…

Continue Reading Clustering resolution for scRNA-seq

Pact_00210 : CDS information — DoBISCUIT

Category 3.4 other modification Product putative 6-methylsalicylyltransferase Product (GenBank) ketoacyl-ACP synthase Gene pctTptmR Gene (GenBank) pctT EC number Keyword Note Note (GenBank) Reference ACC A8R0K3 PmId [17827660] Cloning of the pactamycin biosynthetic gene cluster and characterization of a crucial glycosyltransferase prior to a unique cyclopentane ring formation. (J Antibiot (Tokyo)….

Continue Reading Pact_00210 : CDS information — DoBISCUIT

Legacy genetics of Arachis cardenasii in the peanut crop shows the profound benefits of international seed exchange

Significance A great challenge for humanity is feeding its growing population while minimizing ecosystem damage and climate change. Here, we uncover the global benefits arising from the introduction of one wild species accession to peanut-breeding programs decades ago. This work emphasizes the importance of biodiversity to crop improvement: peanut cultivars…

Continue Reading Legacy genetics of Arachis cardenasii in the peanut crop shows the profound benefits of international seed exchange

High frequency of an otherwise rare phenotype in a small and isolated tiger population

Significance Small and isolated populations have low genetic variation due to founding bottlenecks and genetic drift. Few empirical studies demonstrate visible phenotypic change associated with drift using genetic data in endangered species. We used genomic analyses of a captive tiger pedigree to identify the genetic basis for a rare trait,…

Continue Reading High frequency of an otherwise rare phenotype in a small and isolated tiger population

Genomic and phenotypic characteristics for Vibrio vulnificus

Background Fisheries and aquaculture are becoming increasingly intensive to meet recent human consumption, resulting in proliferation of marine pathogens and food security concerns.1,2 Vibrio species, as one of the most dangerous foodborne pathogens, cause vibriosis in human around the world.3 It has been reported that vibriosis resulted in 80,000 illnesses…

Continue Reading Genomic and phenotypic characteristics for Vibrio vulnificus

GATK’s GenomicsDBImport takes forever…

GATK’s GenomicsDBImport takes forever… 0 Hello! I have 90 samples in the form of vcf files, together they are a few terabytes in size. I wish to create a single multi-sample vcf file for downstream analysis. I am trying to use GenomicsDBImport for this, but it just takes too long…

Continue Reading GATK’s GenomicsDBImport takes forever…

BIOINFORMATICS SCIENTIST job in in Dubai United Arab Emirates

Description Overview We are looking for a highly motivated and creative bioinformatics research scientist to develop and apply innovative analytical approaches to understand the genetic modifications that drive the development and response to therapy of blood tissues. The scientist will contribute ideas to implement automate and improve existing analysis methods,…

Continue Reading BIOINFORMATICS SCIENTIST job in in Dubai United Arab Emirates

WNN in Seurat

Dear all, I am trying to follow the WNN vignette here satijalab.org/seurat/articles/weighted_nearest_neighbor_analysis.html After the steps below, I would like to annotate my clusters, hence I need to know the markers which best represent each cluster. pbmc <- FindMultiModalNeighbors(pbmc, reduction.list = list(“pca”, “lsi”), dims.list = list(1:50, 2:50)) pbmc <- RunUMAP(pbmc, nn.name

Continue Reading WNN in Seurat

ConsensusClusterPlus package

ConsensusClusterPlus package 0 Hi guys, How can I check the plots individually, when I run the ConsensusClusterPlus command they are generated above each other very fast and end up with Tracking plot, I cant find them anywhere else, I tried to generate them as PDFs or as png but nothing…

Continue Reading ConsensusClusterPlus package

PacBio sequencing output increased through uniform and directional fivefold concatenation

Strategy and design of the method We sought to develop a simple method to increase the sequencing capability of PacBio CCS to sequence several diverse DNA libraries ~ 870 bp in length that encoded protein variants originating from a directed evolution campaign. To achieve an increase in the throughput of a PacBio sequencing…

Continue Reading PacBio sequencing output increased through uniform and directional fivefold concatenation

Trouble running Cyclum for my scRNA-seq analysis

I have been analyzing some mouse T cell scRNA-seq data for a few months now using mostly the Seurat pipeline run with default parameters, and I have noticed that regressing out the ‘S.Score’ and ‘G2M.Score’ obtained from default Seurat::CellCycleScoring seems to be insufficient to remove (seemingly large) variation originating from…

Continue Reading Trouble running Cyclum for my scRNA-seq analysis

Gromacs 4.5.4 manual

Gromacs 4.5.4 manual OntheStabilityofNegativelyChargedPlatelets inCalcium g_energy(1) [debian man page] Gromacs User Manual Version 4.6The defence of Königsberg had cost the lives of 42,000 German soldiers and 25,000 civilians. He yanked the seaman out and shouted at him to report straight to…

Continue Reading Gromacs 4.5.4 manual

Bacterial endosymbionts protect beneficial soil fungus from nematode attack

A healthy soil nourishes plants and animals, purifies water and air, and promotes sustainable agriculture. Characteristic for highly complex and competitive soil ecosystems are the frequent and direct interactions between all soil-dwelling microorganisms, animals, and plants (1, 2), all of which need to be provided with minerals and carbon sources….

Continue Reading Bacterial endosymbionts protect beneficial soil fungus from nematode attack