How to properly combine two bam files of a paired-end data

How to properly combine two bam files of a paired-end data 3 Hi all! I am mapping a paired-end read separately using bowtie2. After that, I want to combine the two bam file into one for downstream analysis. How to properly do this combination? I tried: samtools sort -n R1.bam…

Continue Reading How to properly combine two bam files of a paired-end data

GWAS, quantitative traits

There are several types of molecular genetic markers. Until recently, microsatellites (SSR or STR) were very popular. However, microsatellites are not enough for fine mapping of individual regions of genomes; the high cost of equipment and reagents and the development of automated methods using SNP chips are pushing them out…

Continue Reading GWAS, quantitative traits

Estimating sequencing depth or mean reads per cell

Estimating sequencing depth or mean reads per cell 0 I have been working with multiple single cell Rna datasets.Inorder to compare the sequencing dpeth of multiple sample,I am trying to find the mean reads per cell.I used the following code: counts_per_cell <- Matrix::colSums(sample1) Mean(counts_per_cell) The value is coming out to…

Continue Reading Estimating sequencing depth or mean reads per cell

A problem with MEGA-X tool for select the columns of nucleotides and edit them

A problem with MEGA-X tool for select the columns of nucleotides and edit them 2 Hello I have a problem with MEGA-X tool when I try to edit the columns of nucleotides. In the last version of this tool the user can aligned the nucleotides and then selected some columns…

Continue Reading A problem with MEGA-X tool for select the columns of nucleotides and edit them

How do I check the expression level of a unannotated lncRNA in RNAseq dataset?

How do I check the expression level of a unannotated lncRNA in RNAseq dataset? 2 I plan to check a lncRNA expression level in RNAseq dataset, but the lncRNA was just identified and not annotated yet.Only I know is its position in chromatin and the sequence of it. so how…

Continue Reading How do I check the expression level of a unannotated lncRNA in RNAseq dataset?

normalization two different datasets tcga vs gtex

normalization two different datasets tcga vs gtex 0 using tcga and gtex to look for lncrna DE (using raw files) – what are the best ways to normalize? deseq2 and edger? also if i want to look at lncrnas of specific chromosomes, how should i approach normalization? tcga gtex normalization…

Continue Reading normalization two different datasets tcga vs gtex

MAKER genome annotation error with SNAP ab initio prediction

I am trying to do a second round of maker genome annotation with ab initio prediction by snap. The error I am getting is as follows: error: unknown command “genome.hmm”, see ‘snap help’. ERROR: Snap failed –> rank=NA, hostname=bioinformatics ERROR: Failed while preparing ab-inits ERROR: Chunk failed at level:0, tier_type:2…

Continue Reading MAKER genome annotation error with SNAP ab initio prediction

keep getting error with psiblast

keep getting error with psiblast 1 hello I am trying to calculate the PSSM but I keep getting error , where is the problem in this code ? import os import re def command_pssm(content, output_file,pssm_file): os.system(‘psiblast -in_msa 1ak4.fasta -db allseq.fasta -num_threads 10 -num_iterations 3 -evalue 0.001 -out output_file -out_ascii_pssm PSSM.txt’…

Continue Reading keep getting error with psiblast

How can I add KEGG functional categories to whole genome alignments?

How can I add KEGG functional categories to whole genome alignments? 0 Hi! I am looking for packages/servers that can add KEGG gene functional categories to whole genome alignments done on ACT. A reference I am trying to emulate is in panel A here – www.nature.com/articles/nature06244/figures/2. How can I overlay…

Continue Reading How can I add KEGG functional categories to whole genome alignments?

How can i get sequences from a big list of accession number from NCBI?

How can i get sequences from a big list of accession number from NCBI? 1 Hi, everyone! I’m new to this world and im getting acquainted to Biopython.. is there a way on biopython to retrieve the sequences from a txt file with several accession numbers? gene NCBI Biopython •…

Continue Reading How can i get sequences from a big list of accession number from NCBI?

China approves first combination vaccine trial as delta spreads

This study tests the efficacy of a combination of a Chinese Sinovac “inactivated” vaccine and a DNA-based vaccine. China’s drug regulators have approved the country’s first combination vaccine trial because the rapid spread of Delta variants has raised concerns about the effectiveness of domestic jabs, the company involved in the…

Continue Reading China approves first combination vaccine trial as delta spreads

TPM to logFC and pvalues

Hi, I assume you have to find differential expression between two cell lines (Cx and Dx groups). Since you need logFC and Pvalue, this R code can work. And you can use obtained matrix (mysample) to calculate FDR of your interest. mysample <- read.table(“./mymatrix.csv”, sep=”,”, header=TRUE) for(i in 2:nrow(mysample)) {…

Continue Reading TPM to logFC and pvalues

How to search for species with certain function or pathway in the annotated genome?

How to search for species with certain function or pathway in the annotated genome? 0 I’m trying to find bacterial species with certain metabolic pathways. I want to curate a “metagenome” of individual species that each have certain metabolic characteristics. So far, I have found KEGG has information on metabolic…

Continue Reading How to search for species with certain function or pathway in the annotated genome?

How to convert plink data from 38th assembly to 37

How to convert plink data from 38th assembly to 37 1 My goal is to downgrade plink(.bed, bim, fam) data from the 38th build to 37. What is the easiest way to do this? I was reading liftOver documentation and I am confused. Because it only accepts its own .bed…

Continue Reading How to convert plink data from 38th assembly to 37

assessing relative contribution of samples/ organs towards a diseases in multi-tissue RNAseq

assessing relative contribution of samples/ organs towards a diseases in multi-tissue RNAseq 0 I did an RNAseq on multiple tissues (liver, muscle, heart, kidney, etc) on two groups of mice (disease v.s. control). I wonder are there is a type of analysis I can do so that I can conclude…

Continue Reading assessing relative contribution of samples/ organs towards a diseases in multi-tissue RNAseq

Beagple 5.2 phasing error

Beagple 5.2 phasing error 0 Hi everyone, I’m trying to phase a multi-sample (12 samples) vcf file with the first chromosome. I got this vcf after pruning with plink and recode it back to vcf. The file looks like this: 1 112 . C T . . PR GT ./….

Continue Reading Beagple 5.2 phasing error

Error when Phasing with Beagle 5.2

Error when Phasing with Beagle 5.2 0 I’m having trouble phasing a multi-sample (9-samples) vcf file produced by gatk HaplotypeCaller with Beagle 5.2. I do not have a genetic map or reference panel. I am working with a very heterozygous group of organisms (sea urchins). When I run beagle with…

Continue Reading Error when Phasing with Beagle 5.2

Power calculation for microarray data

Power calculation for microarray data 0 I have an initial sample of 228 patients from a microarray study. Recently I have obtained a new set of labels specifying different condition types for only 69 out of the 228 patients. I wanted to run a DEG analysis on this set of…

Continue Reading Power calculation for microarray data

How to consider batch effect and multiple variable to identify differential gene expressions for a given Phenotype in DESeq2

Hello All, I am working on RNAseq data for which one I have the Phenotype (DPN or PDPN), Gender (female or Male), Age and the batch (1st_round or 2nd_round ): ID PiNS.ID Phenotype PiNS Gender Age batch PINS_0112 112 PDPN PINS_0112 Female 64 1st_round PINS_0171 171 DPN PINS_0171 Male 74…

Continue Reading How to consider batch effect and multiple variable to identify differential gene expressions for a given Phenotype in DESeq2

KING struggle: Relatedness

I struggle to infer relationships in a dataset of 20K exomes from tens of kits. At first I found a well-covered union of regions – check. Second, I performed everything to merge 20K VCFs into one. Removed indels and multi-allelic variants. Check. Still, when I run KING with “kinship” option,…

Continue Reading KING struggle: Relatedness

How to colour points in cnetplot of clustprofiler?

I have a cnetplot from running enrichment with kegg using clusterprofiler. I have scores input as the fold change but for each gene in the plot they are not varying in colour to show their difference in the fold change score. My dataset is genes of entrez IDs and then…

Continue Reading How to colour points in cnetplot of clustprofiler?

Dual index barcode demultiplex issue

Dual index barcode demultiplex issue 1 Hello, Following are the list of dual indexed illumina libraries we have used for illumina run.But we were not able demultiplex the data by using the barcodes.Because 99% of the data going as undetermined. Index1 —— Index2 ( case1 ) ATGAGC —- TGAACCTT ATTCCT…

Continue Reading Dual index barcode demultiplex issue

Why do I get a big log fold change but small mean change in b value when plotting differential methylation?

Why do I get a big log fold change but small mean change in b value when plotting differential methylation? 0 Hello I am doing differential methylation analysis using limma. I use the m values for testing and b values for plotting. I plotted a volcano plot visualizing the p…

Continue Reading Why do I get a big log fold change but small mean change in b value when plotting differential methylation?

Answer: alphafold online availability and use case

1. There is no need for heavy-duty methods such as AlphaFold2 (AF2) in all cases. It is very unlikely that you have 1500 sequences that only have domains of unknown function, and even if you do, there were successful structure prediction servers in existence before AF2. So even though what…

Continue Reading Answer: alphafold online availability and use case

Post-doctoral Fellow in Diabetes, Harvard Medical School

Job:Post-doctoral Fellow in Diabetes, Harvard Medical School 0 The mission of the Biddinger Lab is to improve the lives of patients with diabetes. Using bioinformatic, biochemical, genetic, and physiological approaches in cells, mice and humans, our goal is to prevent diabetes associated atherosclerosis and liver disease. We are looking for…

Continue Reading Post-doctoral Fellow in Diabetes, Harvard Medical School

a specific transcript sequence from transcriptome mapped RNA seq

a specific transcript sequence from transcriptome mapped RNA seq 0 hi, i want to start working on a gene for which more than 10 splice varients are shown on ensembl. Fortunately i have access to the BAM file of transcriptome mapped, using STAR, RNA seq data from the cells which…

Continue Reading a specific transcript sequence from transcriptome mapped RNA seq

how to calculate positive and negative for a given protein sequence

how to calculate positive and negative for a given protein sequence 0 Hello, I would like to know how once could calculate positive and negative for a given sequence? The PDB of the protein is 1ak4. It has two chain A and B. I am looking to find a way…

Continue Reading how to calculate positive and negative for a given protein sequence

STAR Genome Indexing

STAR Genome Indexing 0 One of the arguments that STAR –genomeGenerate takes in is sjdbOverhang which the manual says “specifies the length of the genomic sequence around the annotated junction to be used in constructing the splice junctions database” and that it should be equal to read length – 1….

Continue Reading STAR Genome Indexing

Introgression analyses

Introgression analyses 0 Heys, I’m working with whole-genome data of several non-model species and I’m studying the introgression patterns with Dsuite software (github.com/millanek/Dsuite). I have several samples per species from different localities and I was wondering if I should look for introgression patterns specifying each sample independently or clustering together…

Continue Reading Introgression analyses

Looking for a tool which provides mapping quality score distributions from BAM files

Looking for a tool which provides mapping quality score distributions from BAM files 0 Hello BioStars, Is there a tool which generates mapping quality score distributions from bam files? I know I could potentially do this myself, but I am looking for something which would essentially do the work for…

Continue Reading Looking for a tool which provides mapping quality score distributions from BAM files

Human genomics and translational data (ELIXIR/EMBL-EBI, Cambridge, UK)

Job:Scientific Officer: Human genomics and translational data (ELIXIR/EMBL-EBI, Cambridge, UK) 0 ELIXIR is seeking a Scientific Officer to drive stakeholder engagement, document the landscape and identify key requirements relevant for the continued successful implementation of ELIXIR’s federated infrastructure for access to human genomics and translational data. We want to ensure…

Continue Reading Human genomics and translational data (ELIXIR/EMBL-EBI, Cambridge, UK)

Extract reads used for contigs assembly

Extract reads used for contigs assembly 1 Hi all, Basically I want to realize a plot with all reads from each contigs resulting from my asssembly with the last version of Velvet. To do that I am looking for a script or anything else to parse an AFG file or…

Continue Reading Extract reads used for contigs assembly

Differential expression analysis of TCGA data based on tumor staging

Hi everyone I wanted to analyze TCGA-BRCA data for identifying DEGs in different TNM stages (I to IV) between Normal and Tumor. How to change the following code to get the DEGs based on the staging? CancerProject <- “TCGA-BRCA” DataDirectory <- paste0(“../GDC/”,gsub(“-“,”_”,CancerProject)) FileNameData <- paste0(DataDirectory, “_”,”HTSeq_Counts”,”.rda”) query <- GDCquery(project =…

Continue Reading Differential expression analysis of TCGA data based on tumor staging

Comment: alphafold online availability and use case

Not my area of expertise particularly but; 1. I don’t think you can use a structure prediction tool to really ‘validate’ HMMER predictions. I’m pretty sure most structure predictors are relying on HMMER or similar HMM based approaches (Martin told me AlphaFold leans on HHBlits API calls for example). I…

Continue Reading Comment: alphafold online availability and use case

Center or fix a sample at 0 on PC1 in PCA plot

Center or fix a sample at 0 on PC1 in PCA plot 1 We have recently noticed a group that is consistently able to publish PCA plots with their sample of interest seemingly aligned at 0 on PC1 or both PC1 and PC2 . In their methods they only mention…

Continue Reading Center or fix a sample at 0 on PC1 in PCA plot

Plotting PCA chart using biopython

Hi, Im trying to plot a PCA chart using biopython. I’m new to biopython and python in general so excuse me if my code doesnt look good. I tried to do something like that: from Bio.Cluster import pca import numpy as np import pandas as pd import matplotlib.pyplot as plt…

Continue Reading Plotting PCA chart using biopython

How to convert .mol files to cdx?

How to convert .mol files to cdx? 0 Now I have many .mol files, I want to convert them to .cdx files, cdx is a format of chemdraw, How can I do this programmatic, Thanks. convert mol cdx chemdraw • 13 views Source link

Continue Reading How to convert .mol files to cdx?

Error when trying to run IGV on server

Error when trying to run IGV on server 1 I want to use IGV on a server so I don’t have to download bam files to my local machine. I used conda to install igvtools. When I type igvtools in the command line I get this error: Using system JDK….

Continue Reading Error when trying to run IGV on server

What does .bim .bed .fam stands for?

What does .bim .bed .fam stands for? 1 I have a hard time differentiating these files, maybe understanding the acronyms could help. If I’m right .ped is from pedigree and .bed is a strange way to say ‘binary-pedigree’. The others are obscure. plink fam bim bed • 21 views •…

Continue Reading What does .bim .bed .fam stands for?

Gaussian Curve fitting

I have data where my x column corresponds to the residue position and the y axis corresponds to the entropy value. I want to do a curve-fitting on the data obtained. I was trying to fit a Gaussian curve but the curve is not fitting according to what it should…

Continue Reading Gaussian Curve fitting

alphafold online availability and use case

I’m new to both protein structure prediction and the use of AI-based tools like Alphafold2 or RoseTTAFold. And I have a few questions: **1.** Is it possible to use structure prediction by AlphaFold2 to **validate** HMMER based domain sequence predictions? If yes, what would be the steps? I have some…

Continue Reading alphafold online availability and use case

How to split big .faa file into smaller .faa files

How to split big .faa file into smaller .faa files 1 I have a 10 gb .faa proteomes file that I want to run MAFFT on. But it is too big and hence I need to divide the file. How do I convert it to smaller files in windows without…

Continue Reading How to split big .faa file into smaller .faa files

Chipseq visualization how to draw the figure

Chipseq visualization how to draw the figure 1 Dose anybody know how to draw the figure using Chpseq data? Any guidance would be appreciated ! (the figure is from The Histone Lysine Demethylase JMJD3/KDM6B Is Recruited to p53 Bound Promoters and Enhancer Elements in a p53 Dependent Manner) peaks Chipseq…

Continue Reading Chipseq visualization how to draw the figure

How to retrieve KO IDs for a list of genes?

How to retrieve KO IDs for a list of genes? 2 Hi, community!!! I have downloaded a list of genes from the MetaCyc database for some bacterial species. I want to find out their respective KO IDs. Can anyone please tell me how can I do that? Thanks KEGG database…

Continue Reading How to retrieve KO IDs for a list of genes?

how to use ESTIMATE to infer tumor purity and stromal score from RNA-seq data?

how to use ESTIMATE to infer tumor purity and stromal score from RNA-seq data? 1 Dear all: Did anyone use ESTIMATE (bioinformatics.mdanderson.org/main/ESTIMATE:Overview) to infer tumor purity and stromal score from RNA-seq before? I am not clear how to use this tool and what is the input file format for this…

Continue Reading how to use ESTIMATE to infer tumor purity and stromal score from RNA-seq data?

So many variants detected.

So many variants detected. 0 Dear All, I have done variant calling in Germline data that has single sample of each individual and two genes. I did following steps, but after checking results I found too many variants. After Haplotypecaller (the step 6) I found 140900 known variants, and the…

Continue Reading So many variants detected.

KEGG pathway draw

KEGG pathway draw 0 hello, I would like to know the tool or software used for generating similar illustration: thank you for your help. KEGG transcriptome RNAseq • 17 views Login before adding your answer. Source link

Continue Reading KEGG pathway draw

Use BLAST Command Line Applications to run a folder of many sequences against a database

Use BLAST Command Line Applications to run a folder of many sequences against a database 0 Hi everyone, I am quite new to using the BLAST Command Line Applications and would love any and all help. I am trying to use the application to run many sequences stored in a…

Continue Reading Use BLAST Command Line Applications to run a folder of many sequences against a database

How to generate feature Data(fData)

How to generate feature Data(fData) 0 I’m trying to analyze scATAC-seq data using Cicero, but I lack the metadata for peak(fData). I don’t know how to generate this metadata.Can someone tell me which tool can produce it? Note: This data was not generated using the 10X platform but Fluidigm C1….

Continue Reading How to generate feature Data(fData)

extracting a gene from a gmt file

extracting a gene from a gmt file 0 Login before adding your answer. Traffic: 1971 users visited in the last hour Source link

Continue Reading extracting a gene from a gmt file

heatmap of genes pseudotime in monoclle3

heatmap of genes pseudotime in monoclle3 0 Login before adding your answer. Traffic: 1946 users visited in the last hour Source link

Continue Reading heatmap of genes pseudotime in monoclle3

How to trim a GFF3 file based on specific coordinates?

How to trim a GFF3 file based on specific coordinates? 0 Hi, I would like to create a GFF3 file containing information only for specific coordinates from the chromosome level GFF3 file. I know how to extract gene and CDS info separately but don’t know how to do trimming based…

Continue Reading How to trim a GFF3 file based on specific coordinates?

How to get accession number on NCBI and how to get their gene sequences?

How to get accession number on NCBI and how to get their gene sequences? 0 Hi, everyone! I’m new to this world and im getting acquainted to Biopython.. On NCBI I did >> specie 1(organism) OR specie2(organism) OR specie3(organism) OR.. AND (gene1 OR gene2) Which from that i donwloaded the…

Continue Reading How to get accession number on NCBI and how to get their gene sequences?

CROP-seq data analysis

CROP-seq data analysis 1 Hi, I am a new bie to single cell sequencing analysis. I have to analyze CROP-seq data, I am going through the following paper, www.nature.com/articles/nmeth.4177. I have to use cell ranger ( instead of DROP-seq software) as the first step to process single cell data.I wanted…

Continue Reading CROP-seq data analysis

proportion of variance

Khatami – I have addressed this at length, elsewhere. Please see: GWAS: low explained heritability There are a variety of phrases that are used; variance explained is one of them – don’t be thrown off … the concept is very simple. For example, let’s say I am trying to predict…

Continue Reading proportion of variance

Kenya is emerging as a major hub of CRISPR biomedical research in Africa

Kenyan scientist Dr Hussein Abkallo wound up in the sophisticated world of Crispr Cas9 almost by chance, unaware that the decision would catapult … Source link

Continue Reading Kenya is emerging as a major hub of CRISPR biomedical research in Africa

EMBL Australia Group Leader in neural regeneration and/or organ engineering and synthetic biology

Description EMBL Australia Group Leader in neural regeneration and/or organ engineering and synthetic biology Applications are invited from exceptionally motivated scientists with an ambitious and original vision for research that aligns with Australian Regenerative Medicine Institutes (ARMI) core research themes. ARMI is excited to be offering the opportunity for a…

Continue Reading EMBL Australia Group Leader in neural regeneration and/or organ engineering and synthetic biology

ESTIMATE (Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data)

ESTIMATE (Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data) 1 Hi there Can anyone explain to me how to use the ESTIMATE package in RNA-seq analysis? I want to calculate immune scores and stromal scores by employing the ESTIMATE algorithm, then analyze the relationship of…

Continue Reading ESTIMATE (Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data)

Charpentier, winner of the Nobel Prize in Chemistry, was appointed to the Pontifical Academy of Sciences

Pope Francis has appointed Emmanuel Charpentier, director of the Berlin Max Planck Center for Research in Pathology and Nobel Laureate in Chemistry, to the Pontifical Academy of Sciences. The Holy See announced on Tuesday that the head of the Catholic Church has appointed a French microbiologist as an ordinary member….

Continue Reading Charpentier, winner of the Nobel Prize in Chemistry, was appointed to the Pontifical Academy of Sciences

Jobs with European Molecular Biology Laboratory (EMBL)

About European Molecular Biology Laboratory (EMBL) The European Molecular Biology Laboratory (EMBL) is one of the world’s leading research institutions, and Europe’s flagship laboratory for the life sciences. Research at EMBL emphasizes experimental analysis at multiple levels of biological organization, from the molecule to the organism; as well as computational…

Continue Reading Jobs with European Molecular Biology Laboratory (EMBL)

Advice on organizing large GSVA heatmap

Advice on organizing large GSVA heatmap 1 Hi there, I wanted to get some advice on how you might make your heatmap easier to read. In my case, I generated a heatmap from GSVA data which I filtered to only include significant pathways, here. I wanted to see how each…

Continue Reading Advice on organizing large GSVA heatmap

Progress toward Porcine Reproductive and Respiratory Syndrome Virus resistant pigs

Using the CRISPR/Cas9 system, “gene-edited” pigs were produced that lack CD163. The pigs were then placed in pens with control pigs (animals that … Source link

Continue Reading Progress toward Porcine Reproductive and Respiratory Syndrome Virus resistant pigs

New CRISPR/Cas9 Technique Corrects Cystic Fibrosis?

Analysts from the gathering of Hans Clevers (Hubrecht Institute) adjusted transformations that cause cystic fibrosis in refined human foundational microorganisms. In a joint effort with the UMC Utrecht and Oncode Institute, they utilized a procedure called prime altering to supplant the ‘flawed’ piece of DNA with a sound piece. New…

Continue Reading New CRISPR/Cas9 Technique Corrects Cystic Fibrosis?

Alignment-free RNA-seq Differential Gene Expression Analysis with Kallisto & Sleuth

Tutorial:Alignment-free RNA-seq Differential Gene Expression Analysis with Kallisto & Sleuth 0 Login before adding your answer. Traffic: 1718 users visited in the last hour Source link

Continue Reading Alignment-free RNA-seq Differential Gene Expression Analysis with Kallisto & Sleuth

Pope names Nobel laureate Jennifer Doudna to Pontifical

Pope Francis appointed Dr Jennifer Doudna to the Pontifical Academy of Sciences on Wednesday. Aug 11, 2021 VATICAN: The Holy Father has appointed the distinguished Professor Jennifer Anne Doudna, , as ordinary member of the Pontifical Academy of Sciences. Professor Jennifer DoudnaProfessor Jennifer Doudna was born on 19 February 1964…

Continue Reading Pope names Nobel laureate Jennifer Doudna to Pontifical

Healthy ‘beige’ fat protects brain from dementia in mouse models

Subcutaneous fat typically contains a mix of unhealthy white fat cells and “beige” adipocytes, which are similar to healthy brown fat in that they’re able to burn energy. Scientists from the Medical College of Georgia at Augusta University believe they’ve found another attribute for beige fat: It may protect against dementia. The researchers…

Continue Reading Healthy ‘beige’ fat protects brain from dementia in mouse models

Biohackers Season 3 Release Date to be Announced by September 2021

Biohackers is a German decent techno-thriller Netflix original series created by Christian Ditter. The main cast includes Luna Wedler, Jessica Schwarz and BennoFürmann. The first two seasons have six episodes of 35-45 minutes each and have made an image imposing with authenticity. It has 6.8/10 as an IMDb rating. The…

Continue Reading Biohackers Season 3 Release Date to be Announced by September 2021

GSEA and over-representation analysis of many genes

GSEA and over-representation analysis of many genes 0 Hello everyone! I’ve been doing some Differential Expression analysis on specific samples. It happens that I found a lot of genes that are DE. In total, of 24000 features, 11000 were up or down regulated in control vs group. Even tough the…

Continue Reading GSEA and over-representation analysis of many genes

is local ancestry inference typically always run w/ array genotypes instead of imputed genotypes?

This is a very difficult question to answer precisely. The theoretical argument is clear (based on information content literature), but in practice there are a lot of ways to muddy the waters…Let me give a theoretical argument first, then make several practical arguments afterwards. I hope that will do an…

Continue Reading is local ancestry inference typically always run w/ array genotypes instead of imputed genotypes?

3 Questions on… Picking the Right Tumor Cell Models for Research

Technologies that have enabled this include tumoroids, and genome engineering techniques, such as CRISPR/Cas9. Patient-derived xenografts … Source link

Continue Reading 3 Questions on… Picking the Right Tumor Cell Models for Research

Error when trying to import GTF files using rtracklayer’s import function

Error when trying to import GTF files using rtracklayer’s import function 0 I’m trying to use rtracklayer’s import function to import a GTF file. I downloaded the current comprehensive genome annotation for human from GENCODE, gunzipped the .gz file and tried the following: library(rtracklayer) granges <-import(“gencode.v36.annotation.gtf”) I am getting the…

Continue Reading Error when trying to import GTF files using rtracklayer’s import function

Output Polar Contacts Between Chains to Text File

  PYMOL: Output Polar Contacts Between Chains to Text File 0   I am new to PyMOL but have a very specific task that I need to do. I have a PDB structure file of protein homo-oligomer, and I want to use PyMol to determine polar contacts between chain A…

Continue Reading Output Polar Contacts Between Chains to Text File

Creating a variation graph for Giraffe alignment from assemblies

Creating a variation graph for Giraffe alignment from assemblies 0 I have a collection of ~100 4.5 megabase haploid assemblies that I would like to map to using giraffe. However, I am not completely clear on what the best practices are to construct the graph starting from the assemblies. I…

Continue Reading Creating a variation graph for Giraffe alignment from assemblies

integrative multiomic data

integrative multiomic data 0 I have to integrate cancer patients cohort data – protein, mRNA, Methylation, mutation profile to identify genes associated with survival. What may be the best strategy. I have looked at iNMF, icluster plus etc, that are based on different methods but could not find out what…

Continue Reading integrative multiomic data

Inquiry related to vcf file and formatting

Hello everyone, I am trying to run predixcan software. But its showing error as segmentation fault implying that there is something wrong with my vcf files. I am sharing the header of vcf file. ##fileformat=VCFv4.1 ##INFO=<ID=LDAF,Number=1,Type=Float,Description=”MLE Allele Frequency Accounting for LD”> ##INFO=<ID=AVGPOST,Number=1,Type=Float,Description=”Average posterior probability from MaCH/Thunder”> ##INFO=<ID=RSQ,Number=1,Type=Float,Description=”Genotype imputation quality from…

Continue Reading Inquiry related to vcf file and formatting

Difference in alignment length between FASTA and HitTable

Difference in alignment length between FASTA and HitTable 0 Hello all, I’ve a horrible feeling this is going to be a stupidly obvious answer but I’ve had no luck finding a similar question amongst the forum or in the BLAST manual. I’ve used BLAST on some sequences. I’ve then downloaded…

Continue Reading Difference in alignment length between FASTA and HitTable

Data Stewardship Community Manager

Job:Data Stewardship Community Manager 0 We are looking for a new Community Manager to join ELIXIR-UK and lead our new data steward training Fellowship programme and manage the community that develops. Do you have great communication skills, want to help others and enjoy working in an interdisciplinary team? Come and…

Continue Reading Data Stewardship Community Manager

Is subtelomeric region and pericentromeric region defined in human genome?

Is subtelomeric region and pericentromeric region defined in human genome? 2 I’ve been trying to see if there’s any coordinates for these but doesn’t have much luck. Saw a bunch of people defining it by +-2MB around the centromere gap and 30kb away from the telomere. I was wondering if…

Continue Reading Is subtelomeric region and pericentromeric region defined in human genome?

Can someone explain the differences between various 1000 genome project and gnomad call sets? Also any straightforward PCA implementation on them?

I’ve been trying to delve into the data from whole genome sequencing, specifically by looking at the already existing data in the 1000 genome project and gnomad, and I have a lot of questions. Does gnomAD contain the 1000gp samples? I’ve found many vcf including these: ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/supporting/hd_genotype_chip/ ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/integrated_sv_map/ ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase1/analysis_results/integrated_call_sets/ gnomad.broadinstitute.org/downloads…

Continue Reading Can someone explain the differences between various 1000 genome project and gnomad call sets? Also any straightforward PCA implementation on them?

Ten simple rules for biologists initiating a collaboration with computer scientists

I think it depends. I think it is probably quite a good guide for Bioinformaticians that want to collaborate with “proper” computer scientists when they need new computer science to solve their problems. I think its less good for biologists who need to collaborate with someone to give more computational…

Continue Reading Ten simple rules for biologists initiating a collaboration with computer scientists

How to create a BED12 file defining UTR sequences

Hello, I am doing an experiment and I need to build a BED12 file for some UTR sequences that I have. I have done a blast for those sequences and with that I was able to build a successful BED6 file, like this: 19 20752377 20758767 ENSDARG00000062634_Kat2b_Tscan 0 + 15…

Continue Reading How to create a BED12 file defining UTR sequences

What does samtools mean by ‘orientation’ when marking duplicates?

What does samtools mean by ‘orientation’ when marking duplicates? 0 Hello Biostars, It is my understanding that samtools marks duplicates on the basis of the 5′ position of reads and also the orientation of reads. This is based on my reading of the following: www.htslib.org/algorithms/duplicate.html However, I am not sure…

Continue Reading What does samtools mean by ‘orientation’ when marking duplicates?

Tools for Alternative Splicing Events in RNA-Seq analysis

Tools for Alternative Splicing Events in RNA-Seq analysis 0 Hello community, I’ve been doing a bit of a literature search on alternative splicing analysis using RNA-Seq data, there are quite a few tools that I have come across which I will list below. As well as some of these tools…

Continue Reading Tools for Alternative Splicing Events in RNA-Seq analysis

heatmap column(sample) names disordered

heatmap column(sample) names disordered 0 Hi dear biostar community, I have a problem with my heatmap output. I want to draw a heatmap of 45 topvar genes for 52 samples. but my column (sample) names are shown disordered. (e.g. f1d#num is femal-1dpi-disease, and f1c#num is female-1dpi-control) as you can see,…

Continue Reading heatmap column(sample) names disordered

Filtering MSA by similiarity to a consensus sequence/motif

Filtering MSA by similiarity to a consensus sequence/motif 0 Dear all, anyone knows a good way of filtering or sorting a large multiple sequence alignment (~8000 sequences) by similarity to a given consensus sequence? A solution using python/biopython would be optimal. Any help is appreciated! Best Jonathan biopython motif multiple…

Continue Reading Filtering MSA by similiarity to a consensus sequence/motif

blastpgp -b parameter?

blastpgp -b parameter? 0 hey do you know what is for -b parameter in blastpgp? blastpgp -d db -a 6 -e 0.005 -h 0.005 -j 5 -v 50 -b 750 cheers blast • 11 views Login before adding your answer. Source link

Continue Reading blastpgp -b parameter?

Upcoming online training courses

News:Upcoming online training courses 0 Physalia-courses will be hosting numerous online training courses and workshops throughout the rest of 2021! Please visit our website for the specific topics: www.physalia-courses.org/courses-workshops/ Should you have any questions, please do not hesitate to get in touch with us. Best regards, Carlo Python R •…

Continue Reading Upcoming online training courses

Sequencing file conversion

Sequencing file conversion 0 Hi, friends, I downloaded a set of scATAC-seq BAM files from an article database, and the author said that a BAM file is information about a cell. However, after a few days’ analysis of the script given by the author, I found that a CSV file…

Continue Reading Sequencing file conversion

GSVA R packages

GSVA R packages 1 Hello everyone, I’m trying to do a gene set varian analysis using R to detect a specific gene set signature of a specific pathway from 20 samples of RNA-seq. I have this files in BAM format but I don’t know what to do in order to…

Continue Reading GSVA R packages

PLINK ASSOC understanding the results

PLINK ASSOC understanding the results 1 Hello to all, I have 10 vcf files – 5 female fish and 5 male fish I have merged all 10 fish to one vcf file.(all_fish.vcf) I performed the plink association analysis on all 10 fish with the command: -noweb –const-fid –allow-no-sex –allow-extra-chr –pheno…

Continue Reading PLINK ASSOC understanding the results

BMC issues passes to fully vaccinated people ahead of Mumbai local trains' resumption

Ahead of resumption of Mumbai local trains, Brihanmumbai Municipal Corporation (BMC) began issuance of local train passes to fully vaccinated … Source link

Continue Reading BMC issues passes to fully vaccinated people ahead of Mumbai local trains' resumption

how to demultiplex paired end reads when R1 and R2 are identified by two different substrings?

I am struggling with finding a solution to a problem which seems easy but it’s not. I found many many questions that seems to be related (and I believe they are) but they are confusing and you never know which one fits your case. So there we go. I’ll try…

Continue Reading how to demultiplex paired end reads when R1 and R2 are identified by two different substrings?

Where do I get a WES dataset of size

Where do I get a WES dataset of size <1GB 1 Can someone please tell me from where can I get the WES or WGS dataset of size <1GB WGS WES genomics • 164 views Just browse sra-explorer.info for datasets. I doubt you can meaningfully query for file size as…

Continue Reading Where do I get a WES dataset of size

Replace multiple text with corresponding text

Replace multiple text with corresponding text 1 Hi, I run an analysis and the software replaced the bacteria name with codes, and I have txt file as below: Order Original Name Code 1 Allostreptomyces_psammosilenae_DSM_42178 S1_f1 2 Embleya_hyalina_NBRC_13850 S2_f2 3 Embleya_scabrispora_DSM_41855 S3_f3 Because the analysis involved few hundreds bacteria, it would…

Continue Reading Replace multiple text with corresponding text

Antibody Matching Transcripts

Antibody Matching Transcripts 1 Hello everyone, I am interested to find the matching isoforms of a gene that my HPA antibody is able to hit. When I check the Human Atlas Protein website it gives me a list of 4 transcripts my antibody can recognize. However, when I open ensembl…

Continue Reading Antibody Matching Transcripts

How to set variant FILTER in a VCF file based on overlap with regions in a BED file

I figured out how to do the annotation using BCFTools. 2 steps are needed. Input BED file requires 1 for each region where the annotation should be set Chr_01 1000 2000 1 Chr_05 5000 6000 1 Input header file: ##INFO=<ID=BAD_REGION,Number=0,Type=Flag,Description=”My bad region for some reason”> bgzip and tabix the bed…

Continue Reading How to set variant FILTER in a VCF file based on overlap with regions in a BED file

print only columns with data from every line

print only columns with data from every line 0 Hi, I have a vcf file where is about 60 000 columns. Here is example of the first three lines: #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 10022-20416-17 10024-34469-18A 10025-34469-18B 10034-31625-18A 10035-31625-18B 10036-31625-18C 10042-29083-18 10044-34485-18A 10045-34485-18B 10046-34485-18C 10069-33802-18 10070-20895-17…

Continue Reading print only columns with data from every line

Sortmerna error

Sortmerna error 0 Hello, I am facing this error in one of my files during sortmerna [split:646] ERROR: Failed deflating readstring: @A00489:986:HGYHKDRXY:1:2123:9824:27242 1:N:0:CAGTGCTT+ACCTGGAA CGGCTGCCTCTCAGGGGCGGTGGGGGGCGCGGCCGGCAGCGGCCCGCGGGGCGCGGGGGGCACCGAGTCGCTGCTGAAGTCCAGCAGCGGTGCGGCGGCGGGGGGCACCGGAGCCGCGGACAGCCCGGCTGCGGGCTTCCTCTCCAGCAC + FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF::FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFF:F:FFFFFFFFF I have used the tool before many times and in this run all the other samples are also working without errors. The md5sum doesn’t…

Continue Reading Sortmerna error

Postdoc position in phylogenomics and evolution of beetles

The research group led by Dr Dagmara Żyła at the Museum and Institute of Zoology, Polish Academy of Sciences (MIZ, PAS) is looking for candidates for a postdoc position to work within the project entitled: “The Impact of the Paleocene-Eocene Thermal Maximum on diversification dynamics in Paederinae rove beetles” funded…

Continue Reading Postdoc position in phylogenomics and evolution of beetles

How to select dataset from 1000 genome

How to select dataset from 1000 genome 0 I am new to genomics. I want to start practicing. For that, I would like to select dataset (.vcf file) from 1000 genome. On what basis I should choose? 1000 file vcf genome • 20 views Login before adding your answer. Source…

Continue Reading How to select dataset from 1000 genome