Error when trying to import GTF files using rtracklayer’s import function

Error when trying to import GTF files using rtracklayer’s import function 0 I’m trying to use rtracklayer’s import function to import a GTF file. I downloaded the current comprehensive genome annotation for human from GENCODE, gunzipped the .gz file and tried the following: library(rtracklayer) granges <-import(“gencode.v36.annotation.gtf”) I am getting the…

Continue Reading Error when trying to import GTF files using rtracklayer’s import function

Output Polar Contacts Between Chains to Text File

  PYMOL: Output Polar Contacts Between Chains to Text File 0   I am new to PyMOL but have a very specific task that I need to do. I have a PDB structure file of protein homo-oligomer, and I want to use PyMol to determine polar contacts between chain A…

Continue Reading Output Polar Contacts Between Chains to Text File

Creating a variation graph for Giraffe alignment from assemblies

Creating a variation graph for Giraffe alignment from assemblies 0 I have a collection of ~100 4.5 megabase haploid assemblies that I would like to map to using giraffe. However, I am not completely clear on what the best practices are to construct the graph starting from the assemblies. I…

Continue Reading Creating a variation graph for Giraffe alignment from assemblies

integrative multiomic data

integrative multiomic data 0 I have to integrate cancer patients cohort data – protein, mRNA, Methylation, mutation profile to identify genes associated with survival. What may be the best strategy. I have looked at iNMF, icluster plus etc, that are based on different methods but could not find out what…

Continue Reading integrative multiomic data

Inquiry related to vcf file and formatting

Hello everyone, I am trying to run predixcan software. But its showing error as segmentation fault implying that there is something wrong with my vcf files. I am sharing the header of vcf file. ##fileformat=VCFv4.1 ##INFO=<ID=LDAF,Number=1,Type=Float,Description=”MLE Allele Frequency Accounting for LD”> ##INFO=<ID=AVGPOST,Number=1,Type=Float,Description=”Average posterior probability from MaCH/Thunder”> ##INFO=<ID=RSQ,Number=1,Type=Float,Description=”Genotype imputation quality from…

Continue Reading Inquiry related to vcf file and formatting

Difference in alignment length between FASTA and HitTable

Difference in alignment length between FASTA and HitTable 0 Hello all, I’ve a horrible feeling this is going to be a stupidly obvious answer but I’ve had no luck finding a similar question amongst the forum or in the BLAST manual. I’ve used BLAST on some sequences. I’ve then downloaded…

Continue Reading Difference in alignment length between FASTA and HitTable

Data Stewardship Community Manager

Job:Data Stewardship Community Manager 0 We are looking for a new Community Manager to join ELIXIR-UK and lead our new data steward training Fellowship programme and manage the community that develops. Do you have great communication skills, want to help others and enjoy working in an interdisciplinary team? Come and…

Continue Reading Data Stewardship Community Manager

Is subtelomeric region and pericentromeric region defined in human genome?

Is subtelomeric region and pericentromeric region defined in human genome? 2 I’ve been trying to see if there’s any coordinates for these but doesn’t have much luck. Saw a bunch of people defining it by +-2MB around the centromere gap and 30kb away from the telomere. I was wondering if…

Continue Reading Is subtelomeric region and pericentromeric region defined in human genome?

Can someone explain the differences between various 1000 genome project and gnomad call sets? Also any straightforward PCA implementation on them?

I’ve been trying to delve into the data from whole genome sequencing, specifically by looking at the already existing data in the 1000 genome project and gnomad, and I have a lot of questions. Does gnomAD contain the 1000gp samples? I’ve found many vcf including these: ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/supporting/hd_genotype_chip/ ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/integrated_sv_map/ ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase1/analysis_results/integrated_call_sets/ gnomad.broadinstitute.org/downloads…

Continue Reading Can someone explain the differences between various 1000 genome project and gnomad call sets? Also any straightforward PCA implementation on them?

Ten simple rules for biologists initiating a collaboration with computer scientists

I think it depends. I think it is probably quite a good guide for Bioinformaticians that want to collaborate with “proper” computer scientists when they need new computer science to solve their problems. I think its less good for biologists who need to collaborate with someone to give more computational…

Continue Reading Ten simple rules for biologists initiating a collaboration with computer scientists

How to create a BED12 file defining UTR sequences

Hello, I am doing an experiment and I need to build a BED12 file for some UTR sequences that I have. I have done a blast for those sequences and with that I was able to build a successful BED6 file, like this: 19 20752377 20758767 ENSDARG00000062634_Kat2b_Tscan 0 + 15…

Continue Reading How to create a BED12 file defining UTR sequences

What does samtools mean by ‘orientation’ when marking duplicates?

What does samtools mean by ‘orientation’ when marking duplicates? 0 Hello Biostars, It is my understanding that samtools marks duplicates on the basis of the 5′ position of reads and also the orientation of reads. This is based on my reading of the following: www.htslib.org/algorithms/duplicate.html However, I am not sure…

Continue Reading What does samtools mean by ‘orientation’ when marking duplicates?

Tools for Alternative Splicing Events in RNA-Seq analysis

Tools for Alternative Splicing Events in RNA-Seq analysis 0 Hello community, I’ve been doing a bit of a literature search on alternative splicing analysis using RNA-Seq data, there are quite a few tools that I have come across which I will list below. As well as some of these tools…

Continue Reading Tools for Alternative Splicing Events in RNA-Seq analysis

heatmap column(sample) names disordered

heatmap column(sample) names disordered 0 Hi dear biostar community, I have a problem with my heatmap output. I want to draw a heatmap of 45 topvar genes for 52 samples. but my column (sample) names are shown disordered. (e.g. f1d#num is femal-1dpi-disease, and f1c#num is female-1dpi-control) as you can see,…

Continue Reading heatmap column(sample) names disordered

Filtering MSA by similiarity to a consensus sequence/motif

Filtering MSA by similiarity to a consensus sequence/motif 0 Dear all, anyone knows a good way of filtering or sorting a large multiple sequence alignment (~8000 sequences) by similarity to a given consensus sequence? A solution using python/biopython would be optimal. Any help is appreciated! Best Jonathan biopython motif multiple…

Continue Reading Filtering MSA by similiarity to a consensus sequence/motif

blastpgp -b parameter?

blastpgp -b parameter? 0 hey do you know what is for -b parameter in blastpgp? blastpgp -d db -a 6 -e 0.005 -h 0.005 -j 5 -v 50 -b 750 cheers blast • 11 views Login before adding your answer. Source link

Continue Reading blastpgp -b parameter?

Upcoming online training courses

News:Upcoming online training courses 0 Physalia-courses will be hosting numerous online training courses and workshops throughout the rest of 2021! Please visit our website for the specific topics: www.physalia-courses.org/courses-workshops/ Should you have any questions, please do not hesitate to get in touch with us. Best regards, Carlo Python R •…

Continue Reading Upcoming online training courses

Sequencing file conversion

Sequencing file conversion 0 Hi, friends, I downloaded a set of scATAC-seq BAM files from an article database, and the author said that a BAM file is information about a cell. However, after a few days’ analysis of the script given by the author, I found that a CSV file…

Continue Reading Sequencing file conversion

GSVA R packages

GSVA R packages 1 Hello everyone, I’m trying to do a gene set varian analysis using R to detect a specific gene set signature of a specific pathway from 20 samples of RNA-seq. I have this files in BAM format but I don’t know what to do in order to…

Continue Reading GSVA R packages

PLINK ASSOC understanding the results

PLINK ASSOC understanding the results 1 Hello to all, I have 10 vcf files – 5 female fish and 5 male fish I have merged all 10 fish to one vcf file.(all_fish.vcf) I performed the plink association analysis on all 10 fish with the command: -noweb –const-fid –allow-no-sex –allow-extra-chr –pheno…

Continue Reading PLINK ASSOC understanding the results

BMC issues passes to fully vaccinated people ahead of Mumbai local trains' resumption

Ahead of resumption of Mumbai local trains, Brihanmumbai Municipal Corporation (BMC) began issuance of local train passes to fully vaccinated … Source link

Continue Reading BMC issues passes to fully vaccinated people ahead of Mumbai local trains' resumption

how to demultiplex paired end reads when R1 and R2 are identified by two different substrings?

I am struggling with finding a solution to a problem which seems easy but it’s not. I found many many questions that seems to be related (and I believe they are) but they are confusing and you never know which one fits your case. So there we go. I’ll try…

Continue Reading how to demultiplex paired end reads when R1 and R2 are identified by two different substrings?

Where do I get a WES dataset of size

Where do I get a WES dataset of size <1GB 1 Can someone please tell me from where can I get the WES or WGS dataset of size <1GB WGS WES genomics • 164 views Just browse sra-explorer.info for datasets. I doubt you can meaningfully query for file size as…

Continue Reading Where do I get a WES dataset of size

Replace multiple text with corresponding text

Replace multiple text with corresponding text 1 Hi, I run an analysis and the software replaced the bacteria name with codes, and I have txt file as below: Order Original Name Code 1 Allostreptomyces_psammosilenae_DSM_42178 S1_f1 2 Embleya_hyalina_NBRC_13850 S2_f2 3 Embleya_scabrispora_DSM_41855 S3_f3 Because the analysis involved few hundreds bacteria, it would…

Continue Reading Replace multiple text with corresponding text

Antibody Matching Transcripts

Antibody Matching Transcripts 1 Hello everyone, I am interested to find the matching isoforms of a gene that my HPA antibody is able to hit. When I check the Human Atlas Protein website it gives me a list of 4 transcripts my antibody can recognize. However, when I open ensembl…

Continue Reading Antibody Matching Transcripts

How to set variant FILTER in a VCF file based on overlap with regions in a BED file

I figured out how to do the annotation using BCFTools. 2 steps are needed. Input BED file requires 1 for each region where the annotation should be set Chr_01 1000 2000 1 Chr_05 5000 6000 1 Input header file: ##INFO=<ID=BAD_REGION,Number=0,Type=Flag,Description=”My bad region for some reason”> bgzip and tabix the bed…

Continue Reading How to set variant FILTER in a VCF file based on overlap with regions in a BED file

print only columns with data from every line

print only columns with data from every line 0 Hi, I have a vcf file where is about 60 000 columns. Here is example of the first three lines: #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 10022-20416-17 10024-34469-18A 10025-34469-18B 10034-31625-18A 10035-31625-18B 10036-31625-18C 10042-29083-18 10044-34485-18A 10045-34485-18B 10046-34485-18C 10069-33802-18 10070-20895-17…

Continue Reading print only columns with data from every line

Sortmerna error

Sortmerna error 0 Hello, I am facing this error in one of my files during sortmerna [split:646] ERROR: Failed deflating readstring: @A00489:986:HGYHKDRXY:1:2123:9824:27242 1:N:0:CAGTGCTT+ACCTGGAA CGGCTGCCTCTCAGGGGCGGTGGGGGGCGCGGCCGGCAGCGGCCCGCGGGGCGCGGGGGGCACCGAGTCGCTGCTGAAGTCCAGCAGCGGTGCGGCGGCGGGGGGCACCGGAGCCGCGGACAGCCCGGCTGCGGGCTTCCTCTCCAGCAC + FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF::FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFF:F:FFFFFFFFF I have used the tool before many times and in this run all the other samples are also working without errors. The md5sum doesn’t…

Continue Reading Sortmerna error

Postdoc position in phylogenomics and evolution of beetles

The research group led by Dr Dagmara Żyła at the Museum and Institute of Zoology, Polish Academy of Sciences (MIZ, PAS) is looking for candidates for a postdoc position to work within the project entitled: “The Impact of the Paleocene-Eocene Thermal Maximum on diversification dynamics in Paederinae rove beetles” funded…

Continue Reading Postdoc position in phylogenomics and evolution of beetles

How to select dataset from 1000 genome

How to select dataset from 1000 genome 0 I am new to genomics. I want to start practicing. For that, I would like to select dataset (.vcf file) from 1000 genome. On what basis I should choose? 1000 file vcf genome • 20 views Login before adding your answer. Source…

Continue Reading How to select dataset from 1000 genome

How To Perform Peptide-Protein Docking

How To Perform Peptide-Protein Docking 4 Hi, could anybody give me an example about how to perform peptide-protein docking: the software and steps?Here is my problem:(1) a peptide about 10 amino acids(2) target protein (an enzyme with known PDB structure file, and known active sites)(3) how to find the most…

Continue Reading How To Perform Peptide-Protein Docking

Mapping transcripts to mitogenome

Mapping transcripts to mitogenome 0 I have performed de novo assembly of a plant mitogenome (non-model plant). In addition, I also perform de novo transcriptome assembly of this particular plant species. How to verify if the mitogenome assembly is correct making use of the RNA-Seq data? Should I align my…

Continue Reading Mapping transcripts to mitogenome

%% error in Rstudio

%% error in Rstudio 1 dc.markers %>% group_by(cluster) %>% top_n(2, wt = avg_logFC) the above code is giving error even after using dplyr and matrix libraries in seurat analysis in rstudio error : Error: Problem with filter() input ..1. i Input ..1 is top_n_rank(2, avg_logFC). x object ‘avg_logFC’ not found…

Continue Reading %% error in Rstudio

scRNAseq STAR create index how to set –sjdbOverhang

scRNAseq STAR create index how to set –sjdbOverhang 0 hello everyone, I want to create index before align read using STAR, the sjdbOverhang is described as ReadLength-1, so does it mean I should set 149 if the data is from Illumina 2×150bp? thanks scRNAseq STAR sjdbOverhang • 12 views Source…

Continue Reading scRNAseq STAR create index how to set –sjdbOverhang

Seurat to Trajectory Analysis

Seurat to Trajectory Analysis 0 Login before adding your answer. Traffic: 2582 users visited in the last hour Source link

Continue Reading Seurat to Trajectory Analysis

Analysing RNA seq data

Analysing RNA seq data 1 Hello all! I am new analysing RNA-seq data. They give me a directory containing: 1 normal sample and 15 tumor samples, each one has its cel file, exp file, and dcl file. They ask me to do a reanalysis, but I think I need more…

Continue Reading Analysing RNA seq data

Signac CallPeaks from multiple fragment files

Signac CallPeaks from multiple fragment files 0 I am attempting to run Macs2 CallPeaks on some multiome data and running into a problem when attempting to run CallPeaks command on multiple fragment file paths in Seurat object. peaks<-CallPeaks(DataCombined, macs2.path = “/anaconda3/bin/macs2”) FileNotFoundError: [Errno 2] No such file or directory: ‘/Users/Desktop/multiome/sc291/atac_fragments.tsv.gz…

Continue Reading Signac CallPeaks from multiple fragment files

How can I get the exact 3D structure of the protein to use the PDB file for PPI docking

How can I get the exact 3D structure of the protein to use the PDB file for PPI docking 1 I am working on an uncharacterized protein, and I need to know its PPI with rna polymrase in humans. Please, how can I get the exact 3D structure of the…

Continue Reading How can I get the exact 3D structure of the protein to use the PDB file for PPI docking

How to write a function (Python 3.8) to find FASTA entry for DNA sequence

Forum:How to write a function (Python 3.8) to find FASTA entry for DNA sequence 0 Prompt: FASTA files can contain an unlimited number of sequences, but are too large for most text editors to open and manipulate. One that can load a large FASTA file and extract a sequence of…

Continue Reading How to write a function (Python 3.8) to find FASTA entry for DNA sequence

CWL capture multiple output files with prefix

CWL capture multiple output files with prefix 0 I have a CommandLineTool demo, and run as: demo –AA file1.txt –BB file2.txt –CC file3.txt –nthreads 4 input file: –AA output file: –BB –CC The fllowing CWL doesn’t work. error: (“Error collecting output for parameter ‘ofile1’:ndemo.cwl:32:7: Did not find output file with…

Continue Reading CWL capture multiple output files with prefix

What is the best way to publish data from LIMS/ELN to DataLake

What is the best way to publish data from LIMS/ELN to DataLake 0 What is the best Strategy/Technique (like DB sync, push using APIs etc…) to publish Experiment/Study data from LIMS/ELN to Data lake/Data Warehouse? ELN DataLake Warehouse LIMS • 18 views Source link

Continue Reading What is the best way to publish data from LIMS/ELN to DataLake

Sequence alignment in Subio on personal PC

2 hours ago rehankkhan0123 • 0 Is Subio platform a good and trusted platform to perform sequence alignment? I have fastq files that I downloaded from NCBI, now I need to get raw count to perform further analysis like differential gene expression analysis. Source link

Continue Reading Sequence alignment in Subio on personal PC

Codon usage with unknown/unspecified Nucleotides

Codon usage with unknown/unspecified Nucleotides – coRdon 0 Hi, I am perfoming Codon usage analysis with coRdon package, and I wondering about how this package or other bioinformatics tool handle sequences with a significant amount of “Ns”, I failed to find any tip about it or even a deep discussion….

Continue Reading Codon usage with unknown/unspecified Nucleotides

Alignment using bwa-mem2

Alignment using bwa-mem2 0 Hello I need help in aligning the sequence with reference using bwa-mem2. I used the following code: bwa-mem2 mem -t 8 gch38.fa DE98NGSUKBD117612_1_1.fq DE98NGSUKBD117612_1_2.fq > d3_align.sam I got the following error: ERROR! Unable to open the file: gch38.fa.bwt.2bit.64 There is no gch38.fa.bwt.2bit.64 file. I have the…

Continue Reading Alignment using bwa-mem2

Multiple data scientist positions in computational biomedicine

Job:Multiple data scientist positions in computational biomedicine 0 Job Description Several NIH-funded data scientist positions are available in Prof. Gaurav Pandey’s (research.mssm.edu/gpandey/) lab at the Icahn School of Medicine at Mount Sinai in New York City. The overall project for these positions is the design and implementation of novel machine/deep…

Continue Reading Multiple data scientist positions in computational biomedicine

Is it normal for RCorrector to remove millions of reads?

Is it normal for RCorrector to remove millions of reads? 0 I’m trying to build De Novo transcriptomes for unsequenced plants to do sequence analysis. I’m trying to choose a tool for my first pass of quality filtering after running FastQC on my raw reads. I’ve tried AfterQC and RCorrector….

Continue Reading Is it normal for RCorrector to remove millions of reads?

How to find the co ordinates of long reads (simulated by Badreads) with respect to the reference genome

How to find the co ordinates of long reads (simulated by Badreads) with respect to the reference genome 0 Hi, I have simulated a set of ONT long reads (10x) of E coli using the Badreads simulator tool. I was wondering is there any way I can know the co…

Continue Reading How to find the co ordinates of long reads (simulated by Badreads) with respect to the reference genome

Downloading WGBS sequencing reads for cancer sample and healthy sample

Downloading WGBS sequencing reads for cancer sample and healthy sample 0 Hello, I am completely new in WGBS data analysis. I want to learn analysing differentially methylated regions between cancer and normal sample. How can I find these data easily? I’m looking for WGBS reads data. Your help is highly…

Continue Reading Downloading WGBS sequencing reads for cancer sample and healthy sample

Comment: Direct – indirect binding of a transcription factor in chip-seq analysis

Try with different tools HOMER, MEME etc. May be try +/- 10bp from summit of peak. If same motif comes out than the probability increase for direct binding. Although this approach is just to get an idea that there is a direct interaction. Source link

Continue Reading Comment: Direct – indirect binding of a transcription factor in chip-seq analysis

Annotating cell types via integrating a query dataset with a reference dataset and then cluster

  Mostly because it’s typically unnecessary given that reference-based classification should yield a similar result without being subjected to potential biases introduced during the integration process. SingleR (and presumably Seurat, I don’t know as I don’t use it) uses a reference dataset and asks “Which reference sample’s expression profile is…

Continue Reading Annotating cell types via integrating a query dataset with a reference dataset and then cluster

Help speeding up HMMER’s HMMSearch algorithm for large fasta file with GNU Parallel

I’ve seen that HMMER can be sped up with GNU Parallel: Speed of hmmsearch I have around 100,000 sequences and a HMMER database of around 300 HMM profiles. I’m running everything at once but I’m wondering if it’ll be faster to split up the sequences and/or split up the jobs….

Continue Reading Help speeding up HMMER’s HMMSearch algorithm for large fasta file with GNU Parallel

Survival Analysis Cut-off

Survival Analysis Cut-off 0 Hello guys, I am doing a survival analysis using TCGA-BRCA project data. I am trying different cut-offs to separate my samples into high and low risk groups, but since it is my first time I would like to ask a question just to be fully sure…

Continue Reading Survival Analysis Cut-off

antiSMASH output

antiSMASH output 0 Hello. Can someone help me to interpret the antiSMASH output and the count number of BGCs by using command line. It would be great to receive your help. Thank you RESULTS antiSMASH • 65 views Login before adding your answer. Source link

Continue Reading antiSMASH output

linkage disequilibrium and haplotype analysis of GWAS .

  linkage disequilibrium and haplotype analysis of GWAS . 0   Hi all, I have GWAS data. I have my data in 22 chromosome files in plink format. I have imputed genotype with Sanger imputation server. I use plink for my analysis but because plink 1.9 no more supports –hap…

Continue Reading linkage disequilibrium and haplotype analysis of GWAS .

How to visualise a phylogenetic tree with amino acids (double letter repeat) multiple-sequence alignment?

How to visualise a phylogenetic tree with amino acids (double letter repeat) multiple-sequence alignment? 0 I have a fasta file as shown below, rvd.fasta >t1 NI-NG-NR-NN-NG-HD-HD >t_temp5 NG-NG-NI-N*-NR-NI-NN-NG-NG-HD >tal8 NG-NG-NI-N*-ND-NI-NN-NG-NG-H*-NH-NI I have a newick file as follows, tree.newick (tal8:0.49999997,t_temp5:0.47298786,t1:28.37858179); I need to visualise both the tree and rvd.fasta file (multiple-sequence…

Continue Reading How to visualise a phylogenetic tree with amino acids (double letter repeat) multiple-sequence alignment?

How to get the assembly larger than 3.5Gb in NextDenovo?

How to get the assembly larger than 3.5Gb in NextDenovo? 0 Hey there! NextDenovo assembles only 3.5Gb. Is it possible to somehow make the NextDenovo go to 17Gb? Thanks, Ural assemble NextDenovo genome large • 42 views Source link

Continue Reading How to get the assembly larger than 3.5Gb in NextDenovo?

Understanding bcftools command

Understanding bcftools command 1 I need to perform the following action to combine multiple vcf files into one BCF=/path_to_bcftools export BCFTOOLS_PLUGINS=$BCF/plugins DIR=/path_to_normal_vcf_file $BCF/bcftools merge -m all -f PASS,. –force-samples $DIR/*.vcf.gz | $BCF/bcftools plugin fill-AN-AC | $BCF/bcftools filter -i ‘SUM(AC)>1′ > panel_of_normal.vcf I don’t have access to command-line bcftools, and since…

Continue Reading Understanding bcftools command

Positive preliminary data on CRISPR treatment for blood diseases

Credit: CC0 public domain Stephan Grupp, MD, Ph.D., Chief and Medical Director of the Cell Therapy and Transplantation Section of the Institute for Cell and Gene Therapy at the Philadelphia Children’s Hospital (CHOP) and the first pioneer in cell immunotherapy in childhood. A collaborative team of researchers, including Cancer, recently…

Continue Reading Positive preliminary data on CRISPR treatment for blood diseases

Allele frequency calculation

Allele frequency calculation 0 Hello everyone, I use vcf tools to find AF values by using this command: vcftools –gzvcf $SUBSET_VCF –freq2 –out $OUT –max-alleles 2 The output I got from this is: chr pos nalleles nchr a1 a2 <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 22 16050408 2 846…

Continue Reading Allele frequency calculation

Fact Check-mRNA vaccines are distinct from gene therapy, which alters recipient’s genes

Vaccines that use mRNA technology are not gene therapy because they do not alter your genes, experts have told Reuters after contrary claims were posted online. Thousands of social media users have shared such posts since the rollout of COVID-19 vaccines began (here) – and have continued to do so…

Continue Reading Fact Check-mRNA vaccines are distinct from gene therapy, which alters recipient’s genes

Here’s Why Fulcrum Therapeutics Stock Is Shooting Higher Today

What happened Shares of Fulcrum Therapeutics (NASDAQ: FULC), an early clinical-stage biopharmaceutical company, are rocketing higher after an encouraging clinical trial readout. Investors excited about the company’s sickle-cell disease candidate pushed the stock 76% higher as of 10:35 a.m. EDT on Monday. So what This morning Fulcrum Therapeutics shared data…

Continue Reading Here’s Why Fulcrum Therapeutics Stock Is Shooting Higher Today

From EC terms to GO terms to GO enrichment

From EC terms to GO terms to GO enrichment 0 Hi I did a KEGG annotation of RNAseq data. I want do do GO enrichment next. I do have the EC numbers of the genes of interest, so I converted the EC numbers to GO terms using EC2GO. But I…

Continue Reading From EC terms to GO terms to GO enrichment

Error in DESeqDataSetFromMatrix Function in DESeq Library

Error in DESeqDataSetFromMatrix Function in DESeq Library 1 I am trying to create a DESeq dataset object from a dataset in GEO as follows: deseq2_142731 <- DESeqDataSetFromMatrix(countData = GSE142731[,2:ncol(GSE142731)],colData = labels_gse142731,design = ~V1) However, I get an error: Error in DESeqDataSet(se, design = design, ignoreRank) : some values in assay…

Continue Reading Error in DESeqDataSetFromMatrix Function in DESeq Library

Environmental DNA: Seeing the Unseen

We produce antibodies after vaccination, but how many do we need to prevent infection and the spread of COVID? PlayFull TranscriptDownload … Source link

Continue Reading Environmental DNA: Seeing the Unseen

What’s the difference between enrichKEGG and gseKEGG

What’s the difference between enrichKEGG and gseKEGG 3 Hi, I was wondering what is the difference between enrichKEGG and gseKEGG in R package ClusterProfile. Thanks! clusterprofiler KEGG • 2.3k views Source link

Continue Reading What’s the difference between enrichKEGG and gseKEGG

Manual annotation of cell types in single cell RNA-seq

Manual annotation of cell types in single cell RNA-seq 1 I have recently started working with scRNA-seq data. I am following the tutorials by the creators of Seurat. In the final section titled “Assigning cell type identity to clusters”, the authors mention that Fortunately in the case of this dataset,…

Continue Reading Manual annotation of cell types in single cell RNA-seq

Am I Sabotaging Myself By Getting A Masters Instead Of A Phd?

Am I Sabotaging Myself By Getting A Masters Instead Of A Phd? 5 Hello everyone! I realize that this question has been asked before and I have read through some of the other threads but I figured I’d see if there are any more perspectives out there. I am currently…

Continue Reading Am I Sabotaging Myself By Getting A Masters Instead Of A Phd?

The usage of sed

The usage of sed 1 sed -e ‘s/_scATAC_hg19_noDup_noMT.bam//g’ -e ‘s//directory/to/singleCell///g’ bamlist.txt | sed -e ‘s///t/g’ | awk ‘OFS=”t”{print $2}’ | tr ‘n’ ‘t’ > header.txt This replacement command is too complex. Can someone explain what this means? linux sed shell • 51 views • link updated 1 hour ago by…

Continue Reading The usage of sed

SRA splitting for each metagenome-assembled genome

Job:SRA splitting for each metagenome-assembled genome 0 Hi everybody, we obtained viruses from water and sequenced them with Illumina. we formed different metagenomic-assembled genomes and get a Bioproject number and Biosample numbers (for each of them). Now, i should do SRA submission. But i cannot submit for my all genomes…

Continue Reading SRA splitting for each metagenome-assembled genome

Samtools difference between Mapped and Unmapped read

Samtools difference between Mapped and Unmapped read 0 Hello, I am wondering what the difference is between a mapped and unmapped read in samtools. I am extremely new to this whole process, just trying to learn my way through so if you could “dumb” it down or try and explain…

Continue Reading Samtools difference between Mapped and Unmapped read

align using file.ht2

align using file.ht2 1 now i downloaded in my terminal indexed file of UCSC hg19 and when i uncompress it , i found two files genome.5.ht2 genome.8.ht2 and every time i want to align my samples at indexed file this error show up [e::bwa_idx_load_from_disk] fail to locate the index files…

Continue Reading align using file.ht2

question about running CIRI-full

question about running CIRI-full 1 I’m using ciri-full to calculate the full length sequence of circRNAs ,and I can run the test data set successfully, but I can’t run my own data running test data set: java -jar ../CIRI-full.jar Pipeline -1 test_1.fq.gz -2 test_2.fq.gz -a test_anno.gtf -r test_ref.fa -d test_output/…

Continue Reading question about running CIRI-full

DESeq2 design question

I have a count matrix from an RNA-seq experiment that I’d like to normalize using DESeq2 and perform DE analysis on. My code is below: dds <- DESeqDataSetFromMatrix(countData = cts, colData = coldata, design= ~ condition) My experiment is performed over two time periods, week1 (with treated vs untreated) and…

Continue Reading DESeq2 design question

R Programming – how to make a simple heat map

R Programming – how to make a simple heat map 5 Hi can anyone guide me how to make a simple heat map in R? Heatmap R • 264 views There is github.com/XiaoLuo-boy/ggheatmap which is fully ggplot in case you feel more comfortable with it rather than the suggested pheatmap/ComplexHeatmap…

Continue Reading R Programming – how to make a simple heat map

Caribou Biosciences Raises $304M in Potentially the Largest Gene Editing IPO | Rothwell, Figg, Ernst & Manbeck, P.C.

Caribou Biosciences, Inc., a Berkeley, California-based CRISPR genome-editing biopharmaceutical company, raised $304M in an initial public offering, one of the most lucrative IPOs in gene-editing. In June 2021, Gene editing biotech Verve raised $267M in IPO proceeds and later added another $40 million after its financial underwriters opted to buy…

Continue Reading Caribou Biosciences Raises $304M in Potentially the Largest Gene Editing IPO | Rothwell, Figg, Ernst & Manbeck, P.C.

Trimming of adapters and indexes

Trimming of adapters and indexes 0 I investigate a protein which binds small DNA (<30 nt) and have a library of these small DNA. I know that adapters and indexes are from this site (5′ adapter has T instead of U). [To reach the page I want to show click…

Continue Reading Trimming of adapters and indexes

EOF marker absent in VCF

EOF marker absent in VCF – can this be safely ignored? 0 Hi, I generated a VCF file using a bcftools mpileup | bcftools call pipeline. I have done this before, and the file produced then looks fine. However, the log for this one had [W::bgzf_read_block] EOF marker is absent….

Continue Reading EOF marker absent in VCF

Error while subsetting VCF – error doesn’t check out with (z)grep

Error while subsetting VCF – error doesn’t check out with (z)grep 0 I’m using bcftools view -s to subset a VCF.gz file. I ran into an error: [E::vcf_parse_format] Number of columns at chr9:44897051 does not match the number of samples (90 vs 99) To look at this site, I ran…

Continue Reading Error while subsetting VCF – error doesn’t check out with (z)grep

Low assignment rate with featureCounts

Low assignment rate with featureCounts 0 I used STAR to align my reads (brain samples) to human reference genome. Getting good unique mapping rates (~70-90%). However, when I use featureCounts I get really low assignment rates. Here is an example command featureCounts -p -t exon -g gene_id -s 2 -T…

Continue Reading Low assignment rate with featureCounts

seurat `@assays$RNA@counts` vs `@assays$RNA@data`?

seurat `<obj>@assays$RNA@counts` vs `<obj>@assays$RNA@data`? 1 I have two matrices called <object>@assays$RNA@counts and <object>@assays$RNA@data that are both real non-negative. What is the difference between these? seurat • 46 views • link 11 minutes ago by mk &utrif; 230 Source link

Continue Reading seurat `@assays$RNA@counts` vs `@assays$RNA@data`?

k-mer counters – presence/absence matrix

k-mer counters – presence/absence matrix 2 Hi lizabe, You’re right that this tutorial is out of date. The –matrix option is no longer valid as an option to jellyfish count. However, I don’t think it’s original intent was to do what you wanted anyway. It doesn’t write out a binary…

Continue Reading k-mer counters – presence/absence matrix

remove effect of latent variables from log fold change

The findMarkers function of seurat allows users to specify latent variables to be adjusted for when finding differentially expressed genes. I am testing for differences in gene expression between 2 groups – disease vs normal. For the statistical test, I am using LR, described below: LR: disease state is modelled…

Continue Reading remove effect of latent variables from log fold change

Python fast way to get ONLY MAIN metadata for GSE ? (not walking through thousands underlying GSM-samples : slow or even endless)

  Not Python but using EntrezDirect you can get: $ esearch -db bioproject -query “GSE118723” | esummary | xtract -pattern DocumentSummary -element Project_Description Quantification of gene expression levels at the single cell level has revealed that gene expression can vary substantially even across a population of homogeneous cells. However, it…

Continue Reading Python fast way to get ONLY MAIN metadata for GSE ? (not walking through thousands underlying GSM-samples : slow or even endless)

STAR rna-seq for bacterial genomes

Hi, I’m willing to use STAR for bacterial genomes. I wanted to ask if this is strongly unadvised or if there is a way to manage the main challenges of mapping reads to prokaryotes. (I know there are specific tools for this purpose, i.e. EdgePro, but I’m a beginner in…

Continue Reading STAR rna-seq for bacterial genomes

Help with finding p value

Help with finding p value 1 Hello, I’m doing a study to compare 3 groups with different n. I was wondering if anyone could guide me to find the best way to get P-value between these groups using excel p-value • 68 views Source link

Continue Reading Help with finding p value

I am converting the fq.gz. files (which are the results of the mgi study) to bam files to view on igv.

I am converting the fq.gz. files (which are the results of the mgi study) to bam files to view on igv. 0 Hey everyone, before i start apologies for the inconvenience cause of my wrong or inappropriate use of terms. I take some fails of bwa mem lately. As i…

Continue Reading I am converting the fq.gz. files (which are the results of the mgi study) to bam files to view on igv.

Map Entire Directory of Paired-End Reads at Once

Map Entire Directory of Paired-End Reads at Once 0 Is there a way to map an entire directory of reads at once? Would I just have to write a script for this specific to my directory structure and data? I’m using BWA MEM to map 49 paired-end reads and have…

Continue Reading Map Entire Directory of Paired-End Reads at Once

some values in assay are negative

some values in assay are negative 0 Hi, We are trying to analyze information and we have this problem. Our data does not contain any negative values ​​at all. deseq2 • 35 views Source link

Continue Reading some values in assay are negative

What Galaxy tools add Ns to variable length FASTQ sequences to get uniform length? (FASTA if needed)

What Galaxy tools add Ns to variable length FASTQ sequences to get uniform length? (FASTA if needed) 1 Hello! I am attempting to perform alignments for a variety of FASTQ files. I need the sequences to be the same length, 250 bp. That being said, I do not want to…

Continue Reading What Galaxy tools add Ns to variable length FASTQ sequences to get uniform length? (FASTA if needed)

DESeq2 analysis result differences

Hello, I performed patch-seq for 2 sets of neurons and then used DESeq2 to look for transcriptomic differences between the groups. One group consists of 7 neurons and the second group consists of 9 neurons. Genes that meet a threshold criteria of L2FC of more than 1.5 and adjusted p-value…

Continue Reading DESeq2 analysis result differences

Converting mouse Gene IDs to Human while keeping genes that don’t convert

Hi there, I am using bioMart to convert some gene IDs from mouse to human for some data I generated through RNA-seq. I am currently mapping using the following function: convertMouseGeneList <- function(x){ require(“biomaRt”) human = useMart(“ensembl”, dataset = “hsapiens_gene_ensembl”) mouse = useMart(“ensembl”, dataset = “mmusculus_gene_ensembl”) genesV2 = getLDS(attributes =…

Continue Reading Converting mouse Gene IDs to Human while keeping genes that don’t convert

Problems with fragment length distribution output with Salmon

Problems with fragment length distribution output with Salmon 1 Hi all, New to RNA-Seq and I’m struggling with my Salmon alignment output. I tried to find an answer to this question on older posts but I couldn’t locate any other discussions, so apologies in advance if this has been covered…

Continue Reading Problems with fragment length distribution output with Salmon

Histone marks enrichment analysis

Histone marks enrichment analysis 0 Hello everyone, here’s my question: I have a bed file of human genomic coordinates (hg19), and I would like to know whether ChIP-seq peaks for specific histone marks (such as those from ENCODE) are significantly more represented within my test regions compared to a background…

Continue Reading Histone marks enrichment analysis

PyMol – molecule export problem

PyMol – molecule export problem 0 Dear all, I have a molecule in PyMol which I want to open in ChimeraX. But after exporting the molecule as .pdb, some structural information seems to be lost – regions that in the PyMol session file are embedded into helices are now loops…

Continue Reading PyMol – molecule export problem

MinION Data Examples (FAST5) Database

MinION Data Examples (FAST5) Database 0 Hello everyone, I am constructing a pipeline to analyze Oxford Nanopore MinION data. I have start from FAST5 files and for some optimizations I will try multiple tools for each step. So I will need several datasets to try. As I see most of…

Continue Reading MinION Data Examples (FAST5) Database

Relative abundance of differentially abundant ASVs after DESeq2

Relative abundance of differentially abundant ASVs after DESeq2 0 Hello, I used DESeq2 to see which ASVs were differentially abundant between different treatments on 16S metabarcoding data. I now want to plot the relative abundance (in %) of those ASVs. However, I am unsure which data would make to most…

Continue Reading Relative abundance of differentially abundant ASVs after DESeq2

R Programming

R Programming 4 Hi can anyone guide me how to make a simple heat map in R? in Heatmap R • 201 views There is github.com/XiaoLuo-boy/ggheatmap which is fully ggplot in case you feel more comfortable with it rather than the suggested pheatmap/ComplexHeatmap packages and want to have a consistent…

Continue Reading R Programming

GEO submission when I have raw data in SRA

GEO submission when I have raw data in SRA 0 I am trying to submit my scRNA-seq data to GEO. GEO submission guidelines state that I should upload metadata, raw and processed data. And they submit the raw data to SRA on my behalf. But I already submitted my raw…

Continue Reading GEO submission when I have raw data in SRA

phasing with shapeit for non human spicies

phasing with shapeit for non human spicies 0 I am trying to phase genotyping data in non-human organism. I have a reference for my plant species only in fasta format, but required input for shapeit are phased.gz legend.gz and .sample how can I do phasing with my reference panel? Any…

Continue Reading phasing with shapeit for non human spicies

PDRA in Computational Biophysics and Cancer Research, University of Manchester, UK

Research Associate in Computational Biophysics, University of Manchester Job reference: BMH-017047 Location: Oxford Road, Manchester, UK Closing date: 19/08/2021 Salary: £32,816 per annum Employment type: Fixed Term Faculty/Organisation: Biology, Medicine & Health School/ Directorate: Molecular & Cellular Function Hours per week: Full Time Contract Duration: 01 August 2021 until 31…

Continue Reading PDRA in Computational Biophysics and Cancer Research, University of Manchester, UK