Categories
Tag: dgelist
Error in CIBERSORTx
Hello, I am trying to use CIBERSORT to deconvolute the immune cells in pancreatic cancer after my treatments. I have 3 biological replicates of Control, Treatments A,B,C. Using edgeR, I created the cpm matrix which is not log transformed. and converted it to the required format as follows: # Load…
Does gene raw read count mean and dispersion preserve batch effects if I used raw counts from different batches to calculate them?
Question: Does gene raw read count mean and dispersion preserve batch effects if I used raw counts from different batches to calculate them? ContextI ask this because I need these values as inputs to the ssizeRNA_vary function of the ssizeRNA package. Alternatively, I can remove batch effects with removeBatchEffect() function…
how to test for differential expression in samples where a global increase in gene expression is expected
As the title suggestions, I’m wondering what the best way to test each gene in a count matrix containing two groups is, where one group is expected to have a global increase in gene expression. I need to use spike-in normalized RPKM data, so from my understanding of DESeq, it…
Limma/DESeq2 for unbalanced nested design (paired samples)
I have an RNAseq dataset that I want to perform differential gene-expression analysis on. The dataset consists of 3 groups = macrophages deriving from adults (n=6), term-born infants (n=5), and preterm infants (n=3). Each sample has been treated with an immune-stimulus, or left untreated (paired samples). Group Treatment Sample_Nr Sample_within_group…
DuplicateCorrelation() block by duplicate sample or by cell line ?
I have Bulk-RNAseq data from 3 drug exposures (vehicle, low and high dose) x 2 replicates per condition x 6 cell lines. So its a total of 36 samples. I am interested in the Exposure effect. I am using DuplicateCorrelation and limma voom, but my cell Line effect is eating…
How to get RPKM from count matrix
How to get RPKM from count matrix 0 Hi Biostars, I have a count matrix with mouse gene name and need to get RPKM. I know it is not a good metric but biologists used to it. gtf <- readGFF(“/reference_genome/mm39.ncbiRefSeq.gtf”) gtf_exon <- gtf[gtf$type == “exon”, ] width <- gtf_exon$end -…
Complex multifactorial DE analysis with limma/edgeR based on rnaseq data
Dear Biostars, I would like to ask you one specific question regarding the DE analysis on an RNASeq dataset of samples, spanning a multi-factor experimental design. Briefly, unstimulated neutrophils of 4 healthy donors, were cultivated with distinct treatment conditions-that is, supernatant of organoids from different cancer/normal patient samples; There are…
Answer: trajectory-like graph for a data frame
There are better ways to do this than trajectory-type things. In reality, all you need is boxplots or violin plots or whatever grouped by timepoint, optionally split into additional groups if you have other conditions. For example: ![enter image description here][1] [1]: /media/images/c0512668-3fdb-42b6-afe2-fdef4d0f This is split by tissue, grouped by…
Read data file and produce RPKM in edgeR
This post is in response to a number of emails and posts asking about reading data into edgeR and producing RPKM. Suppose we start with a tab-delimited file counts.txt like this: To read this into edgeR: library(edgeR) Data <- read.delim(“counts.txt”, sep=”\t”, row.names=1) y <- DGEList(Data, annotation=”Length”) To normalize the library…
A query about selecting the coefficient in glmLRT to test for by index
A query about selecting the coefficient in glmLRT to test for by index 1 @0b4d0d5b Last seen 9 hours ago Norway I am using the below set of equations for my differential expression analysis. In the glmLRT line, I was previously specifying the coef by index and since I am…
Filter gene with low count in RNA-seq using a function from edgeR
Hi all, I try to filter out gene with low count from raw count matrix I run d <- DGEList(counts=counts,group=factor(conditions)) keep <- filterByExpr(d) bcv <- 0.2 et <- exactTest(keep, dispersion=bcv^2) Error in exactTest(d, dispersion = bcv2) : Currently only supports DGEList objects as the object argument. d <- estimateTagwiseDisp(d) Error…
DGEList error in EdgeR – Warning: Error in $: $ operator is invalid for atomic vectors
DGEList error in EdgeR – Warning: Error in $: $ operator is invalid for atomic vectors 0 Hello everyone, I am developing a simple Shiny app where users can upload a counts file (here called ‘counts’) and this can be analyzed later on using edgeR and Limma. This is a…
Question about RNAseq analysis in EdgeR to identify common and donor specific differentially expressed genes
Hi, This question is about RNAseq analysis in EdgeR to identify common and specific differentially expressed genes. I have 3 different Donor’s in-vitro cultured tissue with two different infection status, one with infected virus (high dose 6hr) and another with un-infected (baseline 0hr). This was sequenced using RNAseq, then aligned…
smallRNA-seq analysis batch correction in limma
Good morning, I am currently trying to analyse a small-RNA sequencing dataset using limma package and the voomLmFit function. However, I am experiencing some issues with the p-value distribution, with some of the comparisons I am testing showing odd p-value distribution. This probably indicates that there is some kind of…
How to do GSEA over limma + voom DGE ?
I am doing DGE over RNASeq data of two types of cancer. Here is the code: keep <- filterByExpr(RNA_data, design = design) RNA_data <- RNA_data[keep,] RNA_data <- DGEList(counts = RNA_data, genes = rownames(RNA_data)) # Normalize the counts using the TMM method RNA_data <- calcNormFactors(RNA_data, method = “TMM”) # Create the…
Limma differential expression analysis across subtypes
I want to retrieve the ANOVA-based differential expression between KIRP1a, KIRP1b, KIRP1c, KIRP2a, KIRP2b, and KIRP2c groups to identify biomarkers for each group. My code below returned deg that are statistically significant but my heatmap (z-scaled) did not show any observable differences across the subtypes. Is there an issue with…
glmLRT contrast (compare group with processed/extracted group)
Hello, experts. I’m here to ask for your kind helps. I’m currently working on DEG analysis. briefly, I want to compare DEG differences between (P07_T01-P07_N01 & P08_T01-P08_N01) vs (P07_T02& P08_T02). This is to compare T01’s solely with T02’s. Yet, there are 2 problems. First, I keep getting an error from…
Limma returned only positive logFC values
Limma returned only positive logFC values 0 I want to obtain the upregulated and downregulated genes using limma. However, all the DEGs returned by my code have positive LogFC and none are downregulated (negative LogFC). This observation is consistent across multiple distinct dataframes. Is there something wrong with my code?…
manually calculate log2 fold change and compare
Hi everybody, I am struggling trying to calculate log2FC manually with an RNA-seq experiment that has no replicates. I know this question has been posted but hadn’t been able to transfer the answers to my data. So I have 3 conditions, let’s say HpA, SpA and Empty. I would like…
Having problem on doing edgeR anlysis- cant create the DGE list through readbismark2DGE function.
I am facing a problem on getting the DGElist from the function of readbismark2dGE(). It didn’t show any error. output of bismark2DGE only shows Hashing, counting. I am using google colab to get more system RAM. I am using R version R4.2.3 and BiocManager 3.16. if (!requireNamespace(“BiocManager”, quietly = TRUE))…
coef /makeContrasts very different results
I have a situation where I have a factoral independant variable ‘suicide’ with three levels ‘non_suicide’, suicide, and ‘unkown’. I have been setting up my analysis thus: suicide.non.undet <- as.factor(df$suicide) CauseofDeath.recode <- as.factor(df$CauseofDeath) Sex <- as.factor(df$Sex) Smoking <- as.factor(df$Smoking) Ethanol <- as.factor(df$Ethanol) svseq1 <- as.numeric(df$svseq1) design1 <- model.matrix(~ suicide +…
non-numeric values found in counts
Error in DGEList(counts = cnt, group = group) : non-numeric values found in counts 2 @681bb58b Last seen 3 hours ago Costa Rica When I tried to create a list with EdgeR, I encountered a error:”Error in DGEList(counts = cnt, group = group) : non-numeric values found in counts” ….
Calculate RPKM
Calculate RPKM 0 Hi bioinformaticians, would anyone calculate RPKM from the count matrix with edgeR or DESeq2? I found some resources guide on this but there are one or two steps I don’t know such as this: y <- DGEList(counts=counts,genes=data.frame(Length=GeneLength)) y <- calcNormFactors(y) RPKM <- rpkm(y) How to get GeneLength?…
paired test for differential expression on RNAseq data
I need to make a paired wise differential expression test for a metadata like below: colData<- DataFrame(row.names = c( “1”, “2”, “3”, “4”, “5”, “6”), Patient=c(“b1″,”c1″,”d1″,”b1″,”c1″,”d1”), condition=c(“ctr”,”ctr”,”ctr”,”tre”,”tre”,”tre”)) DataFrame with 6 rows and 2 columns Patient condition <character> <character> 1 b1 ctr 2 c1 ctr 3 d1 ctr 4 b1 tre…
do I have to do this step in creating DGEList ?
do I have to do this step in creating DGEList ? 1 @441fab0f Last seen 2 hours ago United Kingdom Hi guys, at the moment I have a dataframe, which the Gene ID is in one column, the others are sample ID as column names and expression value as observations….
finding error to run edgeR , please check my code to be helpful for finding error and solving it. difficulty in finding the next steps of the code because of the occurring errors.
library(edgeR) counts <- read.delim(“GSE116959_series_matrix.txt”, row.names = 1) head(counts) data <- read.table(“annotation.txt”,header=TRUE , sep = “\t”) data head(data) d0<- DGEList(counts=counts , group = factor(counts)) d0 dim(d0) d0.full <- d0 #keep the old one in case we mess up countsPerMillion <- cpm(d0) summary(countsPerMillion) countCheck <- countsPerMillion > 1 head(countCheck) keep <- which(rowSums(countCheck)…
Which input file is used for DGEList in EgdeR?
Which input file is used for DGEList in EgdeR? 1 @mohammedtoufiq91-17679 Last seen 1 day ago Qatar Hi, I used an nf-core/rnaseq pipeline using star_salmon default aligner, on strand specific dataset. I have a question about gene counts data obtained as a result of salmon quantification. I am interested in…
finding error to run edgeR, error in ploting MDS and after that in model matrix also
finding error to run edgeR, error in ploting MDS and after that in model matrix also 0 library(edgeR) counts <- read.delim(“GSE116959_series_matrix.txt”, row.names = 1) head(counts) d0 <- DGEList(counts) d0 <- calcNormFactors(d0) d0 cutoff <- 1 drop <- which(apply(cpm(d0), 1, max) < cutoff) d <- d0[-drop,] dim(d) # number of genes…
Which RUVr batch corrected output is better to calculate TPM?
Which RUVr batch corrected output is better to calculate TPM? 0 I have downloaded several samples from 5 studies (5 batches). Example of my count table: S_rep1_batch1 S_rep2_batch1 S_rep1_batch2 S_rep2_batch2 S_rep3_batch2 . . . Gene1 34 54 65 76 67 Gene2 87 77 90 35 19 Gene3 47 67 70…
Hugely different results between edgeR and DESeq2
Hugely different results between edgeR and DESeq2 0 I am working on a 18-patients dataset. I am trying to calculate DE genes with both EdgeR and DESeq2 for further analyses. Its a 2-factor design (Status, Fraction). I want to calculate DE genes of different Fractions using the Status as covariate….
Subset count data in DGEList using edgeR/DESeq2
Subset count data in DGEList using edgeR/DESeq2 1 @0f752196 Last seen 1 day ago United Kingdom Hello, I have an enormous methylation dataset (12.6Gb) that is proving too much for my computer to handle. This data is loaded in to R as a DGEList (with counts, samples, genes and a…
Estimating Fold-Changes of Lowly Expressed Genes
Estimating Fold-Changes of Lowly Expressed Genes 1 @vm-21340 Last seen 6 hours ago Brazil I am doing a DGE analysis using RNAseq data to compare three conditions. I am using a standard pipeline (Create DGEList > Filter very lowly expressed genes > TMM normalize > DGE). Since there is a…
Differential gene expression analysis with no replicates using edgeR
Dear all, I have an experimental design where I have only one sample in each condition (2 conditions in total) and want to do differential gene expression analysis using edgeR. This is the script I want to use for the analysis and it runs without any errors – with this…
TPM normalization starting with read counts
Hello everyone I have multiple bulk RNA-seq datasets that I need to apply the same pipe line on. I want to normalize them from counts data to TPM. In all datasets, I have the genes as rows, and samples as columns. Unfortunately, I don’t have the fastq files, all I…
I have a query regarding differential gene expression using limma-voom.
I have a query regarding differential gene expression using limma-voom. 1 @28946033 Last seen 1 day ago India I used the following pipeline for RNA Seq Analysis Fastq-Trimmomatic- Hisat2(gtf file was annotated)-featurecounts After featurecounts I tried to do limmavoom, but I get error saying this An error occurred with this…
Gene Expression Analysis Steps ?
Hi everybody, I’m new in this field. I’m trying to replicate a paper to train my self . The results come out pretty the same but not exactly the same, so I wanted to know if all my steps are right or if I’m missing something ( or even completely…
rna seq – R – [DESeq2] – How use TMM normalized counts (from EdgeR) in inputs for DESeq2?
I have several RNAseq samples, from different experimental conditions. After sequencing, and alignment to reference genome, I merged the raw counts to get a dataframe that looks like this: > df_merge T0 DJ21 DJ24 DJ29 DJ32 Rec2 Rec6 Rec9 G10 421 200 350 288 284 198 314 165 G1000 17208…
Unsuccessful DE analysis using limma
This might be a bit long, please bare with me. I’m conducting a differential expression analysis using limma – voom. My comparison is regarding response vs non-response to a cancer drug. However, I’m not getting any DE genes, absolute zeros. Someone here once recommended not to use contrast matrix for…
Removing replicate not clustering and group with replicate Vs without -edgeR rnaseq analysis
Removing replicate not clustering and group with replicate Vs without -edgeR rnaseq analysis 0 I am working with bacteria samples – in 3 groups that include the control, Treatment A, and Treatment B. From the PCA I find that the replicates are far apart. So I have removed the treatment…
r – Contrast for Limma – Voom
I’m doing a differential expression analysis for RNA-seq data with limma – voom. My data is about a cancer drug, 49 samples in total, some of them are responders some of them are not. I need some help building the contrast. I’m dealing with only one factor here, so two…
Limma Differential Analysis on Proteomics data
Limma Differential Analysis on Proteomics data 1 @3c9b3fdc Last seen 7 hours ago United States Hi, I have a proteomics data set and I am doing the differential analysis on that. I used the Limma package to do that. I first removed the negative counts and did the analysis but…
DGE from tumor-adjacent normal pair RNA-seq data. For an individual, no replicate
Single sample analysis can be done in edgeR, while it is deprecated in DNASeq2 in 2018 probably (if you can use old DESeq version then it will work in single sample also). For egdeR you just do library(edgeR) setwd() rawdata <- read.delim(“filename”, check.names=FALSE, stringsAsFactors=FALSE) ngenes <- 10000 #no. of genes…
keep only genes expressed in sample at the same time as a particular gene of interest
Filter DGElist object: keep only genes expressed in sample at the same time as a particular gene of interest 0 Hello, need some help here as I’m kind of stuck with the edgeR DGElist format. I have a DGE list named x with the following dimensions: > dim(x ) [1]…
Calculate fold change in edgeR with one sample per condition
Hi, We have run a pilot RNA-Seq study with one sample per condition, this is just a test run. I understand there is no valid statistical test in this case, however just curious to obtain differential expression through edgeR package in R assuming dispersion = 0.4 for the human data….