Tag: PCA

corral: Single-cell RNA-seq dimension reduction, batch integration, and visualization with correspondence analysis

Abstract Effective dimension reduction is an essential step in analysis of single cell RNA-seq(scRNAseq) count data, which are high-dimensional, sparse, and noisy. Principal component analysis (PCA) is widely used in analytical pipelines, and since PCA requires continuous data, it is often coupled with log-transformation in scRNAseq applications. However, log-transformation of…

Continue Reading corral: Single-cell RNA-seq dimension reduction, batch integration, and visualization with correspondence analysis

A series of scripts that facilitate the prediction of protein structures in multiple conformations using AlphaFold2

This repository accompanies the manuscript “Sampling the conformational landscapes of transporters and receptors with AlphaFold2” by Diego del Alamo, Davide Sala, Hassane S. Mchaourab, and Jens Meiler. The code used to generate these models can be found in scripts/ and was derived from the closely related repository ColabFold. This repository…

Continue Reading A series of scripts that facilitate the prediction of protein structures in multiple conformations using AlphaFold2

Diagnostic markers of AIDS combined with TM infection

Introduction The prevalence of Human immunodeficiency virus/Acquired Immune Deficiency Syndrome (HIV/AIDS) is still a public health problem that threatens the health of all human beings. According to the latest report of the World Health Organization, in the end of 2019, there were 38 million patients with human immunodeficiency virus/acquired immunodeficiency…

Continue Reading Diagnostic markers of AIDS combined with TM infection

Rstudio Online Free

Listing Results Rstudio online free RStudio Cloud Preview 2 hours agoRStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online. Analyze your data using the RStudio IDE, directly from your browser. Share projects with your team, class, workshop or the world….

Continue Reading Rstudio Online Free

Rstudio Online Free

Listing Results Rstudio online free RStudio Cloud Preview 2 hours agoRStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online. Analyze your data using the RStudio IDE, directly from your browser. Share projects with your team, class, workshop or the world….

Continue Reading Rstudio Online Free

Analyzing DEG from different dataset in my study design

Analyzing DEG from different dataset in my study design 0 Hi I have a situation here which make me confused and not sure if i have every thing right. I compared 5 different datasets in one study with each other separately. for example data1 vs data2, data1 vs data3, data1…

Continue Reading Analyzing DEG from different dataset in my study design

Senior Bioinformatics Scientist in Cambridge, Cambridgeshire | The Tec Recruitment Group Limited

Senior Bioinformatics Scientist – Cambridge Remote/hybrid working option Role overview: You will be part of an industry leading Genomics company, who are working in the development and accessibility of sequencing products to push the boundaries of drug discovery and therapy development. You will be part of a global team of…

Continue Reading Senior Bioinformatics Scientist in Cambridge, Cambridgeshire | The Tec Recruitment Group Limited

Identification of downstream effectors of retinoic acid specifying the zebrafish pancreas by integrative genomics

Retinoic acid affects the transcriptome of zebrafish endodermal cells To identify genes regulated by RA in zebrafish endodermal cells, we used the transgenic Tg(sox17:GFP) line which drives GFP expression in endodermal cells and allows their selection by fluorescence activated cell sorting (FACS). Tg(sox17:GFP) embryos were treated either with RA, BMS493…

Continue Reading Identification of downstream effectors of retinoic acid specifying the zebrafish pancreas by integrative genomics

Identification of downstream effectors of retinoic acid specifying the zebrafish pancreas by integrative genomics

Retinoic acid affects the transcriptome of zebrafish endodermal cells To identify genes regulated by RA in zebrafish endodermal cells, we used the transgenic Tg(sox17:GFP) line which drives GFP expression in endodermal cells and allows their selection by fluorescence activated cell sorting (FACS). Tg(sox17:GFP) embryos were treated either with RA, BMS493…

Continue Reading Identification of downstream effectors of retinoic acid specifying the zebrafish pancreas by integrative genomics

r – Change line width of specific boxplots with ggplot2

I am using the following code to generate a box plot: df %>% ggplot2::ggplot(ggplot2::aes(x = group, y = count, fill = batch)) + ggplot2::geom_boxplot(ggplot2::aes(lwd = stroke)) + ggplot2::scale_y_log10() + ggplot2::theme_bw() + ggplot2::theme( axis.text.x = ggplot2::element_text(angle = 90, hjust = 1), legend.position = “none” ) + ggplot2::labs(title = nm_dds) which produces…

Continue Reading r – Change line width of specific boxplots with ggplot2

Matlab cross-validation in PCA in bioinformatics data analysis – Freelance Job in Quantitative Analysis – Less than 30 hrs/week – undefined

Require help in matlab working in bioinformatic dataset that consists of 3 alignment methods. 1)Clustal, 2)Muscle, and 3) is Mafft Each method has its own data that consists of Matrix of N X M where the N = number of observations and M = number of Variables. Clustal = 32…

Continue Reading Matlab cross-validation in PCA in bioinformatics data analysis – Freelance Job in Quantitative Analysis – Less than 30 hrs/week – undefined

european-soccer from arthur960304 – Github Help

Data Analysis and Machine Learning with Kaggle European Soccer Database Getting Started These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system. Dataset Kaggle…

Continue Reading european-soccer from arthur960304 – Github Help

Asc-Seurat: analytical single-cell Seurat-based web application | BMC Bioinformatics

To demonstrate Asc-Seurat’s functionalities, we analyzed the publicly available 10× Genomics’ 3k Peripheral Blood Mononuclear Cells (PBMC) dataset [26], showcasing the analysis of an individual sample. In addition, we used a second PBMC dataset to demonstrate the analysis integrating multiple samples in Asc-Seurat. The second PBMC dataset was generated by Hang…

Continue Reading Asc-Seurat: analytical single-cell Seurat-based web application | BMC Bioinformatics

Asc-Seurat: analytical single-cell Seurat-based web application | BMC Bioinformatics

To demonstrate Asc-Seurat’s functionalities, we analyzed the publicly available 10× Genomics’ 3k Peripheral Blood Mononuclear Cells (PBMC) dataset [26], showcasing the analysis of an individual sample. In addition, we used a second PBMC dataset to demonstrate the analysis integrating multiple samples in Asc-Seurat. The second PBMC dataset was generated by Hang…

Continue Reading Asc-Seurat: analytical single-cell Seurat-based web application | BMC Bioinformatics

PCA plot from read count matrix from RNA-Seq

NB – this is now a Bioconductor R package: github.com/kevinblighe/PCAtools ————————- You should normalise your data prior to performing PCA. In the code below, you’ll have to add plot legends yourself, and also colour vectors (passed to the ‘col‘ parameter). Then, assuming that you have transcripts as rows and samples…

Continue Reading PCA plot from read count matrix from RNA-Seq

PCA plot from read count matrix from RNA-Seq

NB – this is now a Bioconductor R package: github.com/kevinblighe/PCAtools ————————- You should normalise your data prior to performing PCA. In the code below, you’ll have to add plot legends yourself, and also colour vectors (passed to the ‘col‘ parameter). Then, assuming that you have transcripts as rows and samples…

Continue Reading PCA plot from read count matrix from RNA-Seq

Interploidy gene flow involving the sexual-asexual cycle facilitates the diversification of gynogenetic triploid Carassius fish

1. Muller, H. J. The relation of recombination to mutational advance. Mutat. Res. Mol. Mech. Mutagen. 1, 2–9 (1964). Google Scholar  2. Maynard Smith, J. The Evolution of Sex (Cambridge University Press, 1978). Google Scholar  3. Avise, J. C. Clonality (Oxford University Press, 2008). Google Scholar  4. Hamilton, W. D.,…

Continue Reading Interploidy gene flow involving the sexual-asexual cycle facilitates the diversification of gynogenetic triploid Carassius fish

how to find each cluster in single-cell represent which cell type?

how to find each cluster in single-cell represent which cell type? 0 I have a gene expression matrix and I would like to cluster it and find different cell-types. Let’s suppose we would like to cluster our gene expression matrix (gene* cells) and use one of the clustering methods such…

Continue Reading how to find each cluster in single-cell represent which cell type?

python – predictions on datasets

Closed. This question needs debugging details. It is not currently accepting answers. Want to improve this question? Update the question so it’s on-topic for Stack Overflow. Closed 18 hours ago. We were given 3 datasets: X_public , y_public and X_eval. We are supposed create…

Continue Reading python – predictions on datasets

Genes and Pathways Involved in Postmenopausal Osteoporosis

Introduction Many postmenopausal women suffer from postmenopausal osteoporosis (PMO). A survey of 3247 Italian postmenopausal women found that according to bone mineral density (BMD) diagnostic criteria, the prevalence of osteoporosis was 36.6%.1 PMO patients may suffer from chronic pain and fractures, and as a result their quality of life is…

Continue Reading Genes and Pathways Involved in Postmenopausal Osteoporosis

Use yyplot to modify the font of ggplot2 drawing, causing the arrow position in the figure to shift

Hi, I am using ggplot2 to draw a scatter plot of PCA analysis. In order to make the results more intuitive, I used geom_label_repel to add labels and arrows. In addition, I want to modify the font in the figure and use yyplot. But after using yyplot to modify the…

Continue Reading Use yyplot to modify the font of ggplot2 drawing, causing the arrow position in the figure to shift

Principal Component Analysis Rna Seq

Listing Results Principal component analysis rna seq Genomatix Principal Component Analysis For RNASeq Data Preview 2 hours agoThis is explained in detail on “RNA–Seq workflow: gene-level exploratory analysis and differential expression”. The matrix of raw counts is input to the DESeq2 rlog function and the resulting transformed matrix is used…

Continue Reading Principal Component Analysis Rna Seq

95% confidence intervals for PCA in DESEQ2

95% confidence intervals for PCA in DESEQ2 2 @add481ab Last seen 1 hour ago United States Dear Help, We have used your package DESEQ2 (including the vst transform) on some RNA-seq data, in order to perform PCA analysis. We were hoping to add Monte-Carlo noise to the data in order…

Continue Reading 95% confidence intervals for PCA in DESEQ2

Plink2 –keep Removing All Samples

Plink2 –keep Removing All Samples 1 I am trying to include the –keep and –remove options in my plink2 command. I am finding that despite my files for these options having identical text to the main file’s IDs, all the samples are removed. Command: plink2 –bfile pca_final –keep EUR.sample –remove…

Continue Reading Plink2 –keep Removing All Samples

Linked supergenes underlie split sex ratio and social organization in an ant

Significance Some social insects exhibit split sex ratios, wherein a subset of colonies produce future queens and others produce males. This phenomenon spawned many influential theoretical studies and empirical tests, both of which have advanced our understanding of parent–offspring conflicts and the maintenance of cooperative breeding. However, previous studies assumed…

Continue Reading Linked supergenes underlie split sex ratio and social organization in an ant

Metagenomic Sequencing Analysis for Acne Using Machine Learning Methods Adapted to Single or Multiple Data

The human health status can be assessed by the means of research and analysis of the human microbiome. Acne is a common skin disease whose morbidity increases year by year. The lipids which influence acne to a large extent are studied by metagenomic methods in recent years. In this paper,…

Continue Reading Metagenomic Sequencing Analysis for Acne Using Machine Learning Methods Adapted to Single or Multiple Data

Metagenomic Sequencing Analysis for Acne Using Machine Learning Methods Adapted to Single or Multiple Data

The human health status can be assessed by the means of research and analysis of the human microbiome. Acne is a common skin disease whose morbidity increases year by year. The lipids which influence acne to a large extent are studied by metagenomic methods in recent years. In this paper,…

Continue Reading Metagenomic Sequencing Analysis for Acne Using Machine Learning Methods Adapted to Single or Multiple Data

Python Scikit learn PCA plt.bar

Can anyone help to explain items in these codes when we plot and check the variance of the component: var = np.round(pca.explained_variance_ratio_*100, decimals = 1) lbls = [str(x) for x in range(1,len(var)+1)] plt.bar(x=range(1,len(var)+1), height = var, tick_label = lbls) The original article about Implementation of Principal Component Analysis(PCA) in K…

Continue Reading Python Scikit learn PCA plt.bar

Archaeogenetic analysis of Neolithic sheep from Anatolia suggests a complex demographic history since domestication

We analyzed DNA from 180 archaeological sheep bone and tooth samples from late Pleistocene and early Holocene Anatolia, originating from six different sites from central and west Anatolia and spanning the Epipaleolithic/Pre-Pottery Neolithic (n = 7) and early to late Pottery Neolithic (n = 173) periods (Fig. 1 and Supplementary Data 1). We generated genome-wide ancient…

Continue Reading Archaeogenetic analysis of Neolithic sheep from Anatolia suggests a complex demographic history since domestication

VCF file generation from multiple samples fro PCA

VCF file generation from multiple samples fro PCA 0 I am trying to generate vcf file for 80 samples(human) and use it for pca. But when trying to get eigen vectors using plink it says genotyping rate is 0.12 and when i remove snps with missing data threshold all data…

Continue Reading VCF file generation from multiple samples fro PCA

VCF file generation from multiple samples fro PCA

VCF file generation from multiple samples fro PCA 0 I am trying to generate vcf file for 80 samples(human) and use it for pca. But when trying to get eigen vectors using plink it says genotyping rate is 0.12 and when i remove snps with missing data threshold all data…

Continue Reading VCF file generation from multiple samples fro PCA

Comment: how to improve PCA visualization using ggplot?

Thanks the legend worked. the PC1 and PC2 axes didn’t! it says: Error in “PC1 (” + percentage[1] : non-numeric argument to binary operator that is how my percentage looks like > percentage [1] ” (36.61%)” ” (33.7%)” ” (19.75%)” ” (18.03%)” ” (-0.1%)” I can only adjust the `hjust=0.4`…

Continue Reading Comment: how to improve PCA visualization using ggplot?

Answer: how to improve PCA visualization using ggplot?

I like to use `labs` instead of `xlab` and `ylab` and add the information to the data.frame directly. Do you only have 4 data point and you are certain that the name are corresponding to the correct PC value? If so do “` percentage <- round((eigenval/(sum(eigenval))*100), 2) percentage <- as.matrix(percentage)…

Continue Reading Answer: how to improve PCA visualization using ggplot?

how to improve PCA visualization using ggplot?

I have used this piece of script to draw the PCA based on eigenvalue and eigenvector percentage <- round((eigenval/(sum(eigenval))*100), 2) percentage <- as.matrix(percentage) percentage <- paste0(names(percentage), " (", percentage, "%)") Names <- c ("mn27hd", "mdkk987", "mnsdnu83", "sjednu83", "bjeo972s") pop.colour <- c("blue", "red", "green", "orange", "brown") ggplot(eigenvec, aes(x=PC1, y=PC2, colour=pop.colour, label=Names))…

Continue Reading how to improve PCA visualization using ggplot?

Assessment of CircRNA Expression Profiles and Potential Functions in Brown Adipogenesis

doi: 10.3389/fgene.2021.769690. eCollection 2021. Affiliations Expand Affiliations 1 Department of Biotechnology, College of Life Sciences, Xinyang Normal University, Xinyang, China. 2 Institute of Animal Science and Veterinary Medicine, Hainan Academy of Agricultural Sciences, Haikou, China. 3 Institute for Conservation and Utilization of Agro-Bioresources in Dabie Mountains, Xinyang Normal University, Xinyang,…

Continue Reading Assessment of CircRNA Expression Profiles and Potential Functions in Brown Adipogenesis

Problem with vcf file columns

Problem with vcf file columns 0 Hello. I’m having troubles with a vcf file I just generated with Stacks. The thing is that the column of the first sample (the first individual in my vcf file) instead of having the information about the genotype, the depth and other things, it…

Continue Reading Problem with vcf file columns

Weird WGCNA plot

Weird WGCNA plot 2 Dear Seniors and All members, I am wondering whether anyone has done weighted gene co-expression network analysis (WGCNA) from Bulk-RNAseq before. I followed the WGNCA tutorial using my data and the module detection outputs are here. However, once I visualized the modules in clustering, I got…

Continue Reading Weird WGCNA plot

Population stratification with PCA

Population stratification with PCA 1 Hi all! I have a genotype dataset in plink format. Now I want to correct for population structure with PCA in association analysis. I split my dataset to training and testing datasets. I want to do the PCA only in the training dataset and use…

Continue Reading Population stratification with PCA

Microarrays analysis for differential gene expression by R

You want to be a professional in the field of Bioinformatics by R. If you don’t know how to use R on data problems and what ways of its uses, then this course will be beneficial for you. R is an emerging part of Bioinformatics. There are many sources to…

Continue Reading Microarrays analysis for differential gene expression by R

DNA methylation batch effect remove

DNA methylation batch effect remove 1 Hi, I’m studying about DNA methylation. I draw heatmap with beta-value, but there are batch effects.. How can I remove these cgID with studio R? DNA effect batch methylation • 24 views Can you explain what do you mean by ‘beta-value’? Generally batch effect…

Continue Reading DNA methylation batch effect remove

analyzing spatial transcriptome data (Part 2)

Recognition of spatial variable features Seurat Two workflows are provided to identify molecular features related to tissue spatial location . The first is differential expression according to the pre labeled anatomical regions in the tissue , This differential expression can be determined by unsupervised clustering or a priori knowledge ….

Continue Reading analyzing spatial transcriptome data (Part 2)

Discovery and construction of prognostic model for clear cell renal cell carcinoma based on single-cell and bulk transcriptome analysis

This article was originally published here Transl Androl Urol. 2021 Sep;10(9):3540-3554. doi: 10.21037/tau-21-581. ABSTRACT BACKGROUND: Clear cell renal cell carcinoma (ccRCC) is the most common malignant kidney tumor in adults. Single-cell transcriptome sequencing can provide accurate gene expression data of individual cells. Integrated single-cell and bulk transcriptome data from ccRCC…

Continue Reading Discovery and construction of prognostic model for clear cell renal cell carcinoma based on single-cell and bulk transcriptome analysis

Covariate correction which data take for downstream analysis?

Hi, I am really bad in stats so I am really sorry if this question is inappropriate or too stupid (also I wasn`t if this was the right forum…if not, apologies again!). A collaborator asked me to correct for age and sex using linear regression) our bulk-RNAseq dataset (6 human…

Continue Reading Covariate correction which data take for downstream analysis?

Bioconductor – courses and conferences

About press copyright contact us creators advertise developers terms privacy policy & safety how youtube works test new features press copyright contact us creators online – flexible short courses. This course is the ideal introduction to english garden history. It provides an overview of five centuries of development, from baroque…

Continue Reading Bioconductor – courses and conferences

Gene expression (RNA-seq) clustering

Unsupervised class discovery is a data mining method to identify unknown possible groups (clusters) of items solely based on intrinsic features and no external variables. Basically clustering includes four steps: 1 Data preparation and Feature selection, 2 Dissimilarity matrix calculation, 3 applying clustering algorithms, 4 Assessing cluster assignment I use…

Continue Reading Gene expression (RNA-seq) clustering

How to generate a 2D PCA plot from bulk RNA-seq data (log2 CPM) using the PCAtools?

How to generate a 2D PCA plot from bulk RNA-seq data (log2 CPM) using the PCAtools? 1 Hi all, I have bulk RNA-seq data with 12 samples – WT (x4), ‘A’ KO (x4), and ‘B’ KO (x4). I want to generate a 2D PCA plot (biplot) like below figure to…

Continue Reading How to generate a 2D PCA plot from bulk RNA-seq data (log2 CPM) using the PCAtools?

Bioconductor – Bioconductor 3.14 Released

Home Bioconductor 3.14 Released October 27, 2021 Bioconductors: We are pleased to announce Bioconductor 3.14, consisting of 2083 software packages, 408 experiment data packages, 904 annotation packages, 29 workflows and 8 books. There are 89 new software packages, 13 new data experiment packages, 10 new annotation packages, 1 new workflow,…

Continue Reading Bioconductor – Bioconductor 3.14 Released

X-shaped PCA plot

X-shaped PCA plot 0 What does an X-shaped PCA plot indicate? (I performed PCA analysis based on gene expression data. Each dot indicates a gene expression profile, and the colors reflect the cell lines. ) I have never seen PCA plots like this before. Please someone help me! I really…

Continue Reading X-shaped PCA plot

Distance matrix PCA

Distance matrix PCA 0 Hi all, I generated PCA values for the 1000genomes dataset using PLINK. I know how to plot the values for PC1 and PC2, but my question is how can I generate a distance matrix to select near samples based on populations? Like for example if I…

Continue Reading Distance matrix PCA

General question about clustering in scRNAseq

I have recently finished an online scRNAseq course. I was a complete beginner in the field and I really enjoyed the course and have learnt a lot. Now that I have an overview of single cell, I have a flood of maybe dumb questions that escaped me during the course….

Continue Reading General question about clustering in scRNAseq

How to normalize miRNAs from RNA-seq data?

How to normalize miRNAs from RNA-seq data? 0 Hi, I have the miRNA expression coming from RNA-seq. I have performed differential gene expression and machine learning methods to find a signature to differentiate between patients and controls. With the training cohort, the results look great, but with the validation cohort,…

Continue Reading How to normalize miRNAs from RNA-seq data?

Bioinformatics Prediction and Analysis of MicroRNAs and Their Targets as Biomarkers for Prostate Cancer: A Preliminary Study

This article was originally published here Mol Biotechnol. 2021 Oct 19. doi: 10.1007/s12033-021-00414-8. Online ahead of print. ABSTRACT Prostate cancer (PCa) is the second most common form of cancer in men around the world. Due to its heterogeneity, presentations range from aggressive lethal disease to indolent disease. There is a…

Continue Reading Bioinformatics Prediction and Analysis of MicroRNAs and Their Targets as Biomarkers for Prostate Cancer: A Preliminary Study

Population structure, biogeography and transmissibility of Mycobacterium tuberculosis

Detailed population structure of L1–4 and a hierarchical sub-lineage naming system We assembled a high-quality data set of whole genomes, antibiotic resistance phenotypes, and geographic sites of isolation for 9584 clinical Mtb samples (“Methods” section and Supplementary Data 1). Of the total, 4939 (52%) were pan-susceptible, i.e., susceptible to at least…

Continue Reading Population structure, biogeography and transmissibility of Mycobacterium tuberculosis

A TME-Related Signature as a Biomarker in Liver Cancer

Introduction As one of the most frequent causes of cancer deaths across the globe, liver cancer, characterized by high mortality, recurrence, metastasis and poor prognosis, is the only one of the top five deadliest cancers to have an annual percentage increase in occurrence.1 Surgery, local destructive therapies, and liver transplantation…

Continue Reading A TME-Related Signature as a Biomarker in Liver Cancer

The Profile and Function of Gut Microbiota in Diabetic Nephropathy

Introduction Diabetic nephropathy (DN) is characterized by kidney function loss caused by diabetes mellitus.1 Almost one-third of patients with diabetes have DN, and the prevalence of DN is increasing worldwide.2 DN is one of the most important factors of chronic kidney disease and end-stage renal disease (ESRD). The signs and…

Continue Reading The Profile and Function of Gut Microbiota in Diabetic Nephropathy

Distant residues modulate conformational opening in SARS-CoV-2 spike protein

Correlation between RBD Opening and Backbone Dihedral Angle. Multiple unbiased trajectories were propagated from different regions of the S-protein RBD opening conformational space, priorly explored by SMD and umbrella sampling (details in SI Appendix). Three of those trajectories were assigned as the closed, partially open, and fully open states, based…

Continue Reading Distant residues modulate conformational opening in SARS-CoV-2 spike protein

Differential enrichment of H3K9me3 at annotated satellite DNA repeats in human cell lines and during fetal development in mouse

The removal of problematic regions The removal of problematic genomic regions is considered essential for the accurate analysis of data obtained by chromatin immunoprecipitation followed by genome sequencing (ChIP-Seq) [27, 35]. Repetitive regions including satellite DNA arrays comprise a majority of such problematic regions, mainly because they reside in the…

Continue Reading Differential enrichment of H3K9me3 at annotated satellite DNA repeats in human cell lines and during fetal development in mouse

Why do we log transform and scale data prior to PCA for scRNA-seq analysis

Why do we log transform and scale data prior to PCA for scRNA-seq analysis 0 Hi, I have a query as to why it is necessary to both log transform, and center and scale, scRNA-seq data prior to performing PCA. I thought the purpose of both of these steps was…

Continue Reading Why do we log transform and scale data prior to PCA for scRNA-seq analysis

High counts of rRNA in RNA-Seq

Hi there, I have recently performed RNA-Seq on the total RNA of a mosquito tissue, where I have three biological replicates of the tissue at three different time points. The pipeline I used was HISAT2 –> featureCounts –> DESeq2. Looking at the normalized counts (output of DESeq2), the counts of…

Continue Reading High counts of rRNA in RNA-Seq

How to get SNPs (variant calling) from .gff or .align file?

How to get SNPs (variant calling) from .gff or .align file? 0 I have masked several genomes each belonging to a separate population to a TE (transposon) library and I have got the gff file and the alignment files of the masked regions as outputs, I want to run a…

Continue Reading How to get SNPs (variant calling) from .gff or .align file?

The difference between merge and integration with Seurat objects

The difference between merge and integration with Seurat objects 0 Hi everyone I have two questions: When can we use merge for Seurat objects? When can we use integration for Seurat objects? I have two datasets from different experiments. each dataset has cancer samples and healthy samples. To do clustering,…

Continue Reading The difference between merge and integration with Seurat objects

Batch correction in DESeq2

Batch correction in DESeq2 1 For RNA-seq data analysis using DESeq2, a recommended method for batch effect removal is to introduce the batch in the design of the experiment as design = ~ batch + condition. The presence of batch was already known from experiment design and also detected by…

Continue Reading Batch correction in DESeq2

ANGPTL8/betatrophin improves glucose tolerance | DMSO

Introduction Insulin resistance is a major risk factor for metabolic syndrome (MetS), including type 2 diabetes mellitus (T2DM).1 T2DM is characterized by insulin resistance in the liver.2 Insulin resistance is an abnormal physiological state that occurs when cells are unable to use insulin effectively, leading to T2DM, a major health…

Continue Reading ANGPTL8/betatrophin improves glucose tolerance | DMSO

Can we merge two VCF files

Forum:Can we merge two VCF files – a RNAseq VCF and a Whole genome Sequencing (WGS) VCF to do PCA? 0 I’m quite new to this variant calling and analysis area. We have around 30 samples of RNAseq data of tumor and normal samples. I performed variant calling and obtained…

Continue Reading Can we merge two VCF files

mtDNA microevolution in Southern Chile’s archipelagos

Abstract/Review The genetic variability of four predominantly Indian populations of southern Chile’s archipelagos was examined by determining the frequencies of four mitochondrial DNA haplogroups that characterize the American Indian populations. Over 90% of the individuals analyzed presented Native American mtDNA haplogroups. By means of an unweighted group pair method with…

Continue Reading mtDNA microevolution in Southern Chile’s archipelagos

Is it advisable to input a count matrix that consists of reads aligned using different algorithms (HT-Seq and Salmon)?

Hello! First of all, thank you for the great package and the excellent documentation that supports it, much appreciated! Sadly, I could not find an answer to my problem, so I wanted to ask here. I have two different bulk RNA-seq datasets, one obtained from TCGA using the TCGAbiolinks package,…

Continue Reading Is it advisable to input a count matrix that consists of reads aligned using different algorithms (HT-Seq and Salmon)?

command not found, in IMPUTE2

Edit June 7, 2020: The code below is for phased imputation using the output of SHAPEIT2 and ultimate production of phased VCFs. For the initial pre-phasing process with SHAPEIT2, see my answer here: Phasing with SHAPEIT So, the steps are usually: pre-phasing into pre-existing haplotypes available from HERE ( Phasing…

Continue Reading command not found, in IMPUTE2

Genomestudio2 output to use in Tassel5?

Genomestudio2 output to use in Tassel5? 1 Hi, I got a genotyping data in Genomestudio 2. I want to export the data to use in Tassel5 for PCA, LD, and GWAS analysis. The problem is, when I load the data into Tassel5, it is put in as a numerical data,…

Continue Reading Genomestudio2 output to use in Tassel5?

Comparative cellular analysis of motor cortex in human, marmoset and mouse

Statistics and reproducibility For multiplex fluorescent in situ hybridization (FISH) and immunofluorescence staining experiments, each ISH probe combination was repeated with similar results on at least two separate individuals per species, and on at least two sections per individual. The experiments were not randomized and the investigators were not blinded…

Continue Reading Comparative cellular analysis of motor cortex in human, marmoset and mouse

Analyzing gene expression in different RNAseq datasets

Analyzing gene expression in different RNAseq datasets 0 Hello! I really need some assistance here, I came up with an analysis of my own that makes sense to me but I really new in this (started studying bioinformatics on my own with the pandemics) and I’m not sure if I…

Continue Reading Analyzing gene expression in different RNAseq datasets

Up-to-date RNA-Seq Analysis Training/Courses/Papers (Dec 2017)

Hi all, I am a PhD student with biology background. I recently inherit a RNA Sequencing project from another PhD student in my lab. We already have paired-ended RNA-Seq data generated from Illumina HiSeq but haven’t started analysis yet. I have basic Linux command line training but have no idea about how…

Continue Reading Up-to-date RNA-Seq Analysis Training/Courses/Papers (Dec 2017)

Single cell RNA sequencing (scRNA-seq) in cardiac tissue

Introduction Cardiovascular diseases (CVDs) are the leading cause of death globally, taking an estimated 17.9 million (32.1%) lives in 2015, up from 12.3 million (25.8%) in 1990.1,2 CVDs are highly heterogeneous diseases involving a group of disorders of the heart and blood vessels, which include cardiomyopathy, hypertensive heart disease, heart…

Continue Reading Single cell RNA sequencing (scRNA-seq) in cardiac tissue

Chromosome-level genome assemblies of five Prunus species and genome-wide association studies for key agronomic traits in peach

Genome assembly In this study, we de novo assembled the plum, Prunus mira, and Prunus davidiana genomes for the first time and improved the peach and apricot genomes by integrating single-molecule real-time (SMRT) long-read sequencing (PacBio), short high-quality Illumina paired-end sequencing, and Hi-C technology. First, we used SMRT reads (99−130 Gb,…

Continue Reading Chromosome-level genome assemblies of five Prunus species and genome-wide association studies for key agronomic traits in peach

Produce PCA bi-plot for 1000 Genomes Phase III in VCF format (old)

NB – Update July 29, 2020 – this thread will no longer be watched and, for all intents and purposes, will now be archived NB – Version 2 of tutorial can be found here and should be used going forward –> Produce PCA bi-plot for 1000 Genomes Phase III –…

Continue Reading Produce PCA bi-plot for 1000 Genomes Phase III in VCF format (old)

visualisation of proteomics data using r and bioconductor

R package version 0.99.7, 2014. [9] have developed an approach to estimate peptide isoelectric point values using a Support Vector Machine classifier from the caret package [10]. [73] De Duve, C., Beaufay, H., A Short History of Tissue Fractionation. The readers are referred to the respective documentation and vignettes for…

Continue Reading visualisation of proteomics data using r and bioconductor

PCA result and batch effect?

PCA result and batch effect? 0 Hello, I am processing a dataframe that consists of about 55000 genes(TPM values,no access to raw data) and 400 samples. After removing the zero variance genes, I am performing a PCA on the samples trying to detect outliers. I have noticed that there are…

Continue Reading PCA result and batch effect?

Automatic Outlier Detection for RNA-seq data

Automatic Outlier Detection for RNA-seq data 2 Hi all, So I am looking for an automated approach to detect outliers in RNA-seq data. I usually looked at a PCA plot and decided visually. Now I would like to automate this. So I have been looking at the PcaHubert() function in…

Continue Reading Automatic Outlier Detection for RNA-seq data

Principal Component Analysis Pca Using Python Scikit Learn.htmlprincipal Component Analysis With Scikit Learn Kaggle.htmlcara Mendapatkan Bni M Secure

Result for: Principal Component Analysis Pca Using Python Scikit Learn.htmlprincipal Component Analysis With Scikit Learn Kaggle.htmlcara Mendapatkan Bni M Secure Principal Component Analysis Pca Using Python Scikit Learn.htmlprincipal Component Analysis With Scikit Learn Kaggle.htmlcara Mendapatkan Bni M Secure Principle Component Analysis (PCA) with Scikit-Learn – Python Fri, 24 Sep 2021…

Continue Reading Principal Component Analysis Pca Using Python Scikit Learn.htmlprincipal Component Analysis With Scikit Learn Kaggle.htmlcara Mendapatkan Bni M Secure

How to perform DE anaysis on a data set in which biological replicates have high varaince?

How to perform DE anaysis on a data set in which biological replicates have high varaince? 2 Hello guys! I would really appreciate if someone could help me with DE analysis. This is my challenge: I have four conditions and I have five biological replicates for each condition. Performing a…

Continue Reading How to perform DE anaysis on a data set in which biological replicates have high varaince?

Single-cell DNA and RNA sequencing reveals the dynamics of intra-tumor heterogeneity in a colorectal cancer model | BMC Biology

Organoid culture of small intestinal cells and lentiviral transduction C57BL/6J mice and BALB/cAnu/nu immune-deficient nude mice were purchased from CLEA Japan (Tokyo, Japan). The small intestine was harvested from wild-type male C57BL/6J mice at 3–5 weeks of age (Additional file 1: Figure S9A). Crypts were purified and dissociated into single cells,…

Continue Reading Single-cell DNA and RNA sequencing reveals the dynamics of intra-tumor heterogeneity in a colorectal cancer model | BMC Biology

Bioconductor – Harman

DOI: 10.18129/B9.bioc.Harman     This package is for version 3.12 of Bioconductor; for the stable, up-to-date release version, see Harman. The removal of batch effects from datasets using a PCA and constrained optimisation based technique Bioconductor version: 3.12 Harman is a PCA and constrained optimisation based technique that maximises the…

Continue Reading Bioconductor – Harman

Selection of DNA Aptamers Recognizing EpCAM-Positive Prostate Cancer b

Introduction Prostate cancer (PCa) is one of the most common genitourinary system malignant tumor in men worldwide. In Asian countries, many patients with PCa are often diagnosed at an advanced stage maybe mostly because of a large population base with relatively backward economic development and imperfect cancer screening system of…

Continue Reading Selection of DNA Aptamers Recognizing EpCAM-Positive Prostate Cancer b

Phylogeographic reconstruction of the marbled crayfish origin

Procambarus fallax collections and PCR genotyping Animals were collected from various wild populations (Table S1) in compliance with state and local regulations (Georgia department of natural resources scientific collection permit 115621108, state of Florida collection permits S-19-10 and S-20-04). DNA was isolated from abdominal muscle tissue using SDS-based extraction and precipitation…

Continue Reading Phylogeographic reconstruction of the marbled crayfish origin

rna seq batch effects and cross species study

rna seq batch effects and cross species study 1 hi I am doing an expression analysis of rna seq dataset I have some doubts to be clarified can we do an analysis by choosing controls and test from two different datasets? One of the rna seq dataset is human(test is…

Continue Reading rna seq batch effects and cross species study

Classifiers for predicting coronary artery disease

Introduction Coronary artery disease (CAD) is a complex pathology associated with behavioral and environmental factors.1–3 CAD shows high prevalence and is associated with a high fatality rate among cardiovascular diseases. The main manifestations of CAD are stable or unstable angina pectoris and identifiable or unrecognized myocardial infarction.4 The main risk…

Continue Reading Classifiers for predicting coronary artery disease

Genetic basis and adaptation trajectory of soybean from its temperate origin to tropics

Resequencing of soybean accessions from low latitudes To investigate the genomic basis for the natural variation in soybean adaptation to low latitudes, we conducted whole-genome resequencing of a panel of 329 soybean accessions collected from 15 countries and covering all soybean subgroups in which 165 accessions are from in low-latitude…

Continue Reading Genetic basis and adaptation trajectory of soybean from its temperate origin to tropics

Phasing with SHAPEIT

Edit June 7, 2020: The code below is for pre-phasing with SHAPEIT2. For phased imputation using the output of SHAPEIT2 and ultimate production of phased VCFs, see my answer here: A: ERROR: You must specify a valid interval for imputation using the -int argument, So, the steps are usually: pre-phasing…

Continue Reading Phasing with SHAPEIT

Produce PCA bi-plot for 1000 Genomes Phase III

Note1 – Previous version: Produce PCA bi-plot for 1000 Genomes Phase III in VCF format (old) Note2 – this data is for hg19 / GRCh37 Note3 – GRCh38 data is available HERE The tutorial has been updated based on the 1000 Genomes Phase III imputed genotypes. The original tutorial was…

Continue Reading Produce PCA bi-plot for 1000 Genomes Phase III

Biomolecular insights into North African-related ancestry, mobility and diet in eleventh-century Al-Andalus

Uniparental genetic background of the Segorbe Giant We confirmed that the individual was genetically male (RY > 0.077; Supplementary Fig. S3), and both his uniparental markers point towards North African origins (Supplementary Table S2). He belongs to mtDNA haplogroup U6a1a1a (nomenclature according to Hernández et al.28). Although U6 in general, and U6a in…

Continue Reading Biomolecular insights into North African-related ancestry, mobility and diet in eleventh-century Al-Andalus

WNN in Seurat

Dear all, I am trying to follow the WNN vignette here satijalab.org/seurat/articles/weighted_nearest_neighbor_analysis.html After the steps below, I would like to annotate my clusters, hence I need to know the markers which best represent each cluster. pbmc <- FindMultiModalNeighbors(pbmc, reduction.list = list(“pca”, “lsi”), dims.list = list(1:50, 2:50)) pbmc <- RunUMAP(pbmc, nn.name

Continue Reading WNN in Seurat

Gene Expression Analysis Reveals Key Genes and Signalings Associated with the Prognosis of Prostate Cancer

This article was originally published here Comput Math Methods Med. 2021 Aug 28;2021:9946015. doi: 10.1155/2021/9946015. eCollection 2021. ABSTRACT It is urgent to identify novel biomarkers for prostate cancer (PCa) prognosis and to understand the mechanisms regulating the tumorigenesis for PCa treatment. In this study, GSE17951 and TCGA were used to…

Continue Reading Gene Expression Analysis Reveals Key Genes and Signalings Associated with the Prognosis of Prostate Cancer

Network analysis using distinct centrality measurements

Network analysis using distinct centrality measurements 0 Hello to everyone, I have question regarding to network analysis. Currently, I’m conducting this kind of analysis using data retrieved from STRING db to obtain highly central nodes. I have re-constructed the network in R using the igraph package to subsequently proceed with…

Continue Reading Network analysis using distinct centrality measurements

Normalization and differential analysis in ATAC-seq data

Normalization and differential analysis in ATAC-seq data 2 Hello everyone! I would like to know if someone had experiences with normalization and differential expression on ATAC-seq data. After using MACS2 for the peak calling, how can we use Dseq2 or EdgeR on these datas? Someone try this? What is the…

Continue Reading Normalization and differential analysis in ATAC-seq data

Weird PCA plot based on WGCNA results

Weird PCA plot based on WGCNA results 0 Dear all, I used WGCNA to find the associated gene modules with the different subtypes of a given cancer as trait. I obtained multiple modules associated with some of the cancer subtypes as shown in . Next, I plotted PCA to visualize…

Continue Reading Weird PCA plot based on WGCNA results

Ancestral polymorphisms shape the adaptive radiation of Metrosideros across the Hawaiian Islands

Significance Some of the most spectacular adaptive radiations of plants and animals occur on remote oceanic islands, yet such radiations are preceded by founding events that severely limit genetic variation. How genetically depauperate founder populations give rise to the spectacular phenotypic and ecological diversity characteristic of island adaptive radiations is…

Continue Reading Ancestral polymorphisms shape the adaptive radiation of Metrosideros across the Hawaiian Islands

Bioconductor – conclus

DOI: 10.18129/B9.bioc.conclus     ScRNA-seq Workflow CONCLUS – From CONsensus CLUSters To A Meaningful CONCLUSion Bioconductor version: Release (3.13) CONCLUS is a tool for robust clustering and positive marker features selection of single-cell RNA-seq (sc-RNA-seq) datasets. It takes advantage of a consensus clustering approach that greatly simplify sc-RNA-seq data analysis…

Continue Reading Bioconductor – conclus

Use of GenotypeGVCFs in population genetic studies

Use of GenotypeGVCFs in population genetic studies 0 I have 16 whole genome sequenced samples from two populations (8 for each population). My goal is detection of signature of selection and introgression. I performed read cleaning, mapping to reference, mark duplication. SNP calling was performed using HaplotypeCaller in GATK for…

Continue Reading Use of GenotypeGVCFs in population genetic studies

ENHANCED GRAVITROPISM 2 encodes a STERILE ALPHA MOTIF–containing protein that controls root growth angle in barley and wheat

    Significance To date, the potential of utilizing root traits in plant breeding remains largely untapped. In this study, we cloned and characterized the ENHANCED GRAVITROPISM2 (EGT2) gene of barley that encodes a STERILE ALPHA MOTIF domain–containing protein. We demonstrated that EGT2 is a key gene of root growth…

Continue Reading ENHANCED GRAVITROPISM 2 encodes a STERILE ALPHA MOTIF–containing protein that controls root growth angle in barley and wheat

when explained variation per PC is too low while running PCA with SNP data

when explained variation per PC is too low while running PCA with SNP data 0 I ran PCA with 91 samples(consisted of 23breeds and one outgroup which is different subspecies). about 18,000,000 SNPs were used when running PCA. but the variation explained were too low which was about 5% for…

Continue Reading when explained variation per PC is too low while running PCA with SNP data

Differential Gene Expression

Can you analyze in GEO2R? => No, because this is RNA-seq and not microarrays. You are lucky thought that the authors seem to provide raw counts so you can easily fede them into DESeq2. Here is a code suggestion, for details please read the DESeq2 vignette extensively, it contains answers…

Continue Reading Differential Gene Expression