Tag: fastQC

Weird over-represented sequence in sn/scRNASeq fastqc

Weird over-represented sequence in sn/scRNASeq fastqc 0 Dear Fellows I have a a weird over-represented sequences in my sn/scRNASe samples founs in fastqc report as in the photo attached. I doubt that these sequences are the source of getting ambient RNA warning? Any idea how to remove them? thank you…

Continue Reading Weird over-represented sequence in sn/scRNASeq fastqc

Help me understand the Nanopore fastqc results

Help me understand the Nanopore fastqc results 2 Hi, I have got my first Nanopore sequencing data and the first step was to see if the data is good. Has anyone has any experience with this kind of data and can tell me how to interpret the results. The whole…

Continue Reading Help me understand the Nanopore fastqc results

Strange Per base sequence content of fastqc

Hi, all! I download fastq.gz files of GSE162708 from ENA which only have 2 files of each sample(usually scRNA-seq has 3 files I1 , R1 & R2 ). Then I run fastp as following Then I get QC report , but I can’t understand why Per base sequence content of…

Continue Reading Strange Per base sequence content of fastqc

Running fastqc

Running fastqc 1 I am trying to run fastqc from my home directory using the command line. I do not have access to the root directory. I have downloaded fastqc zip file and unzipped it. When going into the FastQC directory, the following options are available: Configuration, LICENSE_JHDF5.txt, cisd-jhdf5.jar, net…

Continue Reading Running fastqc

FastQC per base sequence content

FastQC per base sequence content 1 I’m running FastQC on some paired-end fastq files. I have a warning on per-base sequence content, as the first 5 to 6 bases show significant bias towards T and G, as shown below. I was wondering what the sequence in the first 5 or…

Continue Reading FastQC per base sequence content

Linkage mapping, comparative genome analysis, and QTL detection for growth in a non-model teleost, the meagre Argyrosomus regius, using ddRAD sequencing

Fricke, R., Eschmeyer, W. N. & van der Laan, R. (eds). Eschmeyer’s Catalog of Fishes: Genera, Species, Rererences. researcharchive.calacademy.org/research/ichthyology/catalog/fishcatmain.asp. Electronic version, Accessed 15 October 2021. Nelson, J. S. Fishes of the World 4th edn, 372 (Wiley, 2006). Google Scholar  Chen, X. H., Lin, K. B. & Wang, X. W. Outbreaks…

Continue Reading Linkage mapping, comparative genome analysis, and QTL detection for growth in a non-model teleost, the meagre Argyrosomus regius, using ddRAD sequencing

Sequence Duplication Levels failed FastQC Report

Sequence Duplication Levels failed FastQC Report 1 Hi all, I’m checking quality for my RNA-Seq through FastQC and all my fastq failed on “Per base sequence content” and “Sequence Duplication Levels”, besides warning on “Overrepresented sequences” only for read 1 files (it’s paired-end; the sequences match between samples). Below is…

Continue Reading Sequence Duplication Levels failed FastQC Report

High-Throughput Transcriptome Analysis for Investigating Host-Pathogen Interactions

The protocol presented here describes a complete pipeline to analyze RNA-sequencing transcriptome data from raw reads to functional analysis, including quality control and preprocessing steps to advanced statistical analytical approaches. Welcome to the protocol of high-throughput transcriptome analysis for investigating host-pathogen interactions. This protocol is divided in the following steps….

Continue Reading High-Throughput Transcriptome Analysis for Investigating Host-Pathogen Interactions

Per base sequence quality – fastqc

Per base sequence quality – fastqc 2 Hi everyone, I am new to bioinformatics, I am asking a very basic question here, I have paired-end fastq data, I did fastqc, and in this per base sequence quality, few reads are in the red region, and there is no adapter and…

Continue Reading Per base sequence quality – fastqc

MultiQC not working correctly – dataset-collection

@stealsh I’ve seen this before, too, for about the last 5-4 months when collections changed a bit in the 21.09 release. These two tools don’t work in a series the way they used to. Details: The problem comes from the way the data is organized and where the sample names…

Continue Reading MultiQC not working correctly – dataset-collection

nf-core/circrna

circRNA quantification, differential expression analysis and miRNA target prediction of RNA-Seq data Introduction nf-core/circrna is a best-practice analysis pipeline for the quantification, miRNA target prediction and differential expression analysis of circular RNAs in paired-end RNA sequencing data. The pipeline is built using Nextflow, a workflow tool to run tasks across…

Continue Reading nf-core/circrna

FastQC for paired end data

FastQC for paired end data 2 Hi, I have 36 fastq files of paired end RNA-seq so I was wondering if anyone knows how to do fastqc on paired-end data? and what is the difference between fastqc of single end data? I have done with single end data before but…

Continue Reading FastQC for paired end data

Genomic analysis on Galaxy using Azure CycleCloud

Cloud computing and digital transformation have been powerful enablers for genomics. Genomics is expected to be an exabase-scale big data domain by 2025, posing data acquisition and storage challenges on par with other major generators of big data. Embracing digital transformation offers a practically limitless ability to meet the genomic…

Continue Reading Genomic analysis on Galaxy using Azure CycleCloud

python – Packages Not Found Error: Not available from current channel- Bioconda

Using a Mac with M1 chip, I’m trying to install the following Bioconda packages: cutadapttrim-galoresamtoolsbedtools.htseq.bowtie2.deeptools.macs2 I’ve been able to install picard and fastqc with no issues, but all others turn out one of two error messages: PackagesNotFoundError: The following packages are not available from current channels: or Found conflicts! Looking…

Continue Reading python – Packages Not Found Error: Not available from current channel- Bioconda

Cell Strain-Derived Induced Pluripotent Stem Cells as an Isogenic Approach To Investigate Age-Related Host Response to Flaviviral Infection

INTRODUCTION Dengue is the most common mosquito-borne viral disease globally (1). This acute disease, which can be life-threatening, is caused by four different dengue viruses (DENVs) (DENV-1, DENV-2, DENV-3, and DENV-4). An estimated 390 million people are infected with these DENVs annually (2), and populations throughout the tropics face frequent…

Continue Reading Cell Strain-Derived Induced Pluripotent Stem Cells as an Isogenic Approach To Investigate Age-Related Host Response to Flaviviral Infection

identify and remove adapter sequence

identify and remove adapter sequence 2 Hi all, I am trying to identify the adapter sequences of my ATAC-sequencing data. The way I tried to achieve this was to send the fastq file to FastQC. Hoping the sequence would be picked and showed in the report. In the report, there…

Continue Reading identify and remove adapter sequence

FastQC analysis

FastQC analysis 0 Hii, Could anyone please tell me if I can carry out the FastQC analysis using data available online and not in my system? The sequence data that I want to check is huge so was wondering if there was any way to access them online through FastQC?…

Continue Reading FastQC analysis

ChaoXianSen/TrimGalore – Giters

Trim Galore is a wrapper around Cutadapt and FastQC to consistently apply adapter and quality trimming to FastQ files, with extra functionality for RRBS data. Installation Trim Galore is a a Perl wrapper around two tools: Cutadapt and FastQC. To use, ensure that these two pieces of software are available…

Continue Reading ChaoXianSen/TrimGalore – Giters

python – Missing input files after defining them in function

I am trying to do QC on RNAseq data that is tarballed. I am using Snakemake as a workflow manager and am aware that Snakemake does not like one-to-many rules. I defining a checkpoint would fix the problem but when I run the script I get this this error message…

Continue Reading python – Missing input files after defining them in function

Index of /readarchive/Miseq/2014_05_21_run_miseq/Adapter_trimmed_nextera_samples/QC_after_trimming/Geo33_S19.R1.trimmed.paired_fastqc/Images

Name Last modified Size Description Parent Directory   –   duplication_levels.png 24-May-2014 17:58 17K   kmer_profiles.png 24-May-2014 17:58 433K   per_base_gc_content.png 24-May-2014 17:58 48K   per_base_n_content.png 24-May-2014 17:58 26K   per_base_quality.png 24-May-2014 17:58 32K   per_base_sequence_content.png 24-May-2014 17:58 96K   per_sequence_gc_content.png 24-May-2014 17:58 25K   per_sequence_quality.png 24-May-2014 17:58 19K  …

Continue Reading Index of /readarchive/Miseq/2014_05_21_run_miseq/Adapter_trimmed_nextera_samples/QC_after_trimming/Geo33_S19.R1.trimmed.paired_fastqc/Images

Average Read length

Average Read length 3 Hello Everyone! Is there a standard tool commonly used to calculate the average read length of fastq files? If yes please mention it here because I want to know the size of average reads of my fastq files so that I can decide the cutoff for…

Continue Reading Average Read length

Different FastQC results after name-sorting BAM file, sequence duplication increases

Different FastQC results after name-sorting BAM file, sequence duplication increases 1 Okay, so what I did might was stupid, but I was determined to examine on my own a lot of things, and experiment a bit with tools. At one point I decided to do this: I had BAM file…

Continue Reading Different FastQC results after name-sorting BAM file, sequence duplication increases

Total sequences – FASTQC report

Total sequences – FASTQC report 1 Hello! Can someone please dummy-explain to me what exactly are total sequences? I am practicing some bioinformatic problems and in my FASTQC report I get this: Total Sequences 85988702, Sequence length 76. Does that mean that I have one target region of interest with…

Continue Reading Total sequences – FASTQC report

Find right adapter sequence for trimming

Find right adapter sequence for trimming 0 Hello everyone I am newly start to working RNAseq analysis. I am trying to clean single end reads data according to fastqc result. It was resulted like in example as SRR309133 I was tried Illumina Adapter Sequences find it there.But after trimming result…

Continue Reading Find right adapter sequence for trimming

16s rRNA Sequencing Meta-analysis Reconstruction Tool (using mothur).

16SMaRT is a bioinformatics analysis pipeline for 16s rRNA gene sequencing data. 16SMaRT is a “one-click” solution towards performing microbial community analysis of amplicon sequencing data. 16SMaRT aims to be your go-to solution for your next microbiome/metagenomics project. The primary objective of 16SMaRT analysis is to determine what genes are…

Continue Reading 16s rRNA Sequencing Meta-analysis Reconstruction Tool (using mothur).

get rRNA FASTA file for a particular bacteria

get rRNA FASTA file for a particular bacteria 0 Hey all, I was trying to find a way to get all rRNA (5S, 16S and 23S) FASTA sequences for a particular bacteria (B. thetaiotaomicron VPI-5482, which is the type strain). I wanted this file so that I could use something…

Continue Reading get rRNA FASTA file for a particular bacteria

Trimming DNAStringSet

Trimming DNAStringSet 1 Hello, I am currently dealing with the problem of reading in a Fastq-File with “readDNAStringset”, trimming the Sequences and then writing them in to a new fastq-file. The reading of the fastq-file with “readDNAStringSet” is working just fine. I am then trying to trim a fixed length…

Continue Reading Trimming DNAStringSet

Scripts for BGC analysis in large MAGs and results of their application to soil metagenomes within Chernevaya Taiga RSF-funded project

This repository include scripts for analysis of biosynthetic gene clusters (BGCs) in large metagenome assemblies. All scripts were created within the Chernevaya Taiga project funded by the Russian Science Foundation (grant 19-16-00049). The repository also contains results of the scripts application to four hybrid (illumina + ONT) assemblies of various…

Continue Reading Scripts for BGC analysis in large MAGs and results of their application to soil metagenomes within Chernevaya Taiga RSF-funded project

Single-cell DNA and RNA sequencing reveals the dynamics of intra-tumor heterogeneity in a colorectal cancer model | BMC Biology

Organoid culture of small intestinal cells and lentiviral transduction C57BL/6J mice and BALB/cAnu/nu immune-deficient nude mice were purchased from CLEA Japan (Tokyo, Japan). The small intestine was harvested from wild-type male C57BL/6J mice at 3–5 weeks of age (Additional file 1: Figure S9A). Crypts were purified and dissociated into single cells,…

Continue Reading Single-cell DNA and RNA sequencing reveals the dynamics of intra-tumor heterogeneity in a colorectal cancer model | BMC Biology

How can I get PHRED score?

How can I get PHRED score? 1 Hi, all. I am trying to get the assembly stat(Table S1.) according to the following paper about de novo assembly. [www.ncbi.nlm.nih.gov/pmc/articles/PMC7266049/%5D%5B1] In the table, there is an item “Mean read PHRED score after filtering and trimming”. How can I get this? Is there…

Continue Reading How can I get PHRED score?

The sardine run in southeastern Africa is a mass migration into an ecological trap

INTRODUCTION Large-scale annual migrations occur in an extraordinary range of animals, from insects to the great whales. While the driving mechanisms of these migrations are varied and sometimes poorly understood, they often represent a way of optimizing conditions for breeding and adult fitness when these are in conflict. Often, populations…

Continue Reading The sardine run in southeastern Africa is a mass migration into an ecological trap

I downloaded fastq files from a repository and tried to run fastqc, how can the average sequence length be only 8 bp?

I downloaded fastq files from a repository and tried to run fastqc, how can the average sequence length be only 8 bp? 1 I downloaded sequencing files from 2 patients from here: www.ebi.ac.uk/ena/browser/view/PRJNA588461?show=reads there is one fastq file for the forward (1) and reverse (2) reads. I wanted to look…

Continue Reading I downloaded fastq files from a repository and tried to run fastqc, how can the average sequence length be only 8 bp?

Should I expect high duplicate read frequency in my scRNA fastq’s?

Should I expect high duplicate read frequency in my scRNA fastq’s? 0 Hi there, I’m new to scRNA-seq data and I am looking at a multiqc report for some of my samples. The screenshot below shows the sequence counts for my fastq’s. My question is, should I expect such a…

Continue Reading Should I expect high duplicate read frequency in my scRNA fastq’s?

Fastqc user manual – vodosp.ru

FASTQ format – Wikipedia 06 September 2021 – by TC Collin · 2020 · Cited by 3 — Be accompanied by a step-by-step user-friendly manual, If the user performs FastQC prior to the removal of adapters (step 3), the length Both programs can be used on Linux/MacOS X machines and quite…

Continue Reading Fastqc user manual – vodosp.ru

Weird 8bp peak in (clean) small RNA data

Weird 8bp peak in (clean) small RNA data 0 Hi everyone, I’ve received some small RNA sequencing data from a collaborator and after removing adapter and trimming (with TrimGalore) i have a small peaks at ~20 and 30 bp, as I was expecting, plus a huge peak at 8bp that…

Continue Reading Weird 8bp peak in (clean) small RNA data

FASTQC not showing adapters—cutadapt sanity check—

Hello a newbie here, I am reanalyzing an article (GSE83931) for training purpose. I have two concerns/question. 1- I performed FASTQC on the sequences followed by multiqc. When I look at the reports individually it doesn’t show any adapter sequence. (please see pic1). (Authors reported the they used Trimmomatic to…

Continue Reading FASTQC not showing adapters—cutadapt sanity check—

Gene mutation analysis in papillary thyroid carcinoma

Introduction Thyroid tumors are the most common malignant tumors of the endocrine system, and their incidence has been increasing in the recent decades. Currently, there are some target drugs that can effectively treat PTC, and next-generation sequencing (NGS) can be used for targeted therapy. In order to make better informed…

Continue Reading Gene mutation analysis in papillary thyroid carcinoma

How to view fastqc output file in html format which is on remote server

How to view fastqc output file in html format which is on remote server 1 I currently have few html fastqc output files on a remote server that I ssh to. I want to visualize it in my browser. How can I do this? remote linux ssh • 22 views…

Continue Reading How to view fastqc output file in html format which is on remote server

An epigenetic basis of inbreeding depression in maize

INTRODUCTION Charles R. Darwin documented inbreeding depression as growth disadvantages from self-fertilization compared to outcrossing in many plants (1). Prevailing hypotheses suggest that inbreeding depression results from the exposure of deleterious recessive alleles and/or loss of overdominant alleles due to increased homozygosity (2, 3) or reduced recombination frequency in some…

Continue Reading An epigenetic basis of inbreeding depression in maize

Global phylogenomic analyses of Mycobacterium abscessus provide context for non cystic fibrosis infections and the evolution of antibiotic resistance

1. Lee, M.-R. et al. Mycobacterium abscessus complex infections in humans. Emerg. Infect. Dis. 21, 1638–1646 (2015). CAS  PubMed  PubMed Central  Google Scholar  2. Prince, D. S. et al. Infection with Mycobacterium avium complex in patients without predisposing conditions. N. Engl. J. Med. 321, 863–868 (1989). CAS  PubMed  Article  Google…

Continue Reading Global phylogenomic analyses of Mycobacterium abscessus provide context for non cystic fibrosis infections and the evolution of antibiotic resistance

Any program to check resources used by Bioinfo Tools

Any program to check resources used by Bioinfo Tools 0 Hello Everyone, I have a bash Script of FastQC and Trimmomatic. I want to know how much resources each of these tool uses. So is there any way or any bash command or tool which helps with this situation. If…

Continue Reading Any program to check resources used by Bioinfo Tools

removing adaptor

removing adaptor 1 I have two RNAseq datasets, from two different sequencing core facilities – the fastq files they provide already are demultiplexed and trimmed for adaptor. This is the Adaptor content following fastqc-> multiqc . According to the multiqc all files passed the qc for adapter content. So I…

Continue Reading removing adaptor

Pybedtools error sans

Pybedtools error sans 20-08-2021 pysam – Error when I install samtools for python on windows – i trying install pysam, pybedtools modules on python got error: ($i=1; $i[email protected] temp]$ conda install pysam bedtools hisat2 [ snip. However,…

Continue Reading Pybedtools error sans

Trimming Nextera adapter from scRNA paired reads with different length

Trimming Nextera adapter from scRNA paired reads with different length 0 Hi everyone, I have two FASTQ files (R1 and R2) which R1 is 50bp (cDNA) and R2 is 17bp (BC+UMI). I would like to trim the Nextra adapters with Trim Galore and to keep only reads that are >35bp…

Continue Reading Trimming Nextera adapter from scRNA paired reads with different length

Is it normal for RCorrector to remove millions of reads?

Is it normal for RCorrector to remove millions of reads? 0 I’m trying to build De Novo transcriptomes for unsequenced plants to do sequence analysis. I’m trying to choose a tool for my first pass of quality filtering after running FastQC on my raw reads. I’ve tried AfterQC and RCorrector….

Continue Reading Is it normal for RCorrector to remove millions of reads?

How to interpret bimodal distribution of GC-content for RNAseq and can it be remedied ?

How to interpret bimodal distribution of GC-content for RNAseq and can it be remedied ? 0 A colleague of mine have got the following distribution of GC-content for RNAseq. How to interpret bimodal distribution of GC-content for RNAseq ? Does it mean some contamination ? Is there any method to…

Continue Reading How to interpret bimodal distribution of GC-content for RNAseq and can it be remedied ?