Tag: fq.gz

python – Snakemake wrappers suddenly stopped working

I have this wrappers in my snakemake file rule fastqc: input: “reads/{sample}_trimmed.fq.gz” output: html=”qc/fastqc/{sample}.html”, zip=”qc/fastqc/{sample}_fastqc.zip” # the suffix _fastqc.zip is necessary for multiqc to find the file params: extra = “–quiet” log: “logs/fastqc/{sample}.log” threads: config[“resources”][“fastqc”][“cpu”] conda: “envs/qc.yaml” wrapper: “v1.31.1/bio/fastqc” qc.yaml: name: qc channels: – bioconda dependencies: – python – fastqc…

Continue Reading python – Snakemake wrappers suddenly stopped working

Cutadapt error: too many parameters.

Cutadapt error: too many parameters. 0 Hi biostars community! I am having issues to loop cutadapt over gunzipped samples. This is the script I am using: #!/bin/bash #SBATCH –account GRINFISH #SBATCH -c 8 #SBATCH –mem 96g #SBATCH –output logfile.out #SBATCH –error logfile.err # This script performs trimming for PE sequences…

Continue Reading Cutadapt error: too many parameters.

Trimmomatic run error

Trimmomatic run error 0 Hello I have a pb On running input_dir=”$HOME/workdir/group” output_dir=”$HOME/workdir/group/fqdata_trimmed” adap=”$CONDA_PREFIX/share/trimmomatic-0.39-1/adapters” f1=”$HOME/workdir/group/P4.R1.fq.gz” f2=”$HOME/workdir/group/P4.R2.fq.gz” newf1=”$HOME/workdir/group/P4.R1.pe.trim.fq.gz” newf2=”$HOME/workdir/group/P4.R2.pe.trim.fq.gz” newf1u=”$HOME/workdir/group/P4.R1.se.trim.fq.gz” newf2u=”$HOME/workdir/group/P4.R2.se.trim.fq.gz” mismatch_values=(1 2 3 4 5) for mismatch_value in “${mismatch_values[@]}” do trimmomatic PE -threads 1 -phred33 -trimlog trimLogFile -summary statsSummaryFile \ $f1 $f2 $newf1 $newf1U $newf2 $newf2U \ ILLUMINACLIP:$adap/TruSeq3-PE-2.fa:${mismatch_value}:30:10:1 \ SLIDINGWINDOW:4:15…

Continue Reading Trimmomatic run error

No differentially expressed genes after multiple testing correction in mice

No differentially expressed genes after multiple testing correction in mice 0 Hi all, I am working with the RNA-seq data on mice (group A N=3 vs group B N=3). Mice are littermates, of which group A overexpresses a human transgene which I verified. I have had .cram files from mouse…

Continue Reading No differentially expressed genes after multiple testing correction in mice

High number of duplicates and low percentage properly paired

High number of duplicates and low percentage properly paired 0 I have some paired end sequencing data that I have trimmed using cutadapt. It was sequenced on an illumina novaseq 6000 and is low coverage RADseq data (2-3x). My cutadapt script used forward and reverse adapters from illumina : cutadapt…

Continue Reading High number of duplicates and low percentage properly paired

How to split a fastq file to multiples fastq files

How to split a fastq file to multiples fastq files 1 Dear all, I have a fastq.gz file that has more than 100 million reads. My aim is to divide this fastq file into three separate fastq files, ensuring that all reads from the original fastq file are distributed and…

Continue Reading How to split a fastq file to multiples fastq files

Error, fewer reads in file specified with -1 than in file specified with -2

Bowtie2: Error, fewer reads in file specified with -1 than in file specified with -2 1 Hi all, This is my first time attempting to align sequences to a reference index. I am using bowtie2 with the -1 and -2 arguments and have gotten the following error message: Error, fewer…

Continue Reading Error, fewer reads in file specified with -1 than in file specified with -2

Snakemake workflow for trimmomatic

Snakemake workflow for trimmomatic 0 Hello everybody ! I’m a novice in Snakemake. I want to create a workflow for Illumina data analysis. I’m currently programming trimmomatic rule and I’m facing to issue. This is the code: SAMPLES = [“1G_S15”, “7G_S13″] rule trimmomatic_pe: input: adaptaters =”Illumina/adaptaters/TruSeq2-PE.fa”, forward = expand(“HHV8/fastq_raw_/fastq_H8/{sample}_R1.fastq.gz”, sample…

Continue Reading Snakemake workflow for trimmomatic

Correct script for featurecounts in Rsubread

I am new to R and RStudio but have been trying to work through different examples using Rsubread for my data. I have tried reading vignettes and manuals prior to posting here but I am stuck and could really use some advice. I have 7 paired-end, fastq files from Illumina…

Continue Reading Correct script for featurecounts in Rsubread

Hard clip fastq

Hard clip fastq 2 I hope this is not a silly question. I have 2x 200bp fastqs generated from MGI G400 sequencer. I would like to do a comparison with Illumina but these only come as 2x 150bp fastqs. Is it possible to hard clip the 2x 200bp fastqs down…

Continue Reading Hard clip fastq

Nextflow memory issues custom config -c

Nextflow memory issues custom config -c 1 Hi all, I am trying to run nextflow on my laptop nextflow run nf-core/rnaseq \ –input samplesheet.csv \ –genome mm10 \ -profile docker I am having issues with memory: Error executing process > ‘NFCORE_RNASEQ:RNASEQ:FASTQC_UMITOOLS_TRIMGALORE:FASTQC (KO_3)’ Caused by: Process requirement exceed available memory –…

Continue Reading Nextflow memory issues custom config -c

How to select or subset process outputs in Nextflow DSL2?

How to select or subset process outputs in Nextflow DSL2? 2 I have a DSL2 Nextflow workflow. I would like to use just the outputs named “paired.fq.gz” ( index 0 and 2 in the tuple) in downstream processes. Is there a way to filter or select a subset of the…

Continue Reading How to select or subset process outputs in Nextflow DSL2?

Tool To Find Out If Fastq Is In Sanger Or Phred64 Encoding?

Tool To Find Out If Fastq Is In Sanger Or Phred64 Encoding? 9 Is there a simple tool I can use to quickly find out if a FASTQ file is in Sanger or Phred64 encoding? Ideally something that tells me ‘Encoding XX’ somewhere the terminal output. fastq tools • 46k…

Continue Reading Tool To Find Out If Fastq Is In Sanger Or Phred64 Encoding?

10x 3′ library creates R1 and R2 fastq files with the same read length

Let me show you an example: trace.ncbi.nlm.nih.gov/Traces/index.html?view=run_browser&acc=SRR16093385&display=metadata This data contains two reads, R1 and R2. The read length of R1 and R2 are the same 150bp. However, this experiment is performed following 10x 3’library protocol. In the method section, it described as below: The scRNA-seq libraries were generated using the…

Continue Reading 10x 3′ library creates R1 and R2 fastq files with the same read length

trimmomatic on scRNA seq data

trimmomatic on scRNA seq data 1 Hello I’m struggling with scRNA pipeline. I downloaded data from 10* genomics database : support.10xgenomics.com/single-cell-gene expression/datasets/3.0.0/pbmc_1k_v3 when I want to check the size of files I found this : -rw-r–r– 1 5062 5000 753851810 Nov 2 2018 pbmc_1k_v3_S1_L001_R1_001.fastq.gz -rw-r–r– 1 5062 5000 1772725195 Nov…

Continue Reading trimmomatic on scRNA seq data

Trimmomatic generated two (reverse-forward) paired-files with different number of reads

Trimmomatic generated two (reverse-forward) paired-files with different number of reads 0 Hi all, Through the RNA-seq analysis workflow using Linux, Trimmomatic generates 4 out-put files; forward-paired.fq.gz, reverse-paired.fq.gz, and the 2 unpaired files. As I read in several threads, Trimomatic is expected to; Remove the adapters and the low-quality reads. generates…

Continue Reading Trimmomatic generated two (reverse-forward) paired-files with different number of reads

Can’t add read group correctly to minimap2 sam alignmnet

Can’t add read group correctly to minimap2 sam alignmnet 1 Hello I am running minimap2 in a pipeline with GATK that needs read group data @RG with sample information. minimap2 -ax sr -t 20 -I 100G -R @RG\\tID:A00253_251_HTN2JDSXY.2\\tPL:ILLUMINA\tLB:LB1\\tSM:TA90 ref.mmi reads_1.fq.gz reads_2.fq.gz | samtools view -bh -F 260 -T ref.fa >out.bam…

Continue Reading Can’t add read group correctly to minimap2 sam alignmnet

snakemake wildcard in shell

snakemake wildcard in shell 0 I’m a newbie at snakemake. I’m trying to implement the GATK FastqToSam in a rule. I have the following and it works if I hard code the samplename into the shell part but I was wanting to get the samplename from the config file. I…

Continue Reading snakemake wildcard in shell

forcing read error correction using SPAdes

forcing read error correction using SPAdes 2 Given that this is my code below, why is SPAdes giving me the following message?: Mode: ONLY assembling (without read error correction) Debug mode is turned OFF I would like for the assembly to complete the read error correction step if possible. Based…

Continue Reading forcing read error correction using SPAdes

XenoCell fq.gz output files

XenoCell fq.gz output files 0 Hi, I followed the tutorial of XenoCell (see below) and extracted the graft barcodes by using hgmm_5k_v3 example dataset. I got 3 output files (cellular_barcodes.txt, fq_barcode.fq.gz, and fq_transcript.fq.gz) under graft folder. Does anyone know how to convert these files and feed them into 10x genomic…

Continue Reading XenoCell fq.gz output files

STAR is running but .sam file size does not increase after hours mapping

STAR is running but .sam file size does not increase after hours mapping 0 Hi there, I’m using STAR with a small genome. My samples are paired. The commands are: For genome indexes STAR –runThreadN 20 –runMode genomeGenerate –genomeDir /path/to/folder/Analyses/STAR/ –genomeFastaFiles /path/to/genome_reference/genome.fna –readFilesCommand zcat path/to/folder/with/giz_samples/R1.fq.gz R2.fq.gz –sjdbGTFfile path/to/genome_reference/genome.gff –genomeSAindexNbases 11…

Continue Reading STAR is running but .sam file size does not increase after hours mapping

RNAseq for DE purpose

RNAseq for DE purpose 0 Hi all, I am totally new in the bioinformatic analysis. I am working on a project that looks at DGE among different time treatments. Besides, there is no reference genome (meaning that I need a de novo assembly step). So far, after struggling and navigating…

Continue Reading RNAseq for DE purpose

Randomize Read Order In Multigbp Fastq File?

Randomize Read Order In Multigbp Fastq File? 3 Is there any method to randomize the read order in a multi-Gbp fastq file? fastq • 6.0k views Assuming you are talking about a single-end file, you can use awk to put each 4-line fastq entry on a single line. You then…

Continue Reading Randomize Read Order In Multigbp Fastq File?

Error 134 while aligning using hisat2

Error 134 while aligning using hisat2 0 Hello, I am using the below command to align the reads and get bam file: hisat2 -x /hisat/grch38/genome -1 /fastq/output_forward_paired.fq.gz -2 /fastq/output_reverse_paired.fq.gz | samtools sort -o /bams/outout.bam This was running perfectly ok for the last try, however, for the new try I got…

Continue Reading Error 134 while aligning using hisat2

Error in trimmomatic

Error in trimmomatic 1 Hi! Trust you are well. I am trying to run this program but I get the following error and I dont know to fix it neither understand it. Could you help me please? java -jar trimmomatic-0.36.jar PE -phred33 white_replicate1.R1paired.fq.gz white_replicate1.R2paired.fq.gz white_replicate1.R1paired.fq.gz white_replicate1.R1unpaired.fq.gz white_replicate1.R2paired.fq.gz white_replicate1.R2unpaired.fq.gz ILLUMINACLIP:/mnt/g/poolseq_tutorial_1/poolseq_tutorial/adapters/TruSeq3- PE.fa:2:20:10:1:true…

Continue Reading Error in trimmomatic

cutadapt installed via conda igzip error for some fastq files

Only very recently (~2 weeks ago), cutadapt installed via conda has the following error: This is cutadapt 3.2 with Python 3.8.6 Command line parameters: -j 4 -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC in2438_3_CKDL210000739-2a-AK5142-AK6697_HVHF2DSXY_L2_1.fq.gz Processing reads on 4 cores in single-end mode … [———>8 ] 00:00:26 5,536,084 reads @…

Continue Reading cutadapt installed via conda igzip error for some fastq files

htslib/c what is the correct way to use bgzf_thread_pool ?

htslib/c what is the correct way to use bgzf_thread_pool ? 1 I try to split fastq files into ‘N’ chunks using a simple CC program and htslib-C . It works fine: ./split2file -o TMP S1.R1.fq.gz S1.R2.fq.gz -n 10 but when I use a thread pool ( As far as I…

Continue Reading htslib/c what is the correct way to use bgzf_thread_pool ?

Getting information on CRAM files from headers inside the files

Getting information on CRAM files from headers inside the files 1 Hello. I wish to know if one can find the following information in CRAM files’ headers: 1) Whether or not sequencing data in CRAM files is from WGS or WES, and if so, where? and 2) In case one…

Continue Reading Getting information on CRAM files from headers inside the files

Converting Bam file to Fasta (Zipped)

Converting Bam file to Fasta (Zipped) 0 I would like to convert .bam files to fq.gz (zipped fasta files) for paired reads. bedtools bamtofastq seems to be a commonly recommended method, I have also seen samtools fastq as a possible alternative. bedtools bamtofastq -i inputfile.bam -fq outputR1.fq -fq2 outputR2.fq samtools…

Continue Reading Converting Bam file to Fasta (Zipped)

error when fastp filters data

Use fastp filter to appear sequence and quality have different lengths fastp -i CK-2_R1.fq.gz -o CK-2_R1.clean.fq.gz -I CK-2_R2.fq.gz -O CK-2_R2.clean.fq.gz After filtering the data for a while, it will not be updated anymore. [pengliang@fat01 CK-2]$ ERROR: sequence and quality have different length: WARNNIG: different read numbers of the 4852 packRead1…

Continue Reading error when fastp filters data

BBmap bbduk.sh for filtering reads

I’m looking to filter reads that contain a stretch of A’s, I found these posts looking for polyA tails, meaning this should work all the same (Identify RNA-seq reads containing polyA sequence, Identifying RNA-seq reads containing polyA stretch). However, I cannot get it to work. Given just these two reads,…

Continue Reading BBmap bbduk.sh for filtering reads

Hisat2 – stringtie – deseq2 pipeline for bulk RNA seq

Software official website : Hisat2: Manual | HISAT2 StringTie:StringTie article :Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown | Nature Protocols It is recommended to watch the nanny level tutorial : 1. RNA-seq : Hisat2+Stringtie+DESeq2 – Hengnuo Xinzhi 2. RNA-seq use hisat2、stringtie、DESeq2 analysis – Simple books Basic usage…

Continue Reading Hisat2 – stringtie – deseq2 pipeline for bulk RNA seq

Detailed differences between sambamba and samtools

3 month , My first post in the new student group , The false-positive mutation appears because duplicates mark Not enough ?, Tells the story of supplementary read It won’t be GATK MarkDuplicates Marked as duplicates The problem of . after , In response to this question , I began…

Continue Reading Detailed differences between sambamba and samtools

The low successful assignment ratio of FeatureCounts

Hello, I would like to confirm if the low assignment ratio (54%) is normal, and please check the possible reason I found. I used Hisat2 to assign paired-end strand-specific transcriptomic sequences (rRNA removed) to a reference genome. Because I filtered out the unmapped sequences in advance, the overall assignment ratio…

Continue Reading The low successful assignment ratio of FeatureCounts

Trimmomatic/ linux system

Trimmomatic/ linux system 1 Hi all, I am trying to remove adapters and clean my RNA-seq.gz files using Trimmomatic, loaded on a Linux system (supercomputer server) Following the steps for Pair ends reads, explained in the manual (www.usadellab.org/cms/?page=trimmomatic) java -jar trimmomatic-0.39.jar PE input_forward.fq.gz input_reverse.fq.gz output_forward_paired.fq.gz output_forward_unpaired.fq.gz output_reverse_paired.fq.gz output_reverse_unpaired.fq.gz ILLUMINACLIP:TruSeq3-PE.fa:2:30:10:2:True LEADING:3…

Continue Reading Trimmomatic/ linux system

Fastp file merge append | Develop Paper

Interpretation of fastq file formatwww.jianshu.com/p/39115d21ee17 Sometimes, the sequencing results of a species will return two double ended fastps.r1.fq.gz l1.fq.gzr2.fq.gz l2.fq.gzThe content of sequencing data is actually one piece, but it is divided into two parts during transmission.When we use it, we are used to merging it into a double ended…

Continue Reading Fastp file merge append | Develop Paper

Snakemake using multi inputs – Stackify

You need to define target output files using rule all. SAMPLES = [‘1’, ‘2’, ‘3’, ‘4’] rule all: input: expand(“sample{sample}.R{read_no}.fq.gz.out”, sample=SAMPLES, read_no=[‘1’, ‘2’]) rule fastp: input: reads1=”sample{sample}.R1.fq.gz”, reads2=”sample{sample}.R2.fq.gz” output: reads1out=”sample{sample}.R1.fq.gz.out”, reads2out=”sample{sample}.R2.fq.gz.out” shell: “fastp -i {input.reads1} -I {input.reads2} -o {output.reads1out} -O {output.reads2out}” This is the output of command snakemake -np, with…

Continue Reading Snakemake using multi inputs – Stackify

bwa , 2 files fastq to 1 sam

bwa , 2 files fastq to 1 sam 1 i have this problem, please, help me, I’m trying it too from Mac OS Catalina I am creating a sam file, with 2 fastq files, using bwa I apply the following command bwa mem -t 2 GRCh38.primary_assembly.genome.fa.gz V350019555_L03_B5GHUMqcnrRAABA-556_1.fq.gz V350019555_L03_B5GHUMqcnrRAABA-556_2.fq.gz > V350019555_L03_B5GHUMqcnrRAABA-556.sam…

Continue Reading bwa , 2 files fastq to 1 sam

Secret BBMAP helper page – HRGV/Marmics_Metagenomics Wiki

#How to map to the assembled scaffolds.fasta bbmap is a powerful and highly flexible read mapper jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/bbmap-guide/. For the upcoming analysis you are not interested in the typical mapping output but in statistics on the coverage on every scaffold, you can get them with scaffstats. We want to be specific…

Continue Reading Secret BBMAP helper page – HRGV/Marmics_Metagenomics Wiki

Trimmomatic parameters

Trimmomatic parameters 0 $java -jar /apps/eb/Trimmomatic/0.39-Java-1.8.0_144/trimmomatic-0.39.jar PE -phred33 seq1_L2_1.fq.gz seq1_L2_2.fq.gz _L2_r1_paired_fq.gz seq1_L2_r1_unpaired.fq.gz seq_L2_r2_paired.fq.gz Seq1_L2_r2_unpaired.fq.gz ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:5 ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:5 Trimmomatic • 137 views • link updated 15 hours ago by GenoMax 110k • written 17 hours ago by ronny • 0 Login before adding your…

Continue Reading Trimmomatic parameters

Using STAR SJ.out.tab file to identify novel ncRNAs

Using STAR SJ.out.tab file to identify novel ncRNAs 0 Hi All, I am attempting to identify novel ncRNAs from a circadian RNAseq dataset. Specifically I have a ribo-depleted RNAseq timecourse with 31 samples (sample every 2 hours for 60hrs). I have run STAR (code below). I am trying to follow…

Continue Reading Using STAR SJ.out.tab file to identify novel ncRNAs

Mapping multiples

Mapping multiples 1 Hi, I am coming to you for help. I am doing a mapping on short and long read files with BWA and MINIMAP2 My problem is that, I want to make an if loop that would allow me to choose either BWA if I work with short…

Continue Reading Mapping multiples

STAR+RSEM pippline without gtf

STAR+RSEM pippline without gtf 0 Dear all, I have question I mapped reads on cds sequence through STAR I don’t have gtf file and want to calculate read count using RSEM but I am stuck by error “RSEM error: RSEM currently does not support gapped alignments” as I don’t have…

Continue Reading STAR+RSEM pippline without gtf

BBMerge / Tadpole error correction

I’ve been using BBMerge recently to address a very specific problem: I am sequencing pooled short DNA molecules (< 400bps) using paired end reads (average length ~ 230 bps post trimming) Each molecule can be assumed to be different (i.e. contains sequence differences – substitutions & indels – with respect…

Continue Reading BBMerge / Tadpole error correction

How to pass custom software specific variables to nf-core/sarek nextflow pipeline?

How to pass custom software specific variables to nf-core/sarek nextflow pipeline? 0 I’m attempting to call whole genome variants using nf-core/sarek nextflow pipeline. In QC step there is an option that invokes trim_galore quality trimming, but i don’t know how to pass my custom adapters to be cut as well….

Continue Reading How to pass custom software specific variables to nf-core/sarek nextflow pipeline?

STAR align multiple files

STAR align multiple files 1 Hi everybody, I am doing alignment to 36 PE samples using star. to make it little bit easy task I wrote a bash loop to align them all with the same command. here is my loop: for i in $(ls raw_data); do STAR –genomeDir index.150…

Continue Reading STAR align multiple files

Biostar Systems

Comment: STAR vs Novoalign IGV Browser visualization by chasem &utrif; 10 That is good to know that it isn’t just my set of reads…still concerning, though. Comment: STAR vs Novoalign IGV Browser visualization by chasem &utrif; 10 I was not expecting this — not sure what to make of it…

Continue Reading Biostar Systems

question about running CIRI-full

question about running CIRI-full 1 I’m using ciri-full to calculate the full length sequence of circRNAs ,and I can run the test data set successfully, but I can’t run my own data running test data set: java -jar ../CIRI-full.jar Pipeline -1 test_1.fq.gz -2 test_2.fq.gz -a test_anno.gtf -r test_ref.fa -d test_output/…

Continue Reading question about running CIRI-full

I am converting the fq.gz. files (which are the results of the mgi study) to bam files to view on igv.

I am converting the fq.gz. files (which are the results of the mgi study) to bam files to view on igv. 0 Hey everyone, before i start apologies for the inconvenience cause of my wrong or inappropriate use of terms. I take some fails of bwa mem lately. As i…

Continue Reading I am converting the fq.gz. files (which are the results of the mgi study) to bam files to view on igv.