Category: fastq

Anyone could help me with cutadapt?

I’, trying to trim my primers off from Illumina sequences. I ‘ve amplified with: Diat_rbcL_F1 AGGTGAAGTAAAAGGTTCWTACTTAAA, Diat_rbcL_F2 AGGTGAAGTTAAAGGTTCWTAYTTAAA and Diat_rbcL_F3 AGGTGAAACTAAAGGTTCWTACTTAAA as Forward primers and Diat_rbcL_R1 5’CCTTCTAATTTACCWACWACTG 3’ (Reverse Complement: 3’CAGTWGTWGGTAAATTAGAAGG 5’) and Diat_rbcL_R2 5’CCTTCTAATTTACCWACAACAG 3’ (Reverse Complement: 3’CTGTTGTWGGTAAATTAGAAGG 5’) Reverse primers. First, I used PEAR to assembly paired reads…

Continue Reading Anyone could help me with cutadapt?

EBI European Nucleotide Archive (ERA) aspera access broken

EBI European Nucleotide Archive (ERA) aspera access broken 0 I’m trying to download FASTQ files from the ENA via aspera. FTP still works. ascp -QT -P33001 -l 200m -i /home/me/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:/vol1/fastq/SRR663/009/SRR6639099/SRR6639099_1.fastq.gz ./ As of sometime last week I am constantly getting: Session Stop (Error: failed to authenticate) ascp: failed to…

Continue Reading EBI European Nucleotide Archive (ERA) aspera access broken

Stacks process_radtags question and scripts

Stacks process_radtags question and scripts 1 How I can modify the quality score to retain more reads in stacks process_radtags? I have paired-end reads from Illumina Next Seq 500. This is my script: process_radtags -1 MAGICR1_T.fastq.gz -2 MAGICR2_T.fastq.gz -b Barcode2.txt -o /home/mdominguez/demult_MAGIC_b –renz_1 sphI –renz_2 sau3AI -E phred33 –barcode_dist_2 3…

Continue Reading Stacks process_radtags question and scripts

how to map Pacbio CCS fastq

how to map Pacbio CCS fastq 1 I have a Pacbio CCS fastq like this I want to map to genome, and this is my command and out. I want to know how to solve it. Is this fastq correct? Thanks minimap2 Pacbio • 25 views It might pay to…

Continue Reading how to map Pacbio CCS fastq

Empty output from UMItools dedup after fastp processing

Empty output from UMItools dedup after fastp processing 0 Hello, I have been running into an interesting error during my sequencing analysis.We have a paired-end library with UMIs on both ends of the fragments, 6 nts each, for a total of 12 nt.Now, I have been using fastp for preprocessing…

Continue Reading Empty output from UMItools dedup after fastp processing

I’m not sure my Bulk RNAseq read counts extracted from fastq file are correct

I’m not sure my Bulk RNAseq read counts extracted from fastq file are correct 1 Hi. I’m new in bioinformatics and I’m trying to extract read counts from fastq files. I used STAR alignment method with GENCODE annotation files. (I didn’t trimmed by reads because I heard that trimming is…

Continue Reading I’m not sure my Bulk RNAseq read counts extracted from fastq file are correct

use files with same name as input? : bioinformatics

Hi everyone, I have a question regarding the input of several fastq files into ‘CellRanger count’ pipeline.I performed scRNA-seq of different samples at a partner institute and the sequencing facility started by sequencing all the samples at a lower depth (to test the quality of the libraries) and only then…

Continue Reading use files with same name as input? : bioinformatics

Does hisat2 –rg flag eat the “/” character in multiline definitions?

Does hisat2 –rg flag eat the “/” character in multiline definitions? 0 Hello, I am trying to use hisat2, but I noticed something weird. When running it like so: hisat2 -p 8 –rg-id=UHR_Rep2 –rg SM:UHR –rg LB:UHR_Rep2_ERCC-Mix1 –rg PL:ILLUMINA –rg PU:CXX1234-TGACAC.1 -x $RNA_REF_INDEX –dta –rna-strandness RF -1 “$RNA_DATA_DIR/${SAMPLE}_1.fastq.gz” -2 “$RNA_DATA_DIR/${SAMPLE}_2.fastq.gz”…

Continue Reading Does hisat2 –rg flag eat the “/” character in multiline definitions?

How to perform sequence quality filtering of raw reads for single cell RNA seq?

How to perform sequence quality filtering of raw reads for single cell RNA seq? 0 I have a few single-cell RNA-seq samples (raw fastq reads). When I process raw reads, what should I do first? Is it demultiplexing? If so, what is the best tool I can use? Can I…

Continue Reading How to perform sequence quality filtering of raw reads for single cell RNA seq?

FASTQ to VCF pipeline question

FASTQ to VCF pipeline question 0 Hello all, I am new with programming within bioinformatics and long story short, I’m practicing writing pipeline scripts starting with the fastq to VCF pipeline. I am basically at the point where I went from fastq to sorted-bam files, and as I went to…

Continue Reading FASTQ to VCF pipeline question

use files with same name as input

Cell Ranger count pipeline: use files with same name as input 0 Hello, I have a question regarding the input of several fastq files into ‘CellRanger count’ pipeline. I performed scRNA-seq of different samples at a partner institute and the sequencing facility started by sequencing all the samples at a…

Continue Reading use files with same name as input

Identifying RNA-seq reads containing polyA stretch

I’m in the need to filter our RNA-seq data for reads that contain a polyA stretch (e.g. with more than 6 A’s). I need to recover those reads, not discard them. The data is paired end and stranded. So, when dealing with paired reads in 2 files they should always…

Continue Reading Identifying RNA-seq reads containing polyA stretch

trimming fastq files

trimming fastq files 1 I have the fastq files of the data I want to trim. In the FastQC I saw that Nextera adapters were present in my sample. I saw few tutorials and it required to copy the sequences to the current directory so this was the command I…

Continue Reading trimming fastq files

EDGE-pro paired end read input

Hi, I am running EDGE-pro for prokaryotic RNA seq analysis for differential gene expression. ccb.jhu.edu/software/EDGE-pro/ I have paired end read data. The manual states ( ccb.jhu.edu/software/EDGE-pro/MANUAL ) // *MANDATORY FILES: -g genome: fasta file containing bacterial genome. If multiple chromosomes/plasmids exist, they must be combined into one file before running…

Continue Reading EDGE-pro paired end read input

[Solved] tatements are true about the commands used in mothur during the process of generating contigs, filtering and merging the contigs and countin…

Which of the following statements are true about the commands used… in mothur during the process of generating contigs, filtering and merging the contigs and counting contigs?Question 17 options: The make.contigs command will extract the sequence and quality score data from your fastq files, create the reverse complement of the…

Continue Reading [Solved] tatements are true about the commands used in mothur during the process of generating contigs, filtering and merging the contigs and countin…

Generate histogram Via BBmap tool

Generate histogram Via BBmap tool 0 Hello there, I am trying to generate histogram for paired end reads via bbmap tool program. However I am getting this output: root@cemb-ls1-pgs:/mnt/cemb/S017679/raw# /home/fizzah/bbmap/reformat.sh BU21R1.fastq qchist=qchist.txt java -ea -Xms300m -cp /home/fizzah/bbmap/current/ jgi.ReformatReads BU21R1.fastq qchist=qchist.txt Executing jgi.ReformatReads [BU21R1.fastq, qchist=qchist.txt] Set qcount histogram output to qchist.txt…

Continue Reading Generate histogram Via BBmap tool

Mapping SRA files without unpacking?

Mapping SRA files without unpacking? 1 Hi, Is it possible to submit SRA files for alignment directly to short read aligners like Bowtie, avoiding the intermediate step of unpacking data to fastq? Thanks! alignment Bowtie SRA sequencing • 35 views As of version 2.3.5 bowtie2 now supports aligning SRA reads….

Continue Reading Mapping SRA files without unpacking?

rreformat.sh and stats.sh program in BBmap

rreformat.sh and stats.sh program in BBmap 1 Hello there, i am working with RNA seq data and have paired end reads. While running the reformat.sh command I am getting this out put: Set INTERLEAVED to false Exception in thread “main” java.lang.AssertionError: File 19213R-08-01_S16_L002_R1_001.fastq exists and overwrite=false what does it mean?…

Continue Reading rreformat.sh and stats.sh program in BBmap

Server denied on SRA download

Server denied on SRA download 0 Hey guys, I’m having some problem recently regarding downloading some .fastq files from SRA. I’m trying to download from the command line as following (for example, inside a bash script as in the model of sra-explorer.info): curl -L ftp.sra.ebi.ac.uk/vol1/fastq/SRR508/000/SRR5088930/SRR5088930_1.fastq.gz -o Pt35_On_RNA-Seq_1.fastq.gz The problem is…

Continue Reading Server denied on SRA download

Rsubread FeatureCounts return 0.0% assigned

Using featureCounts in the Rsubread package I am getting 0 annotations. I started from raw sequencing data and the Refseq genome and Refseq Genomic GTF files downloaded from here: www.ncbi.nlm.nih.gov/assembly/GCF_000001635.27/ through the download assembly button on the side. I had the top option to RefSeq for both downloads and chose…

Continue Reading Rsubread FeatureCounts return 0.0% assigned

paired reads have different names BWA MEM after samtools bam > fastq

paired reads have different names BWA MEM after samtools bam > fastq 0 I have used bwa mem to align to a host genome and output the unmapped reads. I have then sorted this resulting BAM and split into pairs fastq files. Whya fter sorting the BAM file am I…

Continue Reading paired reads have different names BWA MEM after samtools bam > fastq

mapping long-reads to a reference library

mapping long-reads to a reference library 1 Hi, I have long, pacbio, reads and I have a reference library of only repeats, I want to map the long reads on the repeats library using bwa mem, is this command correct? bwa index mmm.pacbio.fastq.gz bwa mem mmm.pacbio.fastq.gz repeat-library.fasta | samtools sort…

Continue Reading mapping long-reads to a reference library

How can I be sure that raw read counts are well processed from fastq files?

How can I be sure that raw read counts are well processed from fastq files? 0 Hi. I’m new in bioinformatics and try to process fastq files for getting raw read count matrix. I downloaded fastq files from www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE63452 I used fasterq -dump to download fastq files from SRR Aligned…

Continue Reading How can I be sure that raw read counts are well processed from fastq files?

How to get FASTQ reads from the Short Read Archive (SRA)

Getting data out of the short read archive is a tedious and error prone process, thanks to the clunky interfaces and changing methodologies. if you want a subset of the reads say 1000 reads use fastq-dump -X 1000 SRR14575325 if you want the entire file use fasterq-dump SRR14575325 if you…

Continue Reading How to get FASTQ reads from the Short Read Archive (SRA)

wc -l or wc -c which linux command is best to find total read count in fastq file?

wc -l or wc -c which linux command is best to find total read count in fastq file? 1 hello there, I am working on RNA seq data and while counting read counts I came across two type of commands using either wc -l and wc -c command. The out…

Continue Reading wc -l or wc -c which linux command is best to find total read count in fastq file?

Bulk download w prefetch but unable to ‘zip’ (plus more…)

Bulk download w prefetch but unable to ‘zip’ (plus more…) 1 Hi (beginner here so go easy on me). I’m practicing different ways of downloading. I have various questions despite doing a lot of googling on the matter. 1) I’m trying to run something like this (I know these aren’t…

Continue Reading Bulk download w prefetch but unable to ‘zip’ (plus more…)

obtaining unique identifier for a sample from fastq files

obtaining unique identifier for a sample from fastq files 0 in my fastq file which is named “8230-001-001_CTGATCGT-GCGCATAT_L004_R1.fastq.gz“ I am looking for the “unique identifier for a sample“. here is the 1st few lines of the file: @A00379:446:HGTTYDSX2:4:1101:1217:1094_GTGCCAAAGCAC 1:N:0:CTGATCGT+GCGCATAT ATGTGGGCAAGGAGGCCCAGAGCAAGAGAGGCATCCTGACCCTGAAGTACCCCATGGAACACGGCATCATCACCAACTGGGATGACATGGAGAAGATCTGGCACCACACCTTCTACAACGAGCTGCGTGTGGCCCCTGAGGAGC + FF:FFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFF,FFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFF: @A00379:446:HGTTYDSX2:4:1101:11487:1094_ACGTTCAGCGTG 1:N:0:CTGATCGT+GCGCATAT GGCGCTTGGCCTGTTCCATCTCCTCGTCCTTCTCTGCCAGCTTCCGCTCGATCTATGCCTTGATCTGGTTGAACTCTAGCTGGGCCCGGAGGATCTTGCCCTCCTCGTGCTCCAGGGAGGCCTGGGAAGGGGTGGGGTGAGGGC + FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFF:FFF,,FFFFFFFFFFFFFFFF do you know…

Continue Reading obtaining unique identifier for a sample from fastq files

Can I pass a fastq file twice in trimmomatic?

Can I pass a fastq file twice in trimmomatic? 1 I would like to know if I can pass a fastq file in trimmomatic, and then pass again its results. Also, how can I be sure that a fastq file that I downloaded from public sources has already been trimmed…

Continue Reading Can I pass a fastq file twice in trimmomatic?

Aspera: Failed to authenticate

Aspera: Failed to authenticate 0 I tried to download some fasta files from ENA today with following code: cat fq.txt |while read id; do ascp -QT -l 300m -P33001 -k 1 -v -i /home/tomas/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@${id} . ; done One example line in my fq.txt: fasp.sra.ebi.ac.uk:/vol1/fastq/ERR249/001/ERR2497991/ERR2497991_1.fastq.gz But it comes with ascp:…

Continue Reading Aspera: Failed to authenticate

adapter trimming using trimmomatic

adapter trimming using trimmomatic 1 Hi All, I ran fastqc on my chipseq dataset and a redflag is raised for overrepresented sequences. 29% of this sequence- GATCGGAAGAGCACACGTCTGAACTCCAGTCACACA (possible source- Trueseq adapter) I ran trimmomatic for both Trueseq2 and Trueseq3 but both don’t seem to trim anything. Any suggestions? Thanks, Ritu…

Continue Reading adapter trimming using trimmomatic

Please advise a tutorial/course on genetic data analysis

Please advise a tutorial/course on genetic data analysis 1 Hello everyone! I’m by no means a bioinformaticist, but would like to learn some art (my background is chemistry/computer science/machine learning, I do ML-supported drug design). I would like to analyse human genetic data. Specifically, the task is as follows: given…

Continue Reading Please advise a tutorial/course on genetic data analysis

Bioinformatics Analyst – New York

POSITION RESPONSIBILITIES: The person will:Utilize existing pipelines to process and analyze high-throughput sequencing data, including bisulfite sequencing data.Manage, organize all bioinformatics sequencing data in the lab. Including papillomavirus sequence, microbiome data, both 16S and other, and human genomic dataConstruct phylogenic treesThe individual will be responsible for downloading large Fastq, BAM…

Continue Reading Bioinformatics Analyst – New York

Difference between total number of reads in fastq file and no of bases/nt sequences in fastq file?

Difference between total number of reads in fastq file and no of bases/nt sequences in fastq file? 1 Hello there; I am a beginner in Data analysis domain and want to clear my concept number of reads and no of bases in fastq file. what is the difference between total…

Continue Reading Difference between total number of reads in fastq file and no of bases/nt sequences in fastq file?

How to find newline character in fastq file? Is it essential to remove them?

How to find newline character in fastq file? Is it essential to remove them? 1 Hello there I am a beginner in data analysis domain and want to ask: How to find newline character in fastq file Is it essential to remove them? If not then what impact they can…

Continue Reading How to find newline character in fastq file? Is it essential to remove them?

What is the most optimal way to count the nucleotide bases of each fastq file in my directory using UNIX commands?

What is the most optimal way to count the nucleotide bases of each fastq file in my directory using UNIX commands? 3 I have a bunch of fastq files, and I need to write a one line UNIX command that will write the word count (wc) of how many nucleotides…

Continue Reading What is the most optimal way to count the nucleotide bases of each fastq file in my directory using UNIX commands?

bowtie2; using for multiple fastq files, linux loop code

bowtie2; using for multiple fastq files, linux loop code 1 Hi, I’m pretty new to linux and ChipSeq analysis. At this point, I have 100 fastq.gz files to be aligned with hg19. I already indexed my genome and called it hg19 and could align my reads individually with it but…

Continue Reading bowtie2; using for multiple fastq files, linux loop code

BBDuk quality filtering not producing expected results

BBDuk quality filtering not producing expected results 1 I’m trying to trim/filter low quality reads from paired-end exome-seq data, using BBDuk. I used the command: for ea in $files; do R1=”$ea” R2=$(echo $R1 | sed “s/R1/R2/”) /home/shared/programs/bbmap/bbduk.sh -Xmx1g in1=$R1 in2=$R2 out1=”$(echo $ea | sed s/.fastq.gz/_trimmed_filtered.fastq.gz/)” out2=”$(echo $(echo $ea | sed…

Continue Reading BBDuk quality filtering not producing expected results

extract specific information from fastq and sam files

extract specific information from fastq and sam files 1 I have a fastq files which looks like this: 7090-001-001_CTGATCGT-GCGCATAT_L004_R1.fastq.gz 7090-001-001_CTGATCGT-GCGCATAT_L004.sam I need to extract the following information from the file: 1- Read Group Library Identifier 2- Read Group Platform; usually Illumina 3- Read Group Platform Unit 4- Read Group Sample…

Continue Reading extract specific information from fastq and sam files

Read at offset 59230953264 does not begin with “@”.

Read at offset 59230953264 does not begin with “@”. 1 I have been trying to assemble a human genome that has been sequenced on Nanopore-PromethION, basecalled on the latest version of bonito, and am now trying to run an assembly using shasta. The basecalling had no issues, but now when…

Continue Reading Read at offset 59230953264 does not begin with “@”.

Variants in untargeted genes identified after atrgeted exome sequencing analysis

Variants in untargeted genes identified after atrgeted exome sequencing analysis 0 Hi, I recently analyzed some targeted exome sequencing samples, which were provided to us by our collaborators, for which I do not possess the target gene list. Upon analysis, I am informed that some of the genes – whose…

Continue Reading Variants in untargeted genes identified after atrgeted exome sequencing analysis

what does “total read count” means in fastqc file. how does it helpful for analysis

what does “total read count” means in fastqc file. how does it helpful for analysis 1 hello there, I am working on RNA seq data and I am confused about read count in fastq file. Can anyone explain what does “Total read count” mean? is it mean we are counting…

Continue Reading what does “total read count” means in fastqc file. how does it helpful for analysis

bioconductor – I can’t launch FastqCleaner I always get a warning message and the application never starts

I tried to install all the needed and related packages but I still did not know what the problem is, Can anyone please help if anything else I can do?? I always get this over and over: Warning: Error in : Navigation containers expect a collection of `bslib::nav()`/`shiny::tabPanel()`s and/or `bslib::nav_menu()`/`shiny::navbarMenu()`s….

Continue Reading bioconductor – I can’t launch FastqCleaner I always get a warning message and the application never starts

Bioconductor – esATAC

DOI: 10.18129/B9.bioc.esATAC     This package is for version 3.12 of Bioconductor; for the stable, up-to-date release version, see esATAC. An Easy-to-use Systematic pipeline for ATACseq data analysis Bioconductor version: 3.12 This package provides a framework and complete preset pipeline for quantification and analysis of ATAC-seq Reads. It covers raw…

Continue Reading Bioconductor – esATAC

samtools to count the number of reads mapped to each spike-in for each sample

samtools to count the number of reads mapped to each spike-in for each sample 0 My goal is to use STAR to create a new genome with the spike-ins listed below by combining both hg38.fa and spike-in. Once I have the genomes created, I’ll align FASTQs to this newly created…

Continue Reading samtools to count the number of reads mapped to each spike-in for each sample

randomreads.sh adding abundances for metagenomic like distribution

randomreads.sh adding abundances for metagenomic like distribution 0 Hi, I have 9 genomes, I would like to produce a metagenome like distribution using randomreads.sh. I concatenated genome fasta files in one reference file. Then, ran as below. ../bbmap/randomreads.sh ref=simplified_catgenome.fasta out1=20M.read1.fastq out2=20M.read2.fastq length=125 paired=t metagenome=t genome=9 reads=20000000 However, I would like…

Continue Reading randomreads.sh adding abundances for metagenomic like distribution

Aligning multiple fastq files with genome in one script/one line with STAR

Hi there! This is probably a VERY basic question but I don’t have the best terminal skills so I’m struggling a little. I want to apply what I wrote below for all my fastq scripts without doing a for loop or manually writing the code for each (ideally they all…

Continue Reading Aligning multiple fastq files with genome in one script/one line with STAR

RRBS methylation analysis

Hello, I want to use HMST-Seq anayzer (www.sciencedirect.com/science/article/pii/S2001037020304232) tool for my RRBS data analysis (directional) but I am stuck at the first step. In order to run HMST-Seq analyzer pipeline, I need CpG.txt or CpG.bed file as an input. So, I first performed bismark analysis on my control and mutated…

Continue Reading RRBS methylation analysis

NGSeq/DHPGIndex: This tool is for compressing and indexing pan-genomes and genome sequence collections for scalable sequence and read alignment purposes.

General This tool is for compressing and indexing pan-genomes and genome sequence collections for scalable sequence and read alignment purposes. The pipeline can be deployed in cloud computing environment or in dedicated computing cluster. The tool extends the CHIC aligner gitlab.com/dvalenzu/CHIC with distributed and scalable features. DHPGIndex have been tested…

Continue Reading NGSeq/DHPGIndex: This tool is for compressing and indexing pan-genomes and genome sequence collections for scalable sequence and read alignment purposes.

How to run MLST with multiple fastq files

How to run MLST with multiple fastq files 0 Hi, I am trying to run a bash script for MLST at CGE (cge.cbs.dtu.dk/services/MLST/). I have fastq files and downloaded the MLST program. When I ran the program with single fastq file (R1 and R2) it is able to generate the…

Continue Reading How to run MLST with multiple fastq files

fasterq-dump only downloads certain runs?

fasterq-dump only downloads certain runs? 0 Hey all, I wrote a script in python that gets the SRR ids from a SRP id and then downloads all FASTQ files for every SRR. How can it only work on some SRRs? For example, SRP000124 provides these: SRR954969 SRR001030 and downloads the…

Continue Reading fasterq-dump only downloads certain runs?

Comparison of sequencing data processing pipelines and application to underrepresented African human populations | BMC Bioinformatics

Literature survey We reviewed the processing pipelines of 29 HTS studies, 23 of which focus on human populations and six on other mammals (listed in Table 1). Table 1 List of studies included in the literature survey We summarized the information for some processing steps in Table 2 (see Additional…

Continue Reading Comparison of sequencing data processing pipelines and application to underrepresented African human populations | BMC Bioinformatics

fasted

fasted 0 I’m using a m1 Mac and I’ve installed fastqc using homebrew and also from the official website www.bioinformatics.babraham.ac.uk/projects/fastqc/ but I’m not able to run the command line(./fastqc) on terminal this is the output I get zsh: permission denied: ./fastqc and my fastqc does not show file option to…

Continue Reading fasted

Twist Bioscience Staff Bioinformatics Engineer, Biopharma

Twist Biopharma is seeking a Bioinformatics Engineer to develop and integrate workflows, analyses, and computational tools involved in the production and research of antibodies and proteins. While you have a broad interest in biotech and related scientific technologies, you also understand that computer science resources must be utilized to reach…

Continue Reading Twist Bioscience Staff Bioinformatics Engineer, Biopharma

fastp 0.23.0 released, runs 2x faster, and generates reproducible outputs.

Tool:fastp 0.23.0 released, runs 2x faster, and generates reproducible outputs. 0 fastp, the widely used ultra-fast FASTQ preprocessing and QC tool, and till now has been cited over 2,000 times. Today, a new version, v0.23.0 has been released, with great improvement on performance. The threading and I/O modules have been…

Continue Reading fastp 0.23.0 released, runs 2x faster, and generates reproducible outputs.

How to quantify piRNAs ?

How to quantify piRNAs ? 2 Hi I’m trying to create a piRNA count table from samples enriched with small RNAs (~31 bases) using the piRBase reference. Despite all my readings I haven’t find a simple way to do that. In the first place I tried to quantifiy piRNAs with…

Continue Reading How to quantify piRNAs ?

kallisto genomebam not showing reads on igv

Hello! I am trying to produce bam files to load to igv after kallisto quant with –genobam option. After producing and loading the pseudoalignment bam to the igv, it is empty. This is my initial command: kallisto quant -i Homo_sapiens.GRCh38.cdna.all.release-100.idx -o pseudo -t 10 –genomebam -g Homo_sapiens.GRCh38.100.gtf -c hg38.chrom.sizes R1.fastq.gz.trim_1.fq.gz…

Continue Reading kallisto genomebam not showing reads on igv

Single-cell DNA sequencing on Pediatric MDS

Study Description Single-cell DNA sequencing with antibody-oligonucleotide staining was performed using the Mission Bio Tapestri single-cell DNA sequencing platform, per the manufacturer’s instructions. All libraries were sized and quantified using an Agilent Bioanalyzer and pooled for sequencing on an Illumina NovaSeq6000 with 150 base-paired ending multiplexed runs. Fastq files generated…

Continue Reading Single-cell DNA sequencing on Pediatric MDS

Wyss Institute Spinout Pluto Biosciences Bets on Collaborative Bioinformatics Platform

CHICAGO – Pluto Biosciences, a recent spinout of the Wyss Institute, seeks to facilitate collaborations between researchers through a shared bioinformatics platform. Next week, the firm plans to announce that it has closed a seed round worth slightly more than $1 million that it has launched a free version of…

Continue Reading Wyss Institute Spinout Pluto Biosciences Bets on Collaborative Bioinformatics Platform

genome load in a for-loop

STAR-aligner: genome load in a for-loop 0 I am running multiple samples defined in files.txt. How should I load the genome correctly to avoid it being loaded for each iteration in the loop? I tried the following, but the –genomeLoad LoadAndKeep required fastq-files to be loaded, in addition to the…

Continue Reading genome load in a for-loop

Best tools for miRNA target prediction

Best tools for miRNA target prediction 0 Hi, I have sequencing data for microRNA on common bean (Phaseoulus vulgaris), all stored in fastq format. I have used sRNAtoolbox to identify the miRNAs; now I would like to understand which might be the target of this miRNAs in the common bean…

Continue Reading Best tools for miRNA target prediction

Remove reads from FASTQ file based on missing fixed base

Remove reads from FASTQ file based on missing fixed base 0 Hello everybody, I have a question regarding processing raw FASTQ files based on a specific UMI approach. Basically, we employed a strategy to our paired-end sequencing experiment, where we use 6nt UMIs in our library. Following the UMI sequence…

Continue Reading Remove reads from FASTQ file based on missing fixed base

Frontiers | Free DNA and Metagenomics Analyses: Evaluation of Free DNA Inactivation Protocols for Shotgun Metagenomics Analysis of Human Biological Matrices

Introduction The advent of modern culture-independent bacterial DNA sequencing technologies allows to achieve an in-depth characterization of the microbial communities inhabiting the human and animal bodies as well as the microbial consortia residing in other environments (Browne et al., 2016; Milani et al., 2017). These innovative metagenomics approaches, such as…

Continue Reading Frontiers | Free DNA and Metagenomics Analyses: Evaluation of Free DNA Inactivation Protocols for Shotgun Metagenomics Analysis of Human Biological Matrices

How to obtain a protein abundance profile based on a BAM file created using bowtie2 (alignment performed with nucleotide alignment)?

How to obtain a protein abundance profile based on a BAM file created using bowtie2 (alignment performed with nucleotide alignment)? 0 I have a set of metatranscriptomics sequence samples. Let’s say the following sample fastq file is what I have. I would like to map these reads to a bacterial…

Continue Reading How to obtain a protein abundance profile based on a BAM file created using bowtie2 (alignment performed with nucleotide alignment)?

Comparative cellular analysis of motor cortex in human, marmoset and mouse

Statistics and reproducibility For multiplex fluorescent in situ hybridization (FISH) and immunofluorescence staining experiments, each ISH probe combination was repeated with similar results on at least two separate individuals per species, and on at least two sections per individual. The experiments were not randomized and the investigators were not blinded…

Continue Reading Comparative cellular analysis of motor cortex in human, marmoset and mouse

chimeric vs unaligned reads

Forum:chimeric vs unaligned reads 0 Hello, I’m trying to obtain the chemic alignments from a BAM file that originally was generated by STAR and that only has a list of unaligned reads at the end of the file (so no SA tag). If I convert that BAM file back into…

Continue Reading chimeric vs unaligned reads

Trimmomatic unknown trimmer

Trimmomatic unknown trimmer 1 Hi all, I am really struggling with Trimmomatic these days and getting confused. I’m trying to run trimmomatic: java -jar trimmomatic-0.39.jar PE -threads 2 -phred33 ly1_1.fq ly1_2.fq ly1_paired_1.trimmed.fq ly1_unpaired_1.trimmed.fq ly1_paired_2.trimmed.fq ly1_unpaired_2.trimmed.fq HEADCROP:9 This is my error: TrimmomaticPE: Started with arguments:` -threads 2 -phred64 ly1_1.fq ly1_2.fq ly1_paired_1.trimmed.fq…

Continue Reading Trimmomatic unknown trimmer

variants only found on inversion reads in IGV

Forum:variants only found on inversion reads in IGV 0 Hi, I used GATK germline variant calling pipeline to call short variants on paired end fastq files. After got the final analysis ready vcf, applied some extra filters, I inspected bam files in IGV for those variants of interest and found…

Continue Reading variants only found on inversion reads in IGV

Analyzing gene expression in different RNAseq datasets

Analyzing gene expression in different RNAseq datasets 0 Hello! I really need some assistance here, I came up with an analysis of my own that makes sense to me but I really new in this (started studying bioinformatics on my own with the pandemics) and I’m not sure if I…

Continue Reading Analyzing gene expression in different RNAseq datasets

FastUniq deduplicate only working for forward read

I am using FastUniq to deduplicate Illumina Miseq paired-end data, and using FastQC to compare quality control (QC) reports before and after deduplication. I figured out how to use FastUniq, but for some reason, it only seems to be effective on the first read pair, and not nearly as much…

Continue Reading FastUniq deduplicate only working for forward read

Trimmomatic errors

Trimmomatic errors 1 Any ideas here, I have run this code and keep getting errors cannot execute binary file. Is this a problem with my input files or TRIMMOMATIC itself module load trimmomatic module load jdk/1.8.0.221 export TRIMMOMATIC=/opt/software/trimmomatic/0.39/trimmomatic-0.39.jar java -jar $TRIMMOMATIC PE-phred33 /scratch/marionr/Raw_Muscle_2021/41_L4_R1_001.fastq/scratch/marionr/Raw_Muscle_2021/41_L4_R2_001.fastq /scratch/marionr/Raw_Muscle_2021/trimmed_41_L004_R1_paired.fastq /scratch/marionr/Raw_Muscle_2021/trimmed_41_L004_R1_unpaired.fastq /scratch/marionr/Raw_Muscle_2021/trimmed_41_L004_R2_paired.fastq /scratch/marionr/Raw_Muscle_2021/trimmed_41_L004_R2_unpaired.fastq LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15…

Continue Reading Trimmomatic errors

Bioconductor – rSFFreader

DOI: 10.18129/B9.bioc.rSFFreader     rSFFreader reads in sff files generated by Roche 454 and Life Sciences Ion Torrent sequencers Bioconductor version: Release (3.6) rSFFreader reads sequence, qualities and clip point values from sff files generated by Roche 454 and Life Sciences Ion Torrent sequencers into similar classes as are present…

Continue Reading Bioconductor – rSFFreader

How To Extract A Sequence From A Big (6Gb) Multifasta File ?

How To Extract A Sequence From A Big (6Gb) Multifasta File ? 11 I want to extract some sequences using ID from a multifasta file. Using perl is not possible because it gave an error when indexing the database. Maybe because of it’s size? Is there any way to this…

Continue Reading How To Extract A Sequence From A Big (6Gb) Multifasta File ?

Segmentation fault (core dumped) during bwa mem mapping

Hi, I ran bwa mem with trimmed fastq files (ERR2593198) but I saw following error: bwa mem CHO-PICR.fasta ../2.ngsShort/trimmed_ERR2593198_1.fastq ../2.ngsShort/trimmed_ERR2593198_2.fastq [M::bwa_idx_load_from_disk] read 0 ALT contigs @PG ID:bwa PN:bwa VN:0.7.17-r1188 CL:../downloads/bwa-0.7.17/bwa mem CHO-PICR.fasta ../2.ngsShort/trimmed_ERR2593198_1.fastq ../2.ngsShort/trimmed_ERR2593198_2.fastq [M::process] read 92156 sequences (10000179 bp)… Segmentation fault (core dumped) To figure out what’s happening, I…

Continue Reading Segmentation fault (core dumped) during bwa mem mapping

Running into StopIteration Error on UMITools

Running into StopIteration Error on UMITools 0 Hello all, I am trying to transfer UMIs (already extracted) from the headers of raw reads to the headers of reads that had already been filtered for rRNA (using Sortmerna) and trimmed (for adapters and quality using trimmomatic). The command I’m running in…

Continue Reading Running into StopIteration Error on UMITools

BCL files conversion to FASTQ without SampleSheet.csv

BCL files conversion to FASTQ without SampleSheet.csv 1 Dear community, I have got NGS data which is basically the BaseCalls folder with .bcl files. I want to know how to successfully convert .bcl files to .fastq format. So far, I have been using the bcl2fastq program, however, I have no…

Continue Reading BCL files conversion to FASTQ without SampleSheet.csv

Very high read count variation in WGS alignments

Very high read count variation in WGS alignments 0 Hello everyone, I am new to NGS analysis, but have tried to learn through test data before starting with my own data. And now, it seems I am stuck somewhere. So, looking for some help/suggestions/ideas for the same. I have few…

Continue Reading Very high read count variation in WGS alignments

How to search for primer sequences in fastq files generated after amplicon sequencing

Hi all, I need some help with grep or any other command that will help do the job. I am very new to the command line. Any help is appreciated, thank you. I recently did some amplicon sequencing of a multiplexed PCR reaction. I used nearly 90 primer pairs to…

Continue Reading How to search for primer sequences in fastq files generated after amplicon sequencing

Half sense and half anti-sense

Half sense and half anti-sense 0 Hello I have small RNAseq from plasma but it seems strange to me that I get so many “antisense” hits for both RefSeq and lncRNAs. I searched and people say ***there might be something wrong with the way in which the reads are processed…

Continue Reading Half sense and half anti-sense

0 + 0 mapped when used flagstat

0 + 0 mapped when used flagstat 1 I’ve downloaded bam files from ENA, and tried samtools flagstat to confirm mapping. But the result was like this. How do I interpret this? Should I download fastq files? Thanks mapping RNA-seq • 23 views You may have downloaded unaligned BAM (uBAM)…

Continue Reading 0 + 0 mapped when used flagstat

Batch File For Loop Help

Batch File For Loop Help 2 Hello I’m trying to run bowtie for a several files using a loop so I don’t have to repeat the command over and over. The loop looks something like this: mm10=”GenomeFolder/Bowtie2Index/genome” for (( i = 76; i <= 79; i++ )) do bowtie2 -x…

Continue Reading Batch File For Loop Help

Research Assistant in Genomics / Bioinformatics Jobs at Nutrition Technologies , Singapore

Less than a year of experience Important Information Make sure you’re applying to a legit company by checking their website and job posts. Job description Research Assistant in Genomics / Bioinformatics Nutrition Technologies Nutrition Technologies is an innovative company producing insects as a sustainable protein source for the animal feed…

Continue Reading Research Assistant in Genomics / Bioinformatics Jobs at Nutrition Technologies , Singapore

Bioinformatics Biomedical Scientist – Bilsborough Lab

Bioinformatics Biomedical Scientist – Bilsborough Lab – Inflammatory Bowel Diseases Drug Discovery and Development Apply Now Share Requisition # HRC0697538 Join us in accelerating the pace of research and discovery within our unique IBD3 lab! Cedars-Sinai provides virtually every known gastroenterologic analytical procedure and treatment…

Continue Reading Bioinformatics Biomedical Scientist – Bilsborough Lab

Bad Per sequence GC content

Hello, Biostars! I have two fastq files of pair-end reads, which I want to use for SNV calling. Quality checking in FastQC showed bad Per base sequence content and a couple of warnings in both Per sequence GC content and Sequence Length Distribution – you can see it in the…

Continue Reading Bad Per sequence GC content

What current tools are used to phase haplotypes from a FASTQ file?

What current tools are used to phase haplotypes from a FASTQ file? 0 I have a FASTQ file representing a WGS from Dante Labs. Is it possible to phase the sequence into haplotypes, and what software should I be looking at to do this? Any working command line examples would…

Continue Reading What current tools are used to phase haplotypes from a FASTQ file?

Crossing design shapes patterns of genetic variation in synthetic recombinant populations of Saccharomyces cerevisiae

Population creation All yeast strains used in this study originated from heterothallic, haploid, barcoded derivatives of the SGRP yeast strain collection30. A subset of 12 of these haploid strains, originally isolated from distinct geographic locations worldwide, were used to create the synthetic populations we describe here (See Supplementary Fig. S1…

Continue Reading Crossing design shapes patterns of genetic variation in synthetic recombinant populations of Saccharomyces cerevisiae

package does not ship resource files

Control: found -1 38.90+dfsg-1 Control: tag -1 confirmed Hi all, Andreas Tille, on 2021-09-30: > Am Thu, Sep 30, 2021 at 01:22:23PM -0400 schrieb Robert: > > The bbmap package does not ship the needed resource files which causes some > > of > > the included tools not to…

Continue Reading package does not ship resource files

Unusual FastQC sequence distribution from small RNA seq

Unusual FastQC sequence distribution from small RNA seq 0 Hi, I am attempting to analyse some small RNA sequencing data produced using an Illumina TruSeq Small RNA Library Preparation Kit. The RNA was isolated from sheep serum. Pre-sequencing QC was fine and post sequencing looked good too – apart from…

Continue Reading Unusual FastQC sequence distribution from small RNA seq

bowtie2 results with NGS data

bowtie2 results with NGS data 0 Hello, I made index from my reference file and run command to align my metagenomic data by bowtie2. command is bowtie2 -x <index_referance> -1 <paired_end_read_path_1.fastq> -2 <paired_end_read_path_2.fastq> -s <outputname.sam> and i got this result on screen without getting sam file in output folder. 16190304…

Continue Reading bowtie2 results with NGS data

Is it okay to convert bam files to fastq and getting seqkit results?

Is it okay to convert bam files to fastq and getting seqkit results? 0 I’m sorry if this is a non-sense question but I am a rookie and although I searched about this I couldn’t find any decent answer. I had the fastq files but they somehow became corrupted. I…

Continue Reading Is it okay to convert bam files to fastq and getting seqkit results?

biopython write fasta

Step 1 − Create a file named blast_example.fasta in the Biopython directory and give the below sequence information as input. 3. “””Bio.SeqIO support for the “fasta” (aka FastA or Pearson) file format. Then we save this line of text to the output file: Now we have finished all the genes,…

Continue Reading biopython write fasta

Splitting fastq files

Splitting fastq files 0 Hi there, I was just wondering if anyone could offer any advice on splitting two merged fastq files (R1 and R2) into one per-sample fastq files? I’ve downloaded several biosamples from SRA via ftp, but they are merged into one file and I am unsure how…

Continue Reading Splitting fastq files

Why is bcl2fastq2 taking so long to calculate stats?

Our lab has been using bcl2fastq v2.20.0.422 to demultiplex RNA-seq data sequenced on an Illumina Novaseq machine on a beefy EC2 instance and we’ve run into the strange problem: namely that while demultiplexing is very fast, generating the stats files are unbearably slow. In a recent example, we demultiplexed one…

Continue Reading Why is bcl2fastq2 taking so long to calculate stats?

Converting .1 or .man files from NCBI SRA to fastq

Converting .1 or .man files from NCBI SRA to fastq 1 Hi there, Sorry if this is a stupid question, but I’m hoping someone can help me. I’m trying to access fastq files from the SRA run browser (see this link: trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR3138122). However, all 53 samples seem to be contained…

Continue Reading Converting .1 or .man files from NCBI SRA to fastq

Can’t download files from the ENA (European Nucleotide Archive) site

Can’t download files from the ENA (European Nucleotide Archive) site 2 Looks like ENA does not have the data links. If you are simply interested in following the tutorial then you can let ENA know about this problem. If you actually want the data then do the following. Use sra-explorer…

Continue Reading Can’t download files from the ENA (European Nucleotide Archive) site

NGS-Barcode-Count

Tool:NGS-Barcode-Count 0 NGS-Barcode-Count Multithreaded barcode counter originally written for DNA encoded libraries (DEL). I expanded it for use with other data types, such as sequencing from high throughput CRISPR screens. It works very well with all test datasets I’ve tried. As a comparison, the group I was working with was…

Continue Reading NGS-Barcode-Count

How to align incomplete pairs and singled end reads?

How to align incomplete pairs and singled end reads? 0 Hi everyone, I converted bam files to fastq files to realign them with BWA. For some bam files, I get single end read files and files with incomplete pairs. I used the tool bam2fastq by biobambam2 which states single end…

Continue Reading How to align incomplete pairs and singled end reads?

Per base sequence Quality in Fast QC report

Per base sequence Quality in Fast QC report 1 Hello I am working with RNA seq data and generated FastQC quality report. In the report while checking section “per base sequence quality” I noticed that box plat for bases (showing upper and lower quartile) present towards the end of the…

Continue Reading Per base sequence Quality in Fast QC report

galaxy training network

You can also use the following Docker image for these tutorials: It will launch a flavored Galaxy instance available on localhost:8080. Galaxy Remote Servicing Suite kit with license: R057-CD-DG: Galaxy Remote Servicing Suite (Network enabled) with one Dongle: R058-CD-DG: Galaxy User Management Suite, with one Dongle: YY0-0010: Additional 1-user dongle…

Continue Reading galaxy training network

biopython extract sequence from fasta

My two questions are: What is the simplest way to do this? This unique book shows you how to program with Python, using code examples taken directly from bioinformatics. using python-bloom-filter, just replace the set with seen = BloomFilter(max_elements=10000, error_rate=0.001). This book is suitable for use as a classroom textbook,…

Continue Reading biopython extract sequence from fasta

cellranger count help for fastq files with different sample order/number

cellranger count help for fastq files with different sample order/number 0 Hello, I am trying to analyze my dataset of fastq files of one sample with different sample order like Test1_S1_L001_R1_001.fastq.gz, Test1_S2_L001_R1_001.fastq.gz and Test1_S3_L001_R1_001.fastq.gz. I am not sure about that if I should specify the sample order? count cellranger •…

Continue Reading cellranger count help for fastq files with different sample order/number